You are on page 1of 918

Gerhard Klebe

Drug Design
Methodology, Concepts,
and Mode-of-Action

1 3Reference
Drug Design
Gerhard Klebe

Drug Design
Methodology, Concepts, and
Mode-of-Action

With 494 Figures and 44 Tables


Gerhard Klebe
Institute of Pharmaceutical Chemistry
Philipps-University Marburg
Marburg, Germany

Translator
Leila Telan
D€usseldorf, Germany

ISBN 978-3-642-17906-8 ISBN 978-3-642-17907-5 (eBook)


ISBN 978-3-642-17908-2 (print and electronic bundle)
DOI 10.1007/978-3-642-17907-5
Springer Heidelberg New York Dordrecht London

This work is based on the second edition of “Wirkstoffdesign”, by Gerhard Klebe, published
by Spektrum Akademischer Verlag 2009, ISBN: 978-3-8274-2046-6
Library of Congress Control Number: 2013933987

# Springer-Verlag Berlin Heidelberg 2013


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts
in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being
entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication
of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from
Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center.
Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


Preface

The present handbook on drug design builds on the German version first written by
Hans-Joachim Böhm, Hugo Kubinyi, and me in 1996. After 12 years of success on
the market, the German version of this handbook was entirely rewritten and
significantly extended, then by me as the sole author. The new edition particularly
considers novel approaches in drug discovery and many successful examples
reported in literature on structure-based drug design and mode-of-action analysis.
This novel version appeared in 2009 on the German market. Several attempts were
made to translate this book into English to make it available to a wider audience.
This intention was driven by the fact that the author was repeatedly approached
with the question as to why such a successful book is not available in the English
language. An analysis of the textbook market made apparent that no similar
compendium was (and still is) available covering the same field of interest. Finally,
Springer agreed in the translation project, and Dr. Leila Telan, a gifted bilingual
medicinal chemist and physician, was found willing to take the task of producing
a first draft of a cover-to-cover translation of the German original. This version was
corrected, and some chapters extended by the author. The book is meant for
students of chemistry, pharmacy, biochemistry, biology, chemical biology, and
medicine interested in the design of new active agents and the structural founda-
tions of drug action. But it is also tailored to experts in drug industry who want to
obtain a more comprehensive overview of various aspects of the drug discovery
process.
Such a book project would not have been possible without the help of many
friends and colleagues. First of all, I want to express my sincere thanks to Dr. Leila
Telan, D€ usseldorf, Germany, who produced the first version of this translation. Her
version and the modifications of the author have been carefully proofread by many
colleagues in the field. Their help is highly appreciated. Furthermore, I would like
to acknowledge the help of Prof. Dr. Hugo Kubinyi, Heidelberg, Germany, who
assisted in correcting the first version of the English translation. Particular thanks
go to Dr. Simon Cottrell, Cambridge, England, and to Dr. Nathan Kilah, Hobat,
Tasmania, Australia, for their excellent and very thorough proofreading of the
different chapters. The project was ideally guided by Dr. Daniel Quinones and

v
vi Preface

Dr. Sylvia Blago, Springer, Heidelberg, Germany. The author is grateful to the
publisher for their assistance and technical support in producing the electronic and
printed version of this handbook.

Marburg, Germany, May 2013 Gerhard Klebe


Contents

Part I Fundamentals in Drug Research ...................... 1


1 Drug Research: Yesterday, Today, and Tomorrow . . . . . . . . . . . . 3
2 In the Beginning, There Was Serendipity ................... 23
3 Classical Drug Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Protein–Ligand Interactions as the Basis for Drug Action . . . . . . 61
5 Optical Activity and Biological Effect . . . . . . . . . . . . . . . . . . . . . . 89

Part II The Search for the Lead Structure ................... 111


6 The Classical Search for Lead Structures . . . . . . . . . . . . . . . . . . . 113
7 Screening Technologies for Lead Structure Discovery . . . . . . . . . 129
8 Optimization of Lead Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9 Designing Prodrugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
10 Peptidomimetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Part III Experimental and Theoretical Methods . . . . . . . . . . . . . . . 209


11 Combinatorics: Chemistry with Big Numbers . . . . . . . . . . . . . . . . 211
12 Gene Technology in Drug Research . . . . . . . . . . . . . . . . . . . . . . . . 233
13 Experimental Methods of Structure Determination ........... 265
14 Three-Dimensional Structure of Biomolecules ............... 291
15 Molecular Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
16 Conformational Analysis ................................ 335

vii
viii Contents

Part IV Structure–Activity Relationships and Design


Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
17 Pharmacophore Hypotheses and Molecular Comparisons . . . . . . 349
18 Quantitative Structure–Activity Relationships ............... 371
19 From In Vitro to In Vivo: Optimization of ADME and
Toxicology Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
20 Protein Modeling and Structure-Based Drug Design .......... 429
21 A Case Study: Structure-Based Inhibitor Design for
tRNA-Guanine Transglycosylase . . . . . . . . . . . . . . . . . . . . . . . . . . 449

Part V Drugs and Drug Action: Successes of Structure-Based


Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
22 How Drugs Act: Concepts for Therapy . . . . . . . . . . . . . . . . . . . . . 471
23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate . . . . 493
24 Aspartic Protease Inhibitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
25 Inhibitors of Hydrolyzing Metalloenzymes . . . . . . . . . . . . . . . . . . 565
26 Transferase Inhibitors .................................. 599
27 Oxidoreductase Inhibitors ............................... 641
28 Agonists and Antagonists of Nuclear Receptors .............. 697
29 Agonists and Antagonists of Membrane-Bound Receptors . . . . . . 719
30 Ligands for Channels, Pores, and Transporters .............. 745
31 Ligands for Surface Receptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides
as Drugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 813
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 851
Illustration Source References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853
Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863
Subject Index ......................................... 867
Introduction

Drug design is a science, a technology, and an art all in one. An invention is the
result of a creative act, and a discovery is the detection of an already-existing
reality. Design encompasses the two processes with emphasis on a targeted
approach based on the available knowledge and technology. Furthermore, the
creativity and intuition of the researcher play a decisive role.
Drugs are all substances that affect a system by inducing a particular effect. In
the context of this book, drugs are substances that exhibit a biochemical or
pharmacological effect, in most cases medications, that achieve a therapeutic result
in humans.
The idea of rational drug design is not new. Organic compounds were prepared
more than a century ago with the goal of attaining new medicines. The sedatives
chloral hydrate (1869) and urethane (1885), and the antipyretics phenacetin
(1888) and acetylsalicylic acid (1897) are early examples of how targeted com-
pounds can be made that have favorable therapeutic properties by starting with
a working hypothesis. The fact, that the hypotheses in all four cases were more or
less incorrect (▶ Sects. 2.1, ▶ 2.2, and ▶ 3.1) simultaneously demonstrates one of
the main problems of drug design.
In the case of the artistic design of a poster or commodity, or, in the case of
engineering, the design of an automobile, a computer, or a machine, the result is
usually predictable. In contrast, the design of a drug is even today not completely
foreseeable. The consequences of the smallest structural changes of a drug on its
biological properties and target tissue are too multifaceted and at present too poorly
understood.
Until modern times, scientists have worked on the principle of trial and error to
find new medicines. By this they derived mostly empirical rules that have contrib-
uted to a knowledge base for rational drug design and which has been translated by
individual researchers more or less successfully into practice. Today new technol-
ogies are available for drug research, for instance, combinatorial chemistry, gene
technology, and automated screening methods with high throughput, protein crys-
tallography and fragment screening, virtual screening, and the application of bio-
and chemoinformatics.

ix
x Introduction

In many cases the molecular mechanisms of the mode of action of medicines are
fairly well understood, but in other cases we are at the threshold of comprehension.
Many of these mechanisms will be discussed in this book. Progress in protein
crystallography and NMR spectroscopy allows the determination of the three-
dimensional structure of protein–ligand complexes on a routine basis. As is
shown in many of the illustrations in this book (for a general explanation of
“reading” these illustrations, see the appendix at the end of this book) these
structures make a decisive contribution to the targeted design of drugs. 3D struc-
tures with up to atomic resolution are known for approximately 550,000 small
molecules and more than 85,000 proteins and protein–ligand complexes, and the
numbers are increasing exponentially. Methods for the prediction of the 3D struc-
tures of small molecules are now mature, and semiempirical and ab initio quantum
chemical calculations on drugs are now routinely performed. The sequencing of the
human genome is complete, and the genomes of other organisms are reported nearly
every week, including those of important human pathogens. The age of structural
genomics has begun, and it is only a matter of time before the 3D structures of entire
gene families are available. Given enough sequence homology, modeling programs
can nowadays achieve an impressive reliability. In the meantime, the composition of
entire genomes is being processed with structure-prediction programs. There are
already interesting approaches for the de novo prediction of 3D protein structures,
and the first correct 3D structural predictions have been successfully accomplished.
Structure-based and computer-aided design of new drugs is here to stay in
practical drug research. Computer programs serve the search for, modeling of,
and targeted design of new drugs. In countless cases these techniques have assisted
the discovery and optimization of new drugs. On the other hand, a too-strict and
one-sided focus on the computational results bears the danger of losing sight of the
available knowledge of the relationship between the chemical structure and bio-
logical activity. Another danger is the limited consideration of an active agent only
with respect to its interaction with one single target without considering the other
essential requirements for a drug, for instance, the pharmacokinetic and toxicolog-
ical properties. In the last decade, intensive research effort has gone into the
compilation of empirical guidelines to predict bioavailability, toxicological pro-
files, and metabolic properties (ADME parameters). The ability to predict the
metabolic profile for a given xenobiotic by the arsenal of cytochrome P450
enzymes or to predict for each individual patient the metabolic peculiarities is
still a dream. Nonetheless, just such an individually adjusted therapy and dosing
regime is within the realm of possibilities. It is also conceivable that in the
foreseeable future, gene sequencing of each of us will be financially feasible and
will require a manageable and justifiable amount of time and effort. This will open
entirely new perspectives for drug research. Whether this pushes open the gate to
individualized personal medicines will be a question of cost. The theme of this book
is to introduce the methods required for drug design particularly based on structural
and mechanistic evidence. By the use of well-selected examples the route to the
discovery and development of new medicines is discussed and will be reflected
under the constantly changing conditions.
Introduction xi

Drug research is a multidisciplinary field in which chemists, pharmacists,


technologists, molecular biologists, biochemists, pharmacologists, toxicologists,
and clinicians work together to pave the way for a substance to become
a therapeutic. Because of this, the majority of drug developments is done in an
industrial setting. It is only there that the financial requirements and structural
organization are in place to allow a successful cooperation of all disciplines that are
necessary to channel the research in the required manner toward a common goal.
The fundamentals and future-oriented innovations of drug research are, however,
increasingly being established in academia. Interestingly, an increasing amount of
research activities at the universities have recently been devoted to drug develop-
ments for infectious diseases and for diseases that particularly afflict developing
countries, which have been sorely neglected by the profit oriented pharmaceutical
industry of the industrialized world. This is even more alarming when we consider
that our improved quality of life and prolonged life expectancy are attributable to,
above all else, a victory over devastating infectious diseases. We can only hope
that politicians recognize this situation in time and make the resources and
organizational infrastructure available so that the academic research groups can
step into the breach in an efficient and goal-oriented way.
The rising costs of research and development, an already high standard of health
care in many indications, and distinctly increased safety awareness and the con-
comitant demanding standards of the regulatory authorities have caused the number
of new chemical entities (NCE) to steadily decrease over the last decades from 70–
100 per year from 1960 to 1969, to 60–70 from 1970 to 1979, to an average of 50
between 1980 and 1989, to 40–45 in the 1990s, and even less in the new millen-
nium. Despite this, there have still been new developments, and distinct progress
has been made in the therapy of, for example, psychiatric diseases, arterial hyper-
tension, gastrointestinal ulcers, and leukemia in addition to the broadening of
indications for older compounds. Of the blockbusters, a disproportionately large
percentage of the drugs were found in the last years by using a rational approach.
The cost of developing and launching a new drug has increased continuously; to
date, it is between US $800–$1,600 million. Only large pharmaceutical companies
can still afford these costs, with the associated risk of failure in the last phases of
clinical trials, or a misjudgment of the therapeutic potential of a new drug.
There is talk nowadays of a paradigm shift in pharmaceutical research. In
research this refers to the use of new technologies; in the market place this refers
to a concentration process of corporate mergers and acquisitions. The last decade
brought about many such “mega-mergers.” Larger and larger sales figures are being
achieved by fewer and fewer companies. In parallel to this, a very dynamic and
hardly insignificant scene has developed of small- to medium-sized, highly flexible
biotech companies. The areas of gene technology, combinatorial chemistry, sub-
stance profiling, and rational design are particularly well represented in numerous
such companies. Larger companies try to outsource their riskier research concepts
to these companies and contract their services for everything up to the development
of clinical candidates. However, the success of this scene has led to the result that
the “good” companies have been swallowed by the “big” companies. Many former
xii Introduction

employees of “big pharma” have established their own small companies with an
innovative idea. If the idea was good and successful, after a few years these
innovators find themselves once again incorporated into the organization of
a “big pharma” company.
At the same time the prescribing practices in all areas of health care have
changed. Formerly it was the physician alone, occasionally in consultation with
a pharmacist, who was responsible for the pharmacological therapy of the patient.
Today cost-cutting measures, “negatives lists,” health insurance, the purchasing
departments of hospitals and pharmacies, the ubiquitous Internet, and even public
opinion influence therapies to an ever larger extent.
The drug market, with its US $600 billion, is an extremely attractive market.
Furthermore, this market is characterized by dynamic growth, which is decidedly
more than in other markets. The best selling drug in 2005, Lipitor ® (Sortis® in
Europe; atorvastatin) achieved US $12.2 billion in annual sales. Only illegal
narcotics like heroin and cocaine have higher sales figures.
Tailored medications – Will the latest technologies really deliver on this prom-
ise? What makes drug research so difficult? To use a parable, it is something like
playing against an almighty chess computer. The rules are known to both sides, but
it is very difficult to comprehend the consequences of each individual move during
a complicated middle game. A biological organism is an extremely complicated
system. The effect of a drug on the system and the effect of the system on the drug
are multifaceted. Every structural change made with the goal of optimizing one
particular characteristic simultaneously changes the finely tuned equilibrium of the
other characteristics of the drug.
The knowledge of the interplay between the chemical structure and the biolog-
ical effect must be united with the newest technology and results of genetic research
to purposefully develop new medicines. It is also necessary to define the range of
applications and the limitations of new technologies. Theory and modeling cannot
exist detached from experiment. The results of calculations depend strongly on the
boundary parameters of the simulation. The results collected at one system are only
conditionally transferable to other systems. Only an experienced specialist is in
a position to fully exploit the special potential of theoretical approaches. The claims
that some software and venture capital companies make, that their results automat-
ically lead to success, should be considered with some skepticism. This book should
be helpful in these situations too, to separate the wheat from the chaff and to
identifying the application range of these method as well as their limitations.
This book is about drug research and the mode of action of medicines. It is
different from classical textbooks on pharmaceutical chemistry in its structure and
goals. The principles, methods, and problems associated with the search for new
medicines are the themes. Classes of drugs are not discussed, but rather the way that
these drugs were discovered and some insights into the structural requirements for
their action on a particular target protein. As the title suggests, the book is meant for
students of chemistry, pharmacy, biochemistry, biology, and medicine who are
interested in the art of designing new medicines and the structural fundamentals of
how drugs act on their targets.
Introduction xiii

In the first section, after an introduction to the history of medicines and the
concept of serendipity as an unpredictable but always very successful concept in
drug research, examples from classical drug research will be presented.
A discussion about the fundamentals of drug action, the ligand–receptor interaction,
and the influence of the three-dimensional structure on the efficacy of a drug round
the section out. In the second section, the search for lead structures and their
optimization and the use of prodrug strategies are introduced. New screening
technologies but also the systematic modification of structures by using the concept
of bioisosteres and a peptidomimetic approach are discussed. In the third section,
experimental and theoretical methods applied in drug research are described.
Combinatorial chemistry has afforded access to a wide variety of test substances.
Gene technology has produced the target proteins in their pure form, and has helped
to characterize these proteins’ properties and function from the molecular level to
the cellular assembly, all the way to the organism level. It has built a bridge between
understanding the effects of a drug therapy on the complex microstructure of a cell
and in systems biology of an organism. The spatial structure of proteins and
protein–ligand complexes are accessible through NMR spectroscopy and X-ray
crystallography. Their structural principles are becoming better understood and are
increasingly allowing us access to the binding geometry of the drugs. The computer
methods and molecular dynamics simulations of complex conformational analysis
have also sharpened our understanding of targeted drug design. The fourth section
introduces design techniques such as pharmacophore and receptor modeling, and
discusses the methods of, and uses for, quantitative structure–activity relationships
(QSAR). Insights into the transport and distribution of drugs in biological systems
are given, and different techniques for structure-based design are presented. A drug-
design case study from the author’s research closes the chapter. The fifth section of
this book focuses on the core question of pharmacology: How drugs actually work?
Enzymes, receptors, channels, transporters, and surface proteins are divided into
individual chapters and discussed as a group of target proteins. The spatial structure
of the protein and modes of action are used to elucidate in detail why a drug works
and why it must exhibit a particular geometry and structure to work. Exemplarily,
the contributions of structure-based and computer-aided design to the discovery of
new drugs are presented in these chapters, and other aspects are also shifted into the
spotlight.
Because of the concept of this book, many important drugs are not considered or
are only fleetingly mentioned. The same is true of receptor theory, pharmacokinet-
ics and metabolism, the basics of gene technology, and statistical methods. The
biochemical, molecular biological, and pharmacological fundamentals of the mode
of action of drugs, which are important for the understanding of the theme of drug
design, are only commented upon in outline form. Other disciplines that are critical
for the development of an active substance to a medicine and application to
patients, such as pharmaceutical formulations, toxicological testing, and clinical
trials, are not themes that are covered in this book.
The selection of examples from therapeutic areas was made subjectively and for
didactic reasons based on case studies and to bring other aspects of drug research to
xiv Introduction

the foreground. A balanced presentation of the methods of drug design and their
practical application was attempted. The interested reader does not have to read the
book chronologically. If the reader’s interest is purely on drugs and their mode of
action, then they can also begin with ▶ Chap. 22. There are many cross references
in the text to help the reader to find the passages in other parts of the book that are
necessary for an exact comprehension of what is being discussed at any given part.
The references and literature suggestions that follow cite particularly recommend-
able monographs and are ordered alphabetically; journals and series on the themes
that are discussed in later chapters are not mentioned specifically again.

Literature

Monographs

Brunton L, Lazo J, Parker K (2005) Goodman & Gilman’s the pharmacological basis of thera-
peutics, 11th edn. McGraw-Hill, Europe
Ganellin CR, Roberts SM (eds) (1993) Medicinal chemistry. The role of organic chemistry in drug
research, 2nd edn. Academic Press, London
King FD (ed) (2003) Medicinal chemistry: principles and practice, 2nd edn. The Royal Society of
Chemistry, Cambridge
Krogsgaard-Larsen P, Bundgaard H (eds) (1991) A textbook of drug design and development.
Harwood Academic Publishers, Chur, Schweiz
Lednicer D (ed) (1993) Chronicles of drug discovery, vol 3. American Chemical Society,
Washington, DC and earlier volumes from this series
Lemke TL, Williams DA (2008) Foye’s principles of medicinal chemistry, 6th edn. Williams &
Wilkins, Baltimore
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry. Wiley-
VCH, Weinheim, Series with Guest Editors
Maxwell RA, Eckhardt SB (1990) Drug discovery. A casebook and analysis. Humana Press,
Clifton
Mutschler E, Derendorf H (1995) Drug action, basic principles and therapeutic aspects. CRC
Press:Boca Raton/Ann Arbor/London/Tokyo
Silverman RB (2004) The organic chemistry of drug design and drug action, 2nd edn. Elsevier/
Academic Press, Burlington
Wermuth CG, Koga N, König H, Metcalf BW (eds) (1992) Medicinal chemistry for the 21st
century. Blackwell Scientific, Oxford

Journals and Series


Annual Reports in Medicinal Chemistry
Chemistry & Biology
ChemMedChem
Drug Discovery Today
Drug News and Perspectives
Journal of Computer-Aided Molecular Design
Journal of Medicinal Chemistry
Methods and Principles in Medicinal Chemistry
Introduction xv

Nature
Nature Reviews Drug Discovery
Perspectives in Drug Discovery and Design
Pharmacochemistry Library
Progress in Drug Research
Quantitative Structure-Activity Relationships
Reviews in Computational Chemistry
Science
Scientific American
Trends in Pharmacological Sciences
Nowadays the Internet, discussion platforms, and the tremendously valuable tool of Wikipedia are
available to everyone and provide access to an enormous source of information.
Part I
Fundamentals in Drug Research
2 I Fundamentals in Drug Research

This colored copper plate engraving from arguably the most beautiful plant book,
the Hortus Eystettensis by Basilius Besler, Eichst€att, 1613, shows the squill, Scilla
alba (modern name: Urginea maritima L.). This plant was known to the ancient
Egyptians, Greeks, and Romans as a remedy for many ailments, but especially
dropsy (today: congestive heart failure). It was venerated faithfully as general
defense against harm. It was not until our century that the active components of
squill, the glycosides scillaren, and proscillaridin were isolated in their pure form,
and a derivative with improved bioavailability, meproscillarin (Clift ®), was avail-
able for pharmaceutical therapy.
Drug Research: Yesterday, Today, and
Tomorrow 1

The targeted route to medicines is an old dream of humanity. Even the alchemists
sought after the Elixir, the Arcanum that was meant to heal all disease. It still has
not been found today. On the contrary, drug therapy has become even more
complicated as our knowledge of the different disease etiologies has become
more complex.
Nonetheless, the success of drug research is impressive. For hundreds of years,
alcohol, opium, and solanaceae alkaloids (from thorn apples) were the only prepa-
ratory measures for surgery. Today general anesthesia, neuroleptanalgesia, and
local anesthetics allow absolutely pain-free surgical and dental procedures to be
carried out. Until this century, plagues and infectious diseases have killed more
people than all wars. Today, thanks to hygiene, vaccines, chemotherapeutics, and
antibiotics, these diseases have been suppressed, at least in industrialized countries.
The dangerously increasing numbers of therapy-resistant bacterial and viral path-
ogens (e.g., tuberculosis) have presented new problems and make the development
of new medications urgently necessary. The H2-receptor inhibitors and proton-
pump inhibitors have drastically reduced the number of surgical procedures to treat
gastric and duodenal ulcers. Combinations of these inhibitors with antibiotics have
brought even more advances in that it allows a causal therapy (▶ Sect. 3.5).
Cardiovascular diseases, diabetes, and psychiatric diseases (diseases of the central
nervous system, CNS) are treated mostly symptomatically, that is, the cause of the
disease is not addressed, but rather the negative effects of the disease on the
organism. Often the therapy is limited to slowing the progression of these diseases
or increasing the quality of life. Synthetic corticosteroids have lead to significant
pain reduction and retardation of the pathological bone degeneration associated
with chronic inflammatory diseases (e.g., rheumatoid and chronic polyarthritis).
The spectrum of cancer therapy ranges from healing, particularly in combination
with surgical and radiation therapy, all the way to complete failure of all therapeutic
measures.
The history of drug research can be divided into several sequential phases:
• the beginning, when empirical methods were the only source of new medicines,
• targeted isolation of active compounds from plants,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_1, 3


# Springer-Verlag Berlin Heidelberg 2013
4 1 Drug Research: Yesterday, Today, and Tomorrow

• the beginning of a systematic search for new synthetic materials with biological
effects and the introduction of animal models as surrogates for patients,
• the use of molecular and other in vitro test systems as precise models and as
a replacement for animal experiments,
• the introduction of experimental and theoretical methods such as protein
crystallography, molecular modeling, and quantitative structure–activity rela-
tionships for the targeted structure-based and computer-supported design of
drugs, and
• the discoveries of new targets and the validation of their therapeutic value
through genomic, transcriptomic, and proteomic analysis, knock-in and knock-
out animal models, and gene silencing with siRNA.
Each preceding phase loses its importance with the arrival of the next phase.
Interestingly, in modern drug research individual phases run in the opposite direc-
tion. That is, first a target structure is discovered in the sequenced genome of an
organism and its function is modulated to validate it as a candidate for drug therapy.
Then the structure-based and computer-aided design of an active substance is
undertaken in close cooperation with multiple in vitro tests to clarify the activity
and the activity spectrum. Next, the animal experiments substantiate the clinical
relevance, and in the final step clinical trials confirm a test substance’s suitability as
a medicine for patients.

1.1 It All Began with Traditional Medicines

The beginnings of drug therapy can be found in traditional medicines. The narcotic
effect of the milk of the poppy, the use of autumn crocus (Colchicum autumnale)
for gout, and the diuretic effect of squill (Urginia maritime) for dropsy (today:
congestive heart failure) have been known since antiquity. The dried herbs and
extracts from these and other plants have served as the most important source of
medicines for more than 5,000 years. The oldest written records of these uses are
from 3000 BC.
Around 1550 BC the ancient Egyptian Papyrus Ebers listed approximately 800
prescriptions, of which many contained additional rituals to invoke the help of the
gods. The five-volume book De Materia Medica of Dioskurides (Greek physician,
first century AD) is the most scientifically rigorous work of antiquity. It contains
descriptions of 800 medicinal plants, 100 animal products, and 90 minerals. Its
influence reached into the late Arabic medicine and the early modern age.
The most famous medicine of antiquity was undoubtedly Theriac. Its precursor,
Mithridatum, served the King of Pontus, Mithridates VI (120–63 BC) as an antidote
for poisonings of all kinds. Theriac can be traced to Andromachus, the private
physician of the emperor Nero, and originally contained 64 ingredients. This
preparation remained very widespread even into the eighteenth century. It was
prepared in many variations with up to 100 ingredients. In some cities it was even
prepared under state control to ensure that none of the ingredients were left out! Its
use evolved into a panacea for all diseases. In addition, every imaginable wonder
1.2 Animal Experiments as a Starting Point for Drug Research 5

drug was in use, some examples include rain worm oil, unicorn powder, gastric
calculus stones, human cranium powder (Lat. Cranium, skull), mummy dust, and
many more.
Traditional Chinese medicine was very advanced even in ancient times.
A special feature of their formulation was, and is, the circumstances responsible
for the effect of four different qualities. The chief (jun) is the carrier of the effect,
the adjutant (chen) supports the effect or induces a different effect. The assistant
(zuo) can also support the main effect or can serve to ameliorate side effects, and
one or more messengers (shi) moderate the desired effect. The Chinese Pen-Ts’ao
school (first and second century AD), whose goal it was to live for as long as
possible without aging (!), recommended the following dosing regime:
When treating a disease with a medicine, if a strong effect is desired, one should begin with
a dose that is not larger than a grain of millet. If the disease is healed, no more medicine
should be given. If the disease is not healed, the dose should be doubled. If that does not
heal the disease, the dose should be increased tenfold. When the disease is healed, the
therapy should always be discontinued.

The Chinese Materia Medica published by Li Shizhen in 1590 is made up of


52 volumes. It contains almost 1,900 medical principles, plants, insects, animals,
and minerals incorporated into 10,000 detailed recipes for their preparation. The
Chinese Pharmacopeia from 1990 contains only two volumes. One of those
volumes contains 784 traditional medicines; the other contains 967 medications
from “Western” medicine.
Paracelsus (born Theophrastus Bombastus von Hohenheim; 1493/1494–1541)
made a great breakthrough for scientific medical research. He understood the
human to be a “chemical laboratory” and held the ingredients of drugs themselves,
the Quinta essentia, responsible for their healing effects. Despite this, up until
the beginning of the nineteenth century all therapeutic principles were based on
either extracts from plant, animal ingredients, or minerals; only in the most seldom
cases were pure organic compounds used. That changed fundamentally with the
advent of organic chemistry. The great age of natural products from plants (for
examples see 1.1–1.9, Fig. 1.1), and the active substances that were derived from
them had begun. Premature hopes that were invested in some of these substances
around the turn of the previous century, for example in heroin (▶ Sect. 3.3), or
cocaine (▶ Sect. 3.4), were very quickly squelched, but natural products from
plants established the fundamentals for, and form an exceedingly large part of our
modern pharmacy. Natural products and their analogues and derivatives are also
well represented among the best-selling drugs today.

1.2 Animal Experiments as a Starting Point for Drug Research

The wealth of experience gained by traditional medicine is based on many thou-


sands of years of sometimes accidental, sometimes intentional observations of their
therapeutic effects on humans. Planned investigations on animals were relatively
seldom. The biophysical experiment of Luigi Galvani, an anatomy professor in
6 1 Drug Research: Yesterday, Today, and Tomorrow

HO O CH3
1.1 Morphine H3C N
N
O H H
O N N
N 1.2 Caffeine
CH3
HO

H MeO
NHCOMe
MeO
HO N
H MeO
MeO O
OMe
N 1.3 Quinine 1.4 Colchicine

N COOCH3 OH
H3C H
O O N
CH3
H CH3
1.5 Cocaine
1.6 Ephedrine

N
H3C
N CH3 H OH
H
1.7 Coniine O

MeO 1.8 Atropine (racemate)

N N
H H H
O
H
MeO OMe
O
O OMe
OMe 1.9 Reserpine
OMe

Fig. 1.1 Many important natural products were isolated in the nineteenth century, and a few were
synthesized. Morphine 1.1 was isolated from opium by Friedrich Wilhelm Adam Sert€ urner in
1806, caffeine 1.2 was isolated from coffee, and quinine 1.3 was isolated from cinchona bark by
Friedlieb Runge in 1819. Quinine was discovered independently by Pierre Joseph Pelletier and
Joseph Bienaimé Caventou, who one year later isolated colchicine 1.4 from autumn crocus.
Cocaine 1.5 was extracted from coca leaves by Albert Niemann in 1860, and ephedrine 1.6 was
extracted from the Chinese plant Ma Huang (Ephedra vulgaris) by Nagayoshi Nagai. In 1886 the
first alkaloid, coniine 1.7, which is found in hemlock, was synthesized by Albert Ladenburg; in
1901 atropine 1.8 from Deadly Nightshade was synthesized by Richard Willst€atter. Reserpine 1.9,
from Rauwolfia serpentina was first prepared in the middle of the twentieth century, and its
structure was elucidated.
1.2 Animal Experiments as a Starting Point for Drug Research 7

Bologna, which was first described in his book De viribus electricitatis in motu
musculari in 1791, has become famous. In 1780 his students had already observed
how frog thighs would twitch when the nerve was dissected and if a static electricity
generator was simultaneously in use, such devices were standard laboratory equip-
ment in many laboratories at the time. He wanted to demonstrate in standardized
experiments whether the twitching was also caused by thunderstorms. He hung the
legs on an iron window grill with a copper hook — they twitched simply upon
contact with the grill. The voltage difference between the two metals was enough to
stimulate the nerve, even without an electrical discharge.
The systematic investigation of the biological effects in animals of plant
extracts, animal venoms, and synthetic substances began in the next-to-last century.
In 1847 the first pharmacology department was founded at the Imperial University
in Dorpat (today: Tartu, Estonia). The famous pharmacologist, Sir James W. Black,
who developed the first b-blocker (an antihypertensive, ▶ Sect. 29.3) at ICI, and
later took part in the development of the first H2 antagonists (see gastrointestinal
ulcer medications, ▶ Sect. 3.5) at Smith, Kline & French, compared pharmacolog-
ical testing to a prism: what pharmacologists see in their substances’ properties
directly depends on the model that was used to test the substances.
Just as a prism would, the models distort our vision in different ways. There is no
such thing as a depressed rabbit or a schizophrenic rat. Even if there were such
animals, they would not be able to share their subjective perceptions and emotions
with us. Gene-modified animals (▶ Sect. 12.5), such as the Alzheimer mouse, are also
approximations of reality that have been distorted through a different prism, to use
Black’s analogy. This actuality is often underestimated in industrial practices. Sci-
entists tend to optimize their experiments on a particular, isolated model. In doing so,
many factors and characteristics that are essential for a medicine, for instance the
selectivity or bioavailability, are inadequately considered.
There is no way out of this dilemma. We need simple in vitro models (Sect. 1.5)
to be able to test large series of potentially active compounds, and we need the
animal models to correlate the data and to make predictions about the therapeutic
effects on humans. In the past, therapeutic progress was preferentially achieved
when a new in vivo or in vitro pharmacological model was available for a new effect
(see the H2 receptor antagonists, ▶ Sect. 3.5).
Typical mistakes in the selection of models and interpretation and comparison of
experimental results arise from different modes of application and the correlation of
results obtained in different species of animals. It does not make sense to optimize
the therapeutic range of a substance in one species, and the toxicology in another.
Further, comparing effects after a fixed dose, without determining an effective dose
also distorts the results because very strong and weak substances fall outside the
measurement range. Measuring the effect strictly according to a schedule is also
questionable because neither the latency period, that is the time before an effect is
seen, nor the time of maximum biological effect are recorded. In whole-animal
models, auxiliary medications are usually applied, which can also influence the
experimental results. Anesthetized animals often give entirely different results than
conscious animals.
8 1 Drug Research: Yesterday, Today, and Tomorrow

1.3 The Battle Against Infectious Disease

Plagues and infectious diseases, and at the top of this list are malaria and tubercu-
losis, have killed more people over the ages than all of the wars in the history of
humanity. Twenty-two million people died during the first wave of the 1918
influenza (“Spanish flu”). Up until the middle of the twentieth century, millions
of people died every year of malaria, and unfortunately, today these numbers are
shooting up again (▶ Sect. 3.2). Until the turn of the twentieth century, ipecac
(Psychotria ipecacuanha) and cinchona (Cinchona officinalis L.) were the only
therapeutic approaches to this disease. The impressive successes in the fight against
plagues came in large part from the last 80 years of drug research. We have the
sulfonamides (▶ Sect. 2.3) and their combinations with dihydrofolatereductase
inhibitors (▶ Sect. 27.2), the antibiotics (▶ Sects. 2.4, ▶ 6.4, and ▶ 32.6), and
the synthetic tuberculostatic medicines (▶ Sect. 6.5) to thank for this. When
Selman A. Waksman (1888–1973) received the Nobel Prize for the discovery of
streptomycin (▶ Sect. 6.4), a little girl congratulated him with a bouquet of
flowers. She was the first patient with meningeal tuberculosis to be healed with
streptomycin. Today we cannot appreciate the atmosphere in a tuberculosis hos-
pital from our own experience, rather solely from Thomas Mann’s The Magic
Mountain (German: Zauberberg).
However, the infectious diseases, including tuberculosis, are on the advance
again. In the past many antibiotics were too broadly used. This and the spread of
resistant pathogens in hospitals have led to the situation that many cases are only
treatable with very specific antibiotics. If resistance develops to these antibiotics, all
of our weapons are dull. New viral infections are looming. Before the advent of the
immune disease AIDS (acquired immune deficiency syndrome) there were very
few cases of pneumonia from the fungus Pneumocystis jirovecii (formerly
Pneumocystis carinii), nowadays the numbers have increased tremendously. This
type of pneumonia is the primary cause of death of AIDS patients and
immunosupressed patients after organ transplantation. A great effort has been
made to find drugs for AIDS and its complications. On the other hand, many
widespread tropical diseases, for instance malaria and Chagas disease, have been
inadequately researched, and expanding resistance to the currently available med-
ications represents an increasing worldwide problem. Because these diseases are
rampant in parts of the world where people lack the economic resources to finance
chemotherapy, more and more pharmaceutical companies have withdrawn from
these research areas for financial reasons. The chances of recovering the develop-
ment costs from this social stratum are poor. Here the global politics must establish
some structure so that these people are able to benefit from the technological
progress made by modern drug research. An example of this is the Bill and Melinda
Gates Foundation, which is dedicated to the treatment and eradication of diseases
around the entire world, but with particular emphasis on developing countries.
Improved hygiene has also helped to reduce the risk of infection, for
instance traumatic fever or Shigella dysentery (discussed in ▶ Chap. 21, “A Case
1.4 Biological Concepts in Drug Research 9

Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase”).


Above all else, it was the vaccines that contributed to the eradication of many
infectious diseases. Now as before, hopes rest on new and combined vaccines for
the prevention of AIDS, malaria, and gastrointestinal ulcers, the latter of which we
now know to be caused by the bacteria Helicobacter pylori (▶ Sect. 3.5).

1.4 Biological Concepts in Drug Research

Acetylcholine 1.10 (Fig. 1.2), which was synthesized in 1869 by Adolf v. Bayer, is
a neurotransmitter, that is, a transfer agent for nerve impulses. In 1921 Otto
Loewi, a pharmacologist, proved its biological effect in an elegant experiment.
Two isolated frog hearts were perfused with the same solution. The vagal nerve of
one of the hearts was stimulated, leading to a slowing of the heart rate, a so-called
bradycardia. Shortly afterward, the second heart also began to beat more slowly,
which was a clear indication of a humoral (Lat. humor, umor, fluid) signal transfer.
Soon after that acetylcholine was recognized as the responsible “Vagus Stoff”.
Acetylcholine is itself not usable as a therapeutic because it is metabolized too
quickly by acetylcholine esterases (▶ Sect. 23.7).
In 1901 Thomas Bell Aldrich (1861–1938) and Jokichi Takamine isolated the
first human hormone, adrenaline 1.11 (Fig. 1.2). This hormone and its N-desmethyl
derivative, noradrenaline 1.12 (Fig. 1.2), are produced in a central location, the
adrenal glands, and are released under stress conditions into the entire system with
the exceptions of the CNS and the placenta, which have their own barriers against
most polar compounds. These substances cause different reactions in different parts
of the organism, where they react with the relevant receptors. The specificity is
poor, and a plethora of pharmacodynamic effects result: pulse and blood pressure
rise, and the organism is prepared for “flight” – which has been an exceedingly
important function over the course of evolution.
Noradrenaline and adrenaline (also called norepinephrine and epinephrine,
respectively) are also neurotransmitters (▶ Sect. 29.3), just like acetylcholine, the
biogenic amines 1.13–1.15, the amino acids 1.16–1.19, and peptides, such as 1.20
and 1.21 (Fig. 1.2). Neurotransmitters are produced locally in the nerve cells,
stored, and upon stimulation of the nerve, released. After interaction with receptors
on the neighboring nerve cell, they are quickly metabolized or taken up again by the
same neuron that released them. Depending on the name of the neurotransmitter,
one speaks of the adrenergic, cholinergic, and dopaminergic (etc.) systems. The
effect that adrenaline invokes is referred to as adrenergic, and an antagonist to this
system is called antiadrenergic. However, this nomenclature is not always strictly
adhered to. It is common to see combinations of the name of the neurotransmitter
with the term agonist or antagonist, or sometimes blocker instead of antagonist, for
instance a dopamine agonist, a histamine antagonist, or a b-blocker for antagonists
of b-adrenergic receptors. A plethora of drugs have arisen from the structural
variations of neurotransmitters.
10 1 Drug Research: Yesterday, Today, and Tomorrow

O CH3 OH
H
+ N CH3 HO N
H3C O R
CH3
1.10 Acetylcholine HO

1.11 Adrenaline, R = CH3


HO NH2
1.12 Noradrenaline, R = H
HO
1.13 Dopamine
HO
NH2
H
N NH 2
N
N H
1.14 Histamine 1.15 Serotonin

NH2 NH2
COOH
HOOC HOOC COOH

1.16 Aspartic acid 1.17 Glutamic acid

H2N COOH H2N COOH

1.18 Glycine 1.19 γ-Aminobutyric acid

Tyr-Gly-Gly-Phe-Met 1.20 Met-Enkephalin

Tyr-Gly-Gly-Phe-Leu 1.21 Leu-Enkephalin

Fig. 1.2 The natural hormones und neurotransmitters acetylcholine 1.10, adrenaline 1.11, nor-
adrenaline 1.12, dopamine 1.13, histamine 1.14, and serotonin 1.15, the excitatory amino acids
glutamic acid 1.16 and aspartic acid 1.17, the inhibitory amino acid glycine 1.18 and
g-aminobutyric acid (GABA) 1.19, and several peptides, such as the enkephalins 1.20 and 1.21,
substance P and others serve as lead structures for drugs for a variety of cardiovascular and CNS
diseases (see ▶ Chaps. 3, “Classical Drug Research”; ▶ 29, “Agonists and Antagonists of Mem-
brane-Bound Receptors”; and ▶ 30, “Ligands for Channels, Pores, and Transporters”).

At the end of the 1920s the steroid hormones were isolated, and their structures
were determined in short order (▶ Sect. 28.5). Altogether the discoveries of the
mid-twentieth century heralded the “golden age” of drug research. The systematic
variation of the principles responsible for biological activity and our increasing
knowledge of the mode of action has led to the synthesis of enzyme inhibitors,
receptor agonists and antagonists, which together with natural product derivatives
from plants makes up the largest part of our modern pharmacy.
1.5 In Vitro Models and Molecular Test Systems 11

1.5 In Vitro Models and Molecular Test Systems

Around 40 years ago, we began to think about testing substances in simple in vitro
models. With these models biological testing takes place in test tubes rather than
animals. There are many compelling reasons to avoid animal experiments. They
increasingly provoke public criticism and are time and cost intensive. In the
beginning cell culture models were preferentially employed, for example tumor
cell cultures for testing cytostatic therapies, or embryonic chicken heart cells for
cardio-active compounds. Later these were joined by receptor-binding studies. The
first molecular test models were enzyme-inhibitor assays in which the inhibitory
activity of a molecule could be evaluated on one particular target protein in the
absence of interfering side effects (▶ Chap. 7, “Screening Technologies for
Lead Structure Discovery”). With the progress of gene technology methods
(▶ Chap. 12, “Gene Technology in Drug Research”), not only is the preparation
of the enzyme simplified, but also receptor-binding studies can be carried out on
standardized materials. Today it is possible to achieve an exact evaluation of the
entire activity spectrum of any substance on any enzyme, receptors of all types and
subtypes, ion channels, and transporters. In the meantime, in industrial drug dis-
covery this procedure has become routine. Before biological screening begins, the
following questions have to be answered: what therapeutic goal should be achieved
and is this goal achievable? Therapeutic concepts are established based on the
pathophysiology and the causes of its alteration. Regulatory interventions with
drugs should re-establish the normal physiological conditions as closely as possible.
In doing so, a distinct problem occurs. Nature works on two orthogonal principles: the
specificity of the mode of action and an accentuated spa separation of effects;
the compartmentalization. Adrenaline that is produced in the adrenal glands works
on the entire body except for the brain. If it is released there, it works only in the
synapse between two nerve cells. As far as the specificity goes, the chemists can beat
nature most of the time, but they fail when it comes to spatial separation by a wide
margin.
Through the progress made in gene technology (▶ Chap. 12, “Gene Technology
in Drug Research”) we can investigate active substances much more exactly than
before; but by using isolated enzymes and binding studies we are a long way away
from the reality of animal models, and even further away from humans. In analogy
to the difference between an animal experiment and an isolated-organ experiment,
a well-established correlation between the results obtained in cell culture and an
in vitro test and the desired therapeutic effect is a prerequisite to successfully using
the in vitro model. The quantitative relationship between different biological effects
(▶ Chap. 18, “Quantitative Structure–Activity Relationships”) establishes the con-
nection between animal models and humans.
One modern researcher stands out in the area of CNS-active compounds
especially, but also in areas of cardiovascular-active compounds and antihista-
mines. Paul Janssen (1926–2003) was the director of the company Janssen
Pharmaceuticals in Beerse, Belgium. In the years after World War II, his company
discovered over 70 new active substances, carried out the preclinical and clinical
12 1 Drug Research: Yesterday, Today, and Tomorrow

development, and established them as therapies. In doing so, his company


established itself as the most successful in pharmaceutical history. His recipe
for success was not a secret. Paul Janssen was a master of structural variation,
a Beethoven of drug discovery. The systematic combination of pharmacologically
interesting structural fragments, and the elegant evaluation of receptor-binding
studies, in vitro models, and animal experiments were the foundation of
his successes.

1.6 The Successful Therapy of Psychiatric Illness

Up until the middle of the last century psychiatric hospitals were purely custodial care
facilities; they were almost indistinguishable from prisons in terms of the restriction of
personal freedom of the individual. The discovery of neuroleptics, antidepressants,
anticonvulsives, and sedatives revolutionized psychiatry. Typical examples of this
class of drugs are depicted in Fig. 1.3. With the repertoire of drugs that are available
today, schizophrenia, chronic anxiety, and depression preponderate open-ward psychi-
atry. Many patients can be treated in an ambulatory setting. In 1933 Manfred Sakel
(1901–1957), who worked at the psychiatric university hospital in Vienna, noticed that
when schizophrenics were given insulin to stimulate their appetites, they became
calmer. Encouraged by this result, he increased the dose to the point of hypoglycemic
coma, which is a form of deep unconsciousness induced by too little blood sugar. Insulin
shock, pentetrazole, and electroshock became the standard treatment over the next two
decades for psychotic illness, an impressive and frightening proof of the absence of
therapeutic alternatives.
This situation changed in the 1950s with the discovery of reserpine 1.9 (Fig. 1.1,
Sect. 1.1), a herbal natural product. This substance exerts its effect by emptying the
reserves of the neurotransmitters noradrenaline, serotonin, and dopamine in nerve
cells. Reserpine was the first substance to display a prominent neuroleptic effect,
that is, it is sedating and calming, and it was the first compound to be used for
psychotic illness, for which the biological effect could be explained by a mode of
action. In addition, reserpine was used as an antihypertensive medication. Because
of its very broad and unspecific effect it is rarely used today for psychiatric illness
or arterial hypertension.
The role of dopamine 1.13 (Fig. 1.2, Sect. 1.4) in the etiology of schizophrenia
became clear with the discovery of chlorpromazine 1.22 (Fig 1.3, ▶ Sects. 8.5 and
▶ 19.10), a substance that showed a favorable clinical effect. In contrast to the
unspecific reserpine, chlorpromazine is a pure dopamine antagonist. The applica-
tion of chlorpromazine and analogous tricyclic neuroleptics caused symptoms that
occur in Parkinson’s disease. This was the first indication that an endogenous
dopamine deficiency is the cause of that disease.
Chlordiazepoxide (Librium ®, ▶ Sect. 2.7), the first tranquilizer of the group of
benzodiazepines, was found by accident. Only one year after its introduction and
for many years after that, the chemically closely related medication diazepam 1.23
(Valium ®, Fig. 1.3) was the worldwide best-selling drug. The Rolling Stones
1.6 The Successful Therapy of Psychiatric Illness 13

CH3
O
N
S

Cl N
Cl N
CH3
N
CH3
1.22 Chlorpromazine 1.23 Diazepam

F3C

N O
CH3 CH3
N N
R H

1.24 Imipramine, R = CH3


1.25 Desipramine, R = H 1.26 Fluoxetine

Fig. 1.3 A revolution in the therapy of psychiatric illness was brought about by the discovery of
potent neuroleptics such as chlorpromazine 1.22, tranquilizers such as diazepam 1.23, and
antidepressants such as imipramine 1.24. For the first time, these compounds allowed
a purposeful treatment of schizophrenia, chronic anxiety, and depression. Examples of newer
antidepressants with specific modes of action on transport systems (▶ Sect. 4.6) for noradrenaline
and serotonin are desipramine 1.25 and fluoxetine 1.26, respectively.

commemorated it in their multifaceted song “Mother’s Little Helper.” Many com-


panies started grandly endowed synthetic programs, and chemists and pharmacol-
ogists applied their entire arsenal of methods. Their success justified their efforts.
Substances with different modes of action resulted: further tranquilizers, sedatives,
hypnotics, and even antagonists. Even today the benzodiazepines (▶ Sect. 30.5)
belong to the most popular and widespread medications.
The first antidepressant, iproniazid (▶ Sects. 6.7 and ▶ 27.8) was also an acci-
dental discovery. It works by inhibiting the metabolism of the biogenic amines
dopamine, serotonin, noradrenaline, and adrenaline by inhibiting the enzyme
monoamino oxidase (▶ Sect. 27.8). In addition to other severe side effects, the first
unspecific representatives caused hypertensive crises, and when taken with certain
foods a few fatalities occurred. Tyramine, a substance found in cheese, wine, and beer
(therefore the term “cheese effect”) was not duly metabolized. This caused a life-
threatening rise in noradrenaline, a hormone that raises blood pressure.
The antidepressant imipramine 1.24 (Fig. 1.3, ▶ Sect. 8.5) resulted from the
synthesis of analogues of chlorpromazine. Interestingly and despite its close struc-
tural relationship, it is not a neuroleptic but rather it works in the opposite way.
It blocks the transporter for noradrenaline and serotonin, and this prevents the
14 1 Drug Research: Yesterday, Today, and Tomorrow

reuptake of these neurotransmitters from the synaptic gap. Desipramine 1.25 and
fluoxetine 1.26 are even more selective in that they inhibit only the noradrenaline or
the serotonin transporter of nerve cells.

1.7 Modeling and Computer-Aided Design

An extremely capable tool is available for modeling the properties and reactions of
molecules, and particularly their intermolecular interactions: the computer. In
addition to processing complex numerical problems, it is the translation of the
results into color graphics that exceedingly accommodates the human ability to
grasp pictures faster and more easily than text or columns of numbers. That is not
a surprise. Our brains process text sequentially, but pictures are comprehended in
parallel. X-ray crystallography and multidimensional NMR spectroscopic tech-
niques (▶ Chap. 13, “Experimental Methods of Structure Determination”) contrib-
ute to our understanding of molecules as much as quantum mechanical and force
field calculations (▶ Chap. 15, “Molecular Modeling”).
Is molecular modeling an invention of modern times? Yes and No. Friedrich
August Kekulé (1829–1896) supposedly derived his cyclic structure for benzene
from a vision of a snake that circled upon itself and bit its own tail (incidentally, the
snake Uroborus is an age-old alchemist symbol). This now-famous dream may be,
however, traced to a memory of the book Constitutionsformeln der Organischen
Chemie by the Austrian schoolteacher Joseph Loschmidt (1821–1895; Fig. 1.4).
Loschmidt admittedly would take pleasure in contemplating pictures of models that
are quite similar to his own. More and more today we place the three-dimensional
structure, the steric dimensions, and the electronic qualities of molecules in the
foreground. Advances in theoretical organic chemistry and X-ray crystallography
have made this possible. The first structure-based design was carried out on
hemoglobin, the red blood pigment, in the research group of Peter Goodford.
Hemoglobin’s affinity for oxygen is modulated by so-called allosteric effector
molecules that bind in the core of the tetrameric protein. From the three-
dimensional structure he deduced simple dialdehydes and their bisulfite addition
products. These substances bind to hemoglobin in the predicted way and shift the
oxygen-binding curve in the expected direction.
The first drug developed by using a structure-based approach is the antihy-
pertensive agent captopril, an angiotensin-converting enzyme (ACE) inhibitor
(▶ Sect. 25.4). Although the lead structure was a snake venom, the decisive
breakthrough was made after modeling the binding site. For this, the binding site
of carboxypeptidase, another zinc protease, was used because its three-dimensional
structure was known at the time.
The road to a new drug is difficult and tedious. A nested overview of the interplay
between the different methods and disciplines from a modern point of view is
illustrated in the scheme in Fig. 1.5. In the last few years molecular modeling
(▶ Chap. 15, “Molecular Modeling”) and particularly the modeling of ligand–
receptor interactions (▶ Chap. 4, “Protein–Ligand Interactions as the Basis
1.7 Modeling and Computer-Aided Design 15

H O

OH
H

H2N NH2

Cl

N N

N N N
H H

Fig. 1.4 Loschmidt’s book Constitutionsformeln der Organischen Chemie (1861) contains struc-
tures that anticipate both the formulation of the benzene ring as well as the modern modeling
structure. Kekulé must have known about this book because he disparaged it in a letter to Emil
Erlenmeyer in January 1862 in that he referred to it as “Confusionsformeln.” Loschmidt did not
become famous for his book, but rather because he carried out an experiment in 1865 that
determined the number of molecules in a mole to be 6.021023, a constant that was later to be
named after him.

for Drug Action”), have gained importance. Although modeling is employed


predominantly for the targeted structure modification of lead compounds, it is
also suitable for the structure-based and computer-aided design of drugs
(▶ Chap. 20, “Protein Modeling and Structure-Based Drug Design”) and lead
structure discovery (▶ Sect. 7.6). Examples of these approaches are given in
▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”;
▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing
Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidoreductase Inhibi-
tors”; ▶ 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists and
Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for Channels, Pores,
and Transporters”; ▶ 31, “Ligands for Surface Receptors”; ▶ 32, “Biologicals:
Peptides, Proteins, Nucleotides, and Macrolides as Drugs”.
In addition to modeling and computer-aided design, structure–activity relation-
ship analysis (▶ Chap. 18, “Quantitative Structure–Activity Relationships”) has
contributed to the understanding of the correlation between the chemical structure
of compounds and their biological effects. By using these methods, the influence
of lipophilic, electronic, and steric factors on the variation of the biological
activity, transport, and distribution of drugs in biological systems could be
systematized for the first time on statistically significant foundations.
16 1 Drug Research: Yesterday, Today, and Tomorrow

Identification of a
biological target, proof Literature, patents,
of principle, molecular competitor products
test system (‘me too’ research)

Natural products,
synthetics, peptides, Screening Biological concept,
combinatorial clinical side effects
chemistry

Lead structures

Experimental design,
synthetic design

Synthesis
Computer-aided
DESIGN CYCLE design: protein
crystallography,
NMR, 3D database
Biological searches, de novo
testing Structure—activity design
relationships, QSAR,
molecular modeling

Candidate for
further Developmental
substance Formulation Drug
development

Fig. 1.5 The way to a drug is long. The upper part of the figure shows routes to lead structures.
The middle part describes the design cycle, which in practically all cases must be repeatedly
reiterated. Each of these phases is described in detail in the following chapters. The result of
iterative optimization is candidates for further development such as preclinical and toxicological
studies. It is from these studies that the actual candidates are selected. Formulation, clinical trials,
and registration then lead to a new medicine. The last phases are not presented in this book.

1.8 The Results of Drug Research and the Drug Market

The development of different methods in drug research has already been described
in the last section. Table 1.1 gives a short historical overview of the most prominent
results.
1.8 The Results of Drug Research and the Drug Market 17

Table 1.1 Important milestones in drug research


Year Substance Indication/Mode of action
1806 Morphine Hypnotic
1875 Salicylic acid Anti-inflammatory
1884 Cocaine Stimulant, local anesthetic
1888 Phenacetin Analgetic and antipyretic
1889 Acetylsalicylic acid Analgetic and antipyretic
1903 Barbiturate Sedative
1909 Arsphenamin Anti-syphilitic
1921 Procaine Local anesthetic
1922 Insulin Antidiabetic
1928 Estrone Female sex hormone
1928 Penicillin Antibiotic
1935 Sulfamidochrysoidine Bacteriostatic
1944 Streptomycin Antibiotic
1945 Chloroquine Antimalarial
1952 Chlorpromazine Neuroleptic
1956 Tolbutamide Oral antidiabetic
1960 Chlordiazepoxide Tranquilizer
1962 Verapamil Calcium channel blocker
1963 Propranolol Antihypertensive (b-blocker)
1964 Furosemide Diuretic
1971 L-DOPA Parkinson’s disease
1973 Tamoxifen Breast cancer (estrogen receptor antagonist)
1975 Nifedipine Calcium channel blocker
1976 Cimetidine Gastrointestinal ulcer (H2 blocker)
1981 Captopril Antihypertensive (ACE inhibitor)
1981 Ranitidine Gastrointestinal ulcer (H2 blocker)
1983 Ciclosporin A Immunosuppressant
1984 Enalapril Antihypertensive (ACE inhibitor)
1985 Mefloquine Antimalarial
1986 Fluoxetine Antidepressant (5-HT-transport inhibitor)
1987 Artemisinin Antimalarial
1987 Lovastatin Cholesterol biosynthesis inhibitor
1988 Omeprazole Gastrointestinal ulcer (H+/K+-ATPase inhibitor)
1990 Ondansetron Antiemetic (5-HT3 blocker)
1991 Sumatriptan Migraine (5-HT1B,D agonist)
1993 Risperidone Antipsychotic (D2/5-HT2-blocker)
1994 Famciclovir Antiviral/herpes (DNA polymerase inhibitor)
1995 Losartan Arterial hypertension (ATII antagonist)
1995 Dorzolamide Glaucoma (carboanhydrase inhibitor)
1996 Saquinavir HIV protease inhibitor
1996 Ritonavir HIV protease inhibitor
1996 Indinavir HIV Protease inhibitor
1996 Nevirapine HIV reverse transcriptase inhibitor
(continued)
18 1 Drug Research: Yesterday, Today, and Tomorrow

Table 1.1 (continued)


Year Substance Indication/Mode of action
1997 Sibutramine Obesity (uptake inhibitor)
1997 Orlistat Obesity (lipase inhibitor)
1997 Tolcapon Parkinson’s disease (COMT inhibitor)
1998 Sildenafil Erectile dysfunction (PDE5 inhibitor)
1998 Montelukast Broncholytic (leukotriene receptor antagonist)
1999 Infliximab Antirheumatic (TNFa antagonist)
2000 Celecoxib Analgesic (COX-2 inhibitor)
2000 Verteporfin Macular degeneration (photodynamic therapy)
2001 Imatinib Acute myeloid leukemia (kinase inhibitor)
2002 Boscutan Arterial hypertension (endothelin-1 receptor antagonist)
2002 Aprepitant Antiemetic (neurokinin receptor antagonist)
2003 Enfuvirtid HIV fusion inhibitor (oligopeptide)
2004 Ximelagatran Coagulation inhibitor (thrombin inhibitor)
2004 Bortezomib Multiple myeloma (proteasome inhibitor)
2005 Bevacizumab Cytostatic (angiogenese inhibitor)
2006 Natalizumab Multiple sclerosis (monoclonal antibody; integrin inhibitor)
2006 Aliskiren Antihypertensive (renin inhibitor)
2007 Maraviroc HIV fusion inhibitor (CCR5 antagonist)
2007 Sitagliptin Type-II diabetes (DPPVI inhibitor)
2008 Raltegravir HIV integrase inhibitor
2009 Rivaroxaban Oral Anticoagulant (FXa inhibitor)
2010 Mifamurtide Drug against Osteosarcoma (bone cancer)
2011 Fingolimod Immunomodulating drug (multiple sclerosis treatment)

The assessment of the efficacy and safety of a drug has reached an extraordi-
narily high standard today. To some extent this development is a bystander in our
goal of finding new medicines, but it is also a hindrance. Acetylsalicylic acid
(Aspirin®) is without any doubt a valuable drug. Today this compound would
have great difficulty to pass clinical trials. Acetylsalicylic acid is an irreversible
enzyme inhibitor, it has relatively weak efficacy, it causes gastric bleeding in high
doses, and it has a very short biological half-life. Each of these problems would be
a profound argument against its continued development today. It probably would
have already failed in screening. In a risk–benefit analysis however, it is better than
most of the alternatives. Where is the problem? It probably lies in the analytical–
deterministic mindset that dominates science, and therefore also drug research. It is
often overlooked that such an approach deals with a system as complicated and
complex as a human, to whom we apply a drug therapy, cannot always be ade-
quately addressed by all means.
Despite public healthcare systems that constitute a barrier between the supplier
and the consumer, the drug market, with worldwide sales of more than US$880
billion, has strong competition. Two forces affect this market: the state of science
1.9 Controversial Drugs 19

and technology and the needs of patients. A few drugs command a large portion of
sales. Constantly changing “hit lists” of the best-selling drugs can be found on the
internet. Because of the merging of established pharmaceutical companies in the
last years, the market has contracted to fewer, bigger companies. It is frequently
the case that a single drug can make or break a company. Often only two to three
drugs make up more than 50% of a large company’s sales. A historical example is
Glaxo. This company made its way out of the midfield to the top with ranitidine.
Astra experienced a similar boom with omeprazole. Today after the merger with
Zeneca, it belongs to the biggest representatives of this field. Sankyo also had
a single drug, lovastatin, that exceedingly boosted sales. With its drugs sildenafil
(Viagra ®) and atorvastatin (Sortis ®/Lipitor ®) Pfizer’s profits shot to unimaginable
highs. Just in the last years we have been able to see an increasing concentration of
pharmaceutical companies, so that the market is making a transition to an oligo-
poly, dominated by multinational corporations. Keep in mind that sales giants such
as GlaxoSmithKline (GSK), Novartis, Sanofi-Aventis, Bayer HealthCare, Bristol-
Myers Squibb or AstraZeneca have only originated in the last 10 years through
mergers. Companies such as Pfizer and Roche have significantly grown from acqui-
sitions. The role research plays for pharmaceutical companies is apparent when one
considers that typically 15–20% of turnover is invested in this area. It is certain that
the concentration of the pharmaceutical market is not complete. We can only wait and
see how the landscape continues to shift and adapt at an almost annual pace.

1.9 Controversial Drugs

Drugs remain in the focal point of public interest. Whereas for decades it was the
physician alone who prescribed medication, today it is the patient, frightened by the
lay press or better informed through labeling or reputable literature, who wants to
take control of, or at least share in the decision making.
The issues can be illustrated by one example. Psychotropic pharmaceuticals
exert an impressive effect on personality and behavior. At least since the intro-
duction of Valium ® (diazepam) these drugs have been in the media spotlight.
They are invaluable for the treatment of psychiatric illness. On the other hand, the
danger of misuse and addiction is particularly high. Some of these drugs are even
used as self-medication, without strict adherence to the indication guidelines.
Fluoxetine 1.26 (Prozac ®, Fig. 1.3, Sect. 1.6) was introduced in 1988 by Eli Lilly,
and brought unequivocal progress in the treatment of depression. On this one
medication alone there are now over ten popular science books with controversial
content. Peter Kramer’s book Listening to Prozac takes an overall sympathetic
tone with the assertion that depressed patients feel better and more “in harmony”
with their personality after treatment with fluoxetine. This book was on the New
York Times bestsellers’ list for over 21 weeks. Peter Breggin’s book Talking Back
to Prozac criticized fluoxetine, the company Eli Lilly, and the U.S. Food and Drug
Administration (FDA) polemically. The side effects, risks, and particularly the
addictive potential were placed in the foreground. Both books contain correct
20 1 Drug Research: Yesterday, Today, and Tomorrow

assertions, and both books lead to the wrong conclusions. Prozac ® is a valuable
medicine for the treatment of clinically manifest depression; for the treatment
of mundane unhappiness or as a general stimulant, however, it is a drug with
many risks.
To make a risk–benefit analysis of a medication, it is important to consider not
only the desired effect but also the severity of the illness and the objective and
subjective side effects. In oncology one accepts even severe side effects for the
possibility of improving the patient’s condition. If an end-stage cancer patient is
refused an effective pain therapy because of the risk of addiction, then that must be
seen as malpractice. On the other hand many people handle highly potent medica-
tions recklessly. The misuse of antibiotics, the faith in the almighty power of
tranquilizers and antidepressants, or the chronic use of analgesics and laxatives
do more damage than good.

1.10 Synopsis

• Drug research can be divided into several sequential phases starting with empirical
observations of the uptake of natural products from food, the development of
in vitro test systems, increasing understanding of structures and modes of action,
to in vivo models and gene technology.
• It all started with traditional medicines. The first prescriptions date back to the
ancient Egyptians and to traditional Chinese medicine.
• Paracelsus founded scientific medical research and understood humans to be
a “chemical laboratory.” The ingredients of drugs were first held responsible for
healing effects.
• With the advent of organic chemistry, the first therapeutic principles based on
pure organic compounds became available. The great age of natural products
from plants and their active ingredients began.
• Systematic studies on animals began in the next-to-last century and can be
seen as a starting point for drug research. In vitro models are needed to
test large series of potentially active compounds, but animal models are required
to correlate the data and make predictions about the therapeutic effects
in humans.
• Our present life expectancy would not be possible without the successful
fight against infectious diseases. The broad application of antibiotics and
the spread of resistant pathogens, however, have led to situations in which
the best weapons against infectious diseases are becoming increasingly dull.
Research against widespread tropical diseases has been neglected, and
the currently increasing resistance to available medications represents a
worldwide problem.
• The elucidation of biological concepts, pathways, and regulatory cycles by
endogenous compounds has strongly stimulated drug research. Many developed
drugs have arisen from structural variations of neurotransmitters, hormones,
steroids, or natural substrates.
Bibliography 21

• Systematic substance testing began with the establishment of in vitro models


that replaced biological testing on animals by assays in test tubes. Gene
technology has made it possible to prepare sufficient amounts of pure proteins
for testing.
• The discovery of neuroleptics, antidepressants, anticonvulsives, and sedatives
has revolutionized the treatment of psychiatric diseases.
• Molecular modeling and computer-aided design along with structural
biology give access to rational considerations on drug action. The first
structure-based design project was carried out on hemoglobin, and the first
drug developed by using a structure-based approach was the antihyperten-
sive captopril.
• The assessment of drug efficacy and safety has reached an extraordinary high
standard today. The worldwide drug market, with nearly a thousand billion
US dollars in sales per year, is large and highly competitive. Only a few
drugs command a large portion of the sales and determine the particular
dynamics in the market; the current tendency is corporate contraction to
fewer and bigger companies. Often a single drug can make or break
a company.
• Drugs remain in the focal point of public interest. It is no longer the physician
alone who influences the prescription of medication; multiple sources of infor-
mation have an impact and inform the patient. A proper risk–benefit analysis of
a medication, taking into consideration not only the desired therapeutic effect
but also the severity of an illness, is needed.

Bibliography

General Literature
Barondes SH (1993) Molecules and mental illness, Scientific American Library. W. H. Freeman
and Company, New York
Beddell CR (ed) (1992) The design of drugs to macromolecular targets. Wiley, Chichester
Fischer D, Breitenbach J (eds) (2003) Die Pharmaindustrie. Spektrum Akademischer Verlag,
Heidelberg/Berlin
Friedrich C, Müller-Jahncke W-D (2005) Von der Frühen Neuzeit bis zur Gegenwart, vol 2.
GOVI-Verlag, Eschborn
Higby G (ed) (1997) The inside story of medicine. A Symposium. Madison, Wi
Herrmann EC, Franke R (eds) (1995) Computer-aided drug design in industrial research, Ernst
Schering research foundation workshop 15. Springer, Berlin
Müller K (ed) (1995) De Novo Design, Persp. Drug Discov. Design, vol 3, Escom, Leiden, 1995
MüllerJahnke WD, Friedrich C (2005) Arzneimittelgeschichte. Wissenschaftliche Verlagsge-
sellschaft, Stuttgart
Perun TJ, Propst CL (eds) (1989) Computer-aided drug design. Methods and applications. Marcel
Dekker, New York
Porter R, Teich M (eds) (1995) Drugs and narcotics in history. Cambridge
Restak RM (1994) Receptors. Bantam Books, New York
Schmitz R (1998) Geschichte der Pharmazie, vol 1. GOVI-Verlag, Eschborn
22 1 Drug Research: Yesterday, Today, and Tomorrow

Verband Forschender Arzneimittelhersteller (2009) e.V.: http://www.vfa.de/de/presse/statcharts/


arzneimittelmarkt/. Accessed 22 Nov 2011
Werth B (1994) The Billion-Dollar Molecule. One Company’s Quest for the Perfect Drug.
Touchstone, New York

Special Literature

Beddell CR, Goodford PJ, Norrington FE et al (1976) Compounds designed to fit a site of known
structure in human hemoglobin. Br J Pharmac 57:201–209
Breggin PR, Breggin GR (1994) Talking back to prozac. St. Martin’s Press, New York
Kramer P (1993) Listening to prozac. Viking, New York
Mutschler E (1987) Arzneimittel – Erfolge, Misserfolge, Hoffnungen. Deutsche Apoth-Ztg
127:2025–2033
Newman DJ, Cragg GM (2007) Natural products as sources of new drugs over the last 25 years.
J Nat Prod 70:461–477
Noe CR, Bader A (1993) Facts are better than dreams. Chem Brit 29:126–128, Kekulés and
Loschmidts Formeln
In the Beginning, There Was Serendipity
2

“A lucky accident dropped the medicine into our hands”; this is how a publication on
August 14, 1886, from Arnold Cahn and Paul Hepp in the Centralblatt f€ ur Klinische
Medizin began. The history of drug research is punctuated by lucky accidents.
As a general rule, detailed knowledge of biological systems was absent. So it is not
surprising that the working hypotheses were often wrong, and the obtained results
differed from the expectations. The case of accidental success fell into the back-
ground over time. Today happenstance as a strategy has been replaced by the arduous
and ambitious goal of preparing drugs by using a straightforward approach. The only
exception to this is the kind of shotgun-style testing of large and diverse chemical
compound libraries, including microbial and plant extracts that is done with the goal
of finding new lead structures. In this case, serendipity is desired to find as large and
diverse a palette of lead structures (▶ Chaps. 6, “The Classical Search for Lead
Structures” and ▶ 7, “Screening Technologies for Lead Structure Discovery”) with
potential for further optimization (▶ Chaps. 8, “Optimization of Lead Structures” and
▶ 9, “Designing Prodrugs”).

2.1 Acetanilide Instead of Naphthalene: A New, Valuable


Antipyretic

Back to Cahn and Hepp. What happened? There are several legends about this
lucky accident. The most plausible version is that the antipyretic effect of naph-
thalene, which was widely available from coal tar, was tested. The substance indeed
showed fever-lowering qualities. The responsible substance however, was not naph-
thalene but rather something entirely different: acetanilide 2.1 (Fig. 2.1). Further
experiments confirmed the efficacy. Shortly thereafter, the company Kalle & Co.
introduced it to the market with the name “Antifebrin.”
Phenacetin 2.2 (Fig. 2.1) was subsequently developed based upon a targeted
approach. At the time, Bayer in Elberfeld had 30 t of p-nitrophenol, a side product
from dye production, on their waste heap. The then 25-year-old Carl Duisberg, who
later became the chairman of Bayer Farbenfabriken AG and who also took a leading

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_2, 23


# Springer-Verlag Berlin Heidelberg 2013
24 2 In the Beginning, There Was Serendipity

O O
O + O-
N
HN CH3 HN C 3
CH

OH
OEt
2.1 Acetanilide 2.2 Phenacetin 2.3 p -Nitrophenol
O

NH2 HN CH3

OEt OH

2.4 Toxic metabolite 2.5 Paracetamol

Fig. 2.1 By starting with the accidently discovered acetanilide 2.1, Carl Duisberg planned
the synthesis of phenacetin 2.2 from nitrophenol 2.3. In contrast to the toxic metabolite 2.4, the
main metabolite, paracetamol (Amer. acetaminophen) 2.5 is well tolerated.

role in the foundation of I.G. Farbenindustrie in 1924, wanted to use it for the
preparation of acetanilide as it could easily be reduced to p-aminophenol. The
known toxicity of phenol groups led to the design of p-ethoxyacetanilide 2.2 (phen-
acetin), which actually did have the desired qualities and served as an analgesic for
headaches and as an antipyretic for a century. Unfortunately its metabolite 2.4, which
still contains the ethoxy group, leads to the production of methemoglobin, an
oxidized form of the red blood pigment that is incapable of carrying oxygen.
Furthermore, chronic misuse by, for instance, taking kilogram quantities of phenac-
etin over a lifetime, leads to kidney damage. Paradoxically, the main metabolite of
phenacetin, p-hydroxyacetanilide 2.5 (Fig. 2.1, acetaminophen in American English,
or paracetamol in UK English) is actually responsible for the effect, and it is less toxic
and better tolerated. In the USA alone, paracetamol achieved over US$1.3 billion in
annual sales. This is even more than for acetylsalicylic acid.

2.2 Anesthetics and Sedatives: Pure Accidental Discovery

In 1799 Humphry Davy (1778–1829) discovered the euphoric effect of nitrous oxide
(N2O), which was appropriately named “laughing gas.” The dentist Horace Wells
(1815–1848) saw a traveling theater production of a “sniffing party” with N2O in
1844 in which a participant suffered from a flesh wound, apparently without pain. To
test this effect, Wells had one of his own teeth extracted, also without pain. He then
repeated the procedure on many people, with success. A public demonstration went
2.3 Fruitful Synergies: Dyes and Pharmaceuticals 25

Cl Cl OH Cl
Cl H Cl CH Cl CH2OH
OH− Metabolism
Cl Cl OH Cl

2.6 Chloroform 2.7 Chloral hydrate 2.8 Trichlorethanol

Fig. 2.2 The anesthetic chloroform 2.6 is formed upon treatment of chloral hydrate 2.7 with base.
This reaction does not work in vivo, however. The active metabolite of 2.7 is trichloroethanol 2.8.

O
H O
R
Fig. 2.3 The hypothetical H2N O N Et
“prodrug” of ethanol, O
Et
urethane 2.9, led to the 2.9 Urethane R = −CH2 CH3 N
development of H O
isoamylcarbamate 2.10, 2.10 Isoamylcarbamate
which in turn led to the first 2.11 Barbital
barbiturate, barbital 2.11. R = −CH2CH2CH(CH3)2

wrong though, and this drove him to suicide four years later. The same effect was
observed in 1842 by Crawford W. Long (1815–1878) with ether, but he did not report
it immediately. After administering ether, he was able to remove an ulcer from the
neck of a volunteer. William T. Morton (1819–1868) successfully carried out the first
ether anesthesia in the same hospital as Wells. Starting in 1847, chloroform was used
as an anesthetic. A few years later anesthesia became standard for surgical pro-
cedures, a real blessing for the suffering of humanity.
Oskar Liebreich (1839–1908) wanted to develop a depot form of chloroform 2.6
in 1868. Because chloral hydrate can be cleaved with base in an aqueous milieu, he
hoped that this could also happen in the body. Chloral hydrate is in fact a sedative,
but this is because of its active metabolite, trichloroethanol 2.8 (Fig. 2.2), and not
because it releases chloroform.
In 1885 Oswald Schmiedeberg (1838–1921) tested urethane 2.9 (ethylcarbamate,
Fig. 2.3) because he thought that it would release ethanol in the organism. Urethane
itself is the active agent. Its optimization later led to isoamylcarbamate 2.10
(Hedonal ®, 1899). Based on this, open and cyclic carbamates and ureas were
investigated. In 1903 the first barbiturate sedative, barbital (Veronal ® ) resulted.
In the decades that followed, a wealth of better-tolerated barbiturates with a broader
pharmacokinetic spectrum was introduced.

2.3 Fruitful Synergies: Dyes and Pharmaceuticals

Dyes and pharmaceuticals have stimulated each other. The first synthetic dye was
the result of a failed drug synthesis. In 1856 August Wilhelm v. Hoffman assigned
the task of synthesizing quinine, an alkaloid used for treating malaria (▶ Sects. 1.1
and ▶ 3.2), to the then 17-year-old William Henry Perkins (1838–1907); by starting
26 2 In the Beginning, There Was Serendipity

2 C10H13N C20H24N2O2 + H2O


3 [O]
Allyl- Quinine
toluidin CH3
H3C N
NH2 +
H2N N NH
R
[O]

R = H oder o-, p-Methyl 2.12 Mauveine CH3

Fig. 2.4 An unsuccessful quinine synthesis founded the dye industry. The structures of many
organic compounds were still entirely unknown in the middle of the nineteenth century. The
attempt to prepare quinine via a simple route (upper reaction) could not have worked. The
oxidation of an impure aniline (below) gave mauveine 2.12 in 1856, which was used to dye silk
a brilliant mauve color. It was the first synthetic dye!

with only the molecular formula, it was anticipated that the oxidation of an allyl-
substituted toluidine would deliver the desired product. Now that the structural
formula is known, we understand that this could not possibly have worked! Upon
oxidation of aniline that was contaminated with o- and p-toluidine Perkins isolated
a dark precipitate. It contained a dye, mauveine 2.12 (Fig. 2.4) that colored silks
a brilliant mauve. Other dyes were prepared in rapid succession. The development
and later proliferation of the dye industry in England and Germany in the second
half of the nineteenth century can be traced back to this accidental discovery.
Toward the end of the next-to-last century increasing competition and a difficult
economic situation in the dye market inspired the reactionary expansion into
industrial pharmaceutical research. In 1896 a pharmaceutical research laboratory
was founded in the 33-year-old Bayer Farbenfabrik. At that time innumerable
synthetic dyes were known, therefore it is not surprising that these substances
were tested for pharmacological effects.
Of all people, wine adulterators played an important role in the discovery of the
first synthetic laxative. To stop people from selling Trester wine (so-called
Nachwein) as a natural wine (Naturwein), in 1900 the dye phenolphthalein was
added as an easily detectable indicator. The Hungarian pharmacologist Zoltán
von Vámossy (1868–1953) investigated the effects of this compound. Back then,
the conventions of the pharmacologists were still rather primitive. The intravenous
application of 0.01–0.03 g to rabbits caused death “with loud shrieking, convul-
sions, and paralysis”. Vámossy then decided to feed 1–2 g to a rabbit and 5 g to a
4 kg lap dog. Because these oral doses were all well tolerated, Vámossy took 1.5 g
of phenolphthalein himself, and a friend took 1.0 g. The effects were explosive:
rumbling in the bowels, diarrhea, and for two additional days loose stools. It was
later established that 150–200 mg would have been a therapeutic dose.
2.3 Fruitful Synergies: Dyes and Pharmaceuticals 27

HO H2N NH2

HO As As OH

O O
x 2 HCl
HO
2.13 Phenolphthalein 2.14 Arsphenamine

Fig. 2.5 The laxative effect of phenolphthalein became apparent while testing it as an additive for
cheap wines. The antisyphilis compound arsphenamine 2.14 (Salvarsan®, here shown as monomer)
is simply an azodye in which the —N═N— group was exchanged for an —As═As— group.

NH2

H2N N N SO2NH2
2.15 Sulfamidochrysoidine

H2N SO2NH2 H2N COOH

2.16 Sulfanilamide 2.17 p-Aminobenzoic acid

Fig. 2.6 The red azodye sulfamidochrysoidine 2.15 is effective only after cleavage to the
colorless sulfanilamide 2.16, which is a bacterial antimetabolite of p-aminobenzoic acid 2.17.

An entire range of antibacterial and antiparasitic dyes are based on the work
of Robert Koch (1843–1910). He showed that bacteria and parasites accumulate
dyes specifically. Based on this, Paul Ehrlich (1854–1915) hoped to kill pathogens
selectively with suitably chosen dyes. In 1891 he cured two mild cases of malaria by
treating the patients with methylene blue. In the following years he tested hundreds of
different pigments, and thousands more analogues were later synthesized in the
laboratories of Bayer and Hoechst. In 1909 Paul Ehrlich pursued a rational design
when he exchanged both of the nitrogen atoms of an —N═N— group of an azodye
for arsenic atoms. Arsphenamine 2.14 (Salvarsan ®, Fig. 2.5) was the first effective
compound to treat syphilis; the first chemotherapeutic. It became an extraordinary
economic success for the company Hoechst.
The breakthrough with chemotherapeutics was made by the physician Gerhard
Domagk (1895–1964). At the age of 31, he took over the newly formed department
of experimental pathology at Bayer in Elberfeld. Azo dyes bearing sulfonamide
groups had already been designed by the chemists Fritz Mietzsch and Josef Klarer,
but they showed no in vitro activity; Domagk tested these substances in strepto-
cocci-infected mice. By using this model, he found the first active substances in
1932. Sulfamidochrysoidine 2.15 (Protonsil®, Fig. 2.6), a dark-red dye that could
28 2 In the Beginning, There Was Serendipity

treat even severe streptococci infections, resulted in 1935. The sulfonamides


became world famous a year later when the son of the US president, Theodore
D. Roosevelt, Jr., was treated with one to cure a severe sinus infection. But
even here a false hypothesis led to success. It was not the azodye itself, but rather
its metabolite, sulfanilamide 2.16 that was effective. Sulfanilamide replaces
p-aminobenzoic acid 2.17 (Fig. 2.6), which is needed for the bacterial synthesis
of an enzymatic cofactor, dihydrofolic acid.

2.4 Fungi Kill Bacteria and Help with Syntheses

The discovery of the antibiotic effect of Penicillium notatum by Alexander Fleming


(1881–1955) in 1928 is the most famous example of a serendipitous discovery.
Fleming noticed that a spoiled staphylococcus culture had been contaminated with
a fungal infection. In the area around the fungus, no bacteria could grow. Further
investigations showed that this fungus could also curb other bacteria. Fleming
called the still-unknown agent penicillin. It was not until 1940 that it was isolated
and characterized by Ernst Boris Chain (1906–1979) and Howard Florey (1910–
1985). In 1941 an English policeman was the first patient to be treated with
penicillin. Despite a temporary improvement, and even though penicillin could be
isolated from his urine, he died after a few days as no more penicillin was available
for his continued therapy. The fungus Penicillium chrysogenum, which produces
more penicillin than Penicillium notatum and is easier to cultivate was isolated
from a moldy melon in Illinois. The tedious route to the structural elucidation of
penicillin and the successful work to systematically vary its structure are scientific
masterworks of the first order. There were even more difficult problems to conquer
to optimize its production and its biotechnological mass production. Today the
modified penicillins 2.18 and cephalosporins 2.19 (Fig. 2.7), which make up a broad
range of antibiotics with outstanding bioavailability are available. The newer
analogues have a broader spectrum of activity against many pathogens and are
distinguished by a generally improved stability to the penicillin-degrading enzyme
b-lactamase. Fleming was a researcher to whom Pasteur’s thesis “chance favors the
prepared mind” fully applies. One day in 1921 while working in his laboratory with
a cold, he tried a rather headstrong experiment. He added a drop from his own nasal
mucus to a bacterial culture and found a few days later that the bacteria had been

H H
S
H H CH3 RHN
S
RHN N
CH3
N O CH2R⬘
O COOH COOH
2.18 Penicillins 2.19 Cephalosporins

Fig. 2.7 Fleming’s accidental discovery of the antibiotic effects of a fungus has delivered a wide
palette of penicillins 2.18 and cephalosporins 2.19, each with different R groups.
2.5 The Discovery of the Hallucinogenic Effect of LSD 29

killed. This “experiment” led to the discovery of lysozyme, an enzyme that hydro-
lyzes the bacterial wall. As a therapy it is unfortunately unsuitable because it does
not attack most human pathogens.
Chance and a fungus played an important role in the industrial synthesis of
corticosteroids. An important step in the synthesis is the introduction of an oxygen
atom at a particular position in the steroid scaffold, position 11. In 1952 chemists at
the Upjohn company sought after a soil bacteria that could hydroxylate a steroid in
this position. Just when they finally decided to set an agar plate on the window bank
of the laboratory, Rhizopus arrhizus landed exactly there. This fungus transforms
progesterone (▶ Sect. 28.5) to 11a-hydroxyprogesterone. With its help the yield
could be increased to 50%. The closely related fungus Rhizopus nigricans even
afforded 90% of the desired product.

2.5 The Discovery of the Hallucinogenic Effect of LSD

In the 1930s Albert Hoffmann (1906–2008) was working on the partial synthesis of
ergoline alkaloids at Sandoz. In 1938 he wanted to find a way to transfer the
respiratory and cardiovascular stimulatory effect of N,N-diethyl nicotinamide
2.20 onto this class of compounds. In analogy to 2.20, he prepared N,N-diethyl
lysergamide 2.21 (Fig. 2.8) with the hope of maintaining the stimulatory circulatory
and respiratory effects. Except in case the experimental animals were agitated
under anesthesia, the substances showed no particular effect. Therefore they were
not pursued at first. Hoffman prepared the substances for a second time five years
later because he wanted to investigate them more thoroughly. Upon the purification
procedure and recrystallization he reported feeling “a strange agitation combined
with a slight dizziness.” At home he fell into “a not-unpleasant inebriated condition
that was characterized by extremely animated fantasies . . . after about 2 hours, the
condition went away.” Hoffman suspected a connection to the compounds he
prepared and conducted a self-experiment with 0.25 mg a few days later. That
was the smallest dose with which he expected to see an effect. The outcome was
dramatic, the experience was the same as the first time, but much more intense. He
had a technician accompany him home on his bicycle. During the ride, his condition

CO-N(Et)2 H CO-N(Et)2

N N
CH3
H

2.20 N,N-diethyl
nicotinamide HN
2.21 LSD

Fig. 2.8 N,N-Diethyl nicotinamide 2.20 is a centrally active derivative of nicotinic acid.
Hofmann wanted to synthesize a general stimulant analogously by preparing the N,N-diethyl
amide of lysergic acid. The result was the hallucinogen lysergic acid diethyl amide 2.21 (LSD).
30 2 In the Beginning, There Was Serendipity

took on a threatening form, and he fell into a severe crisis dominated by dizziness
and anxiety. The world took on a grotesque form. Later it was determined that
0.02–0.1 mg is enough to cause hallucinations. The substance was temporarily
marketed as Delyside ® for use in psychotherapy and to treat anxiety and compul-
sive disorders.

2.6 The Synthetic Route Determines the Structure

The structure of the first calcium channel blocker, verapamil 2.22 was determined
by its synthesis (Fig. 2.9). Verapamil counteracts the effects of b-adrenergic
agonists, but it is not a b-blocker. It was only after its introduction to the market

MeO
CN CH3
MeO CH2 + OMe
H3C Br

+ Cl N OMe
CH3

MeO OMe
CN

MeO N OMe
CH3
H3C CH3 2.22 Verapamil

NO2 NO2
CHO MeOOC COOMe
MeOOC COOMe
+
H3C N CH3
H3C OH HO CH3 H

NH3 2.23 Nifedipine

Fig. 2.9 Ferdinand Dengel, a chemist at the former Knoll AG wanted to prepare a cardiovascular
therapeutic by alkylating a nitrile. To avoid a double substitution, he started with the sterically
demanding isopropyl group. The result was the first calcium channel blocker, verapamil 2.22. The
isopropyl group is the optimal alkyl group because it stabilizes the biologically active conforma-
tion. The synthetic route played an important role in the development of the second calcium
channel blocker, nifedipine 2.23. In 1948, Friedrich Bosser at Bayer was given the task of finding
new substances that dilate the coronary arteries. After years of work, in 1964 he turned to the easily
prepared dihydropyridines, which surprisingly displayed the desired effects. In this case, the
space-filling nitro group promotes the biologically active conformation (▶ Sect. 17.9).
2.7 Surprising Rearrangements Lead to Medicines 31

that Albrecht Fleckenstein clarified its mode of action: it blocks the inward mem-
brane-voltage-dependent flow of calcium ions through the calcium channels
(▶ Sect. 30.1) in heart and endothelial cells. The hypotonic effect was initially
seen as a side effect, but in the following years it became the most important reason
for use. The second group of therapeutically important calcium channel blockers,
nifedipine 2.23 was inspired by a synthetic principle. It was a reaction from 1882,
the Hantzsch synthesis of dihydropyridines (Fig. 2.9). Remarkably, the pharmaco-
logical experiments on nifedipine had to be carried out in a darkened room because
of its photosensitivity. All the more reason to acclaim that it was developed into
a medicine despite this characteristic.

2.7 Surprising Rearrangements Lead to Medicines

Leo Sternbach (1908–2005), a chemist at Hoffman La Roche was involved in


a program in the mid-1950s to find structurally novel tranquilizers. Sternbach
remembered a synthetic program on pigments from a decade before in which
N-oxide 2.24 (Fig. 2.10) was also prepared. Its reaction with secondary amines
delivered the expected products, which were pharmacologically absolutely
uninteresting. The work was practically ended in 1957, and the laboratory was
being cleaned up when it was noticed that a crystalline base and its hydrochloride
salt had precipitated from a solution. The substance was the product of a reaction
between N-oxide 2.24 and methylamine, but it was never tested due to other
priorities. The subsequent pharmacological testing convincingly showed outstanding
qualities. It was only later established that an unexpected ring rearrangement reaction
had occurred to afford chlordiazepoxide 2.25 (Librium®, Fig. 2.10).
There are other examples of this sort. In 1974 W. Berney was working on
spirodihydronaphthalenes 2.26 (Fig. 2.11) with the goal of preparing CNS-active
substances. Upon acid treatment, he obtained a compound that was highly potent
in vitro and in vivo against a series of human-pathogenic fungi in a routine broad
screening at Sandoz Research Institute in Vienna. In 1985 the substance was

H
N
N N CH3
Cl
N+
Cl O− CH3NH2 Cl N+
O−

2.24 2.25 Chlorodiazepoxide

Fig. 2.10 Treatment of 2.25 with methylamine delivers the rearrangement product chlordiaz-
epoxide 2.25 (Librium ®) instead of the expected one. This first test compound became the first of
the benzodiazapine class to be marketed.
32 2 In the Beginning, There Was Serendipity

CH3
N CH3 N
H+

HO 2.27 Naftifine

2.26 CH3 t Bu
N

2.28 Terbinafine

Fig. 2.11 Instead of CNS activity, naftifine 2.27, prepared from spiro-compound 2.26, is an
antimycotic. A comparison with the more portent terbinafine 2.28 shows that the phenyl group can
advantageously be replaced with a tert-butylethinyl group.

introduced as naftifine 2.27, and later a more potent analogue, terbinafine 2.28
(Fig. 2.11) followed. Both substances showed a previously unknown mode of
action. They damage the membrane of fungi in that they block the ergosteroe
biosynthesis. This happens in a very early step because of the inhibition of the
enzyme squalene epoxidase.

2.8 A Long List of Accidents

The list of accidental discoveries, from which a few are described here, can be
prolonged ad infinitum. A few more examples are briefly mentioned without
chemical formulae.
• Pethidine (▶ Sect. 3.3), the first fully synthetic opiate analgesic, was synthesized
in the 1930s as part of an anticonvulsives research program, by starting from
atropine.
• The suitability of antihistamines for the prevention of motion sickness was
discovered in Boston because of a treatment for a skin rash. A patient reported
that her motion sickness, which always occurred when riding a Boston street car
went away. The “clinical trial” was carried out in 1947 on hundreds of sailors on
the transatlantic voyage of the USNS General Ballou.
• Haloperidol (▶ Sect. 3.3) was meant to be an analgesic, it turned out to be
a neuroleptic.
• Imipramine is structurally very similar to the neuroleptic chlorpromazine
(▶ Sects. 1.6 and ▶ 8.5). Nonetheless it has the opposite effect and is an
antidepressant.
• Phenylbutazone was meant to be an additive used to dissolve the anti-
inflammatory aminophenazone. The substance turned out to be an anti-
inflammatory agent itself as did its metabolite, oxyphenbutazone.
2.9 Where Would We Be Without Serendipity? 33

• An attempt to isolate the causative agent of bipolar disorder from the urine
of patients afforded only uric acid. Because uric acid is poorly soluble,
lithium ureate was tested. This led to the discovery of the antidepressant effect
of lithium salts.
• Clonidine was meant to be a local treatment for the runny nose that accom-
panies the common cold. Instead of the expected effect, a profound hypotonic
effect was surprisingly found. Despite intensive structural variations, none of
clonidine’s analogues have surpassed its potency.
• Levamisole was developed as a broad-spectrum anthelmintic (anti-worm agent).
Instead, an immunomodulatory effect was accidently found that now stands in
the therapeutic foreground.
• Praziquantel was originally meant to be an antidepressant. Because of its high
polarity, it cannot cross the blood–brain barrier. An outstanding suitability for
the treatment of the tropical disease bilharziosis was found through broad
biological testing.
• A chemist at Searle who was working on dipeptides licked his fingers while
flipping through the pages of a book. The sweet taste that he noticed turned out to
be caused by the artificial sweetener aspartame. Saccharine was also found in
a very similar way. In the case of cyclamate, a smoker noticed a sweet taste to his
cigarettes.
• Even today when one would think that rational concepts dominate drug research,
the lucky accident still helps to make “blockbusters.” In the pursuit of a
phosphodiesterase inhibitor to hinder the degradation of cyclic guanosine
monophosphate (cGMP), an improved treatment for angina pectoris was not
found (▶ Sect. 25.8). Instead it became conspicuous that the male subjects in the
clinical trial did not want to give up the substance. After the side effect of
a stronger penile erection was recognized, the side effect became the main effect.
The compound sildenafil was marketed for the treatment of erectile dysfunction
as Viagra ®, and developed into a billion-dollar product.

2.9 Where Would We Be Without Serendipity?

In the English-speaking world, a word is in use that is difficult to translate into other
languages: serendipity. This term, as an expression of a lucky accident, was coined
by Sir Horace Walpole in 1754. It is derived from a Persian fairytale in which three
princes of Serendip (earlier Ceylon, today Sri Lanka) have accidental and unex-
pected luck and make interesting discoveries entirely analogously to the many
examples in this chapter. Serendipity has played an exceedingly important role in
general in science, and especially in drug research. How would our modern
medicine supply look without all of these lucky accidents? By no means should
an arbitrary approach be taken, and an accidental discovery be counted upon. To the
contrary, chemists and pharmacologists have always developed concrete ideas as
to how and why particular structural variations on a lead compound should be
34 2 In the Beginning, There Was Serendipity

pursued. Some of these hypotheses were correct, and others were false. One thing
that they always had in common that helped the researchers was that when
a hypothesis failed, or an unexpected result was found, they recognized the poten-
tial consequences of the result, drew the correct conclusions, and did the right
things. The following chapters will show numerous examples of successful targeted
drug design in cases in which the correct working hypothesis was realized. The
search for a new active substance is, however, not a process that can be pushed
through by a purely technically oriented management. As a general rule, short-term
planning and bureaucratic control have only negative consequences. On the other
hand the search for new medicines requires a concerted effort from many different
groups of specialists, who must work together in a suitable organizational structure.
The subsequent preclinical and clinical development of a newly found active
substance is an extremely expensive and time-consuming process that must be
carefully planned, carried out, and controlled. For this, other instruments are
necessary than are used for drug discovery.

2.10 Synopsis

• The history of early drug research is full of lucky accidents. Many active
principles of substances were discovered by serendipity, but mostly success
can be attributed to an outstanding researcher with a “prepared mind” who
observed important effects.
• Dyes and pharmaceuticals, both developed in the early stages of the up-coming
chemical industry, especially stimulated each other in very fruitful synergies.
• The discovery by Alexander Fleming of the first antibiotic principle, the peni-
cillins, as a defense mechanism of a fungus against bacteria, is one of the most
famous examples of a serendipitous discovery.
• The partial synthesis of ergoline alkaloids led to the discovery of the hallucino-
genic effects of LSD. In those days, researchers frequently conducted self-
experiments to first test active principle in humans.
• Unexpected synthetic products, surprising structural rearrangements, and ini-
tially false working hypotheses produced new, pharmacologically interesting
substances with surprising or outstanding qualities.
• Even today, where rational concepts and the understanding of mode-of-action
dominates drug research, the lucky accident can still help to make “block-
busters” as proven recently by the example of sildenafil (Viagra ®).

Bibliography

Primary Literature

Ban TA (2006) The role of serendipity in drug discovery. Dialogues Clin Neurosci 8:335–344
Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Bibliography 35

de Stevens G (1986) Serendipity and structured research in drug discovery. Fortschr Arzneimit-
telforsch 30:189–203
Kubinyi H (1999) Chance favors the prepared mind. From serendipity to rational drug design.
J Receptor Signal Transd Res 19:15–39
Restak RM (1994) Receptors. Bantam Books, New York
Roberts RM (1989) Serendipity. Accidental discoveries in science. Wiley, New York
Sneader W (1990) Chronology of drug introductions. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry, vol 1, Kennewell PD (ed). Pergamon Press, Oxford,
S.7–S.80

Secondary Literature
Cahn A, Hepp P (1886) Das Antifebrin, ein neues Fiebermittel. Centralblatt f€
ur Klinische Medizin
7:561–564
Hofmann A (1993) LSD – mein Sorgenkind, dtv/Klett-Cotta
Sternbach LH (1978) The Benzodiazepine story. Fortschr Arzneimittelforsch 22:229–266
St€utz A (1987) Allylamine derivatives – a new class of active substances in antifungal chemo-
therapy. Angew Chem Int Ed 26:320–328
von Vámossy Z (1900) Ist Phenolphthalein ein unsch€adliches Mittel zum Kenntlichmachen von
Tresterweinen? Chemiker-Zeitung 24:679–680
Classical Drug Research
3

The hundred years of pharmaceutical research from 1880 to 1980 were punctuated
by trial and error, but also by elegant ideas and their translation into therapeutically
valuable principles. Many lead structures were found by accident (see ▶ Chap. 2,
“In the Beginning, There Was Serendipity”), others came from traditional medi-
cines or from biochemical concepts. In contrast to modern drug research, classical
design was the result of rather limited knowledge of the pathophysiology and
cellular and molecular etiology of disease, and was restricted to animal experi-
ments. Nonetheless, this phase, and particularly the last 50 years, has been excep-
tionally successful. The targeted fight against infectious diseases and the successful
treatment of many psychiatric and other important diseases can be attributed to this
period in drug development. With this came a significant increase in quality of life
and life expectancy. In the following sections, selected examples are used to
demonstrate different aspects of classical pharmaceutical research.

3.1 Aspirin: A Never-Ending Story

The history of acetylsalicylic acid (ASA, Aspirin ®) reflects the progress of phar-
maceutical research like no other example. This is especially true for the elucida-
tion of the mode of action, and the newly found targeted therapies that resulted.
Willow bark extracts have been used since antiquity for the treatment of inflam-
mation. When Napoleon marched across Europe, between 1806–1813 the bark was
even used as a substitute for cinchona bark (Sect. 3.2). Salicin 3.1, a glucoside of the
o-hydroxybenzylalcohol saligenin, is responsible for the effect. Upon hydrolysis
and oxidation, the actual active compound, salicylic acid 3.2 (Fig. 3.1), is formed.
In 1897 the then 29-year-old Bayer chemist Felix Hoffmann began a systematic
search for derivatives of salicylic acid. His father, who suffered from severe
rheumatoid arthritis, had asked him to. High doses of salicylic acid caused unpleas-
ant gastric irritation and vomiting. Hoffmann prepared simple derivatives of
salicylic acid, and was successful within the year. On October 10, 1897 he synthe-
sized acetylsalicylic acid 3.3 (ASA, Fig. 3.1) for the first time in a pure form.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_3, 37


# Springer-Verlag Berlin Heidelberg 2013
38 3 Classical Drug Research

CH2OH COOH
O-b-D-glucopyranoside OH

3.1 Salicin 3.2 Salicylic acid

COOH
O CH3

O 3.3 Acetylsalicylic acid

Fig. 3.1 Salicylic acid 3.2 is the oxidation and cleavage product of salicin 3.1, which is isolated
from willow bark. Acetylsalicylic acid (ASA) 3.3 is not simply a prodrug of salicylic acid, but
rather a drug with its own mode of action.

It was a lucky strike. Although ASA has a very short half-life in plasma, it is
analgetic, antipyretic, and anti-inflammatory in large measure. The clinical trial was
carried out at the Diakonissenkrankenhaus in Halle an der Saale on 50 patients. On
February 1, 1899 Bayer registered ASA as Aspirin ® (A for acetyl and spiraea,
another plant that contains salicylic acid) as a trademark under the number 36 433.
From then on it was sold as 1 g of powder in envelopes, and shortly thereafter as
tablets. Detractors alleged that it was only developed in tablet form so that Bayer
could emboss their famous Bayer cross onto it. Aspirin quickly gained a leading
place in drug therapy. One-hundred years after its market introduction, 40,000 t of
ASA are produced and pressed into tablets every year, worldwide. At the end of
1994 the Bayer plant in Bitterfeld produced 400,000 Aspirin ® tablets per hour, 3.5
billion per year. The importance that the trademark Aspirin had for Bayer became
clear in 1994 when the company paid US$1 billion to take over the self-medication
business from Sterling—Winthrop, which included the trademark rights for
Aspirin, which had been lost in 1918.
The Spanish philosopher José Ortega y Gasset called the previous century the‚
“age of Aspirin.” In his book‚ The Rising of the Masses, he wrote:
The ordinary person lives today more easily, comfortably and safely than the most powerful
of the past. Why should he care that he is not richer than others when the world is and
roads, trains, hotels, telegraphs, personal safety, and Aspirin ® are at his disposal.

Jaroslaw Hasek, Kurt Tucholsky, Giovanni Guareschi, Graham Greene, John


Steinbeck, Agatha Christie, Truman Capote, Hans Helmut Kirst, and Edgar
Wallace also wrote about Aspirin. The singer Enrico Caruso treated his headaches
with only “German Aspirin,” out of principle. Even Franz Kafka and Thomas Mann
raved about its outstanding effects in their letters. In 1986 on an official visit to
Germany, Queen Elizabeth II said that:
German successes span the entire breadth of human life. From philosophy, music and
literature, to the discovery of X-rays and the mass production of Aspirin ®.

The compliment was wonderful, but one must also consider that all of these
scientific discoveries are slightly more than 100 years old! ASA was considered to
3.1 Aspirin: A Never-Ending Story 39

Cyclo -
oxygenase
COOH COOH
O
O

3.4 Arachidonic acid ASA OH


3.5 PGH2

Prostacyclin- Thromboxane-
COOH synthase synthase

O
COOH
O
O
HO OH OH

3.6 Prostacyclin 3.7 Thromboxane A2

Fig. 3.2 Arachidonic acid 3.4 undergoes an oxidative cyclization and a peroxidase reaction in the
prostaglandin biosynthesis to give the primary product PGH2 3.5. Finally prostacyclin synthase
transforms PGH2 into prostacyclin 3.6, which protects the gastric mucosa, dilates blood vessels,
and inhibits platelet (thrombocyte) aggregation. The platelet thromboxane synthase transforms
PGH2 into thromboxane A2, which promotes aggregation. ASA irreversibly inhibits cyclooxygen-
ase. By using low ASA doses, the thromboxane A2 synthesis in the platelets is more strongly
inhibited than the production of prostacyclin in the vascular walls.

be a prodrug of salicylic acid and a drug of unknown mode of action until John Robert
Vane (Nobel Prize 1982) and Sergio H. Ferreira discovered in 1971 that salicylic
acid and other nonsteroidal anti-inflammatory drugs inhibit prostaglandin G/H
synthase (cyclooxygenase, COX). COX, a ubiquitously present, membrane-bound
enzyme transforms arachidonic acid 3.4 over a cyclic endoperoxide into PGH2 3.5,
which in turn is transformed into prostacyclin 3.6, thromboxane A2 3.7, and other
prostaglandins. Large quantities of prostaglandins are produced in inflamed tissue, so
that the inhibition of cyclooxygenase intervenes in the cause of the process
itself (Fig. 3.2).
ASA is in fact a metabolic precursor of salicylic acid. In contrast to other anti-
inflammatory drugs, including salicylic acid, however, it has an astonishing mode
of action (▶ Sect. 27.9). It has been known for some time that ASA selectively
acetylates the hydroxyl group of the amino acid serine 530 of cyclooxygenase. In
1995 the three-dimensional complex structure of a bromine analogue was solved
for the first time. This drives the point home that ASA, analogously to other COX
inhibitors, docks near the arachidonic acid binding site (▶ Sect. 27.9). Therefore
despite its relatively weak binding, ASA is in an outstanding position to acetylate
this serine. Serine 530 is not involved in the catalytic mechanism, but the additional
40 3 Classical Drug Research

Fig. 3.3 Celecoxib 3.8 and SO2NH2


valdecoxib 3.9 are specific SO2NH2
inhibitors of cyclooxygenase
COX-2, which is in particular
responsible for the fast
synthesis of prostaglandins in CH3
N
inflamed tissue than COX-1. N
H3C

F3C O N

3.8 Celecoxib 3.9 Valdecoxib

volume of the acetyl group impedes arachidonic acid’s entrance to the binding site
and therefore the synthesis of the prostaglandin precursors. A COX mutant that
carries an alanine instead of a serine at position 530, is enzymatically fully active
but is inhibited by all other anti-inflammatory compounds. This mutant is, as
expected, only weakly inhibited by ASA.
Stimulation for the continued research on nonsteroidal anti-inflammatory drugs
was generated by the discovery in 1991 of a second cyclooxygenase, COX-2. All
anti-inflammatory drugs until then were unselective, or they exerted their effect
overwhelmingly over COX-1 and only slightly over COX-2. The most important
side effect of ASA and other anti-inflammatory drugs is the gastrointestinal damage
that can occur at high doses; this results from the inhibition of the COX-1-
dependent synthesis of prostacyclin 3.6, which protects the gastric mucosa. In
contrast to the ubiquitously occurring COX-1, COX-2 is responsible for the fast
synthesis of prostaglandins in inflamed tissue. It has been possible to bring many
drugs to the market that are more than 1,000-fold more selective for COX-2 than
COX-1, for instance, 3.8 and 3.9 (Fig. 3.3 and ▶ Sect. 27.9).
But do not worry, Aspirin ® will live forever. Its success is growing in another
market. Even at low doses ASA inhibits the synthesis of thromboxane A2 3.7, which
initiates the coagulation of platelets (thrombocytes). Because of its irreversible
inhibition of cyclooxygenase, and the inability of platelets to synthesize new
enzyme, a one-time contact with the substance is enough to suppress the synthesis
for the lifetime of the thrombocyte, that is, for about a week. The enzyme is
replaced in other tissues besides thrombocytes. Therefore the physiological adver-
sary to thromboxane, the aggregation-inhibiting prostacyclin that is produced in the
walls of the vasculature, can be replenished (Fig. 3.2).
With regard to the condition of increased coagulation tendency, ASA adjusts the
biosynthesis away from the “bad” thromboxane in the direction of the “good”
prostacyclin. This effect is the basis for the therapeutic use of ASA in cases of
thrombosis susceptibility, for instance, before and after a heart attack or stroke.
Considering the now-known mechanism of the effect, the dose can be decreased by
tenfold! That reduces the risk of gastrointestinal bleeding as a possible side effect.
Based on these observations, it is now recommended that ASA be taken
3.1 Aspirin: A Never-Ending Story 41

prophylactically before long-haul flights. The constricted sitting and lack of move-
ment coupled with the dry air and reduced pressure in the cabin lead to dehydration
and cause a “thickening” of the blood. The economy-class syndrome typically leads
to jet legs and increases the risk of embolism and vein thromboses. Here ASA can
offer a measure of protection. On the other hand, its use before surgical procedures
is not recommended. No surgeon wants an increased bleeding risk for the patient as
a result of diminished coagulation competence during a procedure.
Felix Hoffmann’s approach of using simple derivatization to improve the
tolerability of a substance led to a new therapeutic principle 100 years ago, the
value of which cannot be appreciated enough. The victory lap of ASA was, and
is, unstoppable. A German/Austrian study on 13,300 patients showed that ASA
therapy reduces the mortality of a heart attack by 17%, and the number of non-fatal
repeat attacks by 30%. On October 9, 1985 the US FDA, a normally
conservative organization, announced that the daily consumption of ASA can
reduce the chances of a recurrent heart attack by 20%, and in some high-risk
populations by even more than 50%. A further study on 22,000 physicians inves-
tigated the influence of regular ASA use on the chances of heart attack. Here, the
physicians were not the experimenters but the patients. The study was prematurely
ended when it was established that the control group had 18 lethal and 171 non-
lethal heart attacks, whereas the ASA-treated group had 5 lethal and 99 non-lethal
heart attacks: altogether a reduction of 50%. A study on 90,000 nurses showed the
same protective effect in women. The risk of a first heart attack was reduced by
30%. This marked the introduction of ASA as a “preventive medicine.”
A six-year study of 600,000 volunteers is worth an entry in the Guinness Book of
World Records. After the results were in, it appeared that ASA reduces the risk of
lethal colon cancer by 40%. Even this effect has a plausible explanation.
Malondialdehyde, a metabolite of prostaglandines, damages DNA. Mutations in
the so-called tumor-suppressor gene TP53 occur in human colon tumors particu-
larly frequently. This causes the cancer cells to lose the ability to regulate their
growth, and they grow uncontrollably. It could also be entirely different. As a result
of gastrointestinal bleeding, a possible side effect of ASA, the treated group was
probably more frequently examined than the control group. It is entirely conceiv-
able that the colon cancer was therefore found in an earlier stage in which it was
more easily operable.
Since 1992 Aspirin ® is available as a chewable tablet. In this form it is buffered
with calcium carbonate, the absorption is much faster, and the side effects are
reduced. ASA has had an unbelievable career, particularly if one considers that it
would never have had a chance under modern criteria to be approved. Its short
plasma half-life, the irreversible protein inhibition, and the high doses would have
met today’s exclusion criteria. A definitive end point in its hypothetical modern
development would be the teratogenicity seen in rats. A pathological result in
toxicity studies with this animal model will definitely lead to discontinuation,
because who would dare to wager that a teratogenic effect occurs in rodents, but
not in humans. Aspirin ®— really a never-ending story.
42 3 Classical Drug Research

3.2 Malaria: Success and Failure

The therapy of malaria begins with the discovery of cinchona, around which there
are numerous legends. The nicest and most frequently cited version is that of the
fever-stricken Countess Cinchon, the wife of the Spanish viceroy in Lima, Peru,
who was healed by the doctor Juan de Vega in 1638. On the advice of the town
magistrate of Loja, Quinquina the “bark of the barks” (therefore the confusing name
“cinchona bark”) was brought in from 800 km away. The Countess was allegedly
healed and from then on distributed the powder herself. In the older works, the
cinchona bark was also called “Countess powder” or “Jesuit powder.” Perhaps it
was also true that the Indians, who were forced into compulsory service in the silver
mines by their Christian conquerors, chewed the bark to fight off shivering in the
cold. The clever Jesuits took note of these observations, and thought that chewing
the bark would also help with the shivering that comes from a malarial fever
episode. Cinchona then came back to Europe with the Jesuits.
Malaria, the remittent fever, is a widespread tropical and sub-tropical disease.
Because it is transmitted by the anopheles mosquito, it occurs particularly in
wetlands. Even the city Buenos Aires (Span. “good airs”) was badly hit by malaria
(Ital. mala aria¼“bad airs”). Alexander the Great, the Gothic King Alarich, and
the German Emperors Otto II and Heinrich IV died of it. Even Albrecht D€urer
(1471–1528) apparently suffered from malaria. He sent his private physician
a drawing of himself in which he was wearing only a loincloth. His right hand is
over his spleen with the additional text that do der gelb Fleck ist vnd mit dem Finger
drawff dewt, do ist mir we (there where the yellow spot is and where the finger
points, is where it hurts). In Europe malaria was still widespread until the middle of
the last century. In the north of Germany, the last epidemics were in the years 1896,
1918, and 1926.
The miasma, emissions from the ground, swamps, and corpses, were long seen
as the source of malaria and other epidemics. The Roman author Marcus Terrentius
Varrus (116–127 BC) suspected back then that small invisible organisms might be
responsible. Toward the end of the nineteenth century, the anopheles mosquito was
identified as the vector, and a plasmodium was recognized as the cause of malaria.
Around 1930 about 700 million people were infected, and in 2003 the number was
estimated to be 300–500 million. Up to 1.2 million people die every year, mostly
children under the age of 5, and many others retain permanent damage. Psychiatric
changes are also a consequence. The term “spleen” for eccentricity originally came
from the enlarged spleen that malaria causes.
It should not go unmentioned that heterozygotic (i.e, genetically mixed) carriers
of sickle cell anemia are protected from malaria. This genetic form of anemia was
the first disease for which the molecular cause could be identified (▶ Sect. 12.12).
A single amino acid in the hemoglobin of those afflicted is mutated. This causes
hemoglobin to aggregate, and the erythrocyte shrinks together. The malaria parasite
cannot adequately reproduce in such an erythrocyte. This partial protection from
malaria has abetted the spread of sickle cell anemia in malaria-endemic areas, but
not in other areas.
3.2 Malaria: Success and Failure 43

H MeO

N N
HO
H HN
MeO N(Et)2
CH3
N 3.10 Quinine 3.11 Plasmoquine

CH3 CH3
N(Et)2 N(Et)2
HN HN
MeO

N Cl N Cl

3.12 Mepacrine 3.13 Chloroquine

OH
HO NH
H N
HN

N CF3
Cl N
CF3
3.14 Mefloquine 3.15 Amodiaquine

Fig. 3.4 Simple synthetic analogues with antimalarial effects were derived from quinine 3.10.
Plasmoquine 3.11 still contains the methoxyquinoline ring of quinine, but it is in a different
position. The later-developed analogues mepacrine 3.12 and chloroquine 3.13 show strong
similarity to quinine. The newer derivatives mefloquine 3.14 and amodiaquine 3.15 are also
structurally closely related to quinine.

The active substance in the cinchona bark, the alkaloid quinine 3.10 (Fig. 3.4)
was isolated in 1820. Aside from the positive therapeutic effects, it also had
considerable side effects; nonetheless up until a few years ago it was the most
important antimalarial, particularly for the parenteral treatment of severe malaria.
The first synthetic alternative, plasmoquine 3.11, became available in 1927, but it is
seldom used due to its side effects. The later-developed, more potent analogues
3.12–3.14 show a clear structural relationship to the lead structure quinine
(Fig. 3.4). It was only through the protection from malaria that the exploitation of
the colonies was possible. The World Health Organization, WHO, initiated a global
malaria-eradication program in 1955 mainly through the use of the insecticide
dichlorodiphenyltrichloroethane 3.16 (DDT, Fig. 3.5).
The success was overwhelming, the number of cases and fatalities was reduced
to practically zero (Table 3.1). In 1953 it was estimated that five million lives have
44 3 Classical Drug Research

Cl Cl Cl Cl

CCl3 CCl2

3.16 DDT 3.17 DDE

Fig. 3.5 The insecticide p, p0 -dichlorodiphenyltrichloroethane 3.16 (DDT) saved more human
life than all of antimalarials put together. The latest investigations show though, that the
antiandrogenic effects of the main metabolite p, p0 -dichlorodiphenyldichloroethylene 3.17
(DDE) is possibly the main culprit responsible for reproductive disorders found in animals,
including perhaps humans.

Table 3.1 Number of malaria cases in different countries before and after the introduction
of DDT 3.16 (Fig. 3.5) The numbers in parentheses are the years (Jukes TH (1974) Naturwiss
61:6–16)
Country Cases of malaria (year)
Before DDT After DDT
Italy 411,602 (1946) 37 (1969)
Spain 19,644 (1950) 28 (1969)a
Yugoslavia 169,545 (1937) 15 (1969)a
Bulgaria 144,631 (1946) 10 (1969)a
Romania 338,198 (1948) 4 (1969)a
Turkey 1,188,969 (1950) 2,173 (1969)
India  75 million per year  750,000 (1969)
Sri Lanka 2.8 million (1946) 110 (1961)
31 (1962)
17 (1963)
2.5 million (1968/1969)b
Taiwan >1 million (1945) 9 (1969)
Venezuela 817,115 (1943) 800 (1958)
Mauritius 46,395 (1948) 17 (1969)
a
Imported cases
b
After DDT spraying was discontinued in 1963

been saved since 1942. In India alone the number of cases went from 75 million to
750,000, and the number of annual fatalities was reduced to 1,500. DDT has saved
more lives than all antimalarial drugs put together! The acute toxicity of DDT is
actually not a problem for mammals and humans. Unfortunately, it turned out that
DDT decomposes extremely slowly in the environment, and it enriches as it moves
its way up the food chain, especially in birds and fish. It also accumulates in human
fat and in breast milk. The chronic toxicity comes from long-term retention of
one year or more, and that is a serious problem.
The moving book, Silent Spring by Rachel Carson, was published in 1962.
Despite warnings from experts, DDT spraying for mosquitoes was stopped
in Sri Lanka in 1963, and the number of malaria cases raced to 2.4 million by
3.2 Malaria: Success and Failure 45

1968/1969. By then it was too late to use DDT again because the mosquitoes had
become resistant, and this was certainly also partially due to the residual DDT that
remained in the environment in the intervening years.
Further investigations showed that a DDT metabolite, dichlorodiphenyldichloro-
ethylene 3.17 (DDE, Fig. 3.5) has surprisingly strong antiandrogenous effects, that
is, it blocks the effects of male hormones. Therefore, DDE is responsible for the
DDT-dependent reproductive and developmental disorders that are seen in some
species, perhaps also in humans. It is remarkable that the effect of this metabolite
was only discovered 50 years after DDT was introduced.
Not only the mosquitoes became resistant to DDT, the parasite also became
resistant to the drugs. For this reason, the history of the chemotherapeutic develop-
ments for malaria has been a rollercoaster ride of new promising compounds, and
the more or less quick development and distribution of resistant parasites.
Chloroquine 3.13, was prepared in 1934 in the Bayer laboratories, but was
judged to be “too toxic”; it was “rediscovered” by the Americans and deployed
as a malaria therapeutic par excellence. Efficacious, well tolerated, and above all
else inexpensive to produce, it, along with the above-described mosquito extermi-
nation with DDT and landscaping measures, brought us within reach of a victory
over malaria. But resistant parasites emerged almost simultaneously and indepen-
dently from one another in the 1960s in different parts of Southeast Asia, Oceania,
and South America. They possessed a mutated transport protein in the membrane of
their gastriole that recognizes chloroquine as a substrate. By using this protein they
were able to expel chloroquine from its target. In the meantime, resistant parasites
have spread throughout almost the entire geographic range of malaria. Chloroquine
lost its once phenomenal status for the therapy of malaria tropica. Since then,
a malaria therapeutic with similar qualities as chloroquine has been sought by
researchers, until now, however, without success. The structurally related
amodiaquine 3.15 (Fig. 3.4) is in fact effective against weakly chloroquine-resistant
strains, but it is largely ineffective against highly resistant strains (especially in
Southeast Asia). Moreover, upon long-term use as a prophylaxis, it carries the risk
of irreversible liver damage or a life-threatening agranulocytosis. In the short term,
it appeared that the antifolate combination of sulfadoxine/pyrimethamine 3.18/3.19
(Fansidar ®) could replace chloroquine (Fig. 3.4), but the first resistance occurred
much faster than with chloroquine. Starting from the point of origin in Southeast
Asia, the resistance has spread over the entire world.
The wars of the last century have also promoted the search for new antimalarial
drugs. Tremendous effort was made at the Walter Reed Army Institute of
Research in the USA. Over the course of 40 years, and particularly during WWII
and the Vietnam War, more than 250,000 substances were tested for an anti-
malarial effect. Judging on hand of the exerted effort, the success was modest:
the two aryl amino alcohols halofantrine 3.20 and mefloquine 3.14, and the
8-aminoquinoline tafenoquine 3.21, which still has not completed clinical trials,
were the result of strenuous labor. After its introduction, halofantrine was with-
drawn from the market because it caused lethal arrhythmias (▶ Sect. 30.3). In
Southeast Asia the resistance to mefloquine developed so quickly that it can only
46 3 Classical Drug Research

be used in combinations with artesunate 3.22. Because mefloquine has been used
sparingly due to its price, most of the parasite strains are still sensitive to it. For this
reason, today mefloquine is one of the most important malaria prophylactics for
Western tourists. Artesunate is a partial-synthetic derivative of dihydroartemisinin
3.24, which is isolated from annual mugwort (Artemisia annua). Artemisinin’s
very unusual endoperoxide structure is essential for its activity. Intense research
is currently devoted to clarifying whether the iron(II)-catalyzed production
of radicals, which then react with the immediate cell structures (iron-triggered
cluster bomb), or a specific calcium pump inhibition is its mode of action. At any
rate, these are the most potent medicines to fight malaria to date. Scientists consider
it to be only a matter of time until resistance to artemisinin develops.
The artemisinin-based combination therapy is the current recommendation of the
WHO. It is combined with whatever is available, even with substances that have
already established massive resistance. At the moment it is combined with the
Chinese-developed aryl amino alcohol lumefantrine 3.23, which is usually
still effective. The combinations of dihydroartemisinin/piperaquine 3.24/3.25
and artesunate/pyronaridine 3.22/3.26 (Fig. 3.6) are in advanced stages of clinical
trials.
Both combination partners were developed in China in the 1960s and 1980s,
respectively. They belong to the same class as chloroquine, even though
pyronaridine has an azaacridine instead of a quinone scaffold. Resistance to both
of these compounds is already widespread in Southeast Asia. The combination of
dapsone/chlorproguanil (LapDap ®) 3.27/3.28 was introduced only a few years ago
and both compounds are representatives of a long-used class: the antifolates. Even
in this case, the majority of Southeast Asian strains are already resistant. True
novelties in the mode of action are rare. In 1997 a very expensive combination
medication atovaquone/proguanil 3.29/3.30 (Malarone ®) was introduced that syn-
ergistically inhibits the mitochondrial respiratory chain. Fosmidomycin 3.31, an
inhibitor of the parasite-specific mevalonate-independant isoprenoid synthesis
pathway, is currently in clinical trials. Increased efforts are necessary to find new
substances. Ideally, modes of action that have not been exploited yet should be
pursued. It is only in this way that we can be armed and ready for the time that
resistance to artemisinin spreads.

3.3 Morphine Analogues: A Molecule Cut to Pieces

Research on the opiates has taught us how complex natural products can be
systematically simplified, and structurally abbreviated analogues can be prepared
that have the identical effect, but sometimes with even better specificity. It has also
shown that there is sometimes no obvious solution for a specific problem. The
separation of the analgesic and addictive qualities could not, or only inadequately
be achieved.
The narcotic, analgesic, and euphoric effects of opium, which is isolated from
poppies, have been known for at least 5,000 years. Opium was used for operations,
3.3 Morphine Analogues: A Molecule Cut to Pieces 47

NH2
MeO OMe N
H2N Cl
H
N N N
H2N S N
O O H3C

3.18 Sulfadoxine 3.19 Pyrimethamine

CF3
CH3

HO N CH3
O
O
CH3 H3C
Cl CH3
F3C N O
HN
NH2
Cl CH3

3.20 Halofantrine 3.21 Tafenoquine

CH3
CH3
H N
HO
O
H3C H
O
O CH3
Cl Cl
O
CH3 O
O
O–
O Cl
3.22 Artesunate 3.23 Lumefantrine
CH3
H
O
H3C O H
O
O
CH3
OH

3.24 Dihydroartemisinin

Fig. 3.6 (continued)


48 3 Classical Drug Research

N N
N N

Cl N N Cl
3.25 Piperaquine

O O
N S

OH
H2N NH2
N
HN 3.27 Dapsone
N O
CH3
O
Cl
Cl N

3.26 Pyronaridine
OH

NH2 O
3.29 Atovaquone
N
H2N N Cl
N NH2
CH3
H Cl N
CH3
H2N N Cl
3.28 Chlorproguanil N CH3
H
CH3
OH O Na+
3.30 Proguanil
H N P
O–
OH
O

3.31 Fosmidomycin

Fig. 3.6 The latest research in antimalarials shows that many products can be used in
combination. First Fansidar ®, a combination of sulfadoxine 3.18 and pyrimethamine 3.19 was the
drug of choice. The development of rapid resistance has made this once-promising treatment
useless in the meantime. To date, hopes rest on the artemisinin derivatives 3.22 and 3.24. A new
beacon of hope is found in fosmidomycin 3.31, which has a novel mode of action in that it inhibits
the mevalonate-independent biosynthetic route to isoprenoids.

but is also a traditional drug of abuse. The importance of its abuse in the cultural
history of humanity is illustrated, among other places, in the “Opium Wars” of the
nineteenth century. In 1840 the Chinese wanted to stop the English from importing
opium and burned 20,000 cases of it; this led to a 2-year-long war between the two
countries.
3.3 Morphine Analogues: A Molecule Cut to Pieces 49

R1O HO

O H H O OH H
N CH3 N

R2O O
3.35 Naloxone
3.32 Morphine, R1 = R2 = H
3.33 Codeine, R1 = Me, R2 = H
3.34 Heroin, R1 = R2 = Acetyl

Fig. 3.7 Morphine 3.32 and codeine 3.33 served as lead structures for heroin 3.34, which has
better CNS bioavailablity, and naloxone 3.35, a morphine antagonist.

In 1804/5 the pharmacy assistant Friedrich Wilhelm Adam Sert€urner of the


Hof-Apotheke in Paderborn isolated the compound with the sleep-inducing princi-
ple. He named it morpheum (later morphine) after Morpheus, the Greek god of
dreams and son of Hypnos. Morphine addiction took on a whole new dimension
after 1853 and the invention of the hypodermic needle and syringe by Charles G.
Pravaz and Alexander Wood. As a result, morphine and heroin addiction spread
widely, and in the history of humanity it is one of many examples of the misuse of
a beneficial discovery.
Morphine 3.32 (Fig. 3.7) is one of the few examples of a natural product that is
still used today in its original form. It belongs to the most potent known analgesics.
If it is administered according to the correct dose and schedule, the danger of
addiction is low. The addictive potential is often overestimated by physicians such
that patients with severe pain are often inadequately treated with opiates. Morphine
is also a prime example of the success of systematic structural variation in the
direction of more-easily manufactured, simpler analogues as well as more selective
activity. The first modified products were simple derivatives such as the methyl
ether codeine 3.33, which is also found in the poppies. Codeine is weaker than
morphine, but it is bioavailable after oral administration. It has a pronounced
antitussive effect and a low addictive potential. Unfortunately, the opposite is
true for the potent, fast-acting diacetyl derivative heroin 3.34. It has enormous
addictive potential. Today it seems ironic that at the end of the nineteenth century
Heinrich Dreser, a senior pharmacologist at Bayer, wanted to discontinue the
development of Aspirin ® because of a suspected cardiotoxicity in favor of devel-
oping heroin as a well-tolerated and potent cough medicine (sic!), at least until he
realized the mistake. Of all the morphine derivatives, codeine and heroin are the
most widespread: codeine is in numerous combination preparations, and heroin is in
the drug scene. Some n-alkyl derivatives of morphine and close analogues, for
instance, naloxone 3.35, are opiate antagonists, that is, they inhibit the effect of
morphine (Fig. 3.7).
The structural elucidation of morphine took more than 120 years, and its total
synthesis, and ultimate structural proof, was completed in 1952 by Marshall Gates
50 3 Classical Drug Research

H3C N

H OH
N
COOEt H3C O
=
O
COOEt
N 3.37 Atropine
3.36 Pethidine
CH3
HO

O
O
Et
N CH3
H
N(Me)2 MeO
H3C
H3C CH3
3.38 Levomethadone OH
3.39 Etorphine

Fig. 3.8 The architecture of morphine was dissected in many ways. The strongly potent pethidine
3.36 was the first fully synthetic opiate analgesic, but it was discovered in the 1930s in a search for
anticonvulsives by varying the structure of atropine 3.37. It is recognizable however, that pethidine
retains the benzene ring of morphine as well as its piperidine ring. Levomethadone 3.38 is derived
from pethidine. The addition of another ring led to substances the potency of which surpasses
morphine by orders of magnitude. Etorphine 3.39 is 2,000–10,000-times more potent than
morphine in animals. Since 1963 it is used in African wildlife preserves to immobilize large
animals such as elephants and rhinoceroses.

and Gilg Tschudi. Morphine contains five rings: an aromatic benzene ring, two
unsaturated six-membered rings, the nitrogen-containing piperidine ring, and an
oxygen-containing five-membered ring. Systematic structural modifications had the
goal of simplifying the structure, for example, by opening one or more rings, or
removing them altogether.
In 1939, the potent analogue pethidine 3.36 (Fig. 3.8) was the first fully synthetic
analgesic, though it was originally based on the spasmolytic atropine 3.37. Despite
this, it is recognized to be a morphine analogue. In levomethadone 3.38 the
piperidine ring of pethidine is opened, an oxygen atom from the ester group is
removed, and another aromatic ring is added. There are thousands of other ana-
logues, some of which have been introduced to therapy. Aside from the decon-
struction of morphine, the construction of additional rings has surprisingly led to
analogues with more potency, for example, etorphine 3.39 (Fig. 3.8).
For a long time it was a complete mystery why our bodies would have extra
receptors for the contents of poppy plants, so-called opiate receptors. The solution
came with the discovery of the endogenous morphine-like peptides Met- and Leu-
enkephalin (▶ Sect. 10.2), which are the natural ligands for these receptors. The
discovery stimulated an intensive search for orally active peptides or
peptidomimetics devoid of addictive potential. The result of the work was more
3.4 Cocaine: Drug and Valuable Lead Structure 51

Cl
Cl

OH
N O
OH
CON(Me)2 N

3.40 Loperamide 3.41 Haloperidol

Fig. 3.9 Structural derivatives of morphine and its analogues have led to selective antidiarrhea
agents, loperamide 3.40, for instance, as well as neuroleptics such as haloperidol 3.41.

than sobering. Although orally active analogues were found, their addictive poten-
tial was identical to that of morphine and most morphine-derived analogues.
A few synthetic analogues have, in addition to agonistic activity, a weak antag-
onistic effect as well. The potential for these substances to be abused by addicts is
less than with the classical morphine analogues. Combination preparations of
agonists and antagonists are also available. With appropriate use, the analgesic
effect of the agonist dominates because it is present in excess. If the medicine is
injected intravenously, the more-strongly binding antagonist displaces the agonist,
and the desired euphoric effect never sets in.
The work with regard to improved selectivity was also successful. Today cough
medicines and antidiarrhea medicines, for example, loperamide 3.40 (Fig. 3.9), are
available that have no central morphine-like effects. This substance is able to pass
through the blood–brain barrier but is immediately expelled by an active trans-
porter. Upon inhibition of these transporters, for instance, when coupled with
quinidine, loperamide also has classical opiate effects. Its structure unites elements
of pethidine 3.36 and levomethadone 3.38.
In this section only a few representatives of the many thousand structural
modifications of morphine can be discussed. The approach of Paul Janssen should
not remain unmentioned though; he started with pethidine 3.36 with the goal of
preparing a strong analgesic, but instead experienced an unexpected success in
another area. The result was the neuroleptic haloperidol 3.41 (Fig. 3.9), a drug for
the treatment of schizophrenia, the mode of action of which is mediated by an
antagonistic effect at the dopamine D2 receptor (▶ Sect. 29.4).

3.4 Cocaine: Drug and Valuable Lead Structure

No other substance sparkles in so many ways as cocaine. In the introduction it was


already mentioned that it is at the pinnacle of all illegal drugs. Cocaine was also the
chemical starting material for a wide palette of valuable local anesthetics and
antiarrhythmics. We can thank the lead-structure cocaine for local anesthesia,
pain-free dentistry, and nerve-block anesthesia for smaller surgical procedures.
52 3 Classical Drug Research

COOCH3
N O
H3C
O O
H3C O
H
NH2
3.42 Cocaine 3.43 Benzocaine

CH3
H CH3 CH3
N H
H3C N N N
O
H3C H3C O
H3C

3.44 Lidocaine 3.45 Mepivacaine

Fig. 3.10 The local anesthetic effect of cocaine 3.42 was recognized early on. The independently
found lead structure benzocaine 3.43 and the basic moiety of cocaine were models for synthetic
local anesthetics. The structural relationship is clearly recognizable in lidocaine 3.44, which also
acts as an antiarrhythmic, and in mepivacaine 3.45.

The translation of the quite positive central effects of cocaine onto analogues
devoid of addictive potential is still in progress. The example of morphine leads
one to fear that this goal might not be possible.
Coca leaves and cocaine 3.42 (Fig. 3.10) belong to the oldest known drugs.
Chewing dried coca leaves has a long tradition in Peru and Bolivia. In 1744
Garcilaso de la Vega wrote that coca “satisfies hunger, gives new energy to the
tired and exhausted, and lets the unhappy forget their troubles”. The Scottish
author, Robert Louis Stevenson (Treasure Island) wrote in his novella The Strange
Case of Dr. Jekyll and Mr. Hyde about a personality split that a doctor undergoes
under the influence of drugs; he wrote the first draft of this novella in only three
days and nights while under the influence of cocaine. In 1863 the American chemist
Angelo Mariani (1838–1914) patented a mixture of coca extract and wine as
Vin Mariani. It made him a rich man. In 1886 the pharmacist John S. Pemberton
developed a coca-containing stimulant and headache remedy that he named Coca
Cola. He sold the rights in 1891 to a colleague, Asa G. Candler, who founded the
Coca Cola Company one year later. Up until 1906 Coca Cola indeed contained
a small amount of cocaine, but today it only contains the harmless stimulant
caffeine. Back at the turn of the last century, cocaine was already fashionable,
particularly in artistic circles. The Viennese psychiatrist Sigmund Freud (1856–
1939) experimented with cocaine intensively and rather uncritically. He considered
it to be a wonder drug, took it himself regularly, and recommended it generously for
use in therapy, for the treatment of stomach aches, and for a depressed mood. Later,
after massive criticism from his colleagues he turned away from it.
Cocaine causes the release of dopamine from its transporter (see ▶ Sect. 30.7).
Usually it is sniffed, occasionally it is intravenously injected, or it is mixed in drinks
or taken orally. Sniffing delivers it quickly to the brain where it displaces dopamine
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 53

from the binding site of the transporter and this causes increased dopamine release
into the synaptic gap. The free base, which is made by mixing it with sodium
bicarbonate (crack) is absorbed very quickly through the lungs by smoking it, and
causes euphoria that is even more distinct stronger than when the salt (coke,
powder, snow) is sniffed. Because cocaine does not bind for long, the transporter
is quickly reloaded with dopamine. The same effect can be induced again after
a little while. Other cocaine analogues that bind for longer do not allow the effect to
be repeated for hours. Psychological dependence occurs very quickly, even after the
first use in the case of crack cocaine. Physical withdrawal symptoms, as seen with
heroin addicts, usually do not occur.
The credit for discovering the local anesthetic effect of cocaine does not go to
Freud but rather a friend of his, the ophthalmologist Carl Koller (1857–1944).
Freud had planned to investigate this effect but in 1884 he wanted to visit a friend of
his, Martha Bernays, in New York quickly first. Koller picked up on Freud’s
suggestion and carried out the decisive experiment on the eye in his absence. The
synthetic benzoic acid esters and anilides that were initially used as local anes-
thetics were not derived from cocaine 3.42, but rather from p-aminobenzoic acid
esters; benzocaine 3.43 was already in use in therapy in 1902. A structural rela-
tionship to cocaine is, however, easily seen in modern local anesthetics such as
lidocaine 3.44 and mepivacaine 3.45 (Fig. 3.10).

3.5 H2 Antagonists: Ulcer Therapy Without Surgery

The history of the treatment of gastroduodenal ulcers is long and educational. Basic
research clarified the important mechanisms without providing a new drug. The
development of the therapy occurred in several phases. Again and again, better was
the enemy of good. In the beginning the treatment consisted of antacids, and later
anticholinergics. In severe cases only surgery helped. The H2 antagonists made the
breakthrough to purely pharmaceutical treatment. Now we are experiencing the victory
lap of the proton-pump inhibitors, which are used in different combinations with
antibiotics. Perhaps in the future this will be augmented or even replaced by a vaccine.
Gastric and duodenal ulcers are usually chronic illnesses and are widespread in
the general population. Any damage to the mucosal membrane of the stomach leads
to damage to the underlying cells through proteolytic enzymes and gastric acid.
Acetylcholine 3.46, histamine 3.47, and gastrin, a mixture of peptides with 17 (little
gastrin) and 34 (big gastrin) amino acids, stimulate the production of acid.
For decades the treatment of gastroduodenal ulcers was based on reducing the
amount of acid, for instance, with sodium bicarbonate, calcium carbonate, magne-
sium salts, and aluminum oxide hydrate. Advanced ulcers had to be treated surgi-
cally. Anticholinergics, antagonists of the acetylcholine receptor should, in
principle, have been suitable for ulcer treatment; however, unspecific antagonists
are out of the question because of their severe side effects. It was not until
pirenzepine 3.48 (Fig. 3.11), a selective so-called M1 antagonist, was developped
54 3 Classical Drug Research

H
O CH3 N NH2
+N CH3
H3C O N
CH3

3.46 Acetylcholine 3.47 Histamine

H O
N
O
N(Me)2
N N

O N N Me

3.48 Pirenzepine 3.49 Diphenhydramine

Fig. 3.11 Acetylcholine 3.46 and histamine 3.47 stimulate the acid production in the stomach.
The acetylcholine receptor antagonist pirenzepine 3.48 was the first drug specifically for ulcer
therapy. Classical H1 antihistamines such as diphenhydramine 3.49 cannot antagonize histamine in
the stomach.

that this class could be used in therapy. Here the undesirable side effects of
unspecific anticholinergics are only apparent at relative high doses.
The role of histamine in acid secretion was initially called into question because
the classical antihistamines, later defined as H1 antihistamines, did not reduce acid
secretion. These substances, for instance, diphenhydramine 3.49 (Fig. 3.11) antag-
onize histamine in the intestines, lungs, and in allergic reactions. Today a wide
palette of different histamine antagonists is available for the treatment of allergic
rhinitis (hay fever). The most important side effect, particularly with the older
substances, is a more or less pronounced sedation. Histamine-induced gastric acid
secretion, the effect on the heart, and uterus contractions are not inhibited by
diphenhydramine and other analogues. It was first suspected in 1948 that there
might be two different histamine receptors, H1 and H2. The H1-type is inhibited by
diphenhydramine, but the H2-type, which is responsible for the above-mentioned
effects is not. Both belong to the family of G protein-coupled receptors
(▶ Sect. 29.1). In the meantime two additional members of the family, the H3 and
H4 receptors, had been discovered. In 1964 James W. Black (1924–2010) at Smith
Kline & French in England began to develop three models to test the inhibition of
these other effects of the H2-mediated effect of histamine. One was an in vivo model
measuring gastric perfusion on anesthetized rats, and two were in vitro models
evaluating the histamine-induced stimulation of a guinea pig heart and a rat uterus.
James Black later received not only the Nobel Prize, but was also knighted by Queen
Elizabeth II, two rather unusual honors for an industrial pharmaceutical researcher.
Despite all strategies that were available for the development of receptor antag-
onists, the search for an H2 antagonist was to no avail for years. The American
management in Philadelphia became impatient and wanted to end the program. The
first promising result came just in the nick of time. Because all lipophilic analogues
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 55

H H
X NH2 R N N
X CH3
HN N NH HN N S

3.50 X = -NH- 3.52 Burimamide, R = H, X = -CH2-


3.51 X = -S- 3.53 Metiamide, R = CH3, X = -S-

H3C H H
N N
S CH3
3.54 Cimetidine
HN N N
C N

Fig. 3.12 Na-Guanylhistamine 3.50 and S-(2-imidazolyl-4-yl-ethyl)isothiourea 3.51 served as


lead structures for the H2-type antihistamines. The first clinically tested H2 antagonists,
burimamide 3.52 and metiamide 3.53, were unsuitable for therapy. Only the development of
cimetidine 3.54 led to a breakthrough and an exceedingly successful therapy.

were ineffective, the earlier more polar compounds that had already been investi-
gated were reinvestigated. A compound that had already been synthesized in 1928
and determined to be ineffective, Na-guanylhistamine 3.50 (Fig. 3.12), now
appeared to be a weak antagonist. The effect had been overlooked because 3.50
is actually a partial agonist and therefore shows a weak histamine-like effect.
Within a few days the first lead structure, S-(2-imidazoyl-4-yl-ethyl)isothiourea
3.51, with interesting activity was identified (Fig. 3.12).
The extension of the side chains of both of these compounds delivered partial
agonists, the antagonistic effects of which were too weak. It was only in 1972 after
they abandoned the hypothesis that the basic nitrogen in the side chain was
necessary for activity that they, after chain elongation and an N-methyl substitution
of the thiourea, arrived at the first clinically useful H2 antagonist burimamide 3.52.
Human trials confirmed the efficacy, but the bioavailability was poor. The next
milestone was achieved with the development of metiamide 3.53 (Fig. 3.12), which
is 5–10-times more potent than burimamide and clinically demonstrated the desired
ulcer-healing effect. In some patients, however, a granulocytopenia occurred,
which is a dangerous suppression of the white blood cells and cannot be tolerated.
The medical need was great. It was not foreseeable whether the observed effect
was a result of H2 antagonism. We have the company to thank for taking on the risk
of further research. The sulfur atom of the thiourea was suspect. An isosteric
exchange for an oxygen atom delivered a less-potent urea analogue. Exchange for
an ═NH group led back to a guanidine, which was strongly basic, but a potent
antagonist nonetheless. Substitution of the imino group for an NO2 or a CN group
led to less-basic analogues, the antagonistic potency of which was comparable to
metiamide. The somewhat more active of the two analogues, cimetidine 3.54
(Fig. 3.12) was clinically tested. In November 1976 and in August 1977 it was
introduced in England and the USA, respectively. By 1979 it was available in over
100 countries. Shortly thereafter in 1983, cimetidine (Tagamet ®) became the most-
prescribed drug in many countries, and its sales reached about US $1 billion.
56 3 Classical Drug Research

Such a successful drug makes other companies restless. There are many cases
in the history of pharmaceutical research in which a major new concept was adapted
by developments in other companies. Other examples of this are the structurally
entirely different calcium channel blockers verapamil and nifedipine (▶ Sect. 2.6)
and the angiotensin-converting enzyme inhibitors captopril and enalapril
(▶ Sect. 25.4).
The same happened in the development of the H2 antagonists. Ulcer therapy had
been researched since 1960 at Allen and Hansburys, a subsidiary of Glaxo. One of
the first lead structures 3.55 (Fig. 3.13), an aminotetrazole with about the same
potency as burimamide, was systematically varied without success. Their research
management also wanted to stop the project to concentrate on the anticholinergics.
The breakthrough came upon replacement of the tetrazole ring with a furan. It was
not exactly an obvious idea because the previously synthesized compounds always
had at least one nitrogen atom in the ring. The —CH2SCH2CH2— chain was taken
over from metiamide 3.53, and a dimethylaminomethylene group was added to
improve water solubility; the result was AH 18665 3.56 (Fig. 3.13).
The chemists also synthesized a cyanoguanidine AH 18801 3.57 that was
comparable to cimetidine 3.54 in terms of potency. The substance’s characteristics
were, however, unsatisfactory: the melting point was too low. The nitrovinyl
analogue 3.58 brought success in this respect. It was synthesized and was an oil!
That was not seen as a prohibitive problem because it was redeemingly 10-times
more potent than cyanoguanidine 3.57 in the rat. Ranitidine 3.58 (Fig. 3.13) was
developed as a drug and introduced in 1981 as Zantac ® and Sostril ®. Compared to
cimetidine, ranitidine was 4–5-times more efficacious in humans and had the
advantage that it was more selective. In 1987 ranitidine overtook cimetidine. In
1994 with US $4 billion in sales, it became the most economically successful drug
in annual sales at that time. Within a few years, Glaxo was catapulted to the
pinnacle of the world rankings of pharmaceutical corporations. Glaxo used this
opportunity. The research of this company and its strategy in drug development
belong to “the finest” in the branch today. Through mergers and acquisitions with
competitors, Glaxo, “GSK” as it is known today, has become one of the largest
pharmaceutical corporations in the market.
In the meantime, an antitumor effect in colon, gastric, and renal cancer has been
reported for cimetidine. Apparently it suppresses the tumor-mediated interleukin-1-
induced selectin activation (▶ Sect. 31.3).
It is understandable from the chemical structure that cimetidine has a high
affinity for cytochrome P450 enzymes, particularly CYP 3A4 (▶ Sect. 27.6). As
a consequence, interactions with other drugs that depend on CYP 3A4 for meta-
bolism are common. What was first seen as an indispensible imidazole moiety in
3.54 blocks the catalytic iron center in the P450 enzymes. Ranitidine 3.58 carries
a furan ring in the same position and lacks the P450 inhibition. After cimetidine and
ranitidine, very few other drugs have made their way to the market. Nizatidine 3.59
and famotidine 3.60 contain a thiazole ring as a heterocycle (Fig. 3.13). In 3.60, the
electron-withdrawing group of the guanidine moiety is replaced by a sulfonamide
group.
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 57

Fig. 3.13 The lead structures H H


3.55–3.57 were steps on the N N
N N CH3
way to ratinidine 3.58, which
in the 1980s was the N N S
economically most important
NH2 3.55
drug. Nizatidine 3.59 and
famotidine 3.60 represent
newer developments. H H
Omeprazole 3.13 is a proton H3C O N N
pump inhibitor. N S CH3
CH3 X

3.56 AH 18665, X = S
3.57 AH 18801, X = N-CN
3.58 Ranitidine, X = CH-NO2

H H
H3C N N N
N S CH3
CH3 S
NO2
3.59 Nizatidine
NH2
H2N N N SO2NH2
S N
NH2 S
3.60 Famotidine

CH3
OMe

N
CH3
N
S
MeO N O
H

3.61 Omeprazole

It is true even for the H2 blockers that good drugs are replaced by better ones.
After being prompted to acid stimulation, the cells use an H+/K+-ATPase active
enzyme to pump protons out of the cell in exchange for potassium at the cost of
energy. If “the faucet is turned off ” at this step, not only the histamine-induced acid
production, but also the acetylcholine and gastrin-mediated acid production is
stopped. Omeprazole 3.61 is a prodrug that has been developed, which, upon
rearrangement, acts as an irreversible inhibitor of this proton pump (▶ Sect. 9.5).
The effect of omeprazole therefore lasts longer, and the reduction in acid secretion
is stronger than with the H2 antagonists. Gastric and duodenal ulcers heal more
58 3 Classical Drug Research

quickly and reliably. These substances also hit it big. At the end of the last century,
Losec ®, Antra ® (both from Astra), and Prilosec ® (Merck & Co., USA) had com-
bined global sales of over US $6 billion despite the fact that they were introduced to
the market much later than ranitidine. The enantiomerically pure form
esomeprazole (Nexium ®) even reached US $7 billion in sales in 2007.
That is not even the end of the story. Although in principle it had been known
since 1983, the relevance of the bacteria Helicobacter pylori for the etiology of
ulcers was first discussed in 1994 at a conference of the US National Institutes of
Health (NIH). This bacterium infects a large portion of the population in childhood.
Frequently it is spread within a family; a kiss can be enough to infect someone. It
causes gastrointestinal damage in a portion of those infected, which can lead to an
ulcer. In the meantime it is held responsible not only for ulcers but also for at least
two different forms of gastric cancer. It survives assault by many antibacterial
agents as well as the acidic milieu of the stomach. It has an urease that releases
ammonia in its immediate vicinity, which in turn neutralizes the gastric acid.
The drugs of choice to treat such infections are combinations of H2 blockers,
proton pump inhibitors, and antibiotics. H. pylori seems to quickly develop antibi-
otic resistance though. Since the beginning of 1995 the first animal model is
available, a mouse with a sustained H. pylori infection; this should promote further
research in this important area. There is a vaccine currently in development.
A portion of the vaccinated patients exerted enough of an immune response to
defend themselves from the bacteria. For practical use however, its reliability must
be improved. Perhaps in the foreseeable future we will have an ulcer therapy that is
completely different, for instance, a swallowed vaccine that delivers life-long
protection. The revolution is in sight: a one-time treatment without repeated
gastroscopy. The patients will be delighted. Others will see this dramatic change
in therapy with mixed emotions.

3.6 Synopsis

• Even though the period of classical drug research was strongly governed by trial
and error, it has been exceptionally successful. Many leads were found by
accident or from traditional medicine, though limited knowledge of pathophys-
iology or molecular disease etiology was available.
• Acetylsalicylic acid or Aspirin ® is one of our oldest but also most prototypical
drugs. Originating from bark extracts and chemically modified to improve taste
and tolerance, it achieves its actual potency and mode of action by irreversibly
inhibiting cyclooxygenase.
• Since then two isoforms of cyclooxygenase have been characterized, one is
constitutionally present, and the other is induced in inflamed tissue.
Acetylsalicylic acid inhibits both unselectively, giving rise to some undesirable
side effects.
• Due to irreversible inhibition of COX in platelets, Aspirin exerts an influence on
the ratio of synthesized thromboxane and prostacyclin, which has a depressing
Bibliography 59

effect on blood’s coagulation tendency. As a consequence, Aspirin is


recommended as “preventive medicine” to protect against thrombosis or to
reduce mortality of heart attack.
• Malaria is a widespread tropical/subtropical disease transmitted by the anophe-
les mosquito and caused by the plasmodium parasite accessing erythrocytes in
humans. The disease had been nearly eradicated by fighting the mosquito with
the insecticide DDT. One of the oldest active substances to hit the parasite is
quinine, isolated from cinchona bark.
• After stopping DDT spraying for the mosquitos, malaria raged again. Increasing
resistance of the parasite to known drugs occurred, and the development of new
chemotherapeutics for malaria has been a rollercoaster ride of promising com-
pounds and the development of resistant parasites.
• Morphine, isolated from poppies, is in use as the unchanged natural product, as
a potent analgesic. When administered correctly, the risk of addiction is low. Its
complex structure of five fused rings has been simplified and cut into pieces to
give more-easily accessible analogues with higher selectivity.
• Cocaine, which is the active ingredient in coca leaves, is one of our oldest drugs.
Upon replacement of dopamine from its transporter in the synaptic gap, its
euphoric effect is achieved. The cocaine structure served as lead structure for
the development of anesthetics.
• Ulcer therapy went through several phases of drug development leading to active
substances with increasingly efficient mode of action to reduce production of
gastric acid.
• Starting with antacids and rather unspecific anticholinergics, selective H2 antag-
onists were developed as a real breakthrough in pure pharmaceutical treatment
of ulcera. They act upon the H2 receptor, a member of G protein-coupled
receptors (GPCRs). A protein that pumps protons for acid release is stimulated
through these receptors. Proton-pump inhibitors such as omeprazole directly
block the function of the proton-secreting H+/K+-ATPase that builds up acidic
milieu.
• The bacterium Helicobacter pylori causes gastrointestinal damage leading
to ulcers. It can be eradicated by a combination of a proton-pump inhibitor
with an antibiotic. A vaccine against the bacterium could deliver life-long
protection.

Bibliography

General Literature

Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Ryan J, Newman A, Jacobs M (eds) (2000) The pharmaceutical century. Ten decades of drug
discovery. American Chemical Society, Washington, DC, Supplement to ACS Publications
Sneader W (1996) Drug prototypes and their exploitation. Wiley, Chichester
Sneader W (2005) Drug discovery. A history. Wiley, Chichester
Verg E (1988) Meilensteine: 125 Jahre Bayer, 1863–1988. Bayer AG, Leverkusen
60 3 Classical Drug Research

Special Literature
Aspirin – eine unendliche Geschichte, Research. Das Bayer-Forschungsmagazin, Issue 6, S. 4–21
(1992) and other articles in this magazine
Battistini B, Botting R, Bakhle YS (1994) COX-1 and COX-2: toward the development of more
selective NSAIDs. Drug News Perspect 7:501–512
Kelce WR et al (1995) Persistent DDT Metabolite p, p’-DDE is a potent androgen receptor
antagonist. Nature 375:581–585
Patrono C (1989) Aspirin and human platelets: from clinical trials to acetylation of cyclooxygen-
ase and back. Trends Pharm Sci 10:453–458
Schlitzer M (2007) Malaria chemotherapeutics part I: history of antimalarial drug development,
currently used therapeutics, and drugs in clinical development. Chem Med Chem 2:944–986
Wiesner J, Ortmann R, Jomaa H, Schlitzer M (2003) New antimalarial drugs. Angew Chem Int Ed
Engl 42:5274–5293
Protein–Ligand Interactions as the Basis for
Drug Action 4

To purposefully design an active substance the following questions must first be


answered: How does a drug act? How does Aspirin ® relieve headaches? Why do
b-blockers lower blood pressure? Where does a calcium channel blocker act? How
does cocaine work? How do sulfonamides prevent the proliferation of bacterial
pathogens? An active substance must bind to a very special target molecule in the
body to exert its pharmacological action. Usually this is a protein, but nucleic acids
in the form of RNA and DNA can also be target structures for active molecules. An
important prerequisite for the binding is that the active substance has the correct
size and shape to fit into a cavity on the surface of the protein, a binding pocket, as
well as possible. Furthermore, it is also necessary that the surface properties of
ligand and protein fit together so that specific interactions can form. In 1894, Emil
Fischer compared the exact fit of a substrate for the catalytic center of an enzyme to
the picture of a lock and key. In 1913, Paul Ehrlich formulated the Corpora non
agunt nisi fixata, which literally translated means “bodies do not act if they are not
bound.” With this he wanted to express that drugs that are meant to kill bacteria or
parasites must be “fixed,” that is, bound by certain structures. Both concepts form
the starting point for rational drug research. In the broadest sense, they are valid
even today. After being taken, a drug must arrive at its target tissue and enter into
interactions with biological macromolecules there. Specific active substances have
a high affinity to a binding site on these macromolecules and are adequately
selective. It is only in this way that the desired biological effect can be deployed
without extensive side effects.
The most important terms that have to do with the modes of action of drugs are
briefly defined in Table 4.1. These terms are described in detail ▶ Chaps. 23,
“Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, "Aspartic
Protease Inhibitors; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”; ▶ 26,
“Transferase Inhibitors”; ▶ 27, “Oxidoreductase Inhibitors”; ▶ 28, “Agonists
and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists and Antagonists of
Membrane-Bound Receptors”, ▶ 30, “Ligands for Channels, Pores, and Trans-
porters”, ▶ 31, “Ligands for Surface Receptors”; and ▶ 32, “Biologicals: Peptides,
Proteins, Nucleotides, and Macrolides as Drugs” in detail with examples of target

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_4, 61


# Springer-Verlag Berlin Heidelberg 2013
62 4 Protein–Ligand Interactions as the Basis for Drug Action

Table 4.1 Brief definitions of the most important terms


Term Definition
Ligand A (usually small) molecule that binds to a biological macromolecule
Enzyme An endogenous biocatalyst that can transform one or more substrates into one
or more products
Substrate A ligand that is a starting material for an enzymatic reaction
Inhibitor A ligand that prevents the binding of a substrate either directly (competitive) or
indirectly (allosteric), reversibly or irreversibly
Receptor A membrane-bound or soluble protein (or a protein complex) that initiates an
effect after binding an agonist
Agonist A receptor ligand that exhibits an intrinsic effect, that is, it causes a receptor
response
Antagonist A receptor ligand that either directly (competitive) or indirectly (allosteric)
prevents the binding of an agonist
Partial agonist A weak agonist that has a high affinity to the binding site, and in this manner
acts as an antagonist
Inverse agonist A ligand that stabilizes the inactive conformation of a receptor or ion channel
Functional A substance that prevents a receptor response by another mode of action
antagonist
Allosteric A ligand that influences the function of a protein by causing a change in the 3D
effector structure of the protein
Ion channel A pore in a protein that allows specific ions to flow in and out across the cell
membrane along a concentration gradient. Opening and closing is affected by
binding a ligand or by a membrane potential change
Transporter A protein that transports molecules or ions across the cell membrane against
the concentration gradient by consuming energy
Antimetabolite A substance that interferes with the biosynthesis of a central metabolic product
either as a false substrate or as an inhibitor

structures. Drugs often act as inhibitors of enzymes or as agonists or antagonists


on receptors. Enzyme inhibitors and receptor antagonists occupy a binding site
and prevent the substrate or endogenous ligand from docking there. Agonists
exhibit an additional quality, a so-called intrinsic effect. This has the consequence
that the receptor adopts a three-dimensional structure that is in a state that invokes a
response from a downstream process.
Although ion channels, pores, and transport systems are also receptors in the
broadest sense, they are considered as a separate group. Often the term “receptor” is
used rather loosely as a general term for any biological macromolecule that
interacts with a drug.
Biomolecules communicate frequently with one another by the recognition and
formation of large common surface contacts. It is over these contacts that the
primary attack and entry of viruses, bacteria, and parasites into the host cell take
place. Many cells receive a signal via surface receptors upon binding
a macromolecule. Even the rolling behavior of leukocytes in the vasculature is
governed by such surface receptors. These systems are increasingly being tapped
for drug therapy (▶ Chap. 31, “Ligands for Surface Receptors”) in that
active macromolecular substances, so-called biologicals or biopharmaceuticals
4.1 The Lock-and-Key Principle 63

(▶ Chap. 32, “Biologicals: Peptides, Proteins, Nucleotides and Macrolides as


Drugs”), are more often finding application as therapeutics in our pharmaceutical
arsenal.

4.1 The Lock-and-Key Principle

In the early 1880s, Emil Fischer investigated the cleavage of glucosides with
different enzymes that only differed in the stereochemistry of the glycosidic carbon
atom. He noticed that particular glucosides could only be cleaved with one group of
enzymes. Other glucosides, on the other hand, could only be cleaved with another
group of enzymes. He drew the correct conclusions from his observation and in
1894 formulated them in an article in the Berichte der Deutschen Chemischen
Gesellschaft (Reports of the German Chemical Society):
The limited effect of enzymes on the glucosides can also be explained by the assumption that
a chemical process can be initiated only by those [enzymes] that have a similar geometric
construction that approximates that of the molecule [substrates]. To use a picture, I want to
say that enzymes and glucosides must fit together like a lock and key to be able to exert
a chemical effect upon one another. This idea has gained plausibility and value for
stereochemistry research after the phenomena was transferred from the biological to the
chemical field.

In the same year he refined this picture:


Apparently here the geometrical construction exerts such a large influence on the play of
chemical affinities that the comparison of the two molecules undergoing an interaction
seems to me to be comparable to a lock and key. If the fact that some yeasts can ferment
a larger number of hexoses than others is to be explained, the picture can be completed by
differentiating between master and special keys.

Emil Fischer did not pursue this image any further, and later even complained that it
is often quoted out of context. The configuration of the sugars interested him, that of
the isomeric glucosides did not. He expressed a rather distanced attitude to purely
theoretical considerations. In 1912, he wrote in a letter “I myself take not so much
pleasure in theoretical things.” This is remarkably modest for a man who exerted such
a great influence with his image of a lock and key! Emil Fischer would have certainly
been pleased and proud if he had seen the results of the X-ray structural analysis of
protein–ligand complexes, for instance, of retinol (vitamin A) bound to the retinol-
binding protein, which is the transport protein for this molecule (Fig. 4.1).
Many binding sites can exceedingly specifically discriminate between analogues
that are chemically closely related. Even the smallest mishap must not occur in
protein biosynthesis. Friedrich Cramer more closely investigated the recognition
mechanism for the incorporation of the amino acids valine and leucine. These
amino acids differ in their side-chains only in that a methyl group is exchanged
for an ethyl group. The smaller valine residue should easily fit into the “lock” for
64 4 Protein–Ligand Interactions as the Basis for Drug Action

Fig. 4.1 Like a key in a lock,


vitamin A (retinol) fits into
the binding pocket of its
transport protein. The surface
of the ligand is green. The
protein residues in the direct
vicinity of the binding pocket
are visible. To improve the
clarity, the back of the
binding site and the residues
in front of the binding site
have been omitted.

a leucine, though it might not bind as strongly. A clear distinction, which is


absolutely necessary for an error-free protein synthesis, can only occur through
repeated recognition. Indeed, that is the case. An energy-consuming, iterative, and
“skeptical” auditing process reduces the error quotient to less than 1:200,000.
Because of this harsh feedback and control process, even the correct binding partner
is sometimes unsuccessful. Over 80% are rejected as being “dubious.” The result is
a process with an accuracy of about 1:40,000.
The retinol-binding protein is less selective. In this case, such extreme precision is
apparently not necessary for flawless functioning. In addition to the “stretched” retinol
isomer, the “folded” retinol isomer and chemically related substances also bind to the
protein. Other proteins discriminate very little. Examples of less-selective proteins
include digestive enzymes (▶ Sect. 23.3), metabolic enzymes (e.g., cytochromes;
▶ Sect. 27.6), and glycoprotein GP 170, which is responsible for the drug resistance
of tumor cells (▶ Sect. 30.7). A bacterial transport protein, oligopeptide-binding
protein A, can bind any peptide with two to five amino acids with approximately the
same affinity; this represents an extreme case of “chemical promiscuity.”
Linus Pauling translated the “lock-and-key” principle to the transition states of
enzymatically catalyzed reactions. A flexible adaptation often occurs during the
binding of the substrate. The transition state of the reaction binds more strongly to
the enzyme than either the substrate or the product (▶ Sect. 22.3) and is stabilized
by the functional groups of the binding site. The “lock-and-key” principle has been
repeatedly challenged because of the mobility of the ligand in the binding site; but
even with a high-security lock, the pins are still mobile and play an essential role in
the mechanism.
In the 1950s, Daniel E. Koshland proposed the theory of “induced fit,” which says
that the ligand induces a conformational change in the protein by binding to it. The
theory works under the assumption of a particular effect, for instance, the enzymatic
cleavage of the substrate. This mechanism does not contradict the lock-and-key
principle because, as previously stated, even a high-security lock has mobile parts.
Small, induced adaptations play an essential role in the ligand–receptor complex.
Even the relocation of entire protein domains has been observed. As a rule,
4.2 The Essential Role of the Membrane 65

the adaptability of the protein is related to its function. Proteins often have to be
adequately flexible to fulfill their biological functions.
For the rational design of ligands, there are two fundamentally different starting
points that differ in the informational content of the system. Either the exact
three-dimensional structure of the binding site is known or it is unknown. In the first
case, the lock is known, and the key “only” has to be cut (▶ Chap. 20, “Protein
Modeling and Structure-Based Drug Design”). In the other case, the active and
inactive analogues represent the fitting and ill-fitting keys. It is through the comparison
of the keys and systematic variations that better-fitting keys can be designed
(▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Comparisons”). In the
following section, the binding of a low-molecular-weight drug (“ligand”) and
a macromolecular receptor will be more precisely illuminated. These target
structures for drugs can be outside or inside the cell, or they can be embedded in the
cell membrane. Therefore we will briefly address the construction and function of the
cell membrane before the protein–ligand interaction is brought to the foreground.

4.2 The Essential Role of the Membrane

The majority of biological processes in our body take place inside cells. These cells
are surrounded by a membrane that protects the cellular content from “leaking”.
The membrane also hinders undesirable xenobiotics from entering the cell and
mediates the contacts between cells. Membranes are also found within the cell,
where they form substructures (so-called compartments) and separate individual
cellular components from one another. In mammalian cells, the outer membrane is
made up of a lipid double-layer, in which proteins and cholesterol molecules are
embedded (Fig. 4.2). All molecules can move relatively freely, therefore it is called
a “fluid mosaic membrane.”
Lipid membranes of this type function as barriers for polar substances and as
permeable layers for non-polar molecules. The importance of membranes for the
transport and distribution of drugs is presented in detail in ▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”. Here,
only the important function that the lipid membrane has for the activity of
drug molecules is discussed. Membrane-embedded proteins belong to entirely
different classes. Among them are the membrane-anchored and membrane-residing
enzymes, the large class of G protein-coupled receptors (▶ Chap. 29, “Agonists and
Antagonists of Membrane-Bound Receptors”), ion channels, pores and transporters
(▶ Chap. 30, “Ligands for Channels, Pores, and Transporters”), and surface recep-
tors (▶ Chap. 31, “Ligands for Surface Receptors”).
Due to the phosphate and ethanolamine head groups, both of the outer layers
of the lipid double layer are very polar. The alkyl chains are found on the inside,
where the membrane is non-polar. Many drugs are also non-polar and accumulate
here in higher concentration than in solution. Amphiphilic (soap-like) molecules,
that is, substances that have both non-polar and polar character, arrange themselves
in the membrane so that the non-polar portion is on the inside (Fig. 4.2).
66 4 Protein–Ligand Interactions as the Basis for Drug Action

Polar Amphiphilic Polar drug


head groups drug

Exterior of
the membrane
Non-polar
alkyl groups Protein

Interior of
Membrane the membrane
lipids Membrane-embedded
Non-polar cholesterine molecule
drug

Fig. 4.2 Membranes from mammalian cells are constructed from a lipid double layer, in which
proteins (yellow) and individual cholesterol molecules (black) are embedded. The individual lipid
molecules (orange) point their polar groups to the exterior of the membrane, and their alkyl chains
to the interior. Therefore polar drugs (light blue) accumulate on the outside of the membrane.
Non-polar drugs (red) are enriched in the interior of the membrane. Amphiphilic drugs (violet) are
oriented into the membrane according to their structure. Despite this, all of the molecules can
move relatively freely. Therefore this is called a “fluid mosaic membrane”.

This orientation within the membrane plays a particularly important role when the
polar group is a positively charged nitrogen atom that can form additional electro-
static interactions with the phosphate group of the lipids.
In the meantime, this concept has been proven experimentally with numerous
independent methods. For many receptors it is accepted that the ligand binds at
a site in the protein that is only accessible from the inner layer of the membrane
(e.g., lipases, ▶ Sect. 23.7; or cyclooxygenases, ▶ Sect. 27.9). Therefore the
enrichment and arrangement of an active molecule in the membrane plays an
important role for the optimal approach to the binding site. If the molecule, on
the other hand, assumes an incorrect orientation, its docking is hindered.

4.3 The Binding Constant Ki Describes the Strength of


Protein–Ligand Interactions

The binding of a ligand to its target protein is measurable. The extent of the binding
is characterized by the binding constant (Eq. 4.1). Literally interpreted, the disso-
ciation constant Kd is the reverse of the association constant Ka. With enzymes,
the so-called inhibition constant Ki is determined in a kinetic assay (▶ Sect. 7.2). At
low substrate concentration, it determines the inhibitory concentration that is
necessary to reduce the rate of an enzyme reaction by one half. Although Ki is
therefore not exactly defined as a dissociation constant, the two quantities are
usually referred to interchangeably. In the following, the abbreviation Ki is used
in the same sense as a dissociation constant, which indicates the strength of the
interaction between protein and ligand. It is a thermodynamic equilibrium measure
that indicates what portion of the ligand is bound to the protein, on average. The law
of mass action can be expressed as:
4.3 The Binding Constant Ki Describes the Strength of Protein–Ligand Interactions 67

½ligand  ½protein
Ki ¼ (4.1)
½ligand  protein complex

Ki has the dimensions of a concentration with the units of mol/L (M). The smaller
the Ki value is, the more strongly the ligand binds to the protein. If the concentration
of the ligand is significantly lower than Ki, only a very small portion of the protein
molecules are occupied by ligand molecules. A biological effect like that of the
inhibition of an enzyme cannot be observed. If the ligand concentration is equivalent
to Ki, half of the available protein molecules are occupied by a ligand. The Gibbs free
energy can be derived from the binding constants by a thermodynamic relationship
(which is valid for equilibria under so-called standard conditions; Eq. 4.2):

DG ¼ RT ln K i (4.2)

in which R is the gas constant, and T is the absolute temperature in Kelvin.


A binding constant of Ki ¼ 109 M ¼ 1 nM, which is a respectable value for an
active substance, corresponds to a Gibbs free energy of 53.4 kJ/mol at body
temperature. A change in Ki of one order of magnitude means a change in the Gibbs
free energy of 5.9 kJ/mol (or 1.4 kcal/mol).
Frequently, instead of a Ki value, a so-called IC50 value is given. In contrast to
the Ki value, the IC50 value depends on the concentration of the enzyme and the
substrate. The obtained value is affected by the affinity of the substrate for the
enzyme as substrate and inhibitor compete for the same binding site. The IC50
value can be transformed into a Ki value by use of the Cheng-Prusoff equation.
Experience has shown that both values in the first approximation run parallel to one
another so that the more easily determined IC50 value is well suited to characterize
a ligand in comparison to other compounds.
Why is the Gibbs free energy used to describe the energetic relationships upon
complex formation? In chemistry and biology, processes run in open systems under
atmospheric pressure. Because the volume of the environment is enormous, it can be
assumed that the external pressure is unchanged even in processes in which produc-
tion of gas occurs. Therefore these processes are considered to be under constant-
pressure conditions. Nonetheless, a gas that was formed in the reaction must first find
space in the surrounding particles in the air. Therefore work must be performed. This
so-called pressure–volume work diminishes the maximum possible work to be
achieved by the system (internal energy, DU). The energy diminished by the
pressure–volume work is referred to as the enthalpy (DH). It is therefore the energy
converted during a process, corrected by the portion of the pressure–volume work.
The change in enthalpy is not the entire answer as to why a particular process,
such as the formation of a protein–ligand complex, spontaneously occurs. If we take
a hot and a cold chunk of metal and bring them into contact, everyone knows that
the heat will flow from the hot metal to the cold one. The opposite cannot be
observed, even though the energy content of the entire system would remain
unchanged for this process. Why does energy spontaneously flow from a hot to a
68 4 Protein–Ligand Interactions as the Basis for Drug Action

cold object and not the other way around? This has something to do with the tendency
of all natural process to distribute energy evenly. The metal atoms vibrate very
strongly in a hot metal block in around their resting positions. Therefore the piece of
metal is hot. Some vibrational degrees of freedom are strongly activated. If the cold
metal block is brought into contact with the hot metal, these vibrations are transmitted.
In the end, the metal atoms in both blocks vibrate around their resting positions, but on
average not as vigorously as the atoms in the hot block moved before. The sum of the
energy content has remained constant; it is, however, now distributed over many more
degrees of freedom. The system can be described as having gone into a more disor-
dered state (many more atoms are now vibrating on average than in the beginning).
This happens for all spontaneously occurring processes. The entropy, S, is used as
a measure to describe the uniform distribution or random disorder. To correctly
describe the process of the formation of a protein–ligand complex (Eq. 4.3), we
need not only the enthalpy (DH) that is exchanged between the two binding partners,
how the distribution of degrees of freedom changes, and whether the system migrates
into a more disordered state must also be considered. Therefore the term free energy
(DG) has been introduced because it considers not only the energy balance of the
process. It also considers the changes in entropy (TDS) that reflect the spontaneous
distribution of energy over the degrees of freedom of the system. Spontaneously
occurring processes are characterized by a negative value for DG.

DG ¼ DH  TDS (4.3)

As shown in Eq. 4.3, DG is composed of an enthalpic component DH, and


an entropic component TDS. The entropic component is weighted with the
temperature. It matters a great deal whether the entropy in a system is changed at
low temperature, where all the particles are largely in an ordered state, or whether it
occurs at high temperature where the disorder is already very high. Because of the
negative sign, an increase in the entropy causes a decrease in the DG, and therefore
an increase in the binding affinity.

4.4 Important Types of Protein–Ligand Interactions

Organic molecules can bind to proteins by forming chemical bonds between ligand
and protein as well as non-covalent interactions. For example, a chemically mod-
ified product of omeprazole reacts with its target a protein and forms a covalent
bond (▶ Sect. 9.5). In this section, we want to limit ourselves to ligands that bind to
the protein by forming non-covalent interactions. For the following discussion, it
is helpful to classify protein–ligand interactions into different categories. The
different types of interactions are summarized in Fig. 4.3.
Hydrogen bonds (H-bonds) are very frequently observed between protein and
ligand. The proton-carrying partner in a biological system is usually an NH or
OH group, which is termed hydrogen-bond donor. The opposite group is an electro-
negative atom with a partial negative charge and is termed hydrogen-bond acceptor.
4.4 Important Types of Protein–Ligand Interactions 69

Fig. 4.3 Frequently Protein Ligand


occurring protein–ligand
interactions. Important polar O H N
interactions are hydrogen
bonds and ionic interactions. Hydrogen bonds
Metalloproteases contain zinc
O H O
ions as a cofactor, the
interaction of which with
O
a ligand often yields
- H3N+
important contributions to the
binding affinity. Non-polar O
H Ionic interactions
parts of the protein and ligand
(salt bridges)
contribute hydrophobic O H N
interactions. Because of the - +
particular electron O H N
distribution in aromatic rings, H
the interaction between
unsaturated ring systems is 2+
particularly large. Zn HS Metal complexation

CH3 H3C

Hydrophobic interactions

+
N Cation–p interactions

Examples of hydrogen-bond acceptors are oxygen and nitrogen atoms. Hydrogen


bonds are predominantly electrostatic interactions. They achieve their extraordi-
nary strength because the hydrogen atom of the donor group is bound to a strongly
electronegative atom, whereby the electron density of the hydrogen atom is
shifted to the neighboring atom. The sphere of influence of the hydrogen
atom becomes virtually smaller. This, in turn allows the acceptor to come closer
to the proton than the sum of the van der Waals radii should actually allow.
The electrostatic attraction between the partners therefore becomes larger. The
geometry of an H-bond is shown in Fig. 4.4. A hydrogen bond is characterized
by a pronounced distance and angle dependence. It is directional; its geometry is
defined within narrow limits.
It is often found that the charged groups of the ligand bind to the oppositely
charged groups on the protein. Such ionic interactions (also known as salt bridges)
are particularly strong when the two groups are separated by 2.7–3.0 Å from one
another. Frequently an ionic interaction overlaps with a hydrogen bond. This is
called a charge-assisted hydrogen bond. We will see that in many protein–ligand
complexes, the association is determined in large part by such ionic interactions.
A few proteins contain metal ions as cofactors, for example, Zn2+ in metallo-
proteases (▶ Chap. 25, “Inhibitors of Hydrolyzing Metalloenzymes”). It is often
70 4 Protein–Ligand Interactions as the Basis for Drug Action

C=O··H

N H O

N-H··O

Fig. 4.4 Geometry of a hydrogen bond. The atoms N, H, and O adopt an almost linear orientation
to one another. The N···O distance is between 2.8 and 3.2 Å. The angle N–H···O is practically
always larger than 150 . A large variability is observed for the C═O···H angle. It is typically
between 100 and 180 .

the attractive interactions between the metal ion and the opposite charge in the
ligand that makes a decisive contribution to the affinity in these structures. Fur-
thermore, there are a few groups that are particularly well suited to forming
complexes with transition metals. Among these are the thiols RSH, hydroxamic
acids RCONHOH, acid groups, and many nitrogen-containing heterocycles.
Whether the charge can increase the affinity contribution of hydrogen bonds
depends strongly on the protonation state in which the involved functional groups
are found. Drugs are usually weak acids or bases, that is, they contain so-called
titratable groups (▶ Sect. 19.4). Whether these groups, for example, a carbonic
acid, an acidic sulfonamide, or a nitrogen-containing heterocycle, can release
or accept a proton and transform into a charged state depends strongly on the pH.
The same can happen with functional groups of the acidic or basic amino acid
residues. Then these groups can form charge-assisted hydrogen bonds that provide
a higher contribution to the binding affinity (Sect. 4.8).
The pKa value is considered to estimate whether a group is in the protonated or
deprotonated state. It indicates at which pH value the two forms, which are in
equilibrium with one another, are present in equal amounts. The situation might
become even more complicated because the pKa value can be shifted by the local
environment. In a hydrophobic environment, adopting a charged state is less
favorable for acidic and basic groups, that is, a shift to less acid or basic character
is the consequence. If an already-protonated, positively charged group in the ligand
faces an amino acid of the protein with the same charge, its protonation becomes
even more difficult to accomplish. The group therefore behaves less basic.
The opposite is the case when putative positively charged basic groups bind in
a protein environment with a negative charge. Here, the charged state is even more
easily formed, which corresponds to having stronger basic character. Entirely
analogous considerations result for acidic groups, just with opposite signs.
Here a positively charged protein environment shifts acidic groups toward higher
acidity, and a negatively charged environment makes them behave less
acidic. In this way the protein environment can induce a significant pKa shift
of the titratable groups of the ligand. Uncharged H-bonds can become
charge-supported contacts that significantly contribute to the binding affinity
(▶ Sect. 21.9). With the help of electrostatic calculations an attempt can be made
to estimate the pKa shift upon complex formation (▶ Sect. 15.4).
4.5 The Strength of Protein–Ligand Interactions 71

Fig. 4.5 Typical lipophilic O


groups in ligands are aliphatic
and aromatic hydrocarbons,
halogen substituents, as well
iso-Pentyl- Cyclohexyl- Phenyl- Phenoxy-
as non-polar heterocycles
such as furan and thiophene. Cl

O S
Chlorophenyl- Furanyl- Thiophenyl-

Hydrophobic interactions form through the close proximity between non-polar


amino acid side chains of the protein and lipophilic groups on the ligand. Lipophilic
groups are aliphatic or aromatic hydrocarbon groups and also halogen substituents
(e.g., chlorine) and many heterocycles such as thiophene and furan (Fig. 4.5). All
areas that cannot form H-bonds or other polar interactions count as lipophilic parts
of the surface of a protein and ligand. In contrast to hydrogen bonds, hydrophobic
interactions are not directional. It does not matter in which relative orientation that
the lipophilic groups are to each other. The interactions between aromatic rings, for
which there is indeed a preferred relative orientation, are an exception to this.
It has been shown that for ligands with large lipophilic groups hydrophobic
interactions often afford a significant contribution to the binding affinity. The
influence of direct attractive forces between the lipophilic groups is, however,
small. The hydrophobic interactions are mainly caused by the displacement, or
more correctly put, the liberation of water molecules from the lipophilic environ-
ment of the binding pocket. Moreover the ligand with its lipophilic substituents
leaves the bulk water phase in the vicinity of the protein. The solvent “cave”, in
which the ligand was hosted in water, collapses. This step is also coupled with
changes in the free energy. The role of water molecules is discussed in Sect. 4.6.
Yet another important interaction should be mentioned here. Obviously, quaternary
amines bind particularly well in binding pockets that are formed by the aromatic
side chains of the protein. This contact is largely based on the polarization interac-
tion between the positive charge and the electronic system of the aromatic rings.

4.5 The Strength of Protein–Ligand Interactions

When evaluating the strength of protein–ligand interactions, it is reasonable to first


consider the non-covalent interactions between isolated small molecules. Information
about these is available through quantum mechanical calculations (▶ Sect. 15.5) as
well as spectroscopic investigations. In this way, molecule pairs can be experimen-
tally investigated in the gas phase. The association energies that are obtained for the
molecules afford an impression about the strength of the direct interactions.
The influence of effects that originate from the liberation of the solvent water
(desolvation) is, of course, missing in such experiments. Some of these data are
summarized in Table 4.2.
72 4 Protein–Ligand Interactions as the Basis for Drug Action

Table 4.2 Experimental or Dimer Binding energy in kJ/mol


quantum mechanically
determined association CH4···CH4 2
energies in the gas phase C6H6···C6H6 10
H2O···H2O 22
NH3···NH3 18
Na+··· H2O 90
NH4+···CH3COO <400
Na+···Cl <400

The results show that electrostatic interactions are the dominating energetic
factor. The interaction between a cation and an anion in a vacuum is more than
400 kJ/mol. This corresponds to the strength of a covalent bond! This amount is
enormous compared to the typical protein–ligand interactions in water that are
summarized in Sect. 4.4. The binding of an ion pair in the gas phase, therefore, is
much larger than the typical strength of a protein–ligand interaction in water.
Two water molecules bind to each other with 22 kJ/mol. This interaction is also
overwhelmingly of electrostatic nature in that the large dipole moment is respon-
sible for the strong binding. Interactions between small, non-polar molecules are
much weaker. Two methane molecules bind to each other with about 2 kJ/mol. This
is less than 10% of an H2O···H2O interaction. Correspondingly, methane boils at
90 K whereas water is a liquid at room temperature. The direct interactions between
polar groups are therefore orders of magnitude stronger than those between non-
polar groups.

4.6 Blame It All on Water!

The data that were presented in the previous section could suggest that protein–
ligand interactions are mainly determined by H-bonds and ionic interactions.
All the more astonishing is the fact that the acetate ion, CH3COO, does not
form a dimer with the guanidinium ion H2NC(═NH2+)NH2 in water. Likewise,
amides practically do not associate in water at all, even though hydrogen bonds
often occur between two amide groups in protein structures. How can that be? The
answer is: water is to blame for everything!
All biochemical reactions take place in water, and they only occur at all because
of this reason! The binding of a ligand to a protein occurs in an aqueous environ-
ment. At first, the “empty” binding pocket of the protein is filled with water. A few
water molecules form hydrogen bonds to the protein and are found in an energet-
ically favorable orientation. Other water molecules are in contact with lipophilic
areas on the protein surface and cannot build a perfect hydrogen-bond network.
The ligand is also solvated. When it diffuses into the binding pocket it displaces the
water molecules that are there and must additionally strip off its own solvation
shell. At the same time, the “cave” in which the ligand was situated in the water
phase collapses. Therefore not only are direct interactions between protein and
ligand formed, numerous H-bonds to water molecules are broken.
4.6 Blame It All on Water! 73

a Formation of a hydrogen bond between protein and ligand

Protein
H H
N O N O
H H H
H
+ +
O O O O
H H H H

Ligand Protein–ligand
complex

b Hydrophobic interactions
H H
O O
CH3 H CH3 H
+ +
O CH3 CH3 O
H H H H

Fig. 4.6 The influence of water molecules on the strength of protein–ligand interactions.
(a) Upon formation of an H-bond between protein and ligand, water molecules must be displaced.
These form hydrogen bonds to both protein and ligand prior to complex formation. The balance of
hydrogen bonds, that is, the number of H-bonds before and after binding, remains unchanged.
(b) Upon formation of hydrophobic contacts, water molecules are released from an environment
that was unfavorable for them into the bulk water phase. The number of H-bonds increases.

We want to consider the formation of a hydrogen bond as well as a lipophilic


contact between the protein and ligand more closely. Both processes are displayed
in Fig. 4.6. How are H-bonds formed between protein and ligand? Let us assume
that the polar groups of both partners are solvated. Then at least two water
molecules must be displaced to form an H-bond between the two partners.
The released water molecule can in turn form H-bonds with other water molecules.
In this way, exactly as many new H-bonds are formed as are broken. The total
number of H-bonds remains constant! The gain in free energy is determined by the
relative strength of the different H-bonds as well as the entropic contribution, which
is based on the change in the degree of order of the system (Sect. 4.7). The total
contribution to the free energy that results is difficult to quantitatively predict.
If a ligand manages to form more hydrogen bonds to the protein than were possible
with the solvent shell, very strong binding results. This is particularly the case if in
the binding pocket of the protein, the groups forming the polar H-bond are oriented
in a way that the water molecules alone cannot fully manage to satisfy all these
interactions. This is possible for the ligand because it has an optimal arrangement
of its donor and acceptor groups.
The formation of a hydrophobic contact also leads to the release of water
molecules (that were previously occupying the space) from the binding pocket.
Once released into the surrounding aqueous bulk phase, they form H-bonds with
74 4 Protein–Ligand Interactions as the Basis for Drug Action

each other (Fig. 4.6). Because previously H-bonds were possible neither to the
protein nor to the ligand, the total number of H-bonds now increases. Moreover, the
strength with which the water molecules were fixed in the binding pocket before
their release is decisive. If they were strongly fixed, the newly gained degrees of
translational freedom increase the disorder and therefore boosts the entropy, which
is thermodynamically favorable for the free energy DG. If the displaced water
molecules were already severely disordered, their displacement causes very little
entropy gain. Newer findings have shown that the binding pocket does not need to
always be uniformly packed with water molecules. Narrow hydrophobic pockets in
particular are not perfectly solvated. This has consequences for the free energy
balance during binding because it is just this displacement of water molecules that
is decisive for the hydrophobic interactions.

4.7 Entropic Contributions to Protein–Ligand Interactions

In addition to the energetic contributions, the entropic component must also be


considered in the evaluation of the strength of protein–ligand interactions.
As described previously, the entropy S is a measure of the order of a system.
This allows an estimate to be made as to over how many degrees of freedom
a particular amount of energy is distributed. A degree of freedom can mean, for
example, a particular vibration of the system or a rotation of individual groups
around one another. A highly ordered system in which the energy is distributed over
only a few degrees of freedom has little entropy; increasing the disorder increases
the entropy and concomitantly decreases the free energy G.
At room or body temperature, proteins and ligands can move in all spatial
directions. Furthermore, a water shell is, of course, also mobile; the water mole-
cules diffuse back and forth. A few of them are spatially fixed for a longer period of
time because they are bound to the protein by several H-bonds. Such water
molecules can be identified by X-ray crystallography of the protein. A spatial
fixation of a molecule is entropically unfavorable. Other water molecules are freely
mobile and are therefore not captured in an X-ray crystal structure. Such water
molecules are in an entropically favorable state because their TDS contribution is
more positive than for a spatially fixed water molecule.
The hydrophobic protein–ligand interaction is, in many cases, of an entropic nature,
above all, when individual, previously fixed water molecules are displaced from the
binding pocket and released into the surrounding bulk water. The entropic contribution
to protein–ligand interactions is therefore based not on direct interactions but rather on
how the number of degrees of freedom for the protein–ligand–water system changes
upon ligand–protein binding. The more water molecules are released from the hydro-
phobic environment, the greater the contribution to the binding affinity. The number of
released molecules is, in a first approximation, proportional to the size of the hydro-
phobic surface that is no longer accessible to water upon ligand binding, that is, in
other words, “buried.” Therefore this surface contribution often serves as a benchmark
for the estimation of the entropic portion.
4.7 Entropic Contributions to Protein–Ligand Interactions 75

Ligand Ligand–receptor
Receptor
in solution complex

Bound H2O
molecules

Free rotation

Loosely
associated
H2O molecules
H2O molecules that
can move freely
in solution

Fig. 4.7 Illustration of the thermodynamic contribution to the free energy DG. Before binding, the
ligand can move freely; this gives rise to a certain translational and rotational entropy. Moreover,
the ligand is usually flexible, and adopts different conformations. Protein and ligand are solvated in
that H-bonds to water molecules are formed. Some water molecules are in loose contact with the
protein or the ligand without forming H-bonds. Translational and rotational degrees of freedom
are lost upon binding. The concomitant loss in entropy is unfavorable for the binding. Furthermore,
both the protein and the ligand must shed their water shells, which is also an unfavorable process for
the binding. The binding of the ligand leads to the formation of direct interactions to the protein and
it releases water molecules. Both of these are contributions that are favorable for the binding.
H-bonds are indicated by dashed lines and hydrophobic interactions by dotted lines.

In addition to the release of fixed water molecules, there is a further entropic


contribution to the binding energy. The association of a ligand to the protein
leads to a loss in translational and rotational degrees of freedom, and therefore to
a loss in entropy. Before the association, the ligand and protein move freely and
independently of one another. They each have three degrees of translational
and three degrees of rotational freedom. After binding, the protein and ligand rotate
and diffuse together so that three degrees of translational and three degrees of
rotational freedom are lost. Furthermore, a freely mobile, flexible ligand takes on
different conformations (▶ Chap. 16, “Conformational Analysis”) and is therefore
entropically favored. Once bound to the protein the ligand is restricted in its
conformational degrees of freedom to one or a few conformations that fit into the
binding pocket of the protein. It finds itself in an entropically unfavorable state.
Different enthalpic and entropic binding contributions are summarized in Fig. 4.7.
It is first assumed that the entropy TDS contributes positively and the enthalpy
DH contributes negatively to DG. If the negative enthalpic contribution over-
compensates entropic losses, an overall negative DG results (cf. Eq. 4.3). In fact,
such enthalpy-driven binding is very frequently observed, but there are also known
cases, especially with large lipophilic ligands, in which the binding is entropy driven.
This means that the ligand binding is enthalpically unfavorable, but the effect is
over-compensated by the marked entropy increase, that is, DG is overall negative.
76 4 Protein–Ligand Interactions as the Basis for Drug Action

The entropy gain occurs, as mentioned, because of the release of fixed water
molecules. This, however, is not the only entropy contribution that changes upon
ligand binding. The protein changes too. For example, many side chains in proteins
are distributed over multiple conformational states. Upon binding a ligand, this
distribution can change. According to the total balance, the entropy can increase or
decrease through this change. The same is true for the rotation of side chains,
especially methyl groups. If the rotational behavior changes, the total entropy of the
ligand-binding process is influenced. The picture can even be complicated in that
some areas of the protein transform into a more ordered state, and others become
less ordered. In this way the entropic contribution is partially compensated. It is
often assumed that the changes in the entropic portion of the binding within a series
of very similar ligands are the same. Then such contributions can be neglected in
a relative comparison of ligands. Unfortunately, this simplified picture has proven
to be a fallacy. Just such an example is introduced in Sect. 4.10.

4.8 What Is the Contribution of a Hydrogen Bond to the


Strength of Protein–Ligand Interactions?

Naturally, in any discussion about protein–ligand interactions, the question arises as


to how large the contribution of particular hydrogen bonds to the binding affinity
actually is. The question can be experimentally answered when two protein–ligand
complexes that are only different by one hydrogen bond are compared to one
another. Such a comparison is possible, for example, by using protein mutants in
which an amino acid that contributes an H-bond to the ligand is exchanged
for another amino acid that cannot do this. Alan Fersht conducted an elegant
experiment for protein tyrosyl–RNA synthase in complex with the substrate tyrosyl
adenylate (Fig. 4.8). Numerous H-bonds are formed between the protein and
substrate, for example, between the phenolic OH group of tyrosine 34 and
the substrate. The mutant Tyr34 ! Phe, in which tyrosine is replaced by a non-
polar phenylalanine, was prepared, and the binding of the substrate to the mutant
protein was tested. The binding was weakened by 2 kJ/mol. Analogously, other
mutants were investigated. The loss of a neutral H-bond led to a loss in binding
affinity between 2 and 6 kJ/mol. The H-bonds in which one partner is charged are
stronger. The mutation Tyr169 ! Phe decreases the binding affinity by 15.6 kJ/mol.
Fidarestat 4.1 is a potent aldose reductase inhibitor (▶ Sect. 27.5). It forms
a hydrogen bond to the NH function of the amide group of Leu300 with its
carboxamide group (Fig. 4.9). If the leucine is exchanged for a proline, the
possibility of forming the H-bond is lost because proline has no free NH group.
This exchange means a loss in free energy of 7.8 kJ/mol. When the partitioning of
enthalpy DH and entropy TDS is measured by microcalorimetry, it can be seen
that the H-bond loss is largely of an enthalpic nature (▶ Sect. 7.7). In comparison,
the inhibitor sorbinil 4.2, in which the carboxamide group is missing, should
be considered. Interestingly, the free energy of binding for the wild-type protein
and the Leu300 ! Pro mutant is practically identical. Because the group to form the
4.8 Contribution of a Hydrogen Bond 77

Gln195 NH2
Asp38
Asp38 His48 N
N
H O O - O
H +
N N
N P O
O H O O Thr51
Tyr169 H
Cys35 O O
Asp176 H H
O H

H Gly36 Gly192
O

Tyr34

Fig. 4.8 Numerous intermolecular hydrogen bonds are formed in the complex between tyrosyl-
RNA synthetase and the substrate tyrosyl adenylate. The exchange of amino acid Tyr34 for Phe or
Tyr169 for Phe leads to the situation that in each case the hydrogen bond can no longer be formed.
This results in a loss of binding affinity.

H H
O N O N
O O
HN NH2
HN
H
O O
F HHNN O O
F H HHNN
4.1 H 4.2 H
NN
Pro300 OO NN
Pro300 OO
Leu300 NH
NH Leu300 NH
NH
OO
OO

ΔΔG: 7.8 kJ/mol ΔΔG: −0.8 kJ/mol


ΔΔH: 6.9 kJ/mol ΔΔH: 5.1 kJ/mol
−TΔΔS: 0.9 kJ/mol −TΔΔS: −5.9 kJ/mol

Fig. 4.9 Fidarestat 4.1 (left) forms a hydrogen bond with its carboxamide group to the NH
function of Leu300 (blue). By exchanging Leu for Pro (red), the H-bond can no longer be formed.
This leads to a DDG loss of 7.8 kJ/mol, which is paid for mostly by the enthalpy (DDH: 6.9 kJ/
mol). The carboxamide group is missing in sorbinil 4.2 (right). The exchange leucine ! proline
leaves the free energy of binding DDG practically unchanged. Sorbinil, however, binds to the wild
type (leucine, blue) enthalpically more favorably and entropically less favorably than to the
proline mutant (red). An entrapped water molecule mediates an H-bond between sorbinil and
Leu300. This brings an enthalpic advantage to the wild type of about 5 kJ/mol. At the same time,
the entrapment of a water molecule is entropically disadvantageous for the wild type (─TDDS: 6
kJ/mol) and compensates the enthalpic advantage.
78 4 Protein–Ligand Interactions as the Basis for Drug Action

Fig. 4.10 A plot of the 16


binding constants Ki of 80
crystallographically
investigated protein ligand 14
complexes shows that Ki has
no direct relationship to the 12
number of hydrogen bonds
that exist between protein and
ligand. 10

−lgKi 8

0
0 2 4 6 8 10 12 14
n

H-bond with the NH group of Leu300 is missing in sorbinil, the loss of the NH
function in the protein is hardly noticeable. This explains the practically unchanged
free energy of binding. Nonetheless, the sorbinil complexes with the wild-type
protein and the mutant are different. The binding with the wild type is enthalpically
more favorable, but it is entropically more expensive than with the mutant.
The crystal structure indicates that a water molecule mediates an H-bond between
the ether group of sorbinil and the NH function of Leu300 (Fig. 4.9). This yields
an enthalpy gain of about 5 kJ/mol. At the same time, the uptake of water
is entropically disfavored. This contribution of nearly 6 kJ/mol just compensates
the enthalpic gain so that there is practically no affinity gain in DG in the balance.
The proline mutant cannot form a water-mediated contact to sorbinil because of the
missing NH function. Therefore the enthalpic gain from the H-bond is lost. There is
also no entropic loss from capturing a water molecule.
The three-dimensional structures of a large number of protein–ligand complexes
have been elucidated. Many of these complexes contain hydrogen bonds between
the protein and ligand. The entire issue of the contribution of hydrogen bonds
to the binding affinity becomes apparent in Fig. 4.10. Here the experimentally
determined binding constants for 80 protein–ligand complexes are plotted against
the number of hydrogen bonds. The measured binding constants spread over a
considerable range for a given number of hydrogen bonds. The contribution of
a single H-bond is therefore by no means constant, but rather it varies significantly.
The contribution of an H-bond can even reduce the binding affinity due to an
unfavorable desolvation effect. If two ligands are compared that are only different
4.8 Contribution of a Hydrogen Bond 79

Table 4.3 Binding constants Ki for the thermolysin inhibitors 4.3, which contain either
a phosphonamide (X ¼ ─NH─), a phosphonate (X = ─O─), or a phosphinate (X ¼ ─CH2─)
group. The phosphonamide group -PO2NH- complexes the zinc ion and simultaneously forms an
H-bond with Ala113

Ala 113
O
O O
X
O N P R
H -
O O

4.3 2+
Zn

Binding constant Ki in mM X¼
R ─NH─ ─O─ ─CH2─
OH 0.76 660 1.4
Gly-OH 0.27 230 0.3
Phe-OH 0.08 53 0.07
Ala-OH 0.02 13 0.02
Leu-OH 0.01 9 0.01

in the functional group that forms the H-bonds with the protein, the affinity can
increase, remain the same, or even decrease.
An impressive example of the importance of hydrogen bonds is displayed by the
inhibitors 4.3 of the metalloprotease thermolysin, which were synthesized in the
research group of Paul Bartlett. There, a phosphonamide ─PO2HN─ was replaced
by a phosphinate ─PO2CH2─ or a phosphonate ─PO2O─. The results of these
exchanges are summarized in Table 4.3. Although the X-ray structure shows that
the NH groups form an H-bond, it can nonetheless be replaced with a CH2 group
without loss of binding affinity. This result is understandable if we consider the
number of hydrogen bonds before and after ligand binding for the phosphonamide
and for the phosphinate, as we did in Fig. 4.6. In both cases the number of H-bonds
is unchanged. If the NH group is replaced by an oxygen atom, the binding affinity
decreases by a factor of 1,000. In water, the oxygen atom that is in the place of the
NH group can form a hydrogen bond to the bulk water. In the protein–ligand
complex of the phosphonate ─PO2O─, the electronegative oxygen atom is found
exactly opposite the oxygen of the carbonyl group of Ala113. Two acceptor groups
are directly facing one another. A hydrogen bond cannot be formed here. The
inventory of hydrogen bonds remains unbalanced. Furthermore, the two groups
repel one another, which results in a poorer binding. A similarly positioned case is
illustrated in Table 4.4. Here the binding affinity of three thrombin inhibitors
4.4 that were synthesized at Eli Lilly are compared with each other. The amine
(X ¼ ─NH─) can form an H-bond with Gly219 and binds the most strongly.
The ether (X ¼ ─O─) binds 5,000-times weaker because of an electrostatic repulsion
80 4 Protein–Ligand Interactions as the Basis for Drug Action

Table 4.4 Binding of 4.4 to the serine proteases thrombin and trypsin

H
N N CHO
X
O O

O
H
N NH

Gly 216 H2N NH


4.4

IC50 values in mg/mL


Enzyme X ¼ ─NH─ ─O─ –CH2–
Thrombin 0.009 52 0.07
Trypsin 0.009 43 0.018

between the ether oxygen atom and the carbonyl group of the protein. The aliphatic
compound (X ¼ ─CH2─) shows remarkable binding compared to X ¼ ─NH─ that
is merely reduced by a factor of eight (thrombin) and two (trypsin).

4.9 The Strength of Hydrophobic Protein–Ligand Interactions

We have seen that the direct attractive forces between lipophilic groups are
considerably smaller than those between polar groups. Hydrophobic interactions
are mainly based on the displacement of water molecules. It has been shown in
many experiments that their contribution to the binding affinity is, as a first approx-
imation, proportional to the size of lipophilic surface that is buried upon ligand
binding and therefore no longer accessible to water. Typically it is found that the
contribution is approximately between 50 to 200 J/mol per Å2 of lipophilic
contact area. An example for this is retinol. It binds to the retinol-binding protein
(Fig. 4.1) with a binding constant of 190 nM, exclusively through lipophilic
contacts. This corresponds to a free energy of 39.8 kJ/mol. As a result of the
binding, a lipophilic area of 250 Å2 is buried. The contribution per Å2 amounts
to 39,800/250 ¼ 159 J/mol Å2.
Six HIV protease inhibitors (▶ Sect. 24.6) are listed in Fig. 4.11. During the
course of a lead structure optimization, the hydrophobic surface of 4.5 was enlarged
by adding hydrophobic groups. It could be confirmed crystallographically that the
binding mode did not change. If the changes in the molecular volume in this series
are plotted against the affinity, a linear relationship is obtained. The binding affinity
increases by 65 J/mol Å2.
In many cases, the hydrophobic interactions are a dominant contribution to the
free energy of binding. In Fig. 4.12 the lipophilic surface area that is buried upon
4.10 Binding and Mobility: Compensation of Enthalpy and Entropy 81

O O
X=H
SO2 S Cl
N N CH3
CF3
Br
N+
I
X H H X
4.5

Fig. 4.11 The scaffold of the HIV protease inhibitor 4.5 was enlarged during the course of a lead
structure optimization by adding hydrophobic groups to the aromatic N-benzyl group. An
unchanged binding mode was evidenced crystallographically. The additional molecular volume
improved the binding affinity in a linear manner by about 65 J/mol Å2.

16

14

12

10

8
i
−lgK

6
Fig. 4.12 In analogy to
Fig. 4.10, a plot of the binding 4
constants Ki of the 80
crystallographically
investigated protein–ligand 2
complexes against the buried
hydrophobic surface area 0
shows that there is no simple 0 100 200 300 400
function for this measure
either. X/Å2

complex formation of the same 80 protein–ligand complexes as in Fig. 4.10 are


shown together with their experimentally determined binding constants. Here too,
the values are scattered over a broad range.

4.10 Binding and Mobility: Compensation of Enthalpy and


Entropy

According to Eq. 4.3, enthalpy and entropy are in a close physical relationship, and
their sum results in the free energy of binding. If the formation of protein–ligand
82 4 Protein–Ligand Interactions as the Basis for Drug Action

complexes is considered, the DG of weakly binding millimolar complexes


and strongly binding nanomolar complexes fall between ca. 35–55 kJ/mol. A lead
structure optimization (▶ Chap. 8, “Optimization of Lead Structures”) usually
covers an even smaller range. Typically, the binding constants are improved by
5–6 orders of magnitude, which correspond to 25–30 kJ/mol. Upon exchanging
functional groups in a lead structure, the enthalpy DH usually varies over
a considerably broader range. If the change in DG is much smaller during the
course of this replacement, out of purely mathematical reasons the changes in the
enthalpy DH must be compensated by an opposite change in the entropy TDS. It is
only in this way that the large variations in the two properties can lead to the result
that DG remains in a small window. An important question is derived from this: Is
there a connection that causes the enthalpy and entropy, which are opponents, to
partially compensate one another during the optimization? How can it be nonethe-
less achieved that both measures are optimized without canceling out the effects of
one another so that DG remains unchanged?
Entropic optimization aims at increasing the hydrophobic surface of a ligand that
becomes buried upon binding. It is embodied in this very intuitive factor that the
enlarged ligand displaces an increasing number of water molecules upon binding.
The design of a rigid ligand with correctly frozen conformational degrees of freedom
usually leads to an improvement in the entropic binding contribution (▶ Sect. 24.6).
To increase the enthalpic binding of a ligand to the protein, above all, additional polar
interactions must be incorporated. This, however, as a rule comes at a price in that the
additional polar groups must first release their solvation shell. This contribution to
desolvation must be paid for. If an amidine group is added to the para position of the
unsubstituted phenyl group of the thrombin inhibitor 4.6, a significant improvement
in the affinity is obtained in 4.7, which is accompanied by a strong increase in the
enthalpy (Fig. 4.13). The inhibitor forms a salt bridge with its benzamidine group to
an aspartate residue in thrombin. It is therefore strongly spatially fixed, which is
entropically unfavorable. The inhibitor 4.6, which lacks the polar group, binds with
a similar geometry. It cannot, however, form the salt bridge. The structure indicates
an increased residual mobility of the inhibitor in the binding pocket, which is
advantageous from an entropic point of view.
The two compounds 4.8 and 4.9 also represent thrombin inhibitors. They
differ in the size of the cycloalkyl group on the basic scaffold that fills
a hydrophobic pocket of the protein. Both inhibitors have practically the
same binding affinity for thrombin. However, the free energy of binding is
partitioned very differently into the enthalpy and entropy components. The
compound with the cyclopentyl substituent has an enthalpic advantage and an
entropic disadvantage compared to the six-membered-ring derivative. From
where does this surprising effect originate? The crystal structures of the two
derivatives with thrombin show an important difference with regard to the
cycloalkyl group. Whereas the five-membered ring is easily recognized in
the electron density (▶ Sect. 13.5), practically no density at all is visible
where the six-membered ring should be encountered. Such an observation in
an X-ray structure indicates a high degree of disorder in a particular moiety of
4.10 Binding and Mobility: Compensation of Enthalpy and Entropy 83

NH2 NH2
N H N H
N N
O O O O

4.7
4.6

H2N NH
ΔG: −31.7 kJ/mol ΔG: −46.7 kJ/mol
ΔH: −13.6 kJ/mol ΔH: −40.6 kJ/mol
−TΔS: −18.1 kJ/mol −TΔS: −6.1 kJ/mol

NH NH
N H N H
N N
O O O O

4.8 4.9

H2N NH H2N NH

ΔG: −35.4 kJ/mol ΔG: −36.2 kJ/mol


ΔH: −16.9 kJ/mol ΔH: −10.5 kJ/mol
−TΔS: −18.5 kJ/mol −TΔS: −25.7 kJ/mol

Fig. 4.13 Replacement of a phenyl group in 4.6 by a para-benzamidinophenyl group in 4.7 leads to
a significant improvement in the affinity of this thrombin inhibitor, which is largely because of an
enthalpic gain. This is because of the formation of a salt bridge to Asp189 (▶ Sect. 23.3). The
homologous ligands 4.8 and 4.9 bind equally strongly to thrombin, but the binding affinity is divided
into the enthalpic and entropic contributions entirely differently. Compound 4.9 has a significantly
higher residual mobility in the binding pocket than 4.8, which results in an entropic advantage for
this derivative, even though the poorer contacts to the protein cause an enthalpic disadvantage.

the protein–ligand complex. This disorder can be of a purely static nature


whereby the six-membered ring is scattered over many orientations. Alterna-
tively, it can also be the result of a much larger residual mobility in the protein-
bound state than observed for the five-membered-ring derivative. Molecular
dynamics simulations (▶ Sect. 15.7) confirmed this difference. In the case of the
five-membered ring compound, the cyclopentyl group remains in a hydrophobic
pocket and from time to time it undergoes a jump rotation. In doing so, the
virtually planar ring jumps between two orientations and exchanges its upper
and lower face. This practically does not change the placement of the ring in
the pocket. Furthermore, compound 4.8 does not form a hydrogen bond to the
carbonyl group of Gly216. The six-membered ring derivative 4.9 behaves entirely
differently. Here the cyclohexyl group moves out of the binding pocket during the
course of the simulation and returns after some time. At the same time, 4.9 forms
84 4 Protein–Ligand Interactions as the Basis for Drug Action

an intermediate hydrogen bond to Gly216. It is because of this that 4.9 maintains a


large amount of residual mobility.
This difference in the dynamic behavior of 4.8 and 4.9 explains the divergent
thermodynamic profile. The cyclopentyl derivative has an entropic disadvantage
because it is largely fixed in the binding pocket. The unambiguous orientation
achieves an advantage for enthalpic interactions. The good and stabile contacts to
the protein ensure an increased contribution to the interaction energy. This looks
different for the six-membered-ring derivative. Its looser fixation in the binding
pocket means a smaller loss in degrees of freedom upon complex formation.
This causes an entropic advantage. Enthalpically, however, this behavior is
disadvantageous. Because it temporarily leaves the binding pocket, interaction
with the protein can only be formed with reduced strength.
What can be learned from this example? Even when ligands have a very similar
structure, the binding behavior can be significantly different. Their residual mobility in
the binding pocket can have decisive consequences for the thermodynamic binding
contributions. Obviously a mutual compensation of enthalpy and entropy leads to
an unchanged free energy. This interplay of residual mobility in the binding pocket
and quality of the formed interactions has, of course, consequences for the optimiza-
tion process. Medicinal chemists like to think in terms of group contributions to
binding affinity experienced during the exchange of particular functional groups.
Statistical analyses of such group contributions have been carried out and can be
applied as a rule of thumb to guide optimization strategies. The thinking is usually
done additively. How much is gained if a particular group is combined with another in
a molecule that is to be optimized? One should be careful with these considerations.
Small differences in the binding behavior cause such simple rules to fail.
The optimization of the thrombin inhibitors 4.10 and 4.11 should be considered
as examples (Fig. 4.14). Two changes should be undertaken. One is that a hydro-
phobic substituent on the end of the molecular scaffold should be enlarged from an
n-propyl to a phenylethyl group. This means a significant increase in the hydro-
phobic molecular surface area. Second, an amino group introduced next to the
hydrophobic group should form a hydrogen bond to Gly216. The two changes from
4.10 to 4.11 lead to an improvement in the affinity of DDG ¼ 18.6 kJ/mol.
Both modifications could also be introduced sequentially with the intermediates
4.12 and 4.13. If the hydrophobic group is first enlarged from 4.10 to 4.12, only a
small amount of binding affinity is gained. If 4.12 is further optimized, a significant
affinity gain is obtained. Does the amino group yield so much in affinity?
The reverse approach can also be taken, and the amino group can be introduced
to 4.10 to give 4.13. For this change an improvement of only DDG ¼ 9.6 kJ/mol is
obtained. The final enlargement of the hydrophobic surface area of 4.13 to 4.11
features another 9.0 kJ/mol of affinity gain.
This example shows that simple additivity rules fail. As in the example with the
five- and six-membered-ring derivatives 4.8 and 4.9, the balance of the residual
mobility, partial solvation of the binding pocket, and quality of the formed inter-
actions exert a decisive influence on the increase in affinity. The interplay of these
4.11 Lessons for Drug Design 85

N
N
H
H O N
O N O
O
ΔΔG = -3.1 kJ/mol

Cl
Cl
4.10 ΔG = -19.9 kJ/mol 4.12 ΔG = -23.0 kJ/mol

ΔΔ
G=
-18
.6 k
ΔΔG = -9.6 kJ/mol J/m ΔΔG = -15.5 kJ/mol
ol

NH2 NH2
N N
H H
O N O N
O O
ΔΔG = -9.0 kJ/mol

Cl Cl

4.13 ΔG = -29.5 kJ/mol 4.11 ΔG = -38.5 kJ/mol

Fig. 4.14 Optimization of the thrombin inhibitor 4.10 to 4.11 increases affinity by DDG = 18.6
kJ/mol. This is achieved by increasing the size of the hydrophobic side chain (red) from n-propyl
to phenyl and attaching an amino group (blue). The changes can also be accomplished in step-wise
fashion. Increasing the hydrophobic surface to 4.12 enhances affinity only by 3.1 kJ/mol, major
contribution of 15.5 kJ/mol is provided by the addition of the subsequently introduced amino
group. Adding first the amino group to feature 4.13, contributes 9.6 kJ/mol, and the subsequent
substitution of the hydrophobic substituent increases affinity by another 9 kJ/mol. Explanation
for the lack of additivity is found in the complex interference of residual mobility, desolvation and
strength of the formed enthalpic interactions.

partially compensating enthalpic and entropic binding contributions is responsible


for this complex picture.

4.11 Lessons for Drug Design

This chapter should not give the impression that a quantitative prediction about the
strength of protein–ligand interactions is impossible. Despite the complex character
of protein–ligand interactions, some simple rules should always be consulted first.
86 4 Protein–Ligand Interactions as the Basis for Drug Action

• Many strong protein–ligand interactions are characterized by extensive lipo-


philic contacts. An increase in the lipophilic contact area between the protein
and the ligand often leads to an improvement in the binding affinity. This means
that the search for unoccupied lipophilic pockets in the protein should be the first
step in the design and optimization of new ligands. Admittedly, this approach
should not be taken too far because a huge increase in the total lipophilicity of
a molecule increasingly reduces its water solubility.
• An additional H-bond does not guarantee an increase in the binding affinity.
An H-bond contributes to the total inventory if a stronger interaction of the
participating groups occurs in the protein–ligand complex compared to those in
bulk water. On the other hand, a buried polar atom that cannot be accommodated
with an H-bond almost always leads to a loss in binding affinity. It must be
ensured in ligand design that polar atoms find binding partners in case they
are no longer water-accessible in the formed protein–ligand complex.
• Each ligand displaces water molecules upon protein binding. There are binding
pockets in proteins that are formed in a way that they cannot be optimally
solvated by water. In these cases, a ligand can be in the position to form more
H-bonds to the protein than is possible with water. The binding affinity of such
ligands can be very high.
• Rigid ligands can bind more strongly than flexible ligands because the loss of
internal degrees of freedom is less for rigid ligands.
• Water can form strong H-bonds, but is often not as good a ligand for transition
metals as thiols, acids, hydroxamic acids, and related groups. Accordingly,
a direct interaction with the metal ion is important for most proteins that contain
a transition metal (▶ Chap. 25, “Inhibitors of Hydrolyzing Metalloenzymes”).
Generally, all protein–ligand interactions that either cannot at all or can only
very poorly be replaced by water contribute strongly to the affinity.
The relative contributions of enthalpy and entropy to the binding affinity DG,
the actual property that is to be optimized in drug design, are important for the
characterization of ligand binding. This goal can be achieved by improving
the enthalpic or entropic contributions, or optimally both in parallel. For this the
different parameters of the protein–ligand interaction must be concentrated upon
(▶ Sect. 8.8). The question is open whether an enthalpically or an entropically
driven binding is advantageous for a particular drug. The break-through strategy
will depend on whether the binding of the active substance will show adequate
tolerance for quickly developing resistance mutations (▶ Sects. 24.5, ▶ 31.4, and
▶ 32.5), high target selectivity, or even the desired broad binding promiscuity
toward multiple members of a protein family (▶ Sect. 25.6, ▶ 26.4, and ▶ 27.4).

4.12 Synopsis

• Emil Fisher introduced the “lock-and-key” principle to describe the interaction


of a small molecule substrate and a macromolecular receptor. More than
50 years later, Koshland extended this picture by induced-fit considerations
4.12 Synopsis 87

that allow both binding partners to change conformations and mutually adapt to
one another to optimally interact.
• The cells are surrounded by a lipid double-layer membrane with polar
head groups on the exterior and hydrophobic alkyl chains in the interior.
This membrane is a barrier for polar substances, but sufficiently lipophilic
compounds can penetrate and even pass through the membrane.
• The strength of protein–ligand interactions is measured by the binding constant,
which quantifies the stability of a protein–ligand complex as a dissociation
constant according to the law of mass action for complex formation.
• The binding constant is logarithmically related to the Gibbs free energy of
binding. The free energy is composed of an enthalpic and entropic contribution.
The enthalpic part summarizes all terms that relate to the interaction energy
of the binding partners. The entropic part considers the order of the system and
how its energy content is distributed over the degrees of freedom of the system.
• Protein–ligand complexes usually form through non-covalent interactions, pre-
dominantly through hydrogen bonds. The strength of hydrogen bonds strongly
depends on the distributions of charges among the interacting functional groups.
Whether a group is charged or not depends on its protonation state, which is
defined by the pKa value of the titratable groups involved in the protein–ligand
interactions.
• Depending on the local environment in a binding pocket, the pKa values of
titratable groups can vary significantly and can, by this, transform a normal
H-bond into a much stronger charge-assisted H-bond.
• Hydrophobic interactions form through the close proximity of non-polar
functional groups of the binding partners. As direct interactions, they are rather
weak. Nevertheless they can afford a significant contribution to binding affinity
through the release of water molecules from either the lipophilic environment of
the binding pocket or from the ligand surface next to a lipophilic surface patch.
• The strength of protein–ligand interactions is strongly influenced by the water
environment. Both the protein binding pocket and the ligand are solvated before
complex formation and functional groups of protein and ligand will form
H-bonds to water molecules. The total balance of the hydrogen-bond inventory
before and after complex formation matters for binding affinity considerations.
Only if the newly formed hydrogen bonds in the complex are increased in
number and/or stronger than those previously formed to water, a net affinity
increase results.
• The release of water molecules from hydrophobic surface patches can increase
affinity by enthalpy and entropy. Release of fixed water molecules increases the
degrees of freedom and boosts entropy. Replacement of highly disordered water
molecules into the bulk water environment can contribute to an enthalpic gain.
• Entropic contributions to binding arise from an increase of the degrees of
freedom of the protein–ligand–water system and, as a first approximation,
correlate with the size of the hydrophobic surface buried in the formed complex.
• Free energy variations are observed over a window of about 30–55 kJ/mol in
protein–ligand complexes. Variations in enthalpy (DH) and entropy (TDS) can
88 4 Protein–Ligand Interactions as the Basis for Drug Action

be much larger. This results from extensive enthalpy/entropy compensation.


Entropically favored increases in the degrees of freedom, release of water
molecules, or enhanced residual mobility are usually detrimental to improve-
ments in the enthalpy that result from strong interactions.
• The pronounced interdependence of enthalpy and entropy along with dynamic
versus interaction geometric phenomena causes simple additive rules about func-
tional group contributions to fail. Instead pronounced cooperative effects are in
operation.

Bibliography

General Literature

Andrews PR (1993) Drug-receptor interactions. In: Kubinyi H (ed) 3D-QSAR in drug design.
Theory, methods and applications. Escom, Leiden, pp 13–40
Andrews PR, Craik DJ, Martin JL (1984) Functional group contributions to drug-receptor
interactions. J Med Chem 27:1648–1657
Böhm HJ, Klebe G (1996) What can we learn from molecular recognition in protein-ligand
complexes for the design of new drugs? Angew Chem Int Ed Engl 35:2588–2614
Böhm H-J, Schneider G (2003) Protein-ligand interactions. From molecular recognition to drug
design. In: Mannhold R, Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in
medicinal chemistry. Wiley-VCH, Weinheim
Creighton TE (1992) Proteins: structures and molecular properties, 2nd edn. W.H. Freeman,
New York
Gohlke H, Klebe G (2002) Approaches to the description and prediction of binding affinity of
small-molecule ligands to macromolecular receptors. Angew Chem Int Ed Engl 41:2644–2676
Kuntz ID, Chen K, Sharp KA, Kollman PA (1999) The maximal affinity of ligands. Proc Natl Acad
Sci USA 96:9997–10002

Special Literature
Ehrlich P (1913) Chemotherapeutics: scientific principles, methods and results. Lancet 182:445–451
Fersht AR, Shi JP, Knill-Jones J et al (1985) Hydrogen bonding and biological specificity analysed
by protein engineering. Nature 314:235–238
Gerlach C, Smolinski M et al (2007) Thermodynamic inhibition profile of a cyclopentyl- and
a cyclohexyl derivative towards thrombin: the same, but for deviating reasons. Angew Chem
Int Ed Engl 46:8511–8514
Lichtenthaler FW (1994) 100 Years “Schluessel-Schloss-Prinzip”: what made Emil Fischer use
this analogy? Angew Chem Int Ed Engl 33:2364–2374
Mason RP, Rhodes DG, Herbette LG (1991) Reevaluating equilibrium and kinetic binding
parameters for lipophilic drugs based on a structural model for drug interaction with biological
membranes. J Med Chem 34:869–877
Morgan BP, Scholtz JM, Ballinger MD, Zipkin ID, Bartlett PA (1991) Differential binding energy:
a detailed evaluation of the influence of hydrogen-bonding and hydrophobic groups on the
inhibition of thermolysin by phosphorous-containing inhibitors. J Am Chem Soc 113:297–307
Petrova T, Steuber H et al (2005) Factorizing selectivity determinants of inhibitor binding toward
aldose and aldehyde reductases: structural and thermodynamic properties of the aldose reduc-
tase mutant Leu300Pro-Fidarestat complex. J Med Chem 48:5659–5665
Optical Activity and Biological Effect
5

The three-dimensional shape of a molecule has a decisive influence on its biological


activity. The configuration of a molecule is made up of the bonds between the
atoms. Substances with an asymmetric center that are considered here are
optically active and exist in two different forms. They are asymmetrically built
and have a relationship to one another like of an image and its mirror image. They
are called chiral. It is impossible to convert one form into the other without breaking
and remaking bonds. Chirality is often unimportant to chemists because the image
and mirror image behave exactly the same in a symmetrical environment. If they
are brought into an asymmetrical environment, for instance at the binding site of
a protein, that is not true anymore. The consequences of this for drug design and
therapy are the topic of this chapter.
At the beginning of the nineteenth century, Jean Baptiste Biot observed that
some quartz crystals rotated the plane of linearly polarized light to the right, and
others to the left. Macroscopically this optical activity is imprinted in the asym-
metric, handed (enantiomorphic) form of the crystals; they exist as left and right-
handed mirror-image forms. A little later, Biot found that not only crystals but also
organic compounds like turpentine oil or sugar solutions rotated polarized light in
a particular direction.

5.1 Louis Pasteur Sorts Crystals

The decisive experiment was carried out by the then 26-year-old Louis Pasteur in
Paris in 1848. Several literature reports were inconsistent with his theory that an
obvious relationship must exist between crystal forms and their optical properties.
During a careful investigation of the sodium–ammonium salt of the optically
inactive tartaric acid, he discovered that the crystals had different forms. They
were either right- or left-symmetrical and could be sorted by hand. The crystals of
the enantiomers 5.1 and 5.2 (Fig. 5.1) gave solutions that had an opposite rotational
direction. This confirmed his suspicion. Before Pasteur could present his results to
the Academy of Science, he had to repeat the experiment publically (!) in Biot’s

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_5, 89


# Springer-Verlag Berlin Heidelberg 2013
90 5 Optical Activity and Biological Effect

Mirror
plane
COOH COOH COOH
HO H H OH H OH
Inversion
H OH HO H H OH Symmetry
COOH COOH COOH
5.1 5.2 5.3
D-(-)-Tartaric acid L-(+)-Tartaric acid meso-Tartaric acid

Fig. 5.1 Optical isomerism in tartaric acid. The enantiomers ()-tartaric acid 5.1 (mp.
168–170  C, [a]D20 ¼ 12 ) and (+)-tartaric acid 5.2 (mp. 168–170  C, [a]D20 ¼ +12 ) cannot be
superimposed upon each other either in the plane of the paper or in 3D space. They have only
a twofold rotational axis (orange axes) that dissect the central C—C bond. Each mirror image
rotates the plane of polarized light in opposite directions to the other. In contrast, meso-tartaric acid
5.3 (mp. ¼ 140  C) has an inversion center of symmetry (the purple center on the central C—C
bond). Solutions of meso-tartaric acid have no optical activity because the contribution from each
stereogenic center compensates for the other. Racemic tartaric acid (mp. ¼ 206  C, no rotation) is
a 1:1 mixture of both enantiomers of tartaric acid 5.1 and 5.2. Such mixtures are optically inactive
and are called racemates (Lat. racemus, the grape—tartaric acid is found in grapes and wine).

presence at the Collège de France. He was lucky. It was only because his solutions
were allowed to slowly evaporate at room temperature that his experiment was
successful. Above the critical temperature of 28  C, a stoichiometric 1:1 mixture of
both enantiomeric forms, a racemate, would have homogeneously crystallized
(Sect. 5.4).
A few years later Pasteur managed another important observation: mold con-
tamination of a racemic tartaric acid solution caused optical activity to develop.
One enantiomer of tartaric acid is metabolized significantly faster than the other.
With this, he discovered two important methods to separate racemates into enan-
tiomers. Whereas mechanical sorting is limited to a very few examples, enzymatic
kinetic resolution of enantiomers has found broad applications (Sect. 5.4).

5.2 Structural Basis of Optical Activity

An explanation for optical isomerism was possible with the help of the theory of
tetrahedral carbon, which was independently developed in 1874 by Jacobus
5.2 Structural Basis of Optical Activity 91

5.4 Twistane

O
N
N
N
N
5.6
O N

5.5 Methalqualone N
O

Fig. 5.2 Even molecules without stereogenic centers can form an image–mirror-image pair
because of their spatial construction; an example is twistane 5.4. If rotation around the bonds is
limited, as in the case of the sedative methaqualone 5.5, enantiomers are separable (so-called
atropisomers). In non-planar fused ring systems like the dibenzocycloheptadiene derivative 5.6,
the enantiomeric separation depends on the barrier of inversion for the ring system.

Henricus van’t Hoff and Joseph-Achille Le Bel. When a carbon atom carries four
different substituents an asymmetric, or, as it is sometimes called, a stereogenic
center is produced. This property is not limited to carbon; nitrogen (in ammonium
salts), or silicon atoms with four different substituents, phosphorus, for instance, in
phosphonic or phosphoric acid esters, or even sulfur atoms in sulfoxides (with two
different substituents, oxygen, and the lone electron pair) can also be asymmetric. The
spatial orientation of these compounds give rise to two mirror-image isomers, each of
which rotates polarized light in the opposite direction to the same degree. These forms
are called enantiomers (earlier antipodes). With the exception of their optical
activity, enantiomers are identical in all of their chemical and physicochemical
properties, but only as long as they are in an achiral environment.
Compounds with two chiral centers that are configured as an image and mirror
image within the same molecule do not exhibit optical activity macroscopically.
meso-Tartaric acid 5.3 (Fig. 5.1), an inversion-symmetrical molecule, exists as
a racemic mixture of chiral conformers. Each conformer exists as an “internal”
racemic mixture because in one energetically favored conformation the molecule
exhibits inversion symmetry. Its left part can be inverted by point reflection through
the center of the central C—C bond into its right part. Optical activity is also present in
other forms of molecular asymmetry. An example is any regular or irregular tetrahe-
dral orientation of different substituents on any other scaffold than a single carbon
atom. Another case can be found in compounds in which two groups are strongly
rotationally hindered around a common bond. An asymmetrical center results, giving
rise to optically active rotational isomers, so-called atropisomers (Fig. 5.2).
The experimentally determined rotational value (+) or () (previously called
d or l) is used to characterize enantiomeric compounds. The spatial configuration
of a stereogenic center in a molecule is described as D or L (Lat. dextro, levo).
This notation is based on the Fischer convention and is related to the absolute
92 5 Optical Activity and Biological Effect

Fischer Projection
CHO
H OH
CHO CHO
HO H
H OH HO H
H OH
CH2OH CH2OH
5.7 5.8 H OH
CH2OH
D-Glyceraldehyde L-Glyceraldehyde

5.9 D-Glucose

Stereoprojection

CHO CHO COOH


H OH HO H H2N H
CH2OH CH2OH

5.7 5.8 5.10 L-Alanine

Fig. 5.3 The rotation (+ or ) and the Fischer assignment (D or L) is reported as part of the
characterization of optically active compounds. To determine the Fischer assignment, the longest
carbon chain is drawn vertically with the highest-oxidized carbon atom on top (e.g., 5.9). The
standard is set by the asymmetric carbon (red) of the D- and L-glyceraldehyde pair (5.7 and 5.8).
With sugars (e.g., glucose 5.9) or amino acids (e.g., alanine 5.10), the carbon that is marked with
the arrow decides whether the molecule is D or L.

configuration of D- and L-glyceraldehyde, 5.7 and 5.8 (Fig. 5.3). Most sugars, for
instance glucose 5.9, can be traced back to D-glyceraldehyde 5.7, and the natural
amino acids of proteins, for instance alanine 5.10, can be traced back to
L-glyceraldehyde 5.8. For this reason, today the D/L nomenclature is still frequently
applied to sugars and amino acids. The enantiomers of tartaric acid correspond to
the D-() or L-(+) form.
The Cahn–Ingold–Prelog rule allows an unambiguous stereochemical assign-
ment (Fig. 5.4). According to the convention, the optical center is oriented so that
the substituent with the smallest atomic number is at the back (e.g., a hydrogen
atom or a lone pair of electrons). To use an intuitive explanatory model, we want to
assign this substituent to be the column of a steering wheel. Then the other sub-
stituents lie in the plane of the steering wheel. If these substituents are regarded in
descending order according to the atomic number, and this sequence follows
a rotation to the right, the stereogenic center has an R configuration; the opposite
direction is the S configuration (from the Latin: rectus and sinister). The only
disadvantage to this nomenclature system is that the assignment of the stereocenter
can change just because of the atomic number, valency, or oxidation state. The
homologous L-amino acids serine and cysteine, which are structurally stereochem-
ical analogues that differ only in that an oxygen is exchanged for a sulfur atom, are
classified as (S)-serine and (R)-cysteine.
If one stereogenic center is present in a molecule, there are two enantiomers.
Each additional symmetry-independent stereocenter increases the number of
5.2 Structural Basis of Optical Activity 93

Cahn–ngold–Prelog Rules
• Large atomic numbers have priority over low ones, (e.g., Br>Cl>F>O>N>C>H)
• Free electron pairs always have the lowest priority
• Larger atomic masses have priority, (e.g., for isotopes D>H)
• In case the first sphere is identical, (i.e., C), the next sphere is considered
CH3 CH3 CH3 H
CH3 > > >
H H H
CH3 CH3 H H
C[C+C+C] > C[C+C+H] > C[C+H+H] > C[H+H+H]
CH3 CH3 CH3 H CH3 H CH3 H
> > > > > > >
F F OH OH NH2 NH2 CH3
CH3 H CH3 CH3
H CH3 H CH3 H
• Multiple bonds are considered as multiple single bonds, e.g., aldehyde
CHO = C (O+O+H)>CH2OH = (O+H+H)
• If the substituents are chiral, the R>S and R,R>R>S and S,S>S,R
• In the case of differently configurated double bonds Z>E
(Z = zusammen = together and E = entgegen = apart for the configuration of double bonds)

CHO CHO
H H
HO CH2OH HOH2C OH
(R)-Glyceraldehyde (S)-Glyceraldehyde
5.7 5.8

Fig. 5.4 The R/S nomenclature that was proposed by R. S. Cahn, C. K. Ingold, and V. Prelog is
unambiguous. Priority rules for each of the four different substituents on the tetrahedral
stereogenic center were established. The substituent with the lowest priority is placed in the
back, and the direction of remaining substituents determine the direction of rotation by decreasing
priority.

enantiomers by a factor of 2. For n asymmetric centers, there are 2n optical isomers.


They occur as 2n1 racemic mixtures because each has two isomers that behave as
mirror images of each other. Diastereomers cannot be superimposed onto each
other by any translation and rotation in space or by generating a mirror image
because the chirality of the stereocenters differs relative to each other. As a result
they have different physicochemical and chemical properties. All pairwise race-
mates of a diastereomeric mixture are present as a 1:1 mixture of enantiomers, but
their relative portions in the total composition can vary greatly. Labetalol 5.11
(Fig. 5.5) is just such a diastereomer pair that consists of two racemates, that is, two
enantiomeric pairs. As a mixed antagonist, it affects the a-, b1-, and b2-adrenergic
receptors (cf. ▶ Sect. 29.3). Because of the asymmetric architecture of biological
macromolecules, the individual components of this mixture vary significantly in
their quantitative and qualitative biological properties (Sect. 5.5, 5.7).
94 5 Optical Activity and Biological Effect

Fig. 5.5 Because it has two


O OH
different asymmetric centers, H
labetalol 5.11 is N
H2N *
a diastereomeric mixture of *
four different compounds CH3
HO
with different activities on the 5.11 Labetalol
same receptor. The
antagonistic potency on the a1 H OH
H
HO H H
receptor of the (R,R)-, (R,S)-, N R2 N R2
(S,R)-, and (S,S)-isomers is: R1 R1
S,R >>S,SR,R>R,S; and H CH3 H3C H
on the b1 receptor is: (R,R) (S,S)
R,R >>R,S>S,SS,R;
and on the b2 receptor is: H OH HO H
H H
R,R >>R,S >>S,SS,R. N R2 N R2
R1 R1
H3C H H CH3
(R,S) (S,R)

5.3 The Isolation, Synthesis, and Biosynthesis of Enantiomers

Racemic acids and bases can often be separated by using other enantiomerically
pure, optically active bases and acids, as the formed diastereomeric salts of which
have different solubility. The chemical reaction of racemic acids, amines, and
alcohols with optically active alcohols or acids results in diastereomeric reaction
products. Because of their different characteristics, it is possible to separate them
and finally isolate the desired optically active product by chemical cleavage.
Syntheses that do not start with optically active starting materials, and that use
no optically active auxiliaries, always lead to racemic mixtures, that is, an exact
50:50 mixture of both enantiomers. Access to optically active compounds can be
obtained when synthetic reaction components are taken from the “chiral pool”.
Here, all optically active natural products, their derivatives, and degradation prod-
ucts that are available in an optically pure form can be used as easily accessible
synthetic building blocks. Syntheses with chiral catalysts are particularly elegant. In
most cases the optimization of the yield and enantiomeric purity, which is
expressed as the ee value (ee¼enantiomeric excess) requires considerable process
development. The chromatographic separation of racemates on optically active solid
supports is more appropriate for semipreparative or analytical purposes.
Enzymatic and biotechnological techniques have increasingly gained favor in
the last years. Proteases, esterases, lipases, or hydantoinases react more or less
selectively, preferentially with a distinctly different reaction rate; only one enan-
tiomer of a racemic mixture is transformed to the product. The selectivity and yield
of such a reaction can be optimized through the careful selection of the medium and
other reaction conditions.
The production of optically pure ephedrine is an example of an industrial
application of biotechnological synthesis that has been in use for decades. This
phytopharmacon is found in combination preparations for the adjuvant therapy of
5.4 Lipases Separate Racemates 95

Fig. 5.6 The CHO


biotechnological production O yeast
of ephedrine is accomplished + sugar
H3C COOH
by the fermentation of sugar
with baker’s yeast
Saccharomyces cerevisiae Benzaldehyde + Pyruvic acid
to pyruvic acid. Pyruvic acid
is coupled to benzyldehyde Yeast
with decarboxylation to form
(R)-(–)-1-hydroxy-1- H OH
phenylacetone 5.12. Upon CH3
further chemical
5.12 (R )-(-)-1-Hydroxy-
transformation (1R,2 S)-(–)- O
1-phenylacetone
ephedrine 5.13 is obtained in
optically pure form. (1S,2S)-
(+)-pseudoephedrine 5.14 is CH3NH2/H2/Pt
a diastereomer of ephedrine.
The configuration of one of H OH HO H
the two chiral centers is NHCH3 NHCH3
different.
H CH3 H CH3

5.13 (1R,2S)-(-)- 5.14 (1S,2S)-(+)-


Ephedrine Pseudoephedrine

rhinitis, bronchitis, and asthma. The synthetic intermediate 5.12 (Fig. 5.6) is
obtained from a mixture of benzaldehyde, sugar, and yeast. It is then transformed
to (1R,2S)-()-ephedrine 5.13, which is identical to the natural product in both of
its optical centers. The C1 isomer (1S,2S)-(+)-pseudoephedrine 5.14 is
a diastereomer of ephedrine. Its optical rotation, melting point, and biological
characteristics are different from ephedrine’s.
Innumerable other microbial syntheses deliver optically pure products with or
without the use of achiral, racemic, or enantiomerically pure starting materials. The
biotechnological syntheses of a variety of antibiotics, above all the penicillins and
cephalosporins (▶ Sects. 2.4 and ▶ 23.7), are of particular economic importance.
Even the biotechnological preparation of synthetic intermediates for chiral drugs is
gaining increasing importance.

5.4 Lipases Separate Racemates

Because of their asymmetric architecture, lipases are well suited to separate race-
mates. This can either happen if one of the two enantiomers binds as a substrate
better and reacts faster, or if a chemical reaction takes place in the binding pocket of
the protein with disparate efficiency. Lipases are often used for kinetic resolution
because their architecture and their lipophilic surface allow them to sustain their
reactivity in organic solvents. They belong to a larger family of hydrolyzing
enzymes (▶ Chap. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme
96 5 Optical Activity and Biological Effect

5.16 5.15

NH2 NH2

ΔG E-S
ΔR-SΔG = −19.4 ± 6 kJ/mol
E-R
ΔR-SG
ΔR-SH
ΔSG ΔR G
E-A
E+S E+R TΔR-SS
(S)-Amine + + (R)-Amine

Reaction coordinate

Fig. 5.7 The reaction of (R)- and (S)-phenylethylamine, 5.15 and 5.16, with Candida antarctica
lipase begins with the formation of an acyl–enzyme complex, E–A. The faster-reacting R-amine
5.15 (red) forms a lower-energy transition state that leads to the free enzyme and the R-amide
(E+R). Analogously the S-amide (E+S) forms from the higher-energy E–S transition state (blue)
from the S-amine 5.16. Difference in DG{ is 19.4 kJ/mol and favors the R form. The DG{
difference is based on a combined enthalpic and entropic contribution in which the R form is
enthalpically favored, and entropically disfavored. The S form is enthalpically disfavored but has
an entropic advantage.

Intermediate”). A nucleophilic serine is present in the catalytic center that forms an


acyl–enzyme complex upon hydrolysis of an amide or ester substrate. The protein is
then itself converted to an ester through the OH group of the serine, the so-called
acyl form (▶ Sect. 23.2). Such a complex can then react with another nucleophile,
for instance an amine. The amine attacks the internal enzyme ester, the bond to the
serine oxygen atom is broken, and a new amide bond is formed. If one employs the
right or left-handed form of an amine, one form will react preferentially. In this
way, the racemate is resolved.
How does the enzyme manage to distinguish between both enantiomers of an
amine? The reaction of (R)- and (S)-phenylethylamine 5.15 and 5.16 with the lipase
Candida antarctica was carefully investigated (Fig. 5.7). The energy barrier for the
faster-reacting R form is lower than for the slower S form. A more exact evaluation of
the kinetic parameters showed that this is above all due to an enthalpic advantage of
the (R)-amine. The S form has an entropic advantage. Altogether the enthalpic
component is in excess so that the free energy (DG) favors the R form (Fig. 5.7).
How is this discrimination to be understood? Structural transition-state analogues
were synthesized. In the place of the unstable tetrahedral carbon atom intermediate,
a phosphorus atom was introduced (5.17 and 5.18, Fig. 5.8). This trick gives a stable
compound that is very similar to the transition state form at the carbon atom. These
analogues were synthesized with both enantiomeric amines, and complexes with the
lipase were prepared. Marco Bocola managed to get a crystal structure of both.
5.4 Lipases Separate Racemates 97

Fig. 5.8 Shown above is a


a phosphorous transition state O CH3
analogue 5.18 for the lipase O P Trp104
with the (S)-amine (a). The N
crystal structure and 5.18 O H
simulations indicate that it is N
less-rigidly fixed in the Ser HN
transition state and rarely N
adopts the geometry with an H
H-bond (purple) to histidine His224
(on the lower edge of the
binding pocket) that is
necessary for the reaction to
occur. The relevant complex
with the transition-state
analogue 5.17 of the faster-
reacting (R)-amine is shown
in (b). This substrate is highly
restricted in the binding
pocket. Its methyl group
(above right) is embedded in
a small niche in the binding
pocket. This substrate
exclusively adopts the
geometry with the H-bond to
histamine. This orientation is b O CH3
required for a successful O P Trp104
substrate reaction. Therefore, N
the (R)-amine 5.17 reacts with 5.17 O H
the enzyme faster. N
Ser HN
N
H
His224
98 5 Optical Activity and Biological Effect

Interestingly the transition-state analogue of the faster-reacting R form fits into the
binding pocket well (Fig. 5.8). On the other hand, the S form demonstrated great
residual mobility in the catalytic center. Computer simulations and molecular dynam-
ics with both forms confirmed the picture: whereas the R analogue had a well-defined
and temporally stable geometry, which is ideal for the reaction, the S analogue is very
mobile and rarely adopts an orientation that is productive for the catalytic reaction in
the lipase. Therefore a successful reaction of this substrate occurs much less often. On
the other hand, the R analogue, fixed in a vice-like clamp and waiting for its reaction,
forms good enthalpic contacts with the enzyme. It takes on a form that is practically
complementary to the enzyme pocket. This results in a large enthalpic advantage. The
fixation has its entropic price though. The methyl group on the stereogenic center
embeds itself in a small niche in the binding pocket. The S analogue does not have
this possibility because its methyl group is oriented in the mirrored direction. In this
case, the anchor that can be embedded in the binding pocket is missing. It has a high
mobility in the catalytic center and does not lose as many degrees of freedom
compared to the situation before enzyme binding. Entropically this is advantageous.
Enthalpically, however, the substrate loses a good interaction and the complementary
fit is rarely achieved. In the end, the enthalpic component prevails so that the
(R)-amine is transformed significantly faster. This is more than enough to ensure
that, in practice, only the (R)-amide is formed in high yield. This lipase can also be
immobilized onto a solid support and loaded into a glass column. After the acyl form
is prepared on the column, a racemic mixture of the amine only need to be poured
onto the column. The (S)-amine and (R)-amide must then simply be collected in
a flask. If the solvent is well chosen, the amide crystallizes directly from the solution,
and can be mechanically separated.
Interestingly, the enantiopreference of the kinetic resolution is lost with
increasing temperature or enlargement of the enzyme pocket. An enlargement can
be achieved by exchanging a tryptophan along the rim of the catalytic pocket for
a histidine. The higher temperature or increased space in the binding pocket
increases the mobility of both substrates in the lipase. The enthalpic advantage of
the faster-reacting R-amine is lost. The entropic difference of both substrates levels
out under these conditions.
This example shows on a molecular level how a lipase achieves kinetic resolution.
With knowledge of the energetic parameters and structural information, an attempt
can be made to tailor lipases for other transformations. Because of the importance of
such reactions, the targeted design of enzyme catalysts has developed into an ever
more important theme for the synthesis of chiral building blocks in new drugs.

5.5 Differences in the Activity of Enantiomers

Flora and fauna stand out because of their symmetry. Consider the face, the arms and
legs, the ribs, or an orchid flower. The exceptions, for instance a snail shell, are rare or
occur, as in the case of the flounder, only under special evolutionary conditions. The
inner organs of vertebrates are oriented partially paired and partially asymmetrically.
5.5 Differences in the Activity of Enantiomers 99

On the molecular level, there is no correlating symmetry: optically active


building blocks prevail. All specific interaction partners of biologically active
molecules are chiral. Enzymes and receptors are built of L-amino acids. Nucleic
acids are built on a scaffold of D-ribose or D-deoxyribose building blocks. Most
naturally occurring sugars have a D configuration. Important vitamins, hormones,
and messengers exist in an optically homogenous form. Accordingly it is to be
anticipated that enantiomers of an optically active ligand have different effects.
This has been proven with many thousand examples. Enantiomers most often show
significant differences in their efficacy and the quality of their effect.
According to the suggestion of Everhardus J. Ariëns, biologically active enan-
tiomers are referred to as eutomers, and inactive enantiomers as distomers. The
quotient of both affinities or effects is defined as the eudismic ratio, and the
logarithm of this value is called the eudismic index. It should be considered that
this value must be determined on extremely pure compounds. As little as 1% of the
eutomer as an impurity in an entirely inactive distomer can simulate 1% relative
activity in the distomer!
The more the activities of enantiomers in a racemic pair differ, the stronger the
eudismic ratio drifts away from 1. Examples of this are given in the compounds
5.20–5.22 (Fig. 5.9). A eudismic ratio of 500,000 was measured for a chloride ion
transporter inhibitor. In this case the chemists pulled out all the stops for the
purification of the less-effective enantiomer. Theoretically, a nanomolar-effective
compound should give even higher values. A few naturally occurring peptide
antibiotics contain D-amino acids. This affords them better metabolic stability.
For the same reason, D-amino acids are incorporated into many synthetic peptide
molecules. In the best cases, a stronger and longer-acting analogue is obtained.
Synthetic analogues of peptides with a retro–inverso configuration represent
a special case. The direction of the peptide chain, or a part of the peptide chain is
reversed in these cases, that is, compared to the original peptide, the amino and
carboxyl groups of single amino acids are reversed. In order to maintain the relative
configuration, D-amino acids or their analogues are used instead of L-amino acids.
In this way it is possible to deceive some enzymes or receptors; they bind the
natural peptide and the retro–inverso peptide in the same way. This is true for
thiorphan 5.23 and its retro–inverso analogue 5.24, for two enzymes, but not for
a third one (Fig. 5.10). As a general rule retro–inverso peptides are metabolically
more stable than their original peptide analogues.
Enantiomers differ not only in the strength of their effects, but also the qualities.
These differences can manifest as undesirable side effects of the antipode, for
instance the chiral barbiturate 5.25 (Fig. 5.11). The most severe drug side effect
of the last 50 years was the embryonal malformations that were caused by the
sleeping pill thalidomide 5.26 (Contergan ®); these were caused by one of the two
enantiomers (Fig. 5.11). In the 1950s, thalidomide was claimed to be the best-
tolerated sleeping pill, with the fewest side effects. In 1957 it was introduced to the
market and was available in pharmacies as an over-the-counter drug. There were no
concerns that even women in the first months of their pregnancies were taking these
sleeping pills. In 1961 it was withdrawn from the market because of its teratogenic
100 5 Optical Activity and Biological Effect

CH3
Eudismic
O * N CH3 Ratio
H
H OH
b-Blockade 100

Membrane effect 1
5.19 Propranolol

O H CH3 CH3
+ N CH3
H3C O * Cholinergic effect 320
CH3

5.20 Metacholine

O H CH3 CH3
+ N CH3
* O * CH3
OH
Ester group center 50–100

Amino alcohol center 2–4

5.21 Anticholinergic agent

OH
*
t-Bu
N α1 Receptor 73
*
H H D2 Receptor 1250
*
5 HT1 Receptor 8
5 HT2 Receptor 73
Muscarinic Receptor 0.5

5.22 Butaclamol,
(+)-Enantiomer

Fig. 5.9 Enantiomers have different biological effects. The eudismic ratio of propanolol 5.19 is
100 for b-antagonism, and for unspecific membrane interaction, it is, expectedly, 1. Identical
partial structures can have entirely different eudismic ratios, for instance compare the optical
center of the alcohol moiety of the cholinergic compound metacholine 5.20, with the identical
center on the anticholinergic compound 5.21. Compound 5.21 also proves that the eudismic ratio
of different centers in a compound are independent from each other. The example butaclamol 5.22
also shows that the same substance can have different eudismic ratios on different receptors.

effects. If drug testing were then what it is today, this catastrophe would certainly
have been recognized earlier and probably largely avoided. This would not have
been prevented by the administration of only one enantiomer. Both enantiomers
racemize in vitro, that is, one converts into the other even in a test tube.
5.5 Differences in the Activity of Enantiomers 101

Enzyme Ki Value in mmol

NEP 24.11 0.0019


H
HS N COOH Thermolysin 1.8

O ACE 0.14
5.23 Thiorphan

NEP 24.11 0.0023

O 2.3
Thermolysin
HS COOH
N ACE >10
H
5.24 retro -Thiorphan

Fig. 5.10 Thiorphan 5.23 inhibits the metabolism of enkephalins and contains
a b-mercaptopropionic acid, the absolute configuration of which is analogous to L-phenylalanine.
Application of the retro–inverso concept gives aminothiol 5.24, the absolute configuration of
which corresponds to D-phenylalanine. The identical binding mode to the zinc protease was
determined for both thiorphan 5.23 and retro-thiorphan 5.24. Thiorphan and neutral endopeptidase
24.11 (NEP 24.11, previously referred to as enkephalinase) are inhibited by both compounds to the
same extent. On the other hand, angiotensin-converting enzyme (ACE), another zinc protease,
discriminates decidedly between these substances.

Accordingly, the effect was confirmed in vivo after administration of the suppos-
edly safe enantiomer led to teratogenic effects in an animal model.
The “other” enantiomer can also open new therapeutic opportunities. The
enantiomer of a synthetic opiate, for instance propoxyphene 5.27 (Fig. 5.11) has
weak analgesic and narcotic effects, but good cough-suppressing effects. Enantio-
mers can also influence each other in their effects, and even cancel one another out.
In the case of the calcium channel ligand 5.28, one enantiomer is an agonist and the
other is an antagonist.
In the time period between 1983 and 2002, 38% of all approved drugs were
achiral, 39% were enantiomerically pure, and 23% were racemic or diastereo-
meric mixtures. The fact is that racemic mixtures of chiral drugs were much more
easily accepted in earlier decades than they are today. This was certainly not
caused by a stereophobia on the part of the chemical industry. It was more an
expression of inadequate understanding of the stereospecificity and side effects,
and perhaps also because economic considerations were in the foreground; kinetic
resolution and/or enantiomerically pure syntheses are very expensive. You can
certainly see that the proportion of enantiomerically pure drugs is gaining in the
marketplace (Fig. 5.12).
In the 1970s, Ariëns was the first to decisively come out against the use of
racemic mixtures in therapy. Racemates are, in his view, compounds with 50%
impurity. The non-active or less-active enantiomer is seen as enantiomeric ballast.
He used the diastereomeric mixture labetalol 5.11 (Fig. 5.5, Sect. 5.2) as a showcase
102 5 Optical Activity and Biological Effect

O
O CH3
N O
N *
* O
N
N O H
H O
O
5.26 Thalidomide
5.25 N-Methyl-5-phenyl-5
- propylbarbiturate

CF3
OCOEt H
* O2N * COOMe
CH3
* N
H CH3 CH3 H3C N CH3
H
5.27 Propoxyphene 5.28 Bay K 8644

Fig. 5.11 Enantiomers also differ in their mode of action. The (R)-()-enantiomer of barbiturate
5.25 is a hypnotic agent, whereas the (S)-(+)-enantiomer causes seizures. In rats and mice only the
(S)-()-enantiomer of thalidomide 5.26 (Contergan ®) is teratogenic, that is, it causes
embryopathies. Thalidomide 5.26 racemizes in vitro as well as in rabbits. Therefore even the
(R)-(+)-enantiomer is teratogenic in rabbits. Propoxyphene 5.27 is a potent analgesic, the effect of
which depends on the (2S,3R)-(+) enantiomer, dextropropoxyphene. The (2R,3S)-()-enantiomer
is a cough suppressant. The (R)-(+)-enantiomer of Bay K 8644 5.28 is a weak calcium channel
blocker. The (S)-()-enantiomer stabilizes calcium channels in the open form and is therefore an
agonist, that is, a calcium channel opener.

example, which is not a “mixed a,b-antagonist” but rather a mixture of four


different drugs. The effect of this “combination” is a result of the effects of each
enantiomer. In most cases Ariëns criticism is fully justified.
It must be ensured that the biological activity is as specific as possible, and the
side effects are minimal in the design and development of new drugs. Compound
uniformity is usually easier to achieve for an enantiomer than for a racemate,
which is a mixture of two substances, or even for a diastereomeric mixture.
The choice of the correct enantiomer can even reduce or prevent undesirable side
effects of metabolites. Selegilin 5.29, a monoamine oxidase inhibitor, is metabo-
lized to the CNS-effective compounds methamphetamine 5.30 and amphetamine
5.31 (Fig. 5.13). The more-active enantiomer of 5.29 luckily forms the less active of
these two metabolites! If the correct enantiomer of the racemate is used, the desired
effect is increased and the undesired CNS side effects are reduced.
There are also a few counter examples. The ()-enantiomer of the calcium
channel blocker verapamil (▶ Sect. 2.6) is more effective than the (+)-enantiomer.
The therapeutic spectrum of both enantiomers is practically identical. After
oral application, the ()-enantiomer is quickly metabolized. Therefore the
(+)-enantiomer contributes substantially to the desired effect. In this case it would
not be economical to try to separate the racemic mixture.
5.5 Differences in the Activity of Enantiomers 103

Fig. 5.12 The proportion of achiral, enantiomerically pure, and racemic or diastereomeric drugs
approved in the period from 1983 to 2003. In the meantime, the proportion of newly approved
drugs has shifted decidedly in the direction of enantiomerically pure compounds.

CH3 R
CN NH
* N *

CH3 Metabolism CH3

5.29 5.30 R = CH3


5.31 R = H

Fig. 5.13 Upon metabolism of the monoamine oxidase inhibitor, selegilin 5.29, which is used to
treat Parkinson’s disease, the more potent (R)-(–)-enantiomer is converted to methamphetamine
5.30 and amphetamine 5.31. The less-active (S)-(+)-selegilin has less severe side effects because it
is not metabolized to CNS-active stimulants.

Ibuprofen 5.32, an anti-inflammatory drug of the arylpropionic acid class


(Fig. 5.14 and ▶ Sect. 27.9), is a special case. The potency of the enantiomers are
very different in vitro. In vivo, however, the inactive (R)-()-enantiomer is
converted to a large extent to the (S)-(+)-enantiomer. The reverse reaction does
not take place. Therefore the racemate and each enantiomer are therapeutically
identical, even at the same dose. Only the side-effect spectrum is different because
the inversion of the (R)-()-enantiomer is not 100% complete.
Sometimes the effort to produce a pure enantiomer is hardly justifiable. In
such cases the effects and the side effects of both forms must be compared.
104 5 Optical Activity and Biological Effect

H CH3 H3C H
Metabolic
COOH Inversion * COOH

H3C CH3 No Inversion H3C CH3

5.32 (R)-(-)-Form 5.32 (S)-(+)-Form

Fig. 5.14 The (R)-()-enantiomer of ibuprofen 5.32 undergoes a metabolic inversion of its
stereocenter, and the (S)-(+)-enantiomer is formed. As a cyclooxygenase inhibitor in vitro, the
(S)-(+)-form is more potent than the (R)-()-form. The less-active form is converted to the more-
active enantiomer in vivo. Therefore both compounds exhibit equally anti-inflammatory properties
in animal models.

According to the result, in special cases the continued use of the racemate or the
development of an achiral analogue can be considered. At any rate, today these data
must be complete before the drug can receive approval.

5.6 Image and Mirror Image: Why Is It Different for the


Receptor?

Enantiomers and diastereomers have different biological characteristics because the


proteins to which they bind have a handedness. They occur naturally in only one
form. The amino acids with their chiral centers and the secondary structural
elements (▶ Sect. 14.2) with their helical rotational direction are responsible for
these properties. If a protein is offered a left or right-handed ligand, different
binding modes are to be expected, just as two right hands come together to shake
hands more easily than a right and a left hand can.
Up to now only a few successful examples of the structure determination of
protein–ligand complexes have been reported with the ligand bound in the left as
well as right-handed form. This is only possible when both enantiomers have
enough affinity for the target protein, that is, they both bind so strongly to the
protein that an X-ray crystal structure could be determined.
The R- and S-enantiomers of the compound BX5633 (5.33) inhibit the serine
protease trypsin (▶ Sect. 23.3) equally well. They have a stereogenic center next to
an acid group. The crystal structure determination explains this lack of discrimina-
tion. The inhibitor’s acid group is oriented outside of the binding pocket so that no
specific interaction is to be expected (Fig. 5.15). A stereopreference cannot exist.
Both enantiomers 5.34 and 5.35 bind to carbonic anhydrase II, a zinc hydrolase
(▶ Sect. 25.7). There is a difference of a factor of 100 in their affinities. As the
X-ray structure with both enantiomers shows, they have similar binding modes
(Fig. 5.16). All properties that relate to the solvation of the ligands must be the same
for both enantiomers. The difference in affinity is therefore only caused by differ-
ences in the binding mode. The sulfonamide groups of both enantiomeric ligands
5.7 An Excursion in the World of Antipodes 105

Fig. 5.15 The (R)- (gray) COO− NH


and (S)-enantiomers (beige) R,S
H2N +
of the inhibitor BX5633 5.33
NH
bind with the same affinity to +
trypsin. Because the protein NH2
5.33 O
adopts practically the same
geometry with both inhibitors,
only one structure is shown.
The crystal structure shows
that both have almost
identical binding modes. The
acid function on the
stereogenic center points out
of the binding pocket and into
the surrounding aqueous
medium. Therefore no
stereochemical discrimination
can take place.

bind almost identically to the catalytic zinc. Further, the endocyclic SO2 group
forms very similar hydrogen bonds to Gln92. The hydrophobic isobutyl side chains
are in similar parts of the binding pockets. The six-membered ring, however must
adopt a conformation in the case of the more-weakly binding enantiomer that is
highly strained. The price for taking on this strained conformation is paid for in the
reduced binding affinity to the enzyme.
The enantiomeric agonists 5.36 and 5.37 bind in the ligand-binding domain of
the retinoic acid receptor with a difference of a factor of 1,000 (▶ Sect. 28.2). The
receptor itself adopts the same geometry (Fig. 5.17). The alcohol function in the
middle of the molecule is at the stereogenic center. In both cases, the hydrogen
bond to Met272 is formed. As a result, the neighboring amide must take on
a deviating orientation in the binding pocket. On the “right” side, the tetraline
moiety for both stereoisomers is in a similar place. On the “left” side, the benzoic
acid moiety of both enantiomers form a hydrogen-bond network with Arg278,
Ser289, and Leu233. The fluorine-substituted benzene ring adopts in both cases
a 180 flipped orientation. These different orientations, together with the diver-
gently oriented amide bond are responsible for the severe difference in the binding
affinity of the mirror-image agonists.

5.7 An Excursion in the World of Antipodes

Experience has taught us that if an enantiomer crystallizes with a particular auxil-


iary base or acid, the other enantiomer will crystallize with the antipode of the
auxiliary in the same way if the identical reaction conditions are applied. Poly-
peptides composed of L-amino acids form right-handed helices, and polypeptides
made of D-amino acids form left-handed helices.
106 5 Optical Activity and Biological Effect

O O
O NH2 O NH2
S S

S S
O O
S N S N
O O

5.34 5.35

Fig. 5.16 The enantiomeric sulfonamides 5.34 (gray) and 5.35 (beige) bind in a similar way to
the enzyme carbonic anhydrase. Because the protein adopts practically the same geometry with
both inhibitors, only one structure is shown. The zinc ion in the catalytic center (purple sphere) is
coordinated to the sulfonamide groups. The SO2 groups in the six-membered ring form a hydrogen
bond to Gln92 (green). The hydrophobic isobutylamino moieties on the chiral centers project into
a hydrophobic pocket and fill this out to the same extent. In doing this, the six-membered ring must
adopt a deviating conformation in both enantiomers. In one stereoisomer this conformation is
much more strained than in the other, and causes a loss in binding affinity.

Some naturally occurring peptides form ion channels in lipid layers. Their
synthetic antipodes are also able to do this. The more interesting question is: how
does the mirror image of an enzyme behave? In 1992 Stephan Kent and co-workers
prepared HIV protease, a homodimer made up of 299 amino acids, entirely from
D-amino acids. The naturally occurring protein was also prepared in parallel. The
L-enzyme reacts only with L-peptide substrates and the D-enzyme reacts only with
the all-D enantiomer. The same is true for chiral inhibitors of the HIV-1 protease.
An achiral inhibitor, on the other hand, inhibits both enzymes in the same way.
Rubredoxin, an electron-transport protein, was prepared as the D-protein for the sole
purpose of mixing it with the naturally occurring L-protein and to make the racemate!
If the effort involved is considered, this is certainly an approach that takes some getting
used to. The reward for the work was very high-quality crystals. The racemate
crystallized in a centrosymmetric space group (▶ Sect. 13.2), which allowed a better
resolution of the 3D structure than was possible with the natural, all-L enantiomer.
5.7 An Excursion in the World of Antipodes 107

S Met272 S Met272

OH OH
H H
N N

O O
HOOC F HOOC F

(R )-5.36 (S)-5.37

Fig. 5.17 Both enantiomers of the agonists 5.26 (beige) and 5.37 (gray) bind the retinoic acid
receptor with 1,000-fold difference in affinity. Because the protein adopts practically the same
geometry with both ligands, only one structure is shown. Both ligands form H-bonds with their OH
groups to the sulfur in Met272. In doing so, the fluorine-substituted aromatic ring of the benzoic acid
moiety on the left with its central amide bond has to adopt a deviating orientation. The tetrahydro-
naphthalene (tetraline) moiety, on the other hand, is positioned in the same way in both enantiomers.

What does a visit to the mirror-image world look like? Achiral drugs would have
an identical potency and mode of action. On the other hand, many enantiomerically
pure drugs would be useless. We would have to watch out for chiral barbiturates
such as 5.25. They would sooner cause a seizure than act as a sedative. In cases in
which chiral antibiotics were used to treat bacterial infections, it would first have to
be established whether the infecting bacteria came from the mirror-image world or
the normal world. The administration of trimethoprim (▶ Sect. 37.2) and
a sulfonamide (both achiral) would help at any rate.
There would be tremendous problems with nutrition. The carbohydrate and
protein metabolism would not work anymore, nor would the resorption of mono-
mers from the gastrointestinal tract. We would not be able to recognize some plants
by their smell. (R)-Carvone smells of caraway seeds, (S)-carvone smells of spear-
mint. Our beloved sugar would have lost its sweet taste, and fruit juices and
108 5 Optical Activity and Biological Effect

lemonade would taste sour. Coffee, tea, and cola would retain their stimulatory
effects because caffeine is achiral. Diet drinks would have to be sweetened with
saccharine or cyclamate (both achiral) because aspartame is chiral.
Let us return to the normal world! But first, let us have a quick glass of vodka. It
could also be cognac, whisky, or a dry red wine. The taste would be the same as in
the normal world, or would it not? Despite the many hundred flavor components of
wine, the exchange of a single chiral center could have the consequence that
a connoisseur might no longer recognize the chateau. The euphoric effects would
be the same, though this would not be the case for the hard, optically active drugs
such as heroin, cocaine, or LSD.

5.8 Synopsis

• Compounds with an asymmetric or chiral center give rise to enantiomers, two


isomeric forms that relate to each other like an image and mirror image and
cannot be mutually transferred without breaking and reforming bonds.
• Enantiomers exhibit the same properties as long as they are found in a non-chiral
environment. If exposed to the asymmetric environment of a protein-binding
site, they experience different interactions and thus produce distinct biological
properties.
• Chiral centers are mostly found at atoms carrying four different substituents, but
also an overall handed scaffold can give rise to chirality. If n independent
stereocenters are present, 2n isomers (diastereomers) are produced occurring as
2n1 racemic mixtures (pair of equally present enantiomers) as long as there is
no internal inversion, mirror, or improper rotation symmetry present.
• Chiral centers are named according to the Cahn–Ingold–Prelog priority rules that
bring the substituents in a unique sequence according to their atomic numbers.
The substituent with lowest priority has to be oriented to the back and the
direction of the remaining substituents determine R/S by the sense of rotation
following decreasing priority.
• Enantiomers can be separated by fractional crystallization after being converted
into diastereomeric salts with appropriate chiral auxiliaries. Also enzymes such
as lipases, esterases, or proteases can be used for resolution because they
transform one enantiomer faster than the other for steric and kinetic reasons.
• Most natural products are optically active and occur in just one form. Biologi-
cally active enantiomers are called eutomers, inactive ones distomers.
• Biological activities of enantiomers and diastereomers can vary greatly either in
strength and quality. Application of racemates has to be examined carefully for
each individual case. Side effects, chemical stability, and deviating metabolism
can have decisive influence on the activity profile.
• On the molecular level the affinity discrimination of enantiomers is explained by
deviating binding modes in the binding pocket of the target protein resulting in
differences of the observed interaction pattern or strain of the adopted bound
conformation.
Bibliography 109

Bibliography

General Literature

Ariëns EJ, Soudijn W, Timmermans PBMWM (1983) Stereochemistry and biological activity of
drugs. Blackwell Scientific, Oxford
Brown C (ed) (1990) Chirality in drug design and synthesis. Academic, London
Caner H, Groner E, Levy L (2004) Trends in the development of chiral drugs. Drug Discov Today
9:105–110
Eichelbaum M, Testa B, Somogyi A (2002) Handbook of experimental pharmacology, stereo-
chemical aspects of drug action and disposition. Springer, Heidelberg
Holmstedt B, Frank H, Testa B (1990) Chirality and biological activity. Alan R. Liss, New York
Klebe G (2004) Differences in binding of stereoisomers to protein active sites. In: Pifat-Mrzljak
G (ed) Supramolecular structure and function 8. Kluwer Academic/Plenum, New York,
pp 31–53
Smith DF (ed) (1989) CRC handbook of stereoisomers: therapeutic drugs. CRC Press, Boca Raton

Special Literature

Ariëns EJ (1984) Stereochemistry, a basis for sophisticated nonsense in pharmacokinetics and


clinical pharmacology. Eur J Clin Pharmacol 26:663–668
Ariëns EJ (1993) Nonchiral, homochiral and composite chiral drugs. Trends Pharmacol Sci
14:68–75
Ariëns EJ et al (1976) Stereoselectivity and affinity in molecular pharmacology. Fortschr
Arzneimittelforsch 20:101–142
Bocola M, Stubbs MT, Sotriffer C, Hauer B, Friedrich T, Dittrich K, Klebe G (2003) Structural and
energetic determinants for enantiopreferences in kinetic resolution of lipases. Protein Eng
16:319–322
Greer J, Erickson JW, Baldwin JJ, Varney MD (1994) Application of the three-dimensional
structures of protein target molecules in structure-based drug design. J Med Chem
37:1035–1054
Jung G (1992) Proteins from the D-chiral world. Angew Chem Int Ed Engl 31:1457–1459
Klaholz BP, Mitschler A, Belema M, Zusi C, Moras D (2002) Enantiomer discrimination illus-
trated by high-resolution crystal structures of the human nuclear receptor hRARg. Proc Natl
Acad Sci USA 97:6322–6327
Mason S (1986) The origin of chirality in nature. Trends Pharmacol Sci 7: 20–23, and other articles
from the same author on pp. 60–64, 112–116, 155–158, 200–205, 227–230 and 281–285
Stinson SC (1994) Chiral drugs. Chem Eng News S. 38–72, and 9 Oct 1995, S. 44–74
Stubbs MT, Huber R, Bode W (1995) Crystal structures of factor Xa-specific Inhibitors in complex
with Trypsin: structural grounds for inhibition of factor Xa and selectivity against thrombin.
FEBS Lett 375:103–107
Part II
The Search for the Lead Structure
112 II The Search for the Lead Structure

The starting point in the development of a new drug is the search for an appropriate
lead structure for a target protein. Next such a target structure within the genome or
proteome must be validated as being relevant as a therapeutic principle. The
production of the pure target structure is possible by using gene technology
methods. After a high-throughput screening assay is established, thousands of test
molecules can be evaluated for binding to the target protein. The X-ray crystal
structure is solved and serves both the search for, and optimization of lead struc-
tures. Without techniques such as bio- and chemoinformatics, molecular modeling,
and computational chemistry this type of search and optimization is unthinkable
(announcement poster from the research group of the author on the occasion of
a conference in 2003 in Rauischholzhausen, Marburg).
The Classical Search for Lead Structures
6

The starting point in the search for a new drug is the lead structure. Such
a substance has already a desirable biological effect, but some specific character-
istics are still inadequate for its therapeutic use. The definition of the term “lead
structure” also means that analogues can be prepared by targeted chemical varia-
tions which produce compounds better than the lead structure in, for instance, their
potency or selectivity. The goal is the optimization of all characteristics until a final
substance is ready for therapeutic use.
The largest part of our pharmacy originates directly or indirectly from natural
products, that is, from plants, animals, or microbial sources, or from endogenous
substances such as hormones and neurotransmitters. Only a few natural products
have become drugs themselves. Examples include morphine, codeine, papaverin,
digoxin, ephedrine, cilcosporin, and hirudin, the latter of which was isolated from
leeches. Examples of endogenous drugs are the thyroid hormone T3, insulin, coag-
ulation factor VIII, erythropoietin, and further proteins for substitution therapy.
Most naturally occurring compounds serve as lead structures. They are chemically
manipulated with the goal of optimizing their desirable characteristics and mini-
mizing their side effects (▶ Chap. 8, “Optimization of Lead Structures”). Examples
are found in the many natural products and endogenous receptor agonists that have
been modified into selective agonists and antagonists (▶ Sects. 6.2, ▶ 6.3, ▶ 6.4, and
▶ 6.6). Drugs are also derived from enzyme substrates (▶ Sect. 6.6 and ▶ Chaps. 23,
“Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic
Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”;
▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidoreductase Inhibitors”) which can either
be substrates for endogenous enzymes, for instance, that play a role in blood pressure
regulation or inflammation, or they are substrates of enzymes from viruses, bacteria,
or parasites, of which the metabolism should be specifically shut down.
In the last 100 years preparative organic chemistry has played a decisive role not
only in the systematic variation of lead structures but also in lead structure discovery.
The search for new active substances has delivered many drugs that have no structural
relationship to endogenous examples. In other cases, the relationship between the
biological effect and the mode of action was clarified long after their discovery.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_6, 113


# Springer-Verlag Berlin Heidelberg 2013
114 6 The Classical Search for Lead Structures

6.1 How It Began: Hits by In Vivo Screening

The first example of discovering an active principle through testing occurred in the
eighteenth century, and is found in the effects of digitalis. The Scottish physician,
William Withering, while working in England, was consulted by a patient who
suffered from an extremely weak heart. After the doctor was unable to help him, the
patient consulted a gypsy woman, who prescribed a herbal therapy. Impressed by
the recovery of the patient, Withering sought out the woman and asked for the
recipe. He received it in exchange for a handsome fee. The mixture contained an
extract of the (poisonous) purple foxglove, Digitalis purpurea. The physician
investigated the potency of different preparations of these plants in that he gave
the medicines to 163 patients! With this experiment, he established that the best
formulation was made up of the dried, powdered leaves. After the observation was
made that a toxic dose is quickly reached, he recommended that diluted
preparations be administered in repeated doses until the desired effect was
achieved. Even though digitalis is still used today for congestive heart failure,
no one would recommend that Withering’s experimental technique be used to
establish the therapeutic potential of a substance. This approach was neither
ethical nor practical.

6.2 Lead Structures from Plants

The example of the previous section shows that nature has furnished plants with
highly potent substances. A plethora of secondary metabolites, for example, alka-
loids, terpenes, flavones, and glycosides are also available. The contents of about
a hundred different plant species have either directly or indirectly, in the form of
analogues, found their way into human therapy. Traditional medicines use about
5,000–10,000 of the several hundred thousand already known species from the rich
plant kingdom. Morphine, caffeine, quinine, cocaine, ephedrine, coniine, atropine,
and reserpine were already mentioned in ▶ Sect. 1.1. Further plant-based pharma-
ceuticals that are used in therapy, or that have served as lead structures for the
development of medicines are compounds 6.1–6.7 (Fig. 6.1), and, in addition,
emetine, pilocarpine, podophyllotoxin, and the vinca alkaloids vinblastine and
vincristine.
Why do plants contain so many valuable therapeutic compounds? There is not
a human-related answer because plants did not evolve so that they could become
human medicines. The plants, however, had to respond to their environment, and
a competition with other species occurred. The decisive disadvantage of being
a plant is that it cannot run away! That is not a disadvantage when it comes to
reproduction. Bees take care of the first part, and aerodynamic seeds help with the
rest. An effective protective mechanism against, for instance, fungal infection and
pests such as caterpillars, sheep, and cattle served as a selection advantage for
some plants. The substances that offer an advantage taste bitter, hot, or are
toxic. They exert their effects in that they interact with the enzymes or receptors
6.2 Lead Structures from Plants 115

H3C CH3 OH
OMe
N+
O
x 2 Cl−
O
O O
6.1 Tubocurarin
OH N+ R
MeO
H CH3

MeO
CH3 OH
N OO
MeO H
OMe CH3
OO
OH
OMe CH3
6.3 Digitoxin, R = H
OO
6.2 Papaverin OH 6.4 Digoxin, R = OH
HO
OH
H CH3
H

O O
O CH3
H3C
O O H
O CH3 OH H O CH3
H3C H
O
O O 6.6 Artemisinin
H O CH3
N O
OH HO
O H
CH3 N
O O O
O
H3C
NH2

6.5 Paclitaxel 6.7 Huperzin A

Fig. 6.1 Natural products from plants that have been introduced to therapy or have served as lead
structures include, in addition to the substances introduced in ▶ Sect. 1.1, tubocurarine (curare)
6.1, papaverin 6.2, digitoxin 6.3, digoxin 6.4, and the related cardiac glycosides. Newer natural
products from plants with great therapeutic potential include paclitaxel (Taxol ®) 6.5 for tumor
therapy, artemisinin 6.6 for malaria therapy (▶ Sect. 3.3), and the acetylcholinesterase inhibitor
huperzin A 6.7 for the potential treatment of Alzheimer’s disease.

of the “enemy.” The stronger the effect, the better the protection. A successful
principle of evolution is the development of defensive substances that do not kill,
but cause an unpleasant experience for the predator, which in turn teaches the
enemy to stay away. That is how butterflies survive that accumulate poisonous
116 6 The Classical Search for Lead Structures

plant-based substances in their bodies, and even those others that just imitate the
appearance of these butterflies. After the first experience with the poisonous
species, birds give both species a wide berth.
Plant substances have already undergone a selection process on biologically
relevant proteins; during the course of evolution they have “seen” receptors and
binding sites. Further, the course of their biosynthesis takes place in the binding site
of a protein, that is, they have functionality that mediates affinity to a protein.
Certainly, there are many plant substances that coincidently have a biological effect
in humans. Morphine contains a basic nitrogen, a phenolic hydroxyl group, an ether
bridge, and a hydrophobic domain: a medicinal chemist would also choose such
a mixture of functional groups, without the complicated ring structure, in the
conception of an active substance.
The isolation of natural products from plants for lead discovery has experienced
rather changing valuation in the last decades. Large pharmaceutical companies
have repeatedly started ambitious programs to elucidate the mechanism of action of
traditional medicines, only to abandon the area again disappointed. The disappoint-
ments are a result of an unfavorable relationship between effort and reward. All too
often only a toxin is isolated instead of a valuable lead structure, and all too often an
already-known principle is found. Nonetheless, the search continues. Nature offers
structural variation that the chemist can only dream of.

6.3 Lead Structures from Animal Venoms and Other


Ingredients

In contrast to the plants, the evolution of animal venoms occurred with the objective
of subduing prey or defending against an enemy. Many of these substances are
proteins, peptides, and alkaloids. They function as potent poisons that can quickly
lame or kill a victim. Because of this, many active substances from animals are
unsuitable for therapy, but others, for the exact same reason, are interesting lead
structures. Animal products offer many surprises, as illustrated in the following two
examples.
Despite its simple structure, epibatidine 6.8 (Fig. 6.2), which was isolated from
the Ecuadorian poison dart frog Epipedobates tricolor, is a 100-fold more-potent
analgesic than morphine! It does not affect the opiate receptor, but rather it is an
agonist at the nicotinic acetylcholine (nACh) receptor (▶ Sect. 30.4). That comes as
no surprise when its structural similarity to nicotine 6.9 is considered. Epibatidine
has a binding constant of 0.04 nM on the nACh receptor, which is 50-fold stronger
than nicotine. Unfortunately, its analgesic effects are coupled with a pronounced
body temperature reduction (hypothermia).
Dolastatine 6.10 (Fig. 6.2) was isolated from the wedge sea hare, Dolabella
auricularia, a marine snail. It is an interesting lead structure for antitumor com-
pounds. Synthetic analogues of 6.10 cause the complete disappearance of tumors in
some animal models. The diversity of marine animals in particular has historically
been a rich source of new and interesting lead structures and modes of action.
6.3 Lead Structures from Animal Venoms and Other Ingredients 117

H NH H
N

Cl N CH3
N

6.8 Epibatidine 6.9 Nicotine

O O
H
H3C N N N O OMe
N N N
O O O O
CH3 CH3
O
6.10 Dolastatin-15
H3C

O− H3C O NH
HO H
H2N+ H O HO O CH3
H O CH3
N OH N
H
N CH2OH
O O
H HO
H H
HO
H OH H

6.11 Tetrodotoxin 6.12 Batrachotoxin

Fig. 6.2 Epibatidine 6.8, a non-opiate analgesic that binds 50-fold more potently to the nicotinic
acetylcholine receptor than nicotine 6.9, comes from a South American frog (▶ Sect. 30.4).
Dolastatin-15 6.10, which was isolated from a marine snail, is an interesting lead structure for
cancer therapeutics. The toxin of the fugu fish, tetrodotoxin 6.11, is not a lead structure but rather
a sodium channel blocker for experimental (in vitro) use. The steroid alkaloid batrachotoxin 6.12 is
the most potent animal venom known. The LD50 value in mice, that is, the dose necessary to kill
50% of the experimental animals within 24 h, is 200 ng/kg.

Other animal substances have gained importance in experimental pharmacology.


Among them are the poison of the notorious fugu fish, tetrodotoxin 6.11, and the
steroid alkaloid batrachotoxin 6.12 from the skin of the Columbian poison dart frog
(Fig. 6.2). Whereas tetrodotoxin specifically blocks sodium channels, batrachotoxin
stabilizes sodium channels in the open form.
Peptides from snake venom made a decisive contribution to the development of
the antihypertensive angiotensin-converting enzyme inhibitors (▶ Sect. 25.4).
Research on the area of thrombin inhibitors in the past years have turned toward
the active ingredient of leech saliva, hirudin. Aside from the direct use of hirudin,
longer-acting derivatives, shorter peptides that only bind on the fibrinogen-binding
site, and protein conjugates with other thrombin inhibitors have been derived from
the structure.
118 6 The Classical Search for Lead Structures

Animal and human proteins as well as polymeric carbohydrates are extraor-


dinarily important for substitution therapies. Insulin (isolated from the pig pan-
creas) is at the top of the list, followed by aprotinine, a protease inhibitor
(isolated from cattle lungs), digestive enzymes, and the coagulation inhibitor
heparin. Now that the possibility of the gene-technological production of insulin
is available, its isolation from animal organs has become less important. Other
proteins, for example, the erythrocyte-stimulating hormone erythropoietin
(▶ Sect. 29.8), human growth hormone, tissue plasmin activator tPA, urokinase,
and factor VIII, are all manufactured by using gene technology nowadays
(▶ Sect. 32.1). In this way, these proteins are available in practically unlimited
quantities.
The protease ancrod, isolated from the venom of the Malayan pit viper
Agkistrodon rhodostoma, cleaves the precursor of fibrin, fibrinogen, to a product
that can no longer aggregate. Thus the viscosity and the coagulation ability of the
blood is reduced (▶ Sect. 23.4). An elevated thrombosis risk can be significantly
reduced through this mechanism. To isolate the active component of this venom,
several hundred snakes have to be “milked” regularly.

6.4 Lead Structures from Microbial Organisms

When speaking of active substances from microorganisms, antibiotics must be


mentioned first. The b-lactams penicillin and cephalosporin (▶ Sects. 2.4 and
▶ 23.7) are highlighted as particularly valuable lead structures. Aside from oral
bioavailability, the therapeutic goals were broad-spectrum activity and metabolic
stability. Tetracycline 6.13 (Fig. 6.3) was also intensively structurally modified. It
attacks the ribosome during protein biosynthesis (▶ Sect. 32.6). Other microbial
antibiotics, for instance, streptomycin 6.14, are used directly in therapy.
The immunosuppressants ciclosporin A (▶ Sects. 4.7 and ▶ 10.1), FK 506, and
rapamycin also originated from microorganisms. Ciclosporin A is a convincing
example of how difficult it is to predict the potential of a new therapeutic substance.
Sandoz almost abandoned its development because of “lack of market potential.”
This decision would have had fatal consequences because a large portion of the
success of transplantation surgery today can be attributed to this substance. Instead,
ciclosporin became one of the company’s best-selling products.
The fungus Claviceps purpurea, which grows in grain (ergot, Secale cornutum),
contains a toxic alkaloid. For hundreds of years, the consumption of bread that had
been made from contaminated flour was the cause of severe poisonings. The
structures of these alkaloids, for example, ergotamine 6.15 (Fig. 6.3), were in
large part elucidated at Sandoz. Their systematic modification led to active sub-
stances for many indications, e.g., for inducing contractions during labor, migraine
therapy, perfusion disorders, and arterial hypertension. Today they have little
importance because of their limited therapeutic index. Another representative of
this class is the hallucinogen lysergic acid diethylamide (▶ Sect. 2.5), which was
discovered by accident.
6.4 Lead Structures from Microbial Organisms 119

OH O OH O O
OH HN
NH
NH HN
NH2 H2N
NH2
OH OH HO
HO CH3 H N(CH3)2
O
O OH
6.13 Tetracyclin
CHO
H3C
H3C HO HO
O O
H O HO
N
N R2 R1 = −CH2OH
H R1
H N
O O R2 = −NHCH3
N H
OH
H CH3
6.14 Streptomycin

HN
6.15 Ergotamine

O CH3
NH O
N N H
H N
HO N
N H H
N O
H N
H
O O
HN

6.16 Asperlicin 6.17 Devazepide

Fig. 6.3 Penicillins, cephalosporins (▶ Sects. 2.4 and ▶ 23.7), and tetracycline 6.13 were impor-
tant lead structures for even better antibiotics. In contrast, streptomycin 6.14 is used in therapy
itself. Ergotamine 6.15 is a typical representative of the ergot alkaloids, from which a plethora of
different drugs have been derived. Likewise, asperlicin 6.16 is a structurally complex microbial
natural product. The 10,000-fold more potent derivative devazepide 6.17 was derived from it.

Lovastatin and some analogues (▶ Sects. 9.2 and ▶ 27.3) are exceedingly
important therapeutic substances that were isolated from microorganisms; they
interfere in the biosyntheses of cholesterol. Cholecystokinin (CCK) is a peptide
hormone that acts at a G protein-coupled receptor (▶ Sect. 29.1). It induces
multifaceted effects in the central nervous system and gastrointestinal tract. The
non-peptide CCK antagonist asperlicin 6.16 (IC50 ¼ 1.4 mM) originated from
extracts of Aspergillus alliaceus. After intensive structural variation, the much
simpler devazepide 6.17 (IC50 ¼ 80 pM) was designed, which has more than
120 6 The Classical Search for Lead Structures

10,000-fold better affinity to the CCK receptor (Fig. 6.3). This antagonist is orally
bioavailable and is an appetite stimulator.
The enzyme streptokinase for the dissolution of blood clots, and bacterial
collagenase for wound treatment are examples of therapeutically important proteins
that were isolated from microorganisms.

6.5 Dyes and Intermediates Lead to New Drugs

In 1903, Paul Ehrlich investigated hundreds of dyes in mice that had been infected
with trypanosomes. The result of this research was Nagana Red, the first drug for
Trypanosoma crucei infection, the causative agent of cattle trypanosomiasis. Other
dyes followed, as did colorless compounds that contained amide instead of azo
groups. It was only after Ehrlich’s death in 1916 that Bayer, after having investi-
gated more than a thousand analogues, produced its wonder drug suramin
(Germanin® ) 6.18 (Fig. 6.4). The work in this area led to the discovery of the
antibacterial sulfonamides in the 1930s (▶ Sect. 2.3). Thousands, if not tens of
thousands, of analogues were synthesized and tested. Many were introduced to the
market. Depending on the structure, they cover an extraordinarily broad spectrum
of different pharmacokinetic characteristics.
No actual biological activity was expected from the synthetic intermediates.
They were seen merely as starting material for the desired end product. Despite this,
many intermediates were routinely tested for biological activity, and it was a good
thing too!

CH3 O CH3
H H
N N
N N
H H
O O

O NH SO3Na O NH SO3Na

SO3Na SO3Na
SO3Na 6.18 Suramin SO3Na

Fig. 6.4 Bayer’s suramin 6.18, which is also known as E 205 or Germanin ®, had strategic
importance for the colonies. An English engineer who was suffering from the African sleeping
sickness (trypsanosomiasis) and was near death despite aggressive treatment with diverse anti-
mony and arsenic preparations, was cured after a few injections of this substance. The solvent for
the preparation of the intravenous injection solution was rain water in the tropical clinical trials(!).
After a short time, suramin was considered to be a “wonder drug.” Despite the fact that the
structure was kept secret, French researchers worked out their own synthesis within a short time.
Suramin is still used for the treatment of trypsanosomiasis because it has good efficacy and a long-
lasting effect.
6.6 Mimicry: How to Copy Endogenous Ligands 121

S O R
N
N NH2 COOH
H

N N

HNCOCH3 6.20 Isoniazid 6.22 Nicotinic acid


R = −NH-NH2

6.19 Thiacetazone 6.21 Isonicotinic acid


R = −OH

Fig. 6.5 Thiacetazone 6.19 and isoniazid 6.20 are tuberculostatics that originated as synthetic
intermediates. Isoniazid penetrates the cell wall and irreversibly binds to the enzymatic cofactor
NADH after radical generation. The originally accepted hypothesis that, upon metabolic
degredation to isonicotinic acid 6.21, it acts as an antimetabolite for nicotinic acid 6.22, proved
to be incorrect.

Gerhard Domagk, the discoverer of sulfonamides (▶ Sect. 2.3), investigated just


such a synthetic intermediate in addition to the many end-target substances and
found a surprisingly good effect against tuberculosis. Structural optimization
afforded thiacetazone 6.19 (Fig. 6.5), which unfortunately turned out to be
hepatotoxic. In the search for a follow-up substance, Bayer started a concerted
program with 5,000 compounds. In 1951 another synthetic intermediate showed
surprisingly potent tuberculostatic activity. Isoniazid 6.20 (Fig. 6.5) was 15 times
more active than the best antituberculosis antibiotic at the time, streptomycin 6.14
(Fig. 6.3). The discovery was palpable. Two other research groups, both in the USA,
simultaneously and independently discovered the effect of this substance, which,
upon enzymatic radical generation, irreversibly binds to the cofactor NADH of
a fatty-acid-synthesizing enzyme of the tuberculosis bacillus. The hypothesis that
metabolic cleavage to isonicotinic acid 6.21, which in turn exerts its effect by acting
as an anti metabolite to nicotinic acid 6.22 (Fig. 6.5), was evidently wrong.
Inhibitors of the enzyme dihydrofolatereductase, for instance, methotrexate 6.23
(Fig. 6.6), are used in the treatment of leukemia (▶ Sect. 27.2). During the inves-
tigation of analogues, a simple synthetic intermediate, mercaptopurine 6.24 was
tested. It showed efficacy, but was too toxic. The further development delivered
azathioprine 6.25, which releases mercaptopurine in the organism (Fig. 6.6). As an
immunosuppressive, azathioprine was even better than the then-used corticoste-
roids (▶ Sect. 28.5). Until the introduction of ciclosporin (▶ Sect. 10.1) it was used
in all organ transplantations. Another intermediate from this class, allopurinol 6.26
(Fig. 6.6), is a xanthine oxidase inhibitor. It is used for the treatment of gout.

6.6 Mimicry: How to Copy Endogenous Ligands

As of the middle of the nineteenth century, biological substances, enzyme sub-


strates, neurotransmitters, and hormones were increasingly being used as
122 6 The Classical Search for Lead Structures

O COOH

NH2 N COOH
H
N
N N 6.23 Methotrexate
CH3
H2N N N NO2
N
OH
S N S
N H3C N
N N N
N
N N
N N N H
H N
H
6.24 Mercaptopurine 6.25 Azathioprine 6.26 Allopurinol

Fig. 6.6 Simple synthetic intermediates to methotrexate 6.23 turned out to be new drugs.
Mercaptopurine 6.24 and azathioprine 6.25 are immunosuppressants, and allopurinol 6.26 is
used to treat gout.

O Groups that imitate transition states

N O OH
OH
H
P X = -CH2-, -NH-, -O-
X
Substrate
OH OH
CHO , as CH B
H O OH
OH OH
N
H O H O OH

Transition state , as X = -CF3, -CF2-, -Aryl


X X

Fig. 6.7 Examples of substrate, transition state, and groups that imitate the enzymatic transition
state of an amide hydrolysis reaction. A few of the groups reversibly form covalent bonds to the
serine in the catalytic pocket of a serine protease (see ▶ Sect. 23.2).

archetypes for new medicines. The directed design of drugs from these lead
structures led to the “golden age” of pharmaceutical research (▶ Sect. 1.4).
The principal approach is demonstrated here on the example of enzyme inhib-
itors. Enzymes catalyze chemical reactions in that they stabilize the transition state
of the reaction. In doing so, they decrease the activation energy, and the reaction
can proceed at a lower temperature (▶ Sect. 22.3). This specificity can be exploited
particularly well for the optimization of enzyme inhibitors. By starting with knowl-
edge of the reaction mechanism, substrate groups are assembled that are structurally
analogous to the transition state (Fig. 6.7). They imitate it but do not lead to
6.6 Mimicry: How to Copy Endogenous Ligands 123

NH2 H OH
N H2N OH N
N
N N
N N N
N N
HO N N O
O Sugar O

Hypothetical transition state


HO OH of the enzyme reaction OH

6.27 6.29 Pentostatin


Adenosine Adenosine-
deaminase
O H OH
N N
N N N
N
N N N
N N N
HO HO HO
O O
O

HO OH HO OH HO OH

6.28 Inosine 6.30 Nebularine Hypothetical active


form of 6.30

Fig. 6.8 Pentostatine 6.29 and nebularine 6.30 inhibit the enzymatic transformation of adenosine
6.27 to inosine 6.28. The affinity of 6.29 is 7 orders of magnitude more potent than the substrate
adenosine (Ki ¼ 2.5 pM), and the active form of 6.30 is 10 orders of magnitude even more potent
(Ki ¼ 0.3 pM). The structures of pentostatin as well as the active form of nebularine correspond to
the transition state of the enzymatic reaction.

a product. In this way in a single step, through an entirely purposeful chemical


transformation, a substrate can be converted into a potent and selective inhibitor.
The correct inhibitor binding geometry improves the affinity by several orders of
magnitude. The two natural products pentostatin 6.29 and nebularine 6.30 (Fig. 6.8)
are inhibitors of the enzymatic transformation of adenosine 6.27 to inosine 6.28 and
impressive examples of transition-state mimetics. The introduction of a hydroxyl
group with the correct stereochemistry increased the affinity of the ligand to the
enzyme by many orders of magnitude.
Never before was the search for new drugs as successful as it was in the two to
three decades of the “golden age.” Subsequently the success rate fell. Research
became more expensive and laborious. How is this explainable? Because of the
success during this period, many indication areas achieved a very high standard of
care. That makes it difficult for modern research to be as successful as before, even
with the use of superior tools. Other reasons include higher requirements for
efficacy and safety.
124 6 The Classical Search for Lead Structures

6.7 Side Effects Indicate New Therapeutic Options

Many drugs came from the observation of side effects during clinical or practical
use (see ▶ Sect. 2.8). The diuretic effects of mercury compounds were discovered
purely by accident (▶ Sect. 30.9). In 1919 physicians in the First Medical Univer-
sity Hospital in Vienna were testing a new treatment for syphilis. It was observed in
a 21-year-old woman that her urine production increased from 200–500 mL a day to
1.2–2.0 L on the third day of treatment with the test substance. This result led to the
development of the first effective diuretic (medicine to increase urine production).
Fortunately, we are no longer dependent on extremely toxic mercury compounds
for the therapy of venereal disease or as diuretics!
In 1948 it was observed in vulcanization factories that the antioxidant disulfiram
6.31 (Fig. 6.9) caused workers to become intolerant of alcoholic drinks. This
discovery led to the use of the substance for the treatment of chronic alcoholism.

S CH3
H
S N(Et)2 O N
(Et)2N S N CH3
H
S

6.31 Disulfiram 6.32 Iproniazid


N

O
OH OH
OH CH3

O O O O
O
6.33 Dicoumarol 6.34 Warfarin

CH3 O
H
HS 6.35 Penicillamine
OH
H3C
NH2

Fig. 6.9 Tetraethylthiuram disulfide 6.31 or disulfiram, better known as Antabuse ®, is an


aldehyde dehydrogenase inhibitor. The accumulation of the toxic acetaldehyde leads to nausea.
Iproniazid 6.32, a simple derivative of isoniazid 6.20 (Fig. 6.5), is a monoamine oxidase inhibitor
(▶ Sect. 27.8). It acts as an antidepressant by prolonging the effects of the biogenic amines. The rat
poison warfarin 6.34 is derived from dicoumarol 6.33. Even though the coagulation parameters
must be closely monitored, it is still the standard of therapy for diseases that are coupled with
a thrombosis risk, for example, heart attack or stroke. Penicillamine 6.35 is a complexation agent
for heavy metals; it is used for–among other indications–the treatment of Wilson’s disease, which
is an inherited disease that leads to the accumulation of copper in the tissues. It was only later that
its efficacy in chronic rheumatic diseases was discovered.
6.8 From the Traditional Search to the Screening of Large Compound Libraries 125

The metabolic intermediate of ethanol, acetaldehyde, is not metabolized any


further. This leads to generalized poisoning symptoms such as nausea, palpitations,
and cold sweats. The effect is, however, difficult to control. Alcohol consumption
after treatment has occasionally been fatal.
A classic example of the discovery of an important indication by observing side
effects can be found in the sulfonamides. The sulfonamide diuretics and the oral
antidiabetics (▶ Sect. 30.2), drugs of choice to treat certain forms of diabetes, were
found in this way (▶ Sect. 8.4).
Iproniazid 6.32 (Fig. 6.9) is a derivative of isoniazid 6.20 (Fig. 6.5). In 1957
a tuberculosis patient noticed a distinctive mood brightening, which led to its broad
use for the treatment of chronic depression. The substance had to be withdrawn
from the market a few years later due to severe side effects (▶ Sect. 27.8).
Sweet clover has been used in Europe to feed livestock for hundreds of years.
During its introduction in the 1920s to the USA and Canada, it was initially stored
inappropriately, with disastrous consequences. Massive bleeding and fatalities in
the cattle were attributed to the spoiled sweet clover (i.e., hemorrhagic sweet clover
disease). The active substance, dicoumarol 6.33 (Fig. 6.9), was introduced into
therapy in 1942, but its effects were unreliable. The Wisconsin Alumni Research
Foundation investigated 150 analogues and produced warfarin 6.34, which was sold
as a rat poison. The name is derived from the company’s acronym WARF, and the
ending “arin” from coumarin. In 1951 an American soldier attempted suicide with
a high dose of warfarin. Because he survived, a clinical trial was initiated. Despite
the need for frequent and tight control of the coagulation values, treatment with
warfarin is the standard therapy today after a heart attack or stroke.
Penicillamine 6.35 (Fig. 6.9) provides an example of an important indication exten-
sion. It was introduced for the treatment of Wilson’s disease, an inherited metabolic
disease that leads to copper accumulation in tissue. Because 6.35 forms complexes well,
it is also appropriate for the treatment of heavy-metal poisonings. It was only later, after
its practical use, that its much larger importance as a basis therapy for rheumatic disease
was recognized. The mechanism of action remains largely unclear.

6.8 From the Traditional Search to the Screening of Large


Compound Libraries

The approaches that are described in the previous sections are still used in industrial
pharmaceutical research today. Because of the enormous costs associated with the
development of drugs, the search for original lead structures is an increasingly
important goal. Large sums are paid for novel therapeutic approaches, test models,
or 3D structures of target proteins. This information can lead to an advantage over
the competition that indeed takes time to realize, but must be zealously defended
and brought to fruition.
According to the principle of risk diversification and the maximal exploitation of
all imaginable resources, today pharmaceutical companies subscribe to a strategy of
broadly established screening of huge substance libraries of plant extracts,
126 6 The Classical Search for Lead Structures

microbial fermentations, and synthetically prepared compounds. The last category


comes from in-house chemistry as well as purchased compounds and combinatorial
substance libraries (▶ Chap. 11, “Combinatorics: Chemistry with Big Numbers”).
Furthermore, a large part of the search for new lead structures takes nowadays place
by computer methods.
The identification of therapeutically relevant target proteins plays an ever–
increasing role for the discovery of new lead structures. The elucidation of the
human genome (▶ Sect. 12.3) has delivered the sequences of all human proteins.
By comparing the expression pattern between diseased and healthy cells, it is
possible to recognize particular proteins as a cause or consequence of a given
pathology (▶ Sect. 12.8). Should such a protein be detected, the next steps
are certain. The therapeutic concept is tested on a genetically modified animal
(▶ Sect. 12.5), or the gene is silenced (▶ Sect. 12.7), a molecular test system is
established, and the 3D structure of the protein is elucidated. In parallel, all
available techniques for lead structure search are employed. Because this process
chain is being carried out with increasingly high throughput, the capacity for lead-
structure searching must be constantly extended.
Many companies try to simultaneously develop chemically unrelated lead struc-
tures for the same indication. The elaborateness of the animal models for the
preclinical profiling and the preparations for clinical testing require so much
labor and expense that it seems hardly justifiable to start such a program with
only one compound class. Risk minimization and distribution are required for the
search as well as the development of a medicine. Techniques that are used for the
detection of new lead structures are presented in the next chapter (▶ Chap. 7,
“Screening Technologies for Lead Structure Discovery”).

6.9 Synopsis

• Many active substances originate from natural products found in plants, animals,
and microbial sources. Their mode of action has been copied as an active
principle for the development of drugs.
• Endogenous substances such as hormones and neurotransmitters also served as
references for drug development.
• Only a few natural products became drugs themselves.
• Usually targeted chemical variations are required to optimize a lead for meta-
bolic stability, half-life, or selectivity to be ready for therapeutic use.
• Plants contain many valuable therapeutic compounds usually developed as an
effective protective mechanism against all sorts of enemies.
• Nature offers a tremendous body of structural variations, however, ambitious
programs to elucidate mechanisms of action of traditional medicines all too often
only isolate toxins and discover already-known principles.
• Animals have developed venoms as aggressive or defense mechanisms to be
used as predators or against enemies. They are mostly proteins, peptides, or
alkaloids that either kill or lame a victim.
Bibliography 127

• Snake venoms served as references for the development of anti-hypertensive


drugs; active principles to block blood clotting (e.g., by leeches or bats) were
turned into active ingredients for anticoagulation drugs.
• Proteins for substitution therapy (such as insulin, erythropoietin, factor VII) are
manufactured by gene technology.
• Microorganisms have provided leads for antibiotics (e.g., penicillins), which had
to be optimized for oral availability, broad-spectrum activity, and metabolic
stability.
• The immunosuppressant ciclosporin A, a cyclic peptide; ergotamine, a toxic alka-
loid in ergot; lovastatin, an inhibitor of cholesterol biosynthesis; or streptokinase to
dissolve blood clots, are successful drugs originating from microorganisms.
• Dyes and many synthetic intermediates produced in chemical industry were
investigated for biological effects and provided important compound classes
such as the sulfonamides.
• Small but essential structural changes of endogenous ligands transform enzyme
substrates, neurotransmitters, and hormones into successful drugs.
• Many drugs originated from clinical observations of side effects during practical
use, for instance, the anti-diabetic effect of sulfonyl ureas from the observation
of side effects of sulfonamides.
• To exploit all imaginable resources to discover leads today, huge substance
libraries of plant extracts, microbial fermentations, and libraries of synthetically
prepared compounds are screened.

Bibliography

General Literature
Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Sneader W (1990) Chronology of drug introductions. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry. vol 1, Kennewell PD (ed). Pergamon, Oxford, pp 7–80
Verg E (1988) Meilensteine. 125 Jahre Bayer 1863–1988. Bayer AG, Leverkusen

Special Literature
Badio B et al (1994) Epibatidine: discovery and definition as a potent analgesic and nicotinic
agonist. Med Chem Res 4:440–448 and other works (Special journal edition dedicated to
Epibatidine)
Buss AD, Waigh RD (1995) Natural products as leads for new pharmaceuticals. In: Wolff M (ed)
Burger’s medicinal chemistry and drug discovery. Wiley, New York, pp 983–1033
Hylands PJ, Nisbet LJ (1991) The search for molecular diversity (I): natural products. Ann Rep
Med Chem 26:259–269
Pettit GR et al (1993) Isolation of dolastatins 10–15 from the marine mollusc Dolabella
Auricularia. Tetrahedron 41:9151–9170
Suffness M (1993) Taxol: from discovery to therapeutic use. Ann Rep Med Chem 28:305–314
Tempesta MS, King SR (1994) Ethnobotany as a source for new drugs. Ann Rep Med Chem
29:325–330
Screening Technologies for Lead Structure
Discovery 7

In the last chapter, examples were presented of how lead structures can be discovered
by purposefully searching, particularly by using examples from nature or compounds
with known modes of action. Even if a large number of natural products and synthetic
substances are available, it is not always easy to filter the active molecules out and to
assess their value for a given indication. This requires a time and cost-intensive sorting
or screening of enormous substance libraries. By “screening” is meant the more or less
specific biological testing of compounds. Although today molecular test systems and
cell culture models are practically exclusively used, the cost for testing a compound is
between US $2 and US $5. Because typically millions of compounds are tested,
a screening campaign can cost a lot of money!
The screening process can be divided into three phases. First there is an
automatic introductory screening, which is usually carried out by robots and
encompasses libraries of millions of compounds. The first substances that show
an interaction are identified as “hits” that have to be validated by repeated testing.
Next, a more detailed screening follows, with which the chemical space around the
identified compounds is explored. The goal is to establish a structure–activity
relationship (▶ Chap. 18, “Quantitative Structure–Activity Relationships”) and to
improve the pharmacological and physicochemical properties (▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”). Along the
way, lead structures (so-called “leads”) are discovered. Then in the last phase the
lead optimization takes place through detailed biological testing, through which
a drug candidate is selected for clinical testing (▶ Chap. 8, “Optimization of Lead
Structures”). How can we find appropriate hits from the enormous amount of test
candidates that have the potential to be developed into a medicine? The question is
answered by screening for biological effects.

7.1 Screening for Biological Activity by HTS

The prerequisite for a large-scale screening was the development of in vitro test
systems as a surrogate for animal experiments. The first were carried out on

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_7, 129


# Springer-Verlag Berlin Heidelberg 2013
130 7 Screening Technologies for Lead Structure Discovery

isolated enzymes and membrane homogenates for receptor-binding studies.


Later gene technology (▶ Sect. 12.6) made sufficient quantities of pure proteins
available for the development of molecular test systems. This offered the
advantage that homogenous proteins, preferentially human proteins, could be
tested.
In the mid-1990s, automated test systems with an extremely high capacity
(high-throughput screening, HTS) led to a daunting boom. The discovery of
candidates for drug development is now attempted by using the entire methodo-
logical repertoire of biochemistry in a test tube. Meanwhile it is known how to
reprogram cells and organisms so that the function of single genes is highlighted.
The special trick with all of these test methods lies in translating the molecular
effect into a macroscopically visible signal.
Despite the enormous effort that is associated with HTS, and the not-always-
justifiable hit rate, HTS is here to stay in pharmaceutical research. There are
always interesting lead structures to be found in this way (▶ Chaps. 23, “Inhib-
itors of Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Pro-
tease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”; ▶ 26,
“Transferase Inhibitors”; ▶ 27, “Oxidoreductase Inhibitors”; ▶ 28, “Agonists
and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists and Antagonists of
Membrane-Bound Receptors”; ▶ 30, “Ligands for Channels, Pores, and Trans-
porters”; ▶ 31, “Ligands for Surface Receptors”; ▶ 32, “Biologicals: Peptides,
Proteins, Nucleotides, and Macrolides as Drugs”). A weakness may be the
limited diversity of synthetic substances, compared with the structural com-
plexity of plant and microbial metabolites. Another limitation of in vitro test
systems is that neither the entire effect spectrum nor many other effects such as
transport, distribution, metabolism, and excretion (▶ Chap. 19, “From In Vitro
to In Vivo: Optimization of ADME and Toxicology Properties”) can be
assessed.
The composition of suitable screening libraries is exceedingly critical. Fre-
quently molecules and test candidates are used that were prepared during the
course of other drug-development projects. As such, these molecules already
have the size of a typical drug. Usually only modest, almost always micromolar
binding to the test receptor is found. To improve the properties of such a hit, it
must be structurally modified. As a general rule, this is accomplished by adding
more chemical groups. This means that the molecular weight can quickly reach or
exceed 500–600 Da, which is considered to be the upper threshold for good
bioavailability (▶ Sect. 9.1). The optimization of such a screening hit therefore
means that the size must be reduced first, so that it can be increased again during
a goal-oriented optimization. Yet the size reduction often comes with a loss in
binding. Therefore the criterion “ligand efficiency” was introduced to judge
a screening hit’s optimization potential. For this, the number of non-hydrogen
atoms of the hit are considered in relation to the binding affinity. Small sub-
stances that have good binding in relation to their size are seen as particularly
promising candidates for an optimization program.
7.2 Color Change Demonstrates Activity 131

7.2 Color Change Demonstrates Activity

Important target proteins for drug development are proteases and esterases, which
are enzymes that cleave peptide and ester bonds (▶ Chaps. 23, “Inhibitors of
Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhib-
itors”; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”). How can their enzy-
matic activity be visualized? One prepares synthetic substrates that are similar to
the natural substrate. They carry however, a para-nitroanilide or a para-
nitrophenolate group coupled by a peptide or ester bond (Fig. 7.1) When the
enzyme cleaves this substrate, yellow nitrophenolate or nitroanilide is released,
and the absorption properties of the produced anion are a measurably change. This
is observed spectroscopically. If then, during screening, a compound acts as an
inhibitor, the enzymatic cleavage of the synthetic substrate is more or less
suppressed, and the yellow color is minimized. In this way the inhibition potency
of test substances can be determined (Fig. 7.1)

NH2 NH− NH
O−
Peptide +N
O O

R N + + N+ −
N −O
N −O
cleavage −O
O O O
H
-RCOO− O− O
O− OH
Ester +
N
O O

R O +
N+ −O
N −O
N+ −
−O O O
O
405 nm

Fig. 7.1 A p-nitrophenolate or a p-nitroanilide group is added to the terminus of a natural protease
or esterase substrate. The enzyme cleaves the p-nitrophenolate or p-nitroanilide, which becomes
visible as a yellow-colored mesomerically stabilized anion (absorption maximum at 405 nm). If
a competitive inhibitor is added along with the substrate to the enzyme, the cleavage reaction rate
is suppressed depending on the binding strength. This is apparent by the more or less strong yellow
color of the solution and can be quantitatively measured.
132 7 Screening Technologies for Lead Structure Discovery

A broad palette of chromophoric reactions could be developed that are


suitable for the characterization of enzymatic activity. Many enzymes, for example,
dehydrogenases, need NAD(P)H as a natural cofactor, which is subsequently
oxidized to NAD(P)+ (▶ Sect. 27.1). Because the NAD(P)H starting material, in
contrast to the product, absorbs at 340 nm, the progress of the enzymatic reaction
can be followed at this wavelength. As a variation, two enzymatic reactions can be
coupled to one another. This possibility is interesting when the substrate that is
easily spectroscopically followed is produced in an upstream reaction. In this case
the reaction of the enzyme of interest is not actually directly observed. Rather, the
activity of interest is registered based on the consumption of the upstream reaction
products in the subsequent enzyme reaction. Although absorption spectroscopic
assays are preferred for technical reasons, tests that are based on the reaction of
radiolabeled compounds play an even more important role. The activity of kinases
is, for example, followed by using 32P-labeled adenosine triphosphate. The terminal
phosphate group of the labeled substrate is transferred to the phosphorylated protein
by the kinase (▶ Sect. 26.3). The incorporation rate serves as a measure of the
kinase activity. Receptor-binding studies are carried out with a known radioactively
labeled ligand. The assay investigates to what extent test compounds can displace
the radioactively labeled ligand from the receptor-binding site. This type of test
does not necessarily represent a functional assay though. Agonistic and antagonistic
binding (▶ Chaps. 28, “Agonists and Antagonists of Nuclear Receptors” and ▶ 29,
“Agonists and Antagonists of Membrane-Bound Receptors”) must still be
distinguished.

7.3 Getting Faster and Faster: More and More Compounds by


Using Less and Less Material

Antibodies play an important role in assay development. The enormous specificity


of antibody–antigen interactions can be exploited as a highly sensitive system
(▶ Sect. 32.3). In classical immunoassays, either the release of a radioactively
labeled substance is followed (Radioimmunoassay, RIA), or an enzymatic reaction
is provoked (enzyme-linked immunosorbent assay, ELISA). The latter technique
has enjoyed a distinctly larger application range, mostly because radioactivity is
best avoided as a measured quantity. Because they only recognize a single molec-
ular species, immunoassays are not only highly specific but also versatile.
Screening techniques are optimized to be automated and miniaturized. Driven
by the desire for higher capacity, these tests are hardly ever carried out in 96-well
(8  12) microtiter plates anymore. The wells of these plates hold a reaction volume
of about 0.3 mL. In the meantime 384-well (16  24) microtiter plates are used or
even 1536-well (32  48) plates, the volumes of which are only a few microliters
per well. The aggregation behavior of hydrophobic test compounds poses a large
problem. The aqueous buffer solutions that are used for these assays can cause these
compounds to aggregate. This aggregation generates hydrophobic surfaces, on
which the proteins can adsorb. The concentration of free protein is reduced,
7.3 Getting Faster and Faster 133

which can appear as though the protein is well inhibited. The addition of detergents
can reverse this effect.
By using a sophisticated robot system, 100,000 assays a day can be carried
out. This leads to an enormous flood of data to be evaluated. The reduced test
volume has the advantage that much less material is consumed. Furthermore, the
measurements can be carried out quickly. At the same time the sample manipu-
lation has become ever more difficult. One only has to consider the evaporation
of such small amounts of solution, the enormously increasing logistics of
comprehending so much data in parallel, or the reproducibility of the results,
and the necessary sensitivity to measure weak signals with certainty to appreciate
the difficulty.
In order to improve this last aspect, ever more sensitive detection procedures are
used. Fluorescence measuring techniques are particularly sensitive. In the sim-
plest case, a fluorescing substrate such as coumarin (▶ Sect. 14.6) is incorporated in
the place of para-nitroanilide. The protein–ligand binding can also be followed by
fluorescence anisotropy (or polarization). A known ligand is coupled to
a fluorophore and excited with polarized light. The emitted fluorescence is in this
case also polarized. In the time that the excited molecule can freely diffuse in
solution, the extent of the induced polarization decreases. Because a small molecule
can diffuse much faster than a big one, its polarization signal decreases much faster
than if it were bound to a large protein. The difference is determined based on the
change in diffusion character of the large protein, which can be measured.
Even better sensitivity can be achieved with so-called FRET measuring
techniques (fluorescence resonance energy transfer). A resonance energy transfer
can occur between donor and acceptor fluorophores of similar absorption if both
are separated by no more than 50 Å. If, for example, a phosphatase assay is desired,
a phosphorylated peptide substrate must be coupled with a covalently bound donor
fluorophore. The substrate is added with the test compound. Depending on how
potent the inhibiting test compound is, the enzyme’s activity is reduced, and less
substrate is cleaved. Then an antibody is added that binds to the unphosphorylated
substrate. The antibody is also coupled to a fluorescence acceptor, the absorption
maximum of which overlaps with the emission spectrum of the donor fluorophore.
If a fair amount of phosphorylated substrate is still present, that is, the test
compound is a potent inhibitor, the spatial proximity of the donor and acceptor
leads to a strong FRET signal. This can be quantitatively measured.
In the meantime, progress in assay miniaturization allows the detection of single
molecules. This is possible by using fluorescence correlation spectroscopy (FCS).
A confocal laser microscope irradiates approximately a femtoliter of test solution.
If a single fluorophore diffuses through the volume of interest, it causes a time-
resolved fluctuation in the fluorescence signal. An exact analysis of these signals
delivers information about the concentration and diffusion constants. The diffusion
velocity, on the other hand, depends on whether the fluorescence-marker-labeled
substance is bound to a protein or not. If the proteins as well as the ligands are
tagged with different markers, the association and dissociation can even be
followed.
134 7 Screening Technologies for Lead Structure Discovery

7.4 From Binding to Function: Testing in Entire Cells

The binding of a ligand to a protein says nothing about the concomitant function or
change in function. Often it is easy to relate the observed inhibition in an enzyme
assay to a function. The correlation is less obvious with receptors and ion channels
(▶ Chaps. 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists
and Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for Channels,
Pores, and Transporters”). If the biochemical pathways and cell cycle regulation
are considered, it becomes even more complex to assign function for enzymes. This
correlation is not so easily reproduced in a test tube. Therefore assays must also be
developed to study function that allow the response of an entire cell to be measured
upon ligand binding. It is possible to culture cells for many different tissues, which
then allows the study of tissue-specific receptors.
Typically the activity of ion channels can be investigated by using binding tests or
radioactive assays. The so-called patch–clamp technique allows the influence of
a drug candidate to be even better characterized. An electrode is attached to the surface
of a cell, and a voltage or current is applied. In this way the opening or closing of single
channels can be registered, particularly when a test molecule is added. This technique
certainly does not encroach on the dimension of the high-throughput techniques. It is
better used to elucidate the function of hits from a prescreening. Fluorescence methods
are more popular for the first step. As an example, Ca2+-channel function can be
assessed by measuring an increase in intracellular calcium levels by using a dye that
fluoresces in the presence of calcium ions.
Other tests employ the coupling to a reporter gene. Receptor stimulation
initiates a signaling cascade that, for some receptors, leads to the transcription of
gene products that are controlled by the relevant promoters (▶ Sect. 28.1). If the
sequence of the relevant gene is replaced with that of a reporter’s, such as b-
galactosidase, luciferase, or green-fluorescent protein (GFP), then these proteins are
produced by the cell instead. This can subsequently be observed as an easily
detectable signal (Fig. 7.2). As examples, if the produced b-galactosidase cleaves
X-gal, a blue dye is released, luciferase develops an ATP-dependent chemilumines-
cence, and the green-fluorescent protein is detectable because of its own intrinsic
fluorescence.

7.5 Back to Whole-Animal Models: Screening on Nematodes

Primary substance testing on animals as it was once carried out is ethically


unjustifiable today. Further, an animal model is not predictive for target-oriented
optimization. Nevertheless it does have advantages. The reaction of an entire
organism to a substance is immediately transparent, the bioavailability is directly
measured, and side effects as well as synergistic effects are straightaway obvious.
Back in 1963, Sydney Brenner recognized the complexity of molecular biology in
that he emphasized the biochemical control of cellular development. He proposed
that the pinworm (the nematode Caenorhabditis elegans) would be the simplest
7.5 Back to Whole-Animal Models: Screening on Nematodes 135

DNA
Promotor
Gene A
for Gene A
Preparation of
the construct
DNA
Promotor
GFP Gene
GFP

Cell penetration
DNA
Promotor
GFP Gene
for Gene A

Test model
GFP


Activation hν
by active Registered signal
substances

Fig. 7.2 Genes are controlled by promoters. Promoter-initiated gene activation leads to the
synthesis of the relevant protein. By using green fluorescent protein (GFP), an easily observed
assay can be constructed based on this principle. For this the gene promoter that is activated by
agonist binding is coupled to the GF-protein gene. Activation of the promoter then delivers not the
original gene product, but rather the GF protein. The presence of GF protein is easily observed
because of its fluorescence upon excitation with ultraviolet light.

multicellular organism to investigate. This nematode normally lives in soil and


feeds on bacteria. It is also easily culturable in microtiter plates and fed with
Escherichia coli bacteria. It is a hermaphrodite, has a short lifespan, reproduces
itself within 3 days, can be conserved in liquid nitrogen, is transparent, and
homologous genes have been found in humans for 60–80% of its genes. The
pinworm genome has been sequenced, and we now understand how to easily
manipulate it. Because it is transparent, any internal changes can be easily observed
so that, for instance, proteins can be tagged with fluorescence markers. Its 959
somatic cells form many different organs, including a nervous system with 302
neurons. Can substance testing be carried out in such a life form? The ethical
threshold may be set lower in this case. But then, how predictive would any tests
be? Can such an animal be used to predict mood changes, depression, or appetite
and its relation to obesity? This is only possible if the causes of these diseases are
known on the molecular level, for example, a defect caused by an altered serotonin-
mediated signaling. In such a situation the worm can serve as a model. A first step
toward the discovery of a potential target is selective gene silencing. This is
possible by using RNA interference (▶ Sect. 12.7). If the pinworm (nematode) is
exposed to a substance library, it is possible to see a change in appearance or
behavior. Is the life expectancy lengthened or shortened? These are indications that
the compounds could interfere with the aging process or are toxic. If there are
changes in muscle cells, perhaps it might be useful for neurodegenerative muscle
136 7 Screening Technologies for Lead Structure Discovery

disease. Aside from macroscopic changes in the body form, changes in the gene
expression pattern can also be analyzed (▶ Sect. 12.9). Are mutations in proteins
apparent? Certainly the worm does not have the same metabolic pathways as we do.
Even its disease models only partially represent the pathophysiology that is seen in
human disease. Nonetheless the direct testing of compounds on the pinworm seems
to afford a new perspective for screening substance libraries. As an alternative, the
fruit fly (Drosophila melanogaster) or the zebra fish (Danio rerio) are also available
as test organisms. They help to test the validity of a therapeutic approach early in
a program.

7.6 In Silico Screening of Virtual Libraries

As described in the previous section, experimental high-throughput screening


(HTS) has been automated with great effort. When fed with compounds from
combinatorial chemistry (▶ Chap. 11, “Combinatorics: Chemistry with Big Num-
bers”), several hundred thousand substances can be scanned by using HTS. At first
it seemed that this would be the end to all rational structure-based techniques. In
view of the enormous financial investment and the disappointingly low hit rate, the
initial euphoria began to soberingly wane. Therefore as an alternative, the technique
of enumerating huge databases on the computer by fitting smaller molecules in
a predefined binding pocket (docking, ▶ Sect. 20.8) was developed (virtual
screening).
The unsatisfactory hit rate from HTS is attributed to the size, structural diversity,
and poorly selected composition of the substance library with respect to the actual
properties of the target protein. The recognition of false-positive and
false-negative hits in biological systems causes large problems. Disappointing hit
rates have been reported for the translation of initial hits into potential lead
structures for lead optimization. This is all the more reason to attempt to develop
virtual screening techniques into a complementary and alternative method. The
prerequisites for the successful use of the these techniques are entirely different
from those of the technology-driven HTS: virtual screening can only reasonably be
applied if the factors that are responsible for a putative drug to bind to its target
protein are understood on the molecular level.
The starting point for this is the spatial structure of the target protein, which is
usually determined by NMR spectroscopy (▶ Chap. 13, “Experimental Methods of
Structure Determination”) or X-ray structure analysis (Fig. 7.3). Models can be
increasingly derived from structurally homologous proteins of known geometry
(▶ Sect. 20.5). To successfully bind to a protein, the ligand must adopt a shape that
is complementary to the binding pocket. Molecules are flexible and can change
their shape through bond rotations that require very little energy (▶ Chap. 16,
“Conformational Analysis”). In addition to spatial fit in a suitable conformation,
the functional groups of a potential ligand must find complementary functional
groups in the binding pocket of the protein. Hydrogen bonds must be formed
between ligand and protein, and hydrophobic molecular portions must find their
7.6 In Silico Screening of Virtual Libraries 137

h b

c
g
Computer
Screening

f
d
O
OH
OAc

3
2
e
1
0
−1
−2
1 2 3 4 5 6

Fig. 7.3 The spatial structure of a protein is the starting point for virtual screening (a). The binding
pocket is explored with a variety of different probe atoms, for instance, for hydrogen bond acceptors
or donors (b). Regions that are particularly favorable for such interacting groups are highlighted on
the computer graphics. If the “hot spots” in these areas are summarized, a spatial pattern of properties
that a potential ligand should have become apparent (c). This pattern is called “pharmacophore” and
serves as the search criterion for a database retrieval (d). Potential ligands from a large database are
filtered and energetically evaluated by docking (e). The found hits are either commercially available
or synthesized in the laboratory (f). Next biological testing takes place (g), and if the binding is
successful, the lead structure is crystallized with the protein. The subsequent structural determination
(h) serves as a starting point for further design cycles.

counterpart in the protein (▶ Chap. 4, “Protein–Ligand Interactions as the Basis for


Drug Action”). For this, the protein binding pocket is analyzed to highlight the
areas that are essential for binding.
For a particular atom type, for instance, a hydrogen-bond donor or acceptor, the
binding pocket is systematically scanned. By using computer graphics, it is possible
to see where functional groups attached to a candidate ligand might be optimally
placed (▶ Sect. 17.10). The composite picture of all such placed atom types in the
binding pocket that are indicated by this analysis reveals a spatial pattern of
physicochemical properties that a ligand must meet to successfully bind to the
protein (“hot spots” ▶ Sect. 17.1 and ▶ 17.10). With these criteria in hand,
a molecular database can be searched that is composed of already-synthesized
compounds or compounds that have been virtually assembled on the computer.
138 7 Screening Technologies for Lead Structure Discovery

In case a hit from the latter group is found, the compound can be subsequently
synthesized. The search is divided into multiple filtering steps that become increas-
ingly stringent and sophisticated with successive reduction of the search quantity.
With the help of fast docking programs (▶ Sect. 20.8), molecules are fitted into the
binding pocket and a binding geometry is generated, from which the expected
binding affinity can be estimated. This step is the decisive one, but unfortunately it
is also the most difficult (▶ Sect. 20.9). In ▶ Chap. 21, “A Case Study: Structure-
Based Inhibitor Design for tRNA-Guanine Transglycosylase”, examples are
presented that were found by virtual screening.
The evaluation of the generated binding geometries is accomplished with suffi-
cient accuracy in about 70% of cases nowadays. An improvement in predictive
power requires that we understand the ligand–protein recognition process better
(▶ Chap. 4, “Protein–Ligand Interactions as the Basis for Drug Action”). The role
of water in the binding, the induced steric and dielectric adaptation, the plastic
behavior and residual mobility of proteins and bound ligands, and the dynamic
changes during complex formation are still poorly understood. The composition of
the databases themselves plays a decisive role in the search’s success. Enlarging the
database alone is not enough. The enrichment of the compounds that could fulfill
the requirements is crucial. Screening is often compared to the search for a needle
in a haystack. When looking for such a needle, it is not helpful to simply double the
size of the haystack! The haystack must be spiked with more promising needles. To
achieve this, all available knowledge about the structure, function, and dynamic
behavior of the target protein must be used to define the database search. Compar-
isons between proteins and protein binding pockets, especially among members of
the same protein family, can offer decisive information (▶ Sects. 20.3, ▶ 20.4,
▶ 20.5, ▶ 20.6). In principle, all of the data that are needed about the composition
of a suitable compound library for a virtual screening are already intrinsically coded
in the structure and geometric interaction properties of the binding pocket. It is only
a question of applying it correctly. Another decisive criterion for a hit is an adequate
pharmacokinetic profile so that satisfactory bioavailability can be achieved
(▶ Chap. 19, “From In Vitro to In Vivo: Optimization of ADME and Toxicology
Properties”).

7.7 Biophysics Supports Screening

Surface plasmon resonance techniques are being increasingly used to screen for
new lead structures. For this a target molecule is anchored onto the gold-coated
surface of a sensor chip. The underside of a glass carrier is irradiated with light
(Fig. 7.4). Changes in the refractive index, which are measured as a shift in total
internal refraction are a measure for bulk change on the sensor surface.
If a compound binds, the resulting change in mass on the gold surface can be
registered. Because the technique is fast and time resolved, other kinetic parameters
such as the association or dissociation rate constants of the binding event can be
measured in addition to the stoichiometry. One problem associated with screening
7.7 Biophysics Supports Screening 139

Intensity
Light Source Detector

I
II
I II

Polarized Angle
Prism Reflected
Light Light Resonance
Signal
II
Kon Koff
Sensor Chip
with Gold Film
I
Flow Channel
Time
Target Protein Test Ligand
Sensorgram

Fig. 7.4 The principle of surface plasmon resonance (SPR). The method registers changes in the
refractive index on the surface of a sensor chip (green). The extent of the changes on the gold surface
that are caused by the binding of the substrate molecule (yellow) onto an anchored receptor (red) leads
to a shift in the resonance angle of the reflected light (I and II). That way, not only the binding affinity
but also the kinetic association (kon) and dissociation (koff) parameters are measured.

in microtiter plates is the huge amount of time that is needed to load the plate with
compounds. One way around this bottleneck is to apply the entire compound library
to a sensor chip in a microarray format by using spraying techniques. This means
now all the low-molecular weight ligands are anchored on the chip. If a test receptor
protein is added to such a chip, a mass difference is detected where the protein
binds. Because of the spatial resolution of the chip, it can easily be determined
which library compound is responsible for the interaction. The disadvantage of the
method is that the test compounds must be attached with a chemical anchor that
allows them to be immobilized on the chip surface. Surface plasmon resonance has
meanwhile achieved a sensitivity that allows the detection of even very small test
compounds with a mass as small as 100 Da. Therefore the approach can be
reversed: Now the protein is immobilized at the surface and ligand binding from
solution can be recorded.
In Sect. 7.1, the concept of “ligand efficiency” was introduced. To take the latter
aspect into consideration, test libraries are being increasingly supplied with com-
pounds that have a molecular weight of less than 250 Da. In the meantime the term
chemical “fragment” has become popular for these search candidates. The term is
a bit unfortunate because the molecules are actually “complete” small molecules,
and not as the term might suggest that they are simply a “fragment”, that is, an
additional building block to be attached to a lead structure.
Proteins denature when they are heated. A “melting temperature” is defined
when an unfolding process (▶ Sect. 14.2) occurs. This temperature can be mea-
sured very sensitively with a thermal sensor. The binding of a ligand to a protein
140 7 Screening Technologies for Lead Structure Discovery

changes this melting point. As described in Sect. 7.3, fluorescence measurements,


are extremely sensitive indicators. This effect of melting can be registered in that
the unfolded proteins interact with a fluorescent dye, and the change in fluorescence
signal can be detected. The temperature shift caused by ligand binding can be used
as evidence as to whether a ligand is bound to a protein or not. It has also been
possible to construct quantitative binding assays exploiting this effect. This very
sensitive technique is also suitable to detect weakly binding fragments.
Mass spectrometry has developed significantly in the last decades. By applying
very gentle bombardment conditions it is possible to detach single electrons from
huge biomacromolecules, or even to generate negatively charged species. In the
best case, it is possible to detect the investigated protein in its intact form as
a singly charged ion. The charged particles are then accelerated between charged
parallel-oriented condensator plates. The flow of charged particles can be bent by
the application of a magnetic field. The flight path of a particular particle depends
on its mass and charge. In this way it is possible to separate and detect particles
based on their mass-to-charge ratio. This principle has been refined with the most
sophisticated technology and adept combination of electrical and magnetic fields so
that it is now possible to detect single mass differences of only a few Daltons among
even huge proteins. Clever experimental conditions allow a given situation in
solution, for instance, a protein–ligand complex, to be carried across into the gas
phase without decomposition. There it is ionized and detected in the mass spectrom-
eter. With this technique an assay is at our disposal that can be used to detect the
binding of very small ligands to proteins. It is even possible to cause the tailored
decomposition of the complexes by varying the acceleration voltage. By registering
the voltage at which the decomposition occurs, the strength of the protein–ligand
complex can be assessed. Because the decomposition occurs in the gas phase,
information about the binding strength of such complexes in a water-free environ-
ment is available.
Ligands can also be “fished” with proteins. For this, a protein for which a ligand
is sought, is exposed to an entire library of test compounds in aqueous solution.
Whatever compounds from the library bind to the protein are captured. The protein
is then separated with a microfilter, and the bound ligand is released in that the
protein is chemically denatured. The solution with the released ligands is then
processed, and a micro-HPLC separation is carried out. The chromatographically
separated ligands are then subjected to a very sensitive analysis to determine which
members of the original library were fished out by the protein.
The binding process of a ligand to a protein represents a chemical reaction. As
with all chemical reactions, a more or less pronounced heat of reaction can be
observed. The process can either release (exothermic) or absorb heat (endothermic).
This heat signal can be recorded to register the binding event of a ligand to a protein.
A very sensitive calorimeter is required. When equipped with an electronically
controlled compensatory heating, these devices can achieve astonishing sensitivity.
As an example, such a device was built to study the activity of a butterfly that was
being enticed with different pheromones. The heat that was generated with the stroke
of the wing was detected as a signal by the calorimeter.
7.7 Biophysics Supports Screening 141

Time (min)
0 10 20 30 40 50 60 70
0.0

μJ/s −0.2

−0.4
∫dF = ΔH
−0.6

−0.8
0

−2
kJ/mol

ΔH ΔG

−4 Stoichiometry

−6
0.0 0.5 1.0 1.5 2.0
Molar Ratio

Fig. 7.5 In isothermal titration calorimetry, a solution of a ligand is added dropwise to a solution
of a protein. The binding to the protein leads to an exothermic or an endothermic reaction.
The heat that evolves upon the addition of each drop is the area under the single signal peaks.
The total integral of all signal peaks is the binding enthalpy DH. With increasing amount of
ligand the protein becomes saturated so that the signal intensity of the heat signal decreases.
The binding constant (dissociation constant) can be derived from the shape of the curve and
the free energy DG can be obtained from the relationship DG ¼ RT ln Kd. The stoichiometry
of the reaction is simultaneously obtained. The entropy is calculated by using the equation:
DG ¼ DH  TDS.

A dissolved ligand can be titrated by dropwise injection into the solution of


a target protein in such a calorimeter. Each drop results in a heat signal. Upon
increasing saturation of the protein, the heat signal decreases so that a curve can be
generated from which the binding constant of the ligand can be deduced (Fig. 7.5).
If all the signals are integrated over the entire titration, the total heat of reaction for
the binding event is determined. With this, two different thermodynamic binding
characteristics are measured. The free energy DG is determined from the equilib-
rium constant, and the enthalpy DH is given by the integrated heat signal (▶ Sect.
4.3). By using Eq. 4.3, the entropy of binding can be calculated. It is important that
in addition to the proof of ligand binding to the protein, the most relevant thermo-
dynamic parameters DG, DH, and DS are assessable in one experiment at one
temperature.
The method of isothermal titration calorimetry is not for high throughput. It is
better for the analysis and description of the binding process. Because of its
importance, particularly with the optimization of ligands in mind, this method is
considered again in ▶ Sect. 8.8.
142 7 Screening Technologies for Lead Structure Discovery

7.8 Screening by Using Nuclear Magnetic Resonance

The method of NMR spectroscopy is presented in ▶ Sect. 13.7 in greater depth.


Here it suffices to say that it has to do with the orientation of magnetic moments of
the nuclei in a substance sample. By applying a carefully chosen spatial and time-
resolved sequence of electromagnetic fields, it is possible to specifically activate
nuclei that are oriented within these magnetic fields. This can be carried out for one
type of nucleus in a protein. If a solution of test ligands or an entire mixture of
ligands is added to such a solution, protein binding can occur, assuming that the
ligands are suitable. According to their binding strength, they reside for a particular
length of time on the magnetically saturated protein. In doing so the magnetic
signal is transferred from the protein to the ligand. Upon dissociation, the changed
magnetic characteristics can be spectroscopically detected because the relaxation
time of the transferred magnetization is faster in the uncomplexed state. The
solution is measured with and without the magnetized protein. Then the difference
between the spectra is evaluated. Signals are only then recognizable for ligands that
had been bound to the protein in the interim and have therefore experienced
magnetization transfer. The so-called saturation transfer difference (STD) spec-
trum can be used to screen for possible ligands (Fig. 7.6). Many different variations
and elaborate experimental protocols have been developed for the above-described

RF
(selective)

fast fast

minus
=

Fig. 7.6 To determine the saturation transfer difference (STD) with NMR spectroscopy, a library
of test ligands ( , ) is added to a target protein (ellipse). Potential binders (here ) reside for
a finite time span bound to the protein. If the nuclear spin of one type of nucleus in the protein is
selectively saturated (red) by using a suitable resonance frequency (RF), the protein magnetization
can be transferred (nuclear Overhauser effect, see ▶ Sect. 13.7) to the ligand that was bound in the
meantime ( ). These ligands become apparent in that their spectrum is altered even though they
are already dissociated from the protein. If the difference between the spectra in presence of the
saturated and unsaturated protein is displayed, it is possible to determine which ligands were
bound immediately to the protein. Many variations and sophisticated experimental protocols have
been developed for the principle of magnetization transfer.
7.9 Crystallographic Screening for Small Molecular Fragments 143

principle of magnetization transfer. Even the use of so-called reporter or spy


ligands, which have an easily measured NMR signal are used. The resonance of
fluorine atoms is particularly well suited. For this, a fluorine-containing reporter
ligand that binds to the protein is needed; its binding should not be too strong
though. The ligand should be easily displaced from the protein by the test ligand.
This release is detectable as a change in the fluorine NMR spectrum, and reveals the
binding of the test ligand in this way. As is explained in detail in ▶ Sect. 13.7,
the spatial structure of proteins can be determined by isotopic labeling and the
measurement of mutually coupled NMR spectra. By such means, where a test
ligand binds to a protein can be accurately determined by evaluating the specific
resonance shifts of the labeled protein. In the best case, it is even possible to see two
ligands binding at once or two different ligands binding on different non-
overlapping positions in the binding pocket. The research group of Steven Fesik
at Abbott developed these methods. It is known as “SAR by NMR” (SAR stands for
structure–activity relationship) and is used for lead-structure identification and
optimization. A nanomolar inhibitor for the matrix metalloproteinase stromelysin
(▶ Sect. 25.6) was found with this method. First a potent head group was sought
that could bind to the zinc ion in the catalytic center of this protease. Just such
a molecule, acetohydroxamic acid 7.1, was found with an admittedly weak but
specific binding of Kd ¼ 17 mM (Fig. 7.7). After the discovery of this ligand, the
binding site on zinc was saturated by this compound. Further NMR measurements
concentrated on the search for a ligand suited to fill the neighboring S10 binding
pocket. For this, a small library of heteroarylphenyl and biphenyl derivatives was
employed. 4-Cyano-40 -hydroxy-biphenyl 7.2 was identified as a hit. On the right
side of Fig. 7.7 both ligands are shown in the binding pocket. The evaluation of the
structural data showed that the hydroxylated phenyl ring binds in proximity to the
methyl group of the acetohydroxamic acid. Therefore connecting the fragments was
the next obvious thing to do. An ethylenoxy group was used as a bridge and was
coupled to the cyanobiphenyl moiety. NMR spectroscopy confirmed this structural
hypothesis and an inhibitor, 7.3, with an affinity of 25 nM was produced.

7.9 Crystallographic Screening for Small Molecular


Fragments

Crystal structure analysis delivers the most exact spatial position of a molecule in
the binding pocket of a protein. Even the geometry of small, very weakly binding
molecules is easily recognized. In structures that have a resolution better than
2–2.5 Å (▶ Sect. 13.5), water molecules are usually still recognizable as discrete
density maxima. Often, they indicate sites in the binding pocket that can be
equivalently accommodated by polar functional groups of ligands (Fig. 7.8). In
the early 1990s, Dagmar Ringe in the research group of Greg Petzko exposed
protein crystals intentionally to solvent molecules to allow the solvent to diffuse
into the crystals (▶ Sect. 20.2). The solvent molecules can act as probes in that they
populate binding regions of the protein pockets. As an example, the areas where
144 7 Screening Technologies for Lead Structure Discovery

a Kd = 17 mM d
H
N CH3
HO 7.1
O

Zn2+ S1⬘

Stromelysin
His211

Zn2+
b Kd = 17 mM Kd = 0.02 mM Val163

H
N CH3 HO
HO 7.2
His205
O
S1⬘ e
Zn2+ CN

Stromelysin

c IC50 = 25 nM
H
N O His211
HO
7.3
O Zn2+

Val163
Zn2+ S1⬘
CN

Stromelysin His205

Fig. 7.7 In the “SAR by NMR” method, ligands with weak affinity to a protein, in this case
stromelysin, are sought from a large complex mixture. 15N-labeled protein is used and so-called
1
H-15N HSQC spectra are measured. If a ligand such as acetohydroxamic acid 7.1 becomes
apparent through a shift in the resonance of specific amino acids that protrude into the binding
pocket, the binding geometry can be deduced (a, d). Later the binding site is saturated with these
ligands. Further NMR measurements are carried out to identify ligands for neighboring binding
positions. These are revealed by the shift in the resonances of neighboring amino acids. That is how
4-cyano-40 -hydroxybiphenyl 7.2 was discovered (b, d). A chemical coupling of both hits 7.1 and
7.2 with a –CH2CH2O– linker produced 7.3, which is a nanomolar inhibitor of the protease
stromelysin (c, e).

isopropanol, acetonitrile, or acetone are encountered in thermolysin, a zinc prote-


ase, are shown in Fig. 7.8.
Even phenol, a small organic molecule, manages to diffuse into the binding pocket.
Phenylsuccinic acid, a lead structure with a typical fragment size, binds to the zinc
protease. Its binding position has been determined by crystallography. The phenyl
ring of this molecule sits in the position that is also explored by phenol. One of the
acid groups of the succinic acid is in the position that was indicated by the carbonyl
carbon of acetone. The second acid group coordinates to the zinc ion and occupies
positions where water molecules resided in the uncomplexed state (Fig. 7.8). There
are many protein–ligand complexes in which small molecules from the crystallization
solution or cryobuffer were adsorbed. These can be used as probes to map out
7.9 Crystallographic Screening for Small Molecular Fragments 145

Fig. 7.8 It was possible to a


soak small probe molecule
(so-called “fragments”) into Phe114
crystals of the protease
thermolysin. (a)
Superposition of multiple Asn112
structures in which water
(red spheres), isopropanol
(C atoms are gray), acetone
(C atoms are light blue),
acetonitrile (C atoms are
green), and phenol (C atoms
are violet) had penetrated the
crystals. They describe
potential positions for Zn2+
functional groups of putative
ligands. The structure of
benzylsuccinic acid, a weakly Arg203
binding inhibitor of
thermolysin, is also shown in OH
CH3
(b). That molecule H2O
coordinates with one of its N O HO
acid groups to the catalytic
Water Acetonitrile Acetone Isopropanol Phenol
zinc ion (upper row). Both
oxygen atoms of the acid
group displace two water b
molecules that are present in Phe114
the non-complexed structure.
The other carboxylate group
forms a salt bridge with the Asn112
neighboring Arg203.
The oxygen of an acetone
molecule was found at almost
the same position. The phenyl
ring of the benzylsuccinic
acid that occupies nearly the
same position as the phenol
molecule in the fragment
structure was detected. Zn2+
Benzylsuccinic acid can be
used as a starting structure
for further optimization. Arg203

HO
HO O
Benzylsuccinic acid
146 7 Screening Technologies for Lead Structure Discovery

a binding pocket. A creative scientist will directly exploit their position for the design
of new drug candidates. From there, it was obvious to use crystal structure analysis as
a method to screen small molecules or “fragments” (MW <250 Da).
Even today a crystal structure determination is fairly laborious. All the same, it
can be largely automated so that a few hundred molecules can be processed. In
addition, the tendency of small molecules to diffuse into mature protein crystals can
also be used (so-called “soaking”; ▶ Sect. 13.9). If a “cocktail” of multiple test
substances is used, the screening can be accelerated. A protein crystal can be
exposed to up to 10 compounds at once. The composition of the cocktails is
construed so that a mixture of different forms (long and stretched, angular, spher-
ical, etc.) is present. This makes it easier to distinguish them later in the electron
density (see ▶ Sect. 12.5). To optimize the effort-to-yield ratio for the crystallo-
graphic screening, often a different screening method is carried out first to pre-filter
possible hits. Only compounds that have been identified as hits in the first screening
are used in the subsequent crystallographic screening. However, only a few tech-
niques that have been described in the previous section are really suitable to find
a small, weakly binding candidate from a fragment library. Frequently this concerns
only millimolar-binding candidates.
The hits from the crystallographic fragment screening can be further developed
(▶ Sect. 20.7). One possibility is to probe the different regions of the binding
pocket and then connect the pieces with a linker, analogously to what was
described in Sect. 7.6 in the “SAR by NMR” method. In another, usually more
successful variation, the fragment hits are chemically elaborated upon. For this
approach additional moieties are added on the basis of the crystal structure. In this
way the original hit, which serves as a seed, can be enlarged to bind more strongly to
the protein.

7.10 Tethered Ligands Explore Protein Surfaces

Ligands bind with very poor affinity to flat pockets that are open to the surrounding
solvent. Therefore, it is extremely difficult to evidence their binding or obtain
a crystal structure with a ligand bound in such an area. James Wells and his
colleagues at the Sunesis company in San Francisco developed the idea to tether
ligands for this type of binding. From a chemical point of view, this means that
a reaction is carried out with the exposed thiol of a cysteine residue on the protein’s
surface. Such a cysteine must be available in the native protein, or it is appropriately
introduced by mutagenesis (▶ Sect. 12.2). Under suitable reaction conditions, the
ligand is anchored with a disulfide bond, which is formed through the thiol group
of the exposed cysteine (Fig. 7.9). Only those test candidates from the compound
library will react that are able to form an interaction with the surface in the vicinity
of the cysteine thiol group. For all intents and purposes, they explore the surround-
ing region, react with the cysteine, and remain coupled to the surface by the
disulfide bridge. Successfully formed complexes are then evidenced by mass
spectrometry. James Wells and Robert Strout chose thymidylate synthase as their
7.10 Tethered Ligands Explore Protein Surfaces 147

R
R R
S

S S S R
S S
S
S
SH S
S
+

Fig. 7.9 The thiol group of the exposed cysteine is used as an anchor group for the formation of
disulfide bonds with ligand candidates from a compound library. There, suitable ligands react that
are also able to interact with the surface region in the vicinity of the cysteine thiol. A crystal
structure was determined from just such a covalently linked complex (Fig. 7.12). After optimiza-
tion of the initially discovered hit, the disulfide anchor can be discarded and a non-covalent
inhibitor can be developed.

first test example. This enzyme plays an important role in the de novo synthesis of
thymidine, an essential building block for DNA. Cells with a high division rate
especially need this building block so that inhibition of this enzyme might represent
potent anti-infective agents or antitumor compounds (▶ Sect. 27.2).
Thymidylate synthase has a cysteine residue in position 146, in the vicinity of
the catalytic site. From a library of 1200 disulfides, compounds 7.4–7.7 proved to be
binders whereas the very similar derivatives 7.8–7.11 were not selected (Fig. 7.10).
Accordingly, the phenylsulfonamide together with the proline moiety seemed to be
essential for binding. Next the disulfide anchor was removed, and the binding
constant for N-tosyl-D-proline 7.12 was measured to be 1.1 mM (Fig. 7.11). To
further test the concept, Cys146 was exchanged for a serine (Fig. 7.12). When no
binding was apparent with this mutant, the neighboring His147 was mutated to
a cysteine, but this mutant could not fish out the N-tosylproline moiety either. In
contrast, the position-143 mutant was successful (Fig. 7.12). In that case a leucine
was exchanged for a cysteine. The subsequently determined crystal structure
showed that the N-tosylprolyl moiety was almost identically bound in both cova-
lently anchored complexes, just as they are without an S—S anchor (Fig. 7.12). This
is convincing proof that the covalent coupling is not responsible for the binding
geometry. In fact, the technique allows small, initially weakly binding ligands to be
fished out of a large library. From the original millimolar hit 7.12, the side chain of
the natural cofactor methylenetetrahydrofolic acid could be transferred to give 7.13,
which was developed into a nanomolar inhibitor 7.15 in two steps.
The method of “tethering” can be fairly generally applied. It has especially
achieved success in the search for ligands that disrupt the formation of protein–
protein surface contacts (▶ Sect. 10.6). A great advantage of the technique is that
it is not necessary to develop an additional biochemical binding assay. Weakly
148 7 Screening Technologies for Lead Structure Discovery

CH3
CH3 F
S
S S S
S
S S O
H3C
O O N S S
S S O N
O N N
O
7.8 7.9
7.4 7.5
CH3 CH3 CH3
H3C CH3
S S
S S S S
O O
S Cl S S
N S
O O O N
S S H O H
O N O N

7.6 7.7
7.10 7.11

Fig. 7.10 From a library of 1,200 disulfides, the compounds on the left side 7.4–7.7 proved to be
binders although structurally similar derivatives 7.8–7.11 (right) were synthesized but did not bind
to the protein.

binding ligands are covalently “tethered” and cannot be washed away as happens in
the case of simple complex formation. Further, the covalently bound chemical
probes allow the adaptive capacity of the surface region to be explored.

7.11 Synopsis

• Large substance libraries are screened for biological effects to filter out active
molecules and assess their value for a given indication.
• Three phases are distinguished, a broad automatic introductory screening for
hits, a more detailed screening of chemical analogues around a hit to establish
the first structure–activity relationship, and a lead optimization to find candidates
for clinical testing.
• A prerequisite for high-throughput screening was the development of in vitro
test systems using pure proteins produced by gene technology along with the
entire arsenal of biochemical methods in the test tube so that the function of
single-gene products can be recorded.
• As a disadvantage, high-throughput screening does not assess the entire effect
spectrum and ignores effects such as transport, distribution, metabolism, and
excretion.
• Screening libraries are frequently assembled of molecules from other drug
development projects; as such, they are rather inefficient with regard to their
molecular size and their modest screening hit activity in micromolar range.
7.11 Synopsis 149

HOOC COOH HOOC COOH

O NH O NH
CH3

O H
COOH COOH N
O O O
S S S COOH
O N O N O N

7.12 7.14 7.15


Ki = 1,1 mM Ki = 24 μM Ki = 330 nM

HOOC COOH

O NH

O N

HN N
H
N
H2N
7.13

Fig. 7.11 By transferring a side chain from the natural cofactor methylenetetrahydrofolic acid
7.13, N-tosyl-D-proline, a millimolar inhibitor could be transformed into a nanomolar inhibitor
7.15 in two steps.

Small substances with high ligand efficiency and sufficient space for structural
optimization are particularly promising.
• Enzymatic function and its inhibition can be recorded by the production of
chromophoric reaction products.
• Radioactively labeled compounds or enzyme-linked immunosorbent assays are
versatile techniques to record protein function on the molecular level.
• Progress in assay miniaturization calls for sophisticated robotic systems, ever-
improving sensitivity of the read-out, including fluorescence measuring tech-
niques, and reliable logistics to handle the enormous data flow.
• Aggregate formation of hydrophobic test compounds can exert significant influ-
ence on the assay read-out or even cause false positive or negative hits.
• Testing on cell-based assays is performed to study changes in cellular or
organism-related function beyond pure binding of a test compound to a given
protein target.
150 7 Screening Technologies for Lead Structure Discovery

S S
Cys143 Leu
Cys143

= 7.4

S
S

Cys146

= 7.4

Cys146 Ser
His147 = 7.12

Fig. 7.12 Superpostions of crystal structures of the enzyme thymidylate synthase with two
tethered ligands, one bound to Cys143 (C atoms of ligand 7.4 are green) and the other to
Cys146 (C atoms of ligand 7.4 are violet), both of which are N-tosyl-D-proline derivatives and
which are covalently anchored through S—S bridges. Upon cleavage of the disulfide anchor, the
free N-tosyl-D-proline (C atoms are gray, 7.12) proved to be a ligand with an affinity of 1.1 mM. Its
binding geometry is very similar to both of the covalently anchored derivatives.

• Primary animal testing in vertebrates has been abolished today for ethical
reasons, but it is being increasingly replaced by whole-animal screening by
using nematodes as the simplest multicellular organism to record synergistic
and side effects.
• As a complementary and alternative method, virtual computer screening has
been developed to screen large compound libraries by docking ligand candidates
into the known spatial structure of a target protein.
• Binding events are recoreded by biophysical methods such as surface plasmon
resonance, thermal stability shifting, mass spectrometry, or microcalorimetry.
They are used to detect ligands as potential binders.
• NMR spectroscopy can be used to detect ligand binding by magnetization
transfer. Multiple binders can be chemically linked to more strongly binding
ligands according to the SAR by NMR technique.
• Exposure of small molecular probes and fragments to protein crystals allows for
the structural characterization of the binding modes of weakly binding fragments
as a versatile starting point to lead optimization.
• Small-molecule fragments tethered to a protein through covalent attachment to
the exposed thiol group of a cysteine residue allow the exploration of the binding
properties of flat, solvent-exposed surface depressions and serve as a starting
point to develop antagonists to perturb the protein–protein interface in complex
formation.
Bibliography 151

Bibliography

General Literature

Blundell TL, Jhoti H, Abell C (2002) High-throughput crystallography for lead discovery in drug
design. Nat Rev Drug Discov 1:45–54
Hajduk PJ, Greer J (2007) A decade of fragment-based drug design: strategic advances and lessons
learned. Nat Rev Drug Discov 6:211–219
Jahnke W, Erlanson DA (2006) Fragment-based approaches in drug discovery. In: Mannhold R,
Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry, vol 34. Wiley-
VCH, Weinheim
Jones AK, Buckingham SD, Sattelle DB (2005) Chemistry-to-gene screens in Caenorhabitis
elegans. Nat Rev Drug Discov 4:321–330
Klebe G (2006) Virtual ligand screening: strategies, perspectives and limitations. Drug Discov
Today 11:580–592
Löfås S (2004) Optimizing the hit-to-lead process using SPR analysis. Assay Drug Dev Technol
2:407–415
Siegel MM (2002) Early discovery drug screening using mass spectrometry. Curr Topics Med
Chem 2:13–33
Sotriffer C (2010) Virtual screening. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and
principles in medicinal chemistry, vol 48. Wiley-VCH, Weinheim
Vogtherr M, Fiebig K (2003) NMR-based screening methods for lead discovery. In: Hillisch A,
Hilgenfeld R (eds) Modern methods of drug discovery. Birkh€ausen Verlag, Boston, pp S183–
S120. ISBN 376436081X

Special Literature

Hajduk PJ, Sheppard G, Nettesheim DG, Olejniczak ET, Shuker SB, Meadows RP, Steinman DH,
Carrera GM Jr, Marcotte PA, Severin J, Walter K, Smith H, Gubbins E, Simmer R, Holzman
TF, Morgan DW, Davidsen SK, Summers JB, Fesik SW (1997) Discovery of potent nonpeptide
inhibitors of stromelysin using SAR by NMR. J Am Chem Soc 119:5818–5827
Erlanson DA, Braisted AC, Raphael DR, Randal M, Stroud RM, Gordon EM, Wells JA
(2000) Site-directed ligand discovery. Proc Natl Assoc Soc 97:9367–9372
Optimization of Lead Structures
8

A lead structure is the starting point on the way to a drug. The potency, specificity,
and duration of effect must be optimized, and the side effects and toxicity must be
minimized in an usually elaborate, iterative process. Every change in the chemical
structure modulates the 3D structure of the molecule, its physicochemical prop-
erties, and the activity spectrum. The isosteric replacement of atoms or groups,
the introduction of hydrophobic building blocks, the dissection of rings or the
restriction of flexible molecular portions into cyclic structures, and the optimiza-
tion of the substitution pattern are all possibilities to purposefully modify a target
structure.
Creativity and luck are always important prerequisites for success in pharmaceu-
tical research. Nonetheless, there is a treasure chest of decades of accumulated
experience that can be exceedingly supportive to the rational optimization process.
The computer-aided methods can contribute to their full capability in this field in
particular. Several general considerations and approaches to lead optimization are
presented in the sections of this chapter. A discussion of the structure-based
and computer-aided optimization of lead structures is presented in ▶ Chaps. 17,
“Pharmacophore Hypotheses and Molecular Comparisons” and ▶ 20, “Protein
Modeling and Structure-Based Drug Design”; examples for its application to differ-
ent therapeutic areas are presented in ▶ Chaps. 23, “Inhibitors of Hydrolases with an
Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors
of Hydrolyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidore-
ductase Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29,
“Agonists and Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for
Channels, Pores, and Transporters”; ▶ 31, “Ligands for Surface Receptors”; ▶ 32,
“Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs”.

8.1 Strategies for Drug Optimization

The optimization of active substances follows a process that is best characterized by


the words of the philosopher Sir Karl Popper:

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_8, 153


# Springer-Verlag Berlin Heidelberg 2013
154 8 Optimization of Lead Structures

The truth is objective and absolute. But we can never be sure that we have found it. Our
knowledge is always an assumed knowledge. Our theories are hypotheses. We test for the
truth in that we exclude what is false. (Objective Knowledge, 1972)

Accordingly the optimization of a compound’s potency follows a working


hypothesis, while an iterative process of trial and error refines the hypothesis. The
assembled data about the relationship between chemical structure and biological
activity serve the design of new structures. These are synthesized and tested, and
a new working hypothesis is modified as appropriate. In negative cases, the
hypothesis is discarded and a new one is formulated that fits more harmoniously
with the biological data. The following qualities in the structure of the active
substance are distinguished from one another:
• The actual pharmacophore (Sects. 8.7 and ▶ 17.1) that is responsible for the
specific binding and upon which only limited chemical modification can be
carried out,
• The additional groups (adhesion groups) that improve the affinity and biolog-
ical activity,
• Further groups that do not influence the binding but rather the lipophilicity of
the molecule and with it the transport and distribution in biological systems
(▶ Chap. 19, “From In Vitro to In Vivo: Optimization of ADME and Toxicology
Properties”),
• The groups that must be cleaved or modified in the organism to release the
actual active form (▶ Chap. 9, “Designing Prodrugs”).
The most important steps in the optimization of lead structures are the systematic
changes in the shape and form, that is, the three-dimensional structure, and/or the
physicochemical properties. Single steps along this route are:
• Changes in the lipophilicity and the electronic properties through the introduc-
tion or removal of hydrophobic or hydrophilic groups,
• Variations of substituents at aromatic or heteroaromatic rings,
• Introduction or elimination of heteroatoms in chains or rings,
• Changes in chain length of aliphatic groups or linkers,
• Introduction of space-filling substituents to stabilize a particular conformation,
• Changes in the ring size of alicyclic or heterocyclic rings,
• Incorporation of flexible partial structures in rings,
• Incorporation of branches or attachments to rings (rigidifying),
• Opening of rings,
• Elimination of chiral centers to simplify a structure,
• Addition of chiral centers to increase the selectivity or
• Shift the thermodynamic binding profile and the drug’s residence time at the
target protein.
These processes are usually unidirectional in classical drug optimization,
that is, the optimization takes place on one position of the molecule at a time, in
one single direction. In the past, such unidirectional optimization has led to many
disappointments because interdependent influences of the structural changes were
neglected, or the optimal lipophilicity was exceeded. John Topliss developed
8.2 Isosteric Replacement of Atoms and Functional Groups 155

a scheme for the variation of aromatic substituents that allows the biological activity
to be optimized in a minimum number of steps (Sect. 8.3). The application of
experimental design, simultaneously changing multiple parts of a molecule, and the
evaluation of the results by using quantitative structure–activity relationships
(▶ Chap. 18, “Quantitative Structure–Activity Relationships”) usually allows a fast
and effective optimization. In structure-based and computer-aided optimization, the
3D structure of the target protein and its complexes leads to directed structural
variations of the active substances. Here again, the aspects of total lipophilicity and
metabolism should not be neglected.

8.2 Isosteric Replacement of Atoms and Functional Groups

Isosteric replacement is the exchange of particular groups in a molecule for


sterically and electronically related groups. If the biological effect is essentially
maintained, the term bioisosteric replacement (Fig. 8.1) is used. In the simplest
case a single atom is exchanged, for instance, a Cl (lipophilic, weakly electron
withdrawing) is replaced by a Br (same characteristics as Cl) or methyl (lipophilic,
weakly electron donating), or an –O– (polar, H-bond acceptor) is exchanged for an
NH (polar, H-bond donor) or a –CH2– (lipophilic, unable to form H-bonds).
Furthermore, bioisosteric replacement also means the exchange of entire groups.

Substituents: F-, Cl-, Br-, CF3-, NO2-


Methyl-, Ethyl-, Isopropyl-, Cyclopropyl-, tert-Butyl-,
-OH, -SH, -NH2, -OMe, -N(Me)2

Bridging Groups: -CH2-, -NH-, -O-


-COCH2-, CONH-, -COO-,
>C=O, >C=S, >C=NH, >C=NOH, >C=NOAlkyl

Atoms and Groups in Rings: -CH=, -N=


-CH2-, -NH-, -O-, -S-
-CH2CH2-, CH2-O- -CH=CH-, -CH=N-

Larger Groups: -NHCOCH3, -SO2CH3 H


N N N O
-COOH, -CONHOH, -SO2NH2, NH , O
N N
HO HO N

HO N N
H H

Fig. 8.1 A few possibilities for the isosteric replacement of atoms and/or groups.
156 8 Optimization of Lead Structures

I I

HO O CH2CH(NH2)COOH

I 8.1 Triiodothyronine, T3

HO O CH2CH(NH2)COOH

8.2
R
COOH
O

O 8.4 R = -COOH
NH2
8.3 Acetylsalicylic acid or -SO2NH2

Fig. 8.2 Isosteric replacement with retention, loss, and reversal of the biological activity. All
three iodine atoms of the thyroid hormone thyroxine 8.1 can be replaced with alkyl groups and
compound 8.2 is still active. In the case of acetylsalicylic acid 8.3, the exchange of the –OCOCH3
for an NHCOCH3 group led to the loss of the acylating ability and therefore a nearly complete loss of
the biological activity. The antimetabolite sulfanilamide 8.4 (R ¼ SO2NH2) is derived from
p-aminobenzoic acid 8.4 (R ¼ COOH), which is a critical intermediate in the bacterial dihydrofolate
synthesis; 8.4 (R ¼ SO2NH2) is the result of the exchange of a carboxyl group for an isosteric
sulfonamide group.

For example, –COOH, an H-bond acceptor and donor, can be replaced with other
groups that have the same or modified properties, for instance, with the similarly
acidic tetrazole. Another example can be found in the exchange of a phenyl ring for
a thiophene or a furan building block (Fig. 8.1). The potential of isosteric replace-
ment is illustrated in the exchange of all three iodine atoms of triiodothyronine T3
8.1 for alkyl groups to give 3,5-dimethyl-30 -isopropylthyronine 8.2, which in turn
retains impressive affinity and agonistic activity on the thyroid hormone receptor.
In contrast to triiodothyronine, which is both iodinated and metabolized by
a deiodinase, the alkyl groups of 8.2 are no longer metabolically cleavable.
Bioisosteric replacement was and is one of the most important strategies in
pharmaceutical research. Nonetheless, surprises sometimes occur. The replacement
of an ester for an amide group in the local anesthetics (▶ Sect. 3.4) expectedly
improved the metabolic stability. In the case of acetylsalicylic acid 8.3 (Fig. 8.2)
this exchange cannot be made. An analogous exchange of the –COO– group for
a –CONH– group results in a complete activity loss because the amide can no
longer acylate the cyclooxygenase enzyme (▶ Sect. 27.9). In the case of
p-aminobenzoic acid (R ¼ –COOH, Fig. 8.2) the exchange of a carboxyl group
for a sulfonamide group gives sulfanilamide 8.4 (R ¼ –SO2NH2), which is an
antimetabolite of p-aminobenzoic acid (▶ Sect. 2.3).
8.3 Systematic Variation of Aromatic Substituents 157

A lead structure is rarely studied exclusively by one research group. Other


companies adopt successful examples, at the very latest after the economic success
of a new medicine. The goal of this so-called “me-too” research is to modify the
competitor’s lead structure to arrive at patent-free analogues that are more effica-
cious, more selective, or better tolerated. It must be accepted that even this form of
competition has led to the therapeutically most valuable compounds in many thera-
peutic areas. On the one hand, a plentitude of duplicate work has been performed,
while on the other hand, new analogues with improved properties have been produced
and introduced to therapy which turned out to be successful in the long run. Penicil-
lins of the third and fourth generation with broad-spectrum activity and metabolic
stability, b-blockers with improved selectivity, and many other specific drugs would
simply not exist if it were not for the much-disparaged “me-too” research.

8.3 Systematic Variation of Aromatic Substituents

The goal of lead structure optimization has an impact on the planning of the
relevant experimental series. If the biological consequences of structural changes
are to be evaluated with minimal effort, careful design must precede the synthesis
of the substances. Here an almost unsolvable problem emerges in that, as a general
rule, the exchange of a substituent or group leads to complex changes in multiple
properties. The exchange of an ethyl group for a methyl group changes only the
lipophilicity and size of the substituent. If a methyl group is exchanged for
a chlorine atom, the polarizability, electronic properties, and moreover the metab-
olism is altered. Other substituents could then change the H-bond donor and
acceptor properties as well as the ionization and dissociation.
In 1971, Paul Craig proposed the use of a simple diagram for the structural variation
of aromatic substituents, with which the important characteristics of these substitu-
ents, for instance, lipophilicity and electronic properties, are plotted against each
other. The selection of substituents from different quadrants of this diagram allows
an evaluation of different combinations of properties. The concept can be extended to
multiple dimensions, possibly with the aid of mathematical and statistical methods.
In 1972, John Topliss made a suggestion that went further, which would be
called today an evolutionary strategy. One substituent at a time (e.g., hydrogen for
chlorine) is exchanged in the optimization of the substitution pattern of an aromatic
compound. The next compound is planned based on which of the first two com-
pounds demonstrated better effects. If the new substituent improves the effect,
a new substituent is chosen that has the same physicochemical properties, in larger
measure, or more of these substituents are added. If the new substituents impair the
biological activity, then a substituent is chosen that has the opposite physicochem-
ical properties. If two different substituents produce the same effect, it should be
evaluated whether changes in the physicochemical properties influence the activity
in the opposite direction. Despite its elegance, this strategy often fails for the
mundane reason that it is too time consuming to take such a stepwise approach.
158 8 Optimization of Lead Structures

As a consequence of the work of Craig and Topliss, further design methods were
developed. None of these methods should be interpreted too closely. Synthetic
planning must be oriented on both the accessibility of the compounds as well as
achieving the largest possible structural variation, that is, a diversity of physico-
chemical properties and 3D structure. Since the introduction of combinatorial
chemistry (▶ Chap. 11, “Combinatorics: Chemistry with Big Numbers”), the ratio-
nal design of diverse substance libraries has taken on entirely new possibilities and
perspectives.

8.4 Optimizing the Activity and Selectivity Profile

The structural variation of a lead structure influences not only the activity strength
but also the activity spectrum. That can be thoroughly advantageous, but it also
brings with it the risk that the selectivity can deteriorate. A simple rule of thumb is
that enlarging the molecule, introducing optically active centers, and rigidification
improves the selectivity, assuming that the activity is not entirely lost. On the other
hand, removing a chiral center, establishing more flexibility, or reducing the size of
the molecule usually results in unspecific and weaker activity.
Because of the sequencing of the human genome, the gene family to which
a target protein belongs is known, as is the number of members of the gene family.
By using gene technology it is possible to construct single isoform test systems
(assays). As a result, today pharmaceutical research is in a position to make
a predictive selectivity profile. This has stimulated efforts to develop selective
drugs. An interesting corollary to these efforts is the fact that the molecular weight
of drugs has increased, as statistics show, in the last years, a confirmation of the
above-mentioned rule of thumb.
For drugs that are meant to act on neuroreceptors in the brain, the polarity is critical
to whether they can cross the blood–brain barrier. Polar compounds are unable to do
this and act only in the periphery, for instance, on the circulatory system. Examples of
this are adrenaline 8.5 and dopamine 8.6 (Fig. 8.3). The stepwise removal or masking
of polar groups brings the central effects into the foreground. Ephedrine 8.7 acts in the
brain and in the periphery, it is centrally stimulating and raises the blood pressure.
Amphetamine 8.8 (“speed”) and the intoxicant MDMA 8.9 (the designer drug
“ecstasy”) are weak bases. Their relatively nonpolar neutral forms easily overcome
the blood–brain barrier and their CNS effects dominate (Fig. 8.3).
There are exceptions even here. L-DOPA 8.10 (Fig. 8.3) is an extremely polar
amino acid. It could never cross the blood–brain barrier by passive diffusion alone.
Instead it is recognized by an amino acid transporter and actively transported over
the membrane and into the brain. This simultaneously solves the problem of
bringing dopamine 8.6, which is used to treat Parkinson’s disease, into the brain
because L-DOPA is decarboxylated to dopamine there (▶ Sects. 9.4 and ▶ 27.8).
The decisive influence that even the smallest changes in the structure can have is
seen in the effect spectrum of the hormone and neurotransmitter noradrenaline and
adrenaline and their synthetic analogues. Whereas noradrenaline 8.11 (Fig. 8.4)
8.4 Optimizing the Activity and Selectivity Profile 159

Polar Molecules Intermediate Polarity: Nonpolar Molecules:

OH
H NH2
HO N
CH3 CH3
OH
HO H
N 8.8 Amphetamine
CH3
8.5 Adrenaline
CH3
H
HO NH2 8.7 Ephedrine N
O CH3
R CH3
HO O

8.6 Dopamine, R = H 8.9 MDMA


8.10 L-DOPA, R = COOH

Fig. 8.3 The polar compounds adrenaline 8.5 and dopamine 8.6 are cardiovascularly active in the
periphery after intravenous administration. Ephedrine 8.7 is more lipophilic and therefore shows
both peripheral and central effects. The more nonpolar compound amphetamine 8.8 (“speed”) has
overwhelmingly stimulatory effect in the CNS. 3,4-Methylenedioxymethamphetamine 8.9
(MDMA; “ecstasy”) is hallucinogenic. Polar groups are red and neutral or lipophilic groups are
blue.

8.11 Noradrenaline, R = H
OH
H Predominantly α-Mimetic
HO N 8.5 Adrenaline, R = CH3
R
α- and β-Mimetic
HO
8.12 Isoprenaline, R = -CH(CH3)2
β1-Mimetic

OH
H 8.13 Dobutamine
HO N
β1-Mimetic
CH3
HO
OH OH
H H
N CH3 Cl N CH3
HO
CH3 CH3
CH3 CH3
HO H2N
Cl

8.14 Salbutamol 8.15 Clenbuterol


b2-Mimetic b2-Mimetic

Fig. 8.4 Noradrenaline 8.11, adrenaline 8.5, and isoprenaline 8.12 act to different extents on the
a and b receptors. Selective b1 and b2 agonists, for instance, 8.13, 8.14, and 8.15, act specifically as
cardiac stimulants or bronchodilators.
160 8 Optimization of Lead Structures

O O O O O O O
S S S
H2N NH H2N OH

N O
Cl N Cl
H H

8.16 Hydrochlorothiazide 8.17 Furosemide

O O O
S 8.18 Carbutamide, R = NH2
N N CH3
H H 8.19 Tolbutamide, R = CH3
R

O O O
S
O N N
H H
Cl
N
H
OMe 8.20 Glibenclamide

Fig. 8.5 The sulfonamides hydrochlorothiazide 8.16, furosemide 8.17, and related diuretics are
different from most antibacterial analogues because of the unsubstituted sulfonamide group.
Carbutamide 8.18 and tolbutamide 8.19 were the first unspecific sulfonamides with hypoglycemic
effects that were later replaced with specific hypoglycemics of the glibenclamide-type 8.20.

affects the a-adrenergic receptors, its N-methyl derivative adrenaline 8.5 (Fig. 8.3)
acts on a and b receptors as a mixed a/b agonist. This difference was used to
enlarge the N-alkyl group to arrive at the specific b-agonist isoprenaline 8.2
(Fig. 8.4). Further differentiation of the effects could be achieved within the class
of b-adrenergic substances. Dobutamine 8.13 is missing the alcoholic hydroxyl
group of adrenaline. Despite its structural relationship to dopamine 8.6 (Fig. 8.3) it
is a b1 agonist with cardioselective effects. Specific b2 agonists, for instance
salbutamol 8.14 and clenbuterol 8.15 (Fig. 8.4) are used to treat asthma because
they are bronchiodilators without the cardio-stimulatory effects of the unspecific b
agonists (▶ Sect. 29.3).
The sulfonamides are a prime example for the targeted optimization of lead
structures in different therapeutic indications. From the first antibacterial examples,
the diuretics as well as hypoglycemics (antidiabetics) resulted. It had already been
noticed in 1940 that sulfanilamide (▶ Sect. 2.3) inhibits the enzyme carbonic
anhydrase, and therefore should lead to increased urine production (▶ Sect. 25.7).
Among other substances, hydrochlorothiazide 8.16, furosemide 8.17 (Fig. 8.5), and
structurally related compounds gained therapeutic importance. In the early 1940s,
the hypoglycemic effects of a few sulfonamides were clinically observed. The
antibacterial and simultaneously hypoglycemic carbutamide 8.18 was introduced
to therapy in 1955, the lipophilic and therefore more bioavailable tolbutamide 8.19
8.5 From Agonists to Antagonists 161

was introduced later. Systematic structural variation finally led to glibenclamide


8.20 (Fig. 8.5 and ▶ Sect. 30.2), which is much more potent and specific.

8.5 From Agonists to Antagonists

There is no general recipe for the transformation of an agonist into an antagonist. An


example of this is found in the tedious route from the agonist histamine to the H2
antagonist, as is described in detail in ▶ Sect. 3.5. There are, however, recognized
principles that have proven to be of value. For example, the exchange of polar for non-
polar substituents or the introduction of large groups such as additional aromatic rings
changes some receptor agonists to antagonists. The exchange of both phenolic
hydroxyl groups in isoprenaline 8.12 for two chlorine atoms (DCI, 8.21) or additional
aromatic rings (pronethalol, 8.22) delivered the first b-adrenergic antagonists, the
b-blockers. The introduction of an oxygen atom in the side chain, and further structural
optimization afforded the first b1-selective antagonists, for example, practolol 8.23 and
metoprolol 8.24. The b1-selective partial agonist xamoterol 8.25 is a blocker as well as
an agonist (Fig. 8.6). It occupies b1 receptors and displays a moderately stimulating
effect. By occupying the receptor, it protects it from an excessive response upon
elevated adrenaline release, for instance, from exercise or stress.
Analogously, the exchange of the imidazole ring of histamine 8.26 for large
hydrophobic groups led to the first H1 antagonists, for instance, diphenhydramine
8.27 (Fig. 8.7). Sedation is the most troublesome side effect of the classic H1
antagonists, which are used to treat allergies. The non-sedating terfenadine

OH OH
H H
Cl N CH3 N CH3

CH3 CH3
Cl
8.21 DCI 8.22 Pronethalol

OH
H
O N CH3 8.23 Practolol, R = -NHCOCH3

CH3 8.24 Metoprolol, R = -CH2CH2OMe


R

OH O
H
O N
N N 8.25 Xamoterol
H
O
HO

Fig. 8.6 3,4-Dichloroisoprenaline 8.21 (DCI) and pronethalol 8.22, the first unspecific
b-blockers, were derived from isoprenaline 8.12. Practolol 8.23 and metoprolol 8.24 are specific
b1 agonists. Xamoterol 8.25 is a partial b1 agonist, a combined agonist and antagonist.
162 8 Optimization of Lead Structures

Fig. 8.7 By starting with


H
histamine 8.26 and N NH2 O CH3
introducing large N
hydrophobic groups, the H1 N
antagonists, for instance, CH3
diphenhydramine 8.27, were 8.26 Histamine
obtained. The non-sedating H-Agonist
terfenadine 8.28 (R ¼ CH3)
crosses the blood–brain 8.27 Diphenhydramine
barrier but is immediately Non-polar H1 Antagonist
expelled by a transporter. (sedating)
In the meantime the active
metabolite, fexofenadine with
OH
R ¼ COOH, is in the market.
N

OH R
CH3
CH3
8.28 Terfenadine, T = CH3
Polar H1 Antagonist (non-sedating)
Fexofenadine, Active Metabolite: R = -COOH

8.28 (R ¼ H) can cross the blood–brain barrier because of its high lipophilicity, but
is immediately expelled by a transporter. Because of its cardiotoxicity, terfenadine
has been withdrawn from the market in the meantime and replaced by its active
metabolite fexofenadine 8.28 (R ¼ COOH). The sedating side effects of antihista-
mines also led to neuroleptics and antidepressants (▶ Sect. 1.6). Here, however, the
limits of rational drug optimization are apparent. Promethazine 8.29 is an antihis-
tamine with antiallergic action and sedating side effects. The neuroleptic chlor-
promazine 8.30 is a central depressant and therefore an antipsychotic; the
extraordinarily similar structure of imipramine 8.31 acts, on the other hand, as
a stimulant and is an antidepressant (Fig. 8.8). All three substances have different
mechanisms of action. The introduction of additional aromatic rings to other
receptor agonists, for instance, to the neurotransmitters acetylcholine and dopa-
mine, has led to antagonists (Fig. 8.9).

8.6 Optimizing Bioavailability and Duration of Effect

The absorption of the majority of pharmaceuticals depends only on their


lipophilicity. The more polar the drug, the more poorly it can penetrate the lipid
membrane, and the lower the absorption (▶ Sect. 19.6). Increasing the lipophilicity
improves the absorption (▶ Sect. 19.6). Extremely lipophilic compounds are insol-
uble in water, and the absorption is too slow. Lipophilic acids and bases offer
advantages here, if their acidity constant is not too far away from the neutral point,
pH 7. In their ionized form they are highly water soluble, while in their neutral form,
with which they are in equilibrium, they are lipophilic and membrane penetrable.
8.6 Optimizing Bioavailability and Duration of Effect 163

S S

N N Cl N
CH3

N CH3 CH3
H3C CH3 N N
CH3 CH3

8.29 Promethazine 8.30 Chlorpromazine 8.31 Imipramine


H1 Antagonist Neuroleptic Antidepressant

Fig. 8.8 Closely related structures of active substances can have very different qualitative
activity. Chlorpromazine 8.30, a dopamine antagonist with neuroleptic activity, and imipramine
8.31, a dopamine transporter inhibitor with antidepressant activity, are both derived from
promethazine 8.29, an H1 antagonist with antiallergic activity.

H + D
N NH3 P
Fig. 8.9 The active
substance histamine 8.26 and N A
pharmacophores that are
attributed to it (A acceptor, D 8.26 Histamine Pharmacophore
donor, P positively charged (Positively charged
group). form at pH = 7)

These correlations are discussed in detail in ▶ Sect. 19.5. The molecular size influ-
ences the bioavailability insofar that substances with a molecular weight above
500–600 Da are captured by the liver on the sole grounds of the molecular size, and
are quickly excreted with the bile. Aside from this there are substances that penetrate
the membrane regardless of their polarity. These are taken up into the cell or are
eliminated from the cell by transporters (▶ Sect. 30.7). Among these are structural
analogues of amino acids and nucleosides. Classical strategies to extend the duration
of action are the conversion of free hydroxyl groups to ethers (see ▶ Sect. 9.2), the
replacement of esters with amides, and the replacement of metabolically labile amide
groups with isosteres. In a few cases, such structural changes are associated with
a reduction in potency, which is more than compensated for by a longer duration of
action. In the case of peptides the replacement of L-amino acids with D-amino acids,
the inversion of amide groups, and the replacement of larger structural elements with
peptidomimetic groups (▶ Sect. 10.4) have all proven successful.
The metabolism of aliphatic amino groups can be suppressed with alkyl substi-
tution or branching at the a carbon. Secondary alcohols can be converted to the
more bioavailable tertiary alcohols by introducing an ethinyl group at the same
carbon atom (▶ Sect. 28.5). The introduction of an isosteric fluorine atom in the
para position as a replacement for hydrogen prevents hydroxylation in this position.
If steric considerations do not play a role, the para position can also be blocked
164 8 Optimization of Lead Structures

with a larger group, such as a chlorine atom or a methoxy group. In the hydroxylated
3- and 4-position of the neurotransmitters dopamine, adrenaline, and noradrenaline,
the conversion to the monohydroxylated analogues, 3,5-dihydroxy compounds or to
the NH-isosteric indole group (Fig. 8.1, Sect. 8.2) led to metabolically more stable
and therefore longer-acting compounds.

8.7 Variations of the Spatial Pharmacophore

Rational design is characterized by the fact that the common feature of all
active compounds, and the differences to less potent or inactive analogues
can be derived from the structure of the pharmacophore. A pharmacophore
(Sect. 8.9) is defined as a special arrangement of particular functionalities that
are common to more than one drug and form the basis of the biological activity
(▶ Sect. 17.1).
During the course of rational optimization the molecular scaffold and the sub-
stituents at a pharmacophore are changed to maintain the principle function while
arriving at higher potency or better selectivity. Many computer methods have been
developed to generate ideas for the spatial isomorphic replacement of ligand scaf-
folds. By considering the conformational aspects of the molecules (▶ Chap. 16,
“Conformational Analysis”), they scan databases to find possible candidates that,
despite a different parent scaffold, can place the side chains and interacting groups
in the same spatial orientation. Examples of such approaches are presented in
▶ Sect. 10.8 and ▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Com-
parisons”. But an indirect approach using the protein structure has also been tried.
For this, the spatial structure of the protein–ligand complex is the starting point
from which a part of the binding pocket is cut out, and new building blocks for the
ligand are sought. Subsequently the form and interaction properties of the cut-out
pocket are compared with a database of all known protein–ligand complexes
(▶ Sect. 20.4). If a subpocket is discovered that has similarities to the sought-
after pocket, then ligands that bind there provide an interesting design hypothesis.
The structure of the building blocks that occupy the newly discovered pocket can
generate ideas for isosteric structural elements in a modified ligand.
A different strategy that also considers the pharmacophore can be successful.
In this approach the pharmacophore is retained and only those groups are modified
that affect the pharmacokinetic properties, that is, the transport, distribution,
metabolism, and excretion of a molecule. An efficient and pragmatic strategy is
important. For this, it is essential that not too many changes are made at the same
time, and the changes should not be too biased. With little synthetic effort, a broad
spectrum of physicochemical properties and spatial arrangements should be
covered.
In the meantime it has been established that binding to human plasma proteins
such as serum albumin and the acidic k1-glycoprotein is of decisive importance for
the transport and pharmacokinetic properties of a drug. Therefore binding to these
proteins is considered even in the early phase of drug development (▶ Chap. 19,
8.8 Optimizing Affinity, Enthalpy, and Entropy of Binding and Binding Kinetics 165

“From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”). On


the other hand, binding to the hERG ion channels (so-called “antitarget”) is
avoided because blocking these channels can lead to arrhythmias (▶ Sect. 30.3).
Drug metabolism is in itself a very important theme and must be considered in
earlier phases of development. The cytochrome P450 enzymes are responsible
for the vast majority of chemical transformations that occur on xenobiotics
(▶ Sect. 27.6). To be able to predict the behavior of drug candidates at this stage
of the development process, the expected interactions with these metabolic
enzymes are evaluated in an early phase of optimization. The expression of P450
enzymes can also be induced by xenobiotics. The trigger for this could be the
binding to a transcription factor like the PXR receptor (▶ Sect. 28.7). Drug candi-
dates binding to this transcription factor can be evaluated early in their development
to avoid this undesirable enhanced metabolism.

8.8 Optimizing Affinity, Enthalpy, and Entropy of Binding


and Binding Kinetics

Generally, the binding affinity to a target protein is primarily improved during


the course of optimization. If multiple candidates are available, the ligand
efficiency (▶ Sect. 7.1) in addition to the chemical accessibility leads the way.
Small, potent lead structures offer legitimate hope that they can be well optimized.
Very small compounds that have nanomolar affinity, despite their low molecular
weight, can be problematic. Most of the time an optimal interaction pattern
is already established. It is then almost impossible to transfer this pattern to another
molecular scaffold. Medicinal chemists have established a set of rules based
on experience (▶ Sect. 4.10). According to these rules it is possible to judge
how much a particular group, if correctly placed, can contribute to the binding
affinity.
It was shown in ▶ Sect. 4.10 that the affinity is a combination of the enthalpic
and entropic contributions. Usually one begins with a lead structure that has
a binding affinity in the micromolar range. Expressed as the Gibb’s free
energy DG, this is usually about 30 kJ/mol. An increase in the binding affinity of
4–5 orders of magnitude causes an improvement in DG of 20–30 kJ/mol. Where
should the screw be turned to optimize a lead structure? Does it make more sense to
improve the binding enthalpy, or is one better advised to improve the binding
entropy? Given the enthalpy/entropy compensation described in ▶ Sect. 4.10, is it
even possible to attempt optimization of both values independently? The prereq-
uisite for using such a concept in the optimization is the determination of both
values of a lead structure. Does this help in the choice of the right candidate for
optimization? In the case that the thermodynamic binding profiles of multiple
alternative lead candidates are known, should enthalpically or entropically
driven binders be chosen for optimization? It is very interesting to compare the
thermodynamic signatures of multiple generations of marketed products. The
binding profiles for HIV protease inhibitors (▶ Sect. 24.3) and HMG-CoA
166 8 Optimization of Lead Structures

kcal/mol ΔG ΔH −TΔS
5

−5

−10

−15

−20
r

ir

ir

ir

ir

ir

ir
vi

vi

vi
av

av

av

av

av

av
na

na

na
in

fin

en

an

an

un
di

ito

pi
qu

el

pr

az

pr

ar
In

Lo
R
N
Sa

Am

Ti

D
At
kcal/mol
5

−5

−10

−15

−20
tin

tin

tin

in

tin
at
ta

ta

ta

ta
st
as

as

as

as
va
uv

av

iv

uv
or
er
Fl

Pr

os
At
C

Fig. 8.10 Between 1995 and 2006, the profile of multiple development generations of HIV
protease inhibitors (upper, for formulae see ▶ Fig. 24.15) and statins as HMG-CoA inhibitors
(lower, for formulae see ▶ Fig. 27.13) could be optimized for their thermodynamic signatures, that
is, the extent to which they are driven by entropy or enthalpy. The free energy DG is shown in red,
the enthalpy DH in blue, and the entropic contribution TDS in green. The more negative the
column becomes, the stronger the binding affinity and the more the profile is determined by
enthalpy or entropy. The initially developed compound such as indinavir, saquinavir, nelfinavir,
and pravastatin were entropic binders; in contrast, the newer derivatives such as darunavir or
rosuvastatin have an improved enthalpic profile.

inhibitors (▶ Sect. 27.3) are displayed in Fig. 8.10. Notably, it has been successful
to shift the profile from initially strongly entropically driven binders to
enthalpically driven ones. This observation suggests that it is initially simpler to
optimize a substance’s entropic binding contribution than its enthalpic contribu-
tion. Most of the time this can be seen in the first lead structure upon which an
enlargement of the hydrophobic surface area leads to better binding. The affinity
that is gained is explained by the displacement of ordered water molecules
(▶ Sect. 4.6). Such contributions are assumed to be entropically favorable.
A strategy of introducing rigid rings can also be pursued. In doing so, the com-
pound loses degrees of freedom. If the geometry of the bound state is correctly
frozen, the binding is improved for entropic reasons. An example of this is the
8.8 Optimizing Affinity, Enthalpy, and Entropy of Binding and Binding Kinetics 167

O O

O CH3 O N
H H3C O
H
N
N N H3C S N
H
CH3 O O
H CO2H
O

8.32 HN NH2
8.33
NH2
HN

ΔG : −42.3 kJ/mol ΔG : −49.2 kJ/mol


ΔH: −6.2 kJ/mol ΔH: −48.5 kJ/mol
−TΔS: −36.1 kJ/mol −TΔS : −0.7 kJ/mol

Fig. 8.11 The rigid thrombin inhibitor 8.32 only has a small number of rotatable bonds. It has an
optimal shape complementarity to the binding pocket of thrombin. Its binding is, for the most part,
entropically driven. On the other hand, the considerably more flexible ligand 8.33 has a higher
enthalpic binding contribution.

binding of the largely rigid thrombin inhibitor 8.32, which binds in an almost
exclusively entropically driven manner to the protein (Fig. 8.11). In contrast, the
decidedly more flexible ligand 8.33 displays a large enthalpic binding contribu-
tion. Compound 8.32 represents the result of an optimization that led to a substance
with single-digit nanomolar binding and an optimal shape complementarity for the
binding pocket of thrombin.
As it seems, in general there are applicable concepts for the entropy-driven optimi-
zation. If one can “always win entropically,” then for theoretical reasons enthalpically
favored lead structures should be preferred as a starting point for optimization.
However, caution is called for here. Why a ligand has a particular thermody-
namic profile must be clarified. The inhibitors 8.34 and 8.35 were discovered in
a virtual screening as aldose reductase inhibitors (Fig. 8.12). The chemical struc-
tures of both ligands are very similar. Nevertheless one is an enthalpically driven
binder, and the other is an entropically driven binder. The crystal structure of both
ligands with the protein delivered the reason: the enthalpically preferred inhibitor
8.34 entraps a water molecule, which mediates binding between the ligand and the
protein, whereas the other one does not. The incorporation of a water molecule is
entropically disfavored, and therefore the profile appears to be that of an enthalpic
binder. A resistance profile for inhibitors against mutants of the viral HIV protease
was investigated in the research group of Ernesto Freire at The Johns Hopkins
University in Baltimore (▶ Sect. 24.5). Interestingly, the result was that resistance
to the entropically favored inhibitors could be developed much faster than to
inhibitors with enthalpic advantages. This observation indicates that it is worth-
while to concentrate on enthalpically favored binders in cases in which resistance
can be expected to develop. In the investigated example the enthalpically driven
168 8 Optimization of Lead Structures

OH OH
O S
N
O O
N
O O2N O N N
O2N
8.34 8.35
ΔG: −35.4 kJ/mol ΔG: −31.3 kJ/mol
ΔH : −25.6 kJ/mol ΔH: −8.7 kJ/mol
−TΔS: −9.8 kJ/mol −TΔS: −22.6 kJ/mol

Fig. 8.12 Compounds 8.34 and 8.35 were discovered in a virtual screening as lead structure for
the inhibition of aldose reductase. Although they are structurally similar, 8.34 is a stronger
enthalpic binder and 8.35 is an entropic binder. The subsequent crystal structure analysis of the
complex with the reductase showed that 8.34 traps a water molecule upon binding, whereas this
was not observed with 8.35. Because the entrapment of a water molecule is entropically unfavor-
able, the binding of 8.34 is enthalpically preferred.

binder 8.33 had a less-rigid scaffold (Fig. 8.11). This allows it to more easily elude
changes that are caused by mutations. It is much more difficult for rigid ligands that
bind for entropic reasons to adapt to such steric modifications.
On the other hand, entropic binders can also have an advantage in escaping
resistance. If a ligand is entropically favored because it adopts multiple binding
modes, and even exhibits large residual mobility in the binding pocket when bound,
this can prove to be beneficial! If the protein tries to change the shape of its binding
pocket through resistance mutations to this inhibitor, an incorporated ligand that is
able to adopt multiple binding modes is left with alternative orientations, which,
despite the mutation, still offer good binding.
If it is clear that a lead structure is an enthalpically driven binder, and
superimposed effects such as the entrapment of water molecules have not distorted
the profile, how is the binding of an enthalpically driven binder optimized? Let us
remember the consideration in ▶ Sects. 4.5 and ▶ 4.8: hydrogen bonds, electrostatic
interactions, and van der Waals contacts determine the binding enthalpy. However,
a change in such an interaction property of a molecule is often coupled with
a compensation of enthalpy and entropy. The result is that DG and the binding
affinity do not change at all! The optimization process can be compared to the act of
getting around the inherent enthalpy/entropy compensation. Enthalpically favorable
hydrogen bonds should have an optimal geometry and should not induce severe
structural changes in the protein environment. Otherwise this can lead to an entropic
compensation by causing a shift in the dynamic degrees of freedom. It seems to be
more favorable to strengthen the hydrogen bonds in structurally rigid regions of the
binding pocket. There, enthalpy is better gained because the compensatory shift in
dynamic parameters is less likely. Introduced hydrogen bonds should also not reduce
the degree of desolvation of a bound ligand in that they induce small structural
changes in the binding geometry of hydrophobic groups that become stronger when
8.9 Synopsis 169

exposed to the surrounding solvent environment. It is also important that the local
water structure in the binding pocket remains unchanged.
Another essential question has to do with the optimal interaction kinetics that
a ligand should have. Surface plasmon resonance was introduced in ▶ Sect. 7.7.
The question of whether a ligand binds quickly or slowly to a protein and with what
rate it is released again can be determined with this method. Ideally, how long
should a ligand stay bound to a protein, what is the optimal residence time? The
binding affinity is determined by the relative ratio of the association rate (kon) and
the dissociation rate (koff). It has been shown that structurally similar ligands can
have entirely different kinetic profiles. Which profile is optimal? A loss in affinity
can manifest itself as an increased dissociation rate, or a slower association rate, as
well as a combination of both effects. It was shown in the research group of Helena
Danielson in Uppsala that different binding profiles of therapeutically used HIV
protease inhibitors correlate with the development of resistance to mutants of the
protease. They also demonstrated that resistance forms more rapidly against drugs
that have a higher dissociation rate. This is a decisive criterion to direct drug
optimization in the correct direction. Certainly the kinetic binding profile must be
granted a greater priority in the future. Therefore, a more comprehensive correla-
tion between the structure and the binding is necessary so that this knowledge can
be used for targeted design. Until now, what differentiates a “fast” or “slow” binder
has only been understood in a very few cases. These are parameters that have to do
with the induced-fit adaptations of the protein. It can also involve the ease with
which the desolvation of the previously uncomplexed binding pockets takes place
or with the kinetics with which a ligand in the solvated state sheds its own water
shell. More attention must be paid to these protein and ligand-based properties.

8.9 Synopsis

• A lead structure is only the starting point on the way to a drug; potency,
specificity, and duration of action have to be optimized concurrently to minimize
side effects and toxicity.
• The structure of an active substance is determined by its pharmacophore, which
is responsible for target binding. Its adhesion groups enhance potency and
biological activity, its lipophilicity is responsible for transport and distribution,
and groups to be cleaved or modified release the active form.
• Multiple concepts to modify the chemical structure of a lead can be planned,
however, optimization is multifactorial due to highly correlated influences of the
attempted changes.
• Bioisosteric functional group replacement attempts the exchange of groups on
a given skeleton for sterically and electronically related groups that maintain
activity but improve other drug properties.
• Me-too research follows the goal of modifying the competitor’s lead structures
to arrive at patent-free analogues with improved properties.
170 8 Optimization of Lead Structures

• Assuming unchanged activity, enlarging a molecule, adding chiral centers,


and rigidification usually improves selectivity, whereas removing chiral
centers, allowing more flexibility, and reducing the size makes a drug less
selective.
• The activity spectrum of a substance can be tailored even by the smallest
structural changes that modulate affinity, transportation, distribution, or metab-
olism. Therefore a particular compound class can show activity in quite different
therapeutic indications.
• Transforming agonists to antagonists does not follow clear-cut rules, however,
increasing the size and the attachment of hydrophobic groups such as aromatic
rings often shift the profile.
• The more polar a drug, the more poorly it can penetrate lipid membranes, and the
lower is the absorption. On the other hand, special transporters can assist
penetration.
• Extension of the duration of action is mostly achieved by replacement of
metabolically labile groups with more stable isosteres, the introduction of
more branching groups, blockage of metabolically labile positions at aromatic
rings by F or Cl, or by exchanging L- for D-amino acids concurrently with the
inversion of amide groups.
• Molecular databases can be screened to detect other scaffolds or substitution
patterns that represent a given pharmacophore in an alternative fashion.
• In the early phase of drug development undesired binding to plasma proteins,
antitargets such as the hERG ion channel or preferred binding, inhibition, or
activation of transcription factors or metabolizing cytochrome P450 enzymes are
examined and possibly avoided.
• Proper adjustment of the thermodynamic binding profile can be essential for the
optimization of binding affinity and to endow a drug with the required target-
specific properties. Similarly the interaction kinetics determining binding on and
off rates or residence times are of decisive importance to develop drugs with, for
example, an optimal resistance profile.

Bibliography

General Literature
Sneader W (1985) Drug discovery: the evolution of modern medicines. Wiley, New York
Taylor JB, Triggle DJ (eds) (2007) Comprehensive medicinal chemistry II. Elsevier, Oxford
Wermuth CG (ed) (2008) The practice of medicinal chemistry, 3rd edn. Elsevier-Academic,
New York

Special Literature

Burger A (1991) Isosterism and bioisosterism in drug design. Fortschr Arzneimittelforsch


37:287–371
Bibliography 171

Copeland RA, Pompliano DL, Meek TD (2006) Drug–target residence time and its implications
for lead optimization. Nat Rev Drug Discov 5:730–740
Fokkens J, Klebe G (2006) A simple protocol to estimate protein binding affinity differences for
enantiomers without prior resolution of racemates. Angew Int Ed Engl 45:985–989
Hansch C (1974) Bioisosterism. Intra-Science Chem Rept 8:17–25
Lipinski CA (1986) Bioisosterism in drug design. Ann Rep Med Chem 21:283–291
Ohtaka H, Freire E (2005) Adaptive inhibitors of the HIV-1 protease. Prog Biophys Mol Biol
88:193–208
Shuman CF, Markgren P-O, H€am€al€ainen M, Danielson UH (2003) Elucidation of HIV-1 protease
resistance by characterization of interaction kinetics between inhibitors and enzyme variants.
Antiviral Res 58:235–242
Steuber H, Heine A, Klebe G (2007) Structural and thermodynamic study on aldose reductase:
nitro-substituted inhibitors with strong enthalpic binding contribution. J Mol Biol 368:618–638
Thornber CW (1979) Isosterism and molecular modification in drug design. Chem Soc Rev
8:563–580
Designing Prodrugs
9

After the optimization of a lead structure there are still problems. Many substances
lack important characteristics that are required for therapy in humans, for instance,
adequate bioavailability, duration of action and metabolic stability, the ability to
penetrate the blood–brain barrier, selectivity, or good tolerability. Often it proves
impossible to address or improve these properties through structural variation. A
solution to this problem can be found through special preparations, for instance to
be used for poorly water-soluble substances, or via a derivatization to a prodrug.
This term refers to a non-active or poorly active precursor or derivative of an active
molecule. In the organism this form is converted to the actual active substance. In
most cases, this is achieved by enzymatic reactions, in a few cases it happens by
spontaneous chemical decomposition.
Aside from this, the metabolites of some drugs also show favorable therapeutic
properties. In some cases this has led to new and improved drugs, in other cases the
original substance was retained as a prodrug.

9.1 Foundations of Drug Metabolism

Multiple factors have crucial importance for the absorption, bioavailability, and
duration of action of an active substance. The most important are the solubility and
lipophilicity of the drug, which are nearly equal in importance, followed by the
molecular size and the metabolic stability. The terms absorption and bioavailability
have very different meanings. Absorption refers to the amount of active substance
that is taken up by the entire gastrointestinal tract. The bioavailability refers to just
the portion of the active substance that is available in the circulation after the first
pass through the liver.
After oral administration, the metabolism of the substance by enzymes begins.
Ester and amide bonds are hydrolyzed, often already in the stomach and intestines,
or by passage through the stomach and intestinal wall. The entire blood volume
that flows through the intestines goes first to the liver via the portal vein (Fig. 9.1).
This passage is called “first pass”. Because of its rich spectrum of hydrolyzing,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_9, 173


# Springer-Verlag Berlin Heidelberg 2013
174 9 Designing Prodrugs

Metabolites
Organs

Bile Liver Circulatory


Feces System

Portal Vein Kidney

Metabolites Gastrointestinal Wall

Urine
Drug

Fig. 9.1 Schematic sketch of the “lifecycle” of a drug after oral administration. The drug is
already metabolized during the passage through the stomach or intestinal wall, and above all, at the
first pass through the liver. Lipophilic drugs and substances with a molecular weight of more than
500–600 Da are excreted with the bile. Polar substances and conjugated and/or metabolic products
(metabolites) are excreted by the kidneys.

oxidizing, reducing, and conjugating enzymes, the liver is the main site of drug
degradation, that is, metabolism. A drug can have poor bioavailability despite
good absorption because of fast and pronounced metabolism in the liver. For many
substances, the first pass is already ‘the end of the road’. They are well absorbed,
but are immediately metabolized or excreted in the bile. The “first-pass effect”
refers to cases of successful and extensive metabolism in the very first passage.
Lipophilic active substances and those with a molecular weight of more than
500–600 Daltons (Da) are susceptible to particularly intense first-pass effects. Of
course, blood flows continuously through the liver, and metabolism carries on. The
substances are no longer in the blood stream at as high a concentration as they were
before the first liver passage because they have been distributed to the tissue. In
general, the hydrolytic cleavage of ester or amide groups leads to highly water-
soluble metabolites that can be excreted by the kidneys. Conjugation, that is, the
coupling of the substance with native polar substances, for instance, with sulfate
groups, the amino acid glycine, or the glucose oxidation product glucuronic acid,
leads to easily excreted products. In humans, conjugation has great importance. It
is more critical if the substance has neither easily degradable functional groups nor
conjugation positions. Nonetheless humans have enzymes that can metabolize xeno-
biotics. Among these, the cytochrome P450 isoenzymes are particularly important
because they are able to chemically change a molecule oxidatively at various posi-
tions. Usually this leads to better water solubility and therefore better-excretable
substances. Because these enzymes cannot predict what properties the metabolites
of these biotransformations will possess, it can occasionally happen that toxic com-
pounds ensue that have mutagenic or carcinogenic properties (▶ Sect. 27.6).
Evolution has had time over millions of years to hone the degradation and
excretion of foreign substances. For many compounds however, the system fails.
Instead of detoxifying, the opposite happens, a “poisoning”. The carcinogenic
effect of polycyclic hydrocarbons is attributed to an oxidative assault, just as is
9.1 Foundations of Drug Metabolism 175

H
Further Conjugation with
O Metabolization Macromolecules
H
9.1 Benzene Epoxide

CH3 COOH CONHCH2COOH

9.2 Toluene 9.3 Benzoic Acid 9.4 Hippuric Acid

Fig. 9.2 The oxidation of benzene 9.1 leads to a reactive and toxic intermediate. In contrast, the
oxidation of toluene 9.2 affords benzoic acid 9.3, which can be excreted by the kidney as its
nontoxic glycine conjugate 9.4.

the bone marrow damage and blood disease that is caused by benzene 9.1. The
simplest alkyl homologue of benzene, toluene 9.2 is less toxic for this reason alone
because it can be oxidized to benzoic acid 9.3, which, after conjugation with the
amino acid glycine, can be excreted as hippuric acid 9.4 (Fig. 9.2). There are even
more conjugation possibilities available for the benzoic acid intermediate.
One can speculate as to why no multienzyme complexes have evolved to
immediately convert toxic intermediates into polar, nontoxic metabolites. In any
case, it is an almost unsolvable problem because the properties of the metabolites
would have to be predicted for each xenobiotic. A modification that leads to
improved water solubility in one compound can cause a mutagenic effect in
another. For their own protection humans have, in fact, mechanisms for trapping
reactive metabolites. Here glutathione and glutathione transferase must be men-
tioned because they can detoxify electrophiles particularly well (▶ Sect. 27.7).
Perhaps toxic or carcinogenic effects were not a particularly decisive theme for
evolution until now. Tumors play a secondary role for most animals because of their
short lifespan. Up until just a few generations ago, war and infectious diseases were
the primary causes of death in humans. It has only been in recent times that the
average life expectancy increased. In the sense of evolution, aging individuals play
only a secondary role. Once reproduction is complete, the parents are only neces-
sary for the care of their young until early adulthood. One only needs to think of
female spiders that consider their mates to be nothing more than their next prey
immediately after copulation!
From the above-described examples of toxic chemicals, the wrong conclusion
should not be drawn that only human-made substances can cause cancer. That is
true for a few natural products as well, for instance, aflatoxins. These microbial
secondary metabolites, which form in spoiled nuts and other foodstuffs are potent
carcinogens. Certain alkaloids, for example, from the Spurge family (Euphorbiaceae)
are also strongly cancer-promoting substances; they are so-called tumor promoters.
The principle of nil nocere (Lat. do not harm) is strictly applied to medicines,
and only slowly have these standards been applied to other materials in our
176 9 Designing Prodrugs

environment. For the testing and development of active compounds, this means that
particularly rigorous tests for carcinogenic, mutagenic, and teratogenic effects must
be conducted. The well-founded suspicion alone that a compound or one of its
possible metabolites displays such effects leads to the consequence that the com-
pound is not further developed.

9.2 Esters Are Ideal Prodrugs

Establishing satisfactory water solubility in substances that are simultaneously


suitable for passive transport across membranes is a special challenge in pharma-
ceutical optimization. Nowadays attention is paid to the correct balance of
these parameters already in the early phase of development (▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”). If it is not
possible to achieve this optimum with the actual active substance, esters are often
produced as suitable prodrugs. Esters are easily cleaved by ubiquitously occurring
esterases. The improved lipophilicity helps with the passive transport through
diffusion over membrane barriers, as found in the intestines and above all else
the blood–brain barrier. One prodrug that has sadly achieved infamy is heroin 9.5
(Fig. 9.3), the diacetyl ester of morphine (▶ Sect. 3.3). Because of its markedly
increased lipophilicity, heroin penetrates the blood–brain barrier quickly. The
pharmacologist Heinrich Dreser, who tested acetylsalicylic acid at Bayer, intro-
duced heroin to therapy in 1898 as a pain and cough medicine because of its
minimal respiratory depression. But heroin belongs to the substances with the
highest addictive potential. Its abuse is an enormous social problem in many
countries. It is used therapeutically in exceptional cases, for instance, for pain
therapy in cancer patients, particularly those, who have exhausted other thera-
peutic options.
Many other prodrugs are also esters. The transformation from an acid or alcohol
group to an ester usually leads to a better-absorbable product. The formerly used
antilipidemic clofibrate 9.6 (▶ Sect. 28.6) is just such an example of a bioavailable
ester of a biologically active free acid 9.7. The angiotensin-converting enzyme
inhibitor enalapril 9.8 (▶ Sect. 25.4) and its analogues are also prodrugs. The free
acid 9.9 is not absorbed, but it is the active form in vitro (Fig. 9.3). The diester is
chemically unstable and quickly forms the inactive diketopiperazine 9.10. It is
essential that only one of the acid groups is esterified to prevent the formation of
this side product. The monoester 9.8 is “interpreted” as a dipeptide and is transported
over the cell membrane by an oligopeptide transporter (▶ Sect. 30.7). The b-lactam
antibiotics (▶ Sect. 23.7) are also taken up by this transporter.
Hydroxymethylglutaryl-coenzyme A 9.11 (HMG-CoA) is enzymatically
reduced to mevalonic acid 9.12 in the biosynthesis of cholesterol (Fig. 9.4). The
antilipidemic lovastatin 9.13 (▶ Sect. 27.3) prevents this reaction by inhibiting
HMG-CoA reductase. It contains a lactone ring, which is transformed to its active
form 9.14 by hydrolysis. This form is structurally very similar to the product of the
enzymatic reaction, mevalonic acid 9.12.
9.2 Esters Are Ideal Prodrugs 177

CH3COO O
O
OR
H3C CH3
O H H Cl
N CH3
9.6 Clofibrate, R = Et
CH3COO 9.7 Clofibric acid, R = H
9.5 Heroin

R1 R2

O
EtOOC N
CH3
N
N O
ROOC N
H COOH
O

9.8 Enalapril, R = Et 9.10 Diketopiperazine


9.9 Enalaprilat, R = H R1 = Phenethyl, R2 = Me

Fig. 9.3 Heroin 9.5, the diacetyl derivative of morphine acts reliably and quickly, “heroically.”
Like morphine, it is slowly and inefficiently absorbed, but after intravenous application it crosses
the blood–brain barrier 100 times faster than morphine. There, the ester is converted by the
enzyme pseudocholinesterase to morphine, which can no longer leave the brain because of its
higher polarity. The cholesterol-lowering drug clofibrate 9.6 is a prodrug of the actual active
compound, the free acid 9.7. The antihypertensive enalapril 9.8 is also a prodrug of the active
compound 9.9. Here the high lipophilicity is not responsible nor the absorption, rather it is actively
transported by binding to a dipeptide transporter. The diester of enalapril is unsuitable as a drug
because it spontaneously forms the inactive diketopiperazine 9.10.

Other ester prodrugs were developed for depot formulations to achieve a longer
duration of action after subcutaneous or intramuscular administration.
The phenolic hydroxyl group of bambuterol 9.15 is masked as a carbamate.
Terbutaline 9.16 (Fig. 9.5) is formed from this prodrug after hydrolysis by
unspecific cholinesterases (▶ Sect. 23.7). By using this prodrug strategy it was
possible to make a long-acting bronchospasmolytic that only needs to be adminis-
tered once daily in contrast to the actual active substance, which must be admin-
istered three times daily.
Occasionally, a prodrug can be used to improve the taste, for instance, in the case
of the extremely bitter chloramphenicol 9.17. By converting it to the palmitate 9.18
(Fig. 9.5) the water solubility is strongly reduced, but the substance no longer tastes
bitter. The concomitant reduction in the absorption is of no consequence. The
substance is hydrolyzed to the highly soluble and easily absorbed chloramphenicol
in the duodenum by the pancreatic lipase enzymes.
The glucoside salicin (▶ Sect. 3.1) represents a true prodrug that after hydrolysis
and oxidation is converted to the anti-inflammatory salicylic acid. In contrast,
acetylsalicylic acid (ASA) is a mixed type. It has its own activity through the
178 9 Designing Prodrugs

H3C HMG-CoA- H3C


Reductase
HO COOH HO COOH
O OH

SCoA
9.12 Mevalonic Acid
9.11 HMG-CoA

H H
O
HO HO COOH
O OH

R R
9.13 Lovastatin 9.14 Active Metabolite

Fig. 9.4 The enzymatic reduction of hydroxymethylglutaryl-coenzyme A 9.11 (HMG-CoA) to


mevalonic acid 9.12 is inhibited by the lactone-ring-opened active metabolite 9.14 of lovastatin
9.13 (▶ Sect. 27.3).

irreversible inhibition of cyclooxygenase, above all as a coagulation-inhibiting


substance. On the other hand, ASA has prodrug character because the metabolic
release of salicylic acid contributes a small part to the anti-inflammatory effect
(▶ Sect. 27.9). Furthermore, ASA is less irritating to the mucous membranes and
tastes less unpleasant than salicylic acid. For a drug with a molecular weight of
180 Da, this combination of favorable characteristic in one structure is a proud
achievement.
Esterification can also help with inadequate water solubility of an active sub-
stance. For this, esters with phosphoric acid or hemiesters with dicarboxylic acids
such as succinic acid are formed. The added groups carry a charge and increase the
water solubility of the active substance. In the organism, the esters are easily
hydrolyzed again. The anticonvulsive compound phenytoin could be converted to
a more-hydrophilic phosphate prodrug 9.19 (Fig. 9.5), which is easily hydrolyzed
by phosphatases (▶ Sect. 26.8). If a terminal sulfonamide group, as found in the
prodrug of celecoxib (9.21, 9.22 Fig. 9.5), is acylated, water-soluble salts are more
easily formed. The acyl group is also easily hydrolyzed in the intestines.
Esterification with polyethylene glycol (PEG) can also be used to enhance
solubility. This very water-soluble polymer has been coupled through an ester
group to the natural product paclitaxel (▶ Sect. 6.2, ▶ 6.5). As PEG-paclitaxel,
this compound can be used as an intravenous chemotherapeutic.

9.3 Chemically Well Wrapped: Multiple Prodrug Strategies

The antibacterial sulfonamide, sulfamidochrysoidine (▶ Sect. 2.3) is a prodrug. It is


only after cleavage of the azo bond that the metabolic product, sulfanilamide, acts
as an antimetabolite of p-aminobenzoic acid, which is critical for microorganisms.
9.3 Chemically Well Wrapped: Multiple Prodrug Strategies 179

N
O H3C
CH3 H3C
O CH3
CH3 HO
N CH3
H N
Bioactivation H
OH
OH
O HO
N
9.16 Terbutaline
O
Na +
9.15 Bambuterol R N
SO2
OPO(OH)2 O

HO H N
OR O O
H3C
H N N
HN CHCl2 H N
O2N
O
CF3
9.17 R = H
9.20 R = Methyl
9.18 R = CO(CH2)14CH3 9.19 Fosphenytoin
9.21 R = Ethyl
Celecoxib Prodrugs

H2N N NH2
H H H
N N N CH3
N N
Bioactivation
NH NH CH3 CH3
Cl H3C
Cl
9.22 Proguanil
9.23 Cycloguanil

O
S S

Bioactivation

F F CH2COOH
CH2COOH

9.24 Sulindac 9.25

Fig. 9.5 Bambuterol 9.15 is a carbamate-masked prodrug of the bronchospasmolytic terbutaline


9.16. It is transformed to the active compound slowly, by hydrolysis. The prodrug 9.18 of
chloramphenicol 9.17 masks only its extremely bitter taste. Phenytoin can be converted to
a phosphoric acid ester 9.19, which is significantly better water-soluble. The cyclooxygenase
inhibitor celecoxib can be converted to prodrugs (9.21–9.21) by adding acyl groups; these have
much-improved water solubility. The antimalarial cycloguanil 9.23 is formed by a metabolic
cyclization of the inactive precursor proguanil 9.22. The anti-inflammatory sulindac 9.24 has
100 times better water solubility than its actual active form, the sulfide 9.25. In addition to this
reversible enzymatic reduction an irreversible enzymatic oxidation to a biologically inactive
sulfone also occurs.
180 9 Designing Prodrugs

Fig. 9.6 Ximelagatran 9.26 O O O


H
and sibrafiban 9.27 were N
developed to improve oral EtO N N
bioavailability, and contain H
NOH
both an uncharged amidoxime
group and an ester function as NH2
a double prodrug.
9.26 Ximelagatran
O
O
EtO O
N
N
H
O NOH

9.27 Sibrafiban NH2

Additional prodrugs are proguanil 9.22, which is converted to cycloguanil 9.23


(▶ Sect. 27.2), or the anti-inflammatory sulindac 9.24, which is metabolically
converted to the active sulfide 9.25 (Fig. 9.5).
Amidines are used as building blocks in thrombin inhibitors and antagonists of
the integrin receptor aIIbb3 (▶ Sect. 31.2). These strongly basic groups are detri-
mental for good bioavailability. Through oxidation to the corresponding
amidoximes, a less-basic group is formed that is not protonated under physiolog-
ical conditions. Reductases, which are present in the liver, kidney, lung, or brain,
release the original amidine structure. This concept, together with the esterifica-
tion of the terminal acid function, was applied in a double-prodrug strategy for the
thrombin inhibitor ximelagatran 9.26 and the receptor antagonist sibrafiban 9.27
(Fig. 9.6).
The bombing of an allied ship that was docked in an Italian harbor in 1943 with
100 t of mustard gas 9.28 (bis-b-chlorethylsulfide, Fig. 9.7) led to the observation
that many of those who were poisoned experienced a severe reduction in their white
blood cell counts. This severe toxicity for cells that quickly divide could be used for
killing tumor cells. The cytotoxic effect arises from multiple alkylations of DNA.
Consequently, replication and subsequent cell devision are affected. A purposeful
search for analogues of mustard gas with less toxicity led over N-derivative 9.29 to
the aromatic-substituted derivative 9.30, which still had inadequate tolerability and
tumor specificity. Tumor cells are especially rich in phosphatases. Because of this,
H. Arnold at the German company Chemie Gr€ unenthal reasoned that phosphoric
acid derivatives of N-lost might be suitable for a tumor-specific therapy. The most
interesting compound was cyclophosphamide 9.31, a substance that can cause the
complete disappearance of tumors in animal experiments. The originally assumed
mechanism is not correct because the substance is inactive in vitro in cell cultures of
tumors. The metabolic activation occurs outside of the tumor in the liver through
oxidation (Fig. 9.7).
9.4 L-DOPA THERAPY: A CLEVER PRODRUG CONCEPT 181

Cl Cl
S R N
Cl Cl
9.28 Mustard gas 9.29 N-analog, R = CH3
9.30 N-Aryl-analog, R = Aryl

O O Cl
P N 9.31 Cyclophosphamide
N Cl
H
Metabolic activation
in the liver

O O Cl
P N
N Cl
H
HO

O O Cl HO O Cl
P N + O
P N
H2N Cl H2N Cl
O
9.32 Active form Acrolein

Fig. 9.7 The cytostatic N-methyl and N-aryl compounds 9.29 and 9.30 are derived from mustard
gas 9.28. The first step in the activation of the prodrug cyclophosphamide 9.31 is a metabolic
hydroxylation of the carbon next to the nitrogen atom. The biologically active agent 9.32 and the
toxic side product acrolein come from a labile intermediate that is formed by enzymatic degrada-
tion and spontaneous decomposition.

In the case of the cancer therapeutic 5-fluorouracil 9.33, the activation occurs
through tumor-specific enzymes. The triple-prodrug capecitabin 9.34 is initially
activated to 9.35 by a carboxylesterase in the liver (Fig. 9.8). Then cytidine
deaminase cleaves an amino group to give 9.36 in the liver as well as in the
tumor. Lastly thymidine phosphorylase releases the active substance 9.33 in the
tumor cell. There, the compound unleashes its effect by blocking thymidylate
synthase, an enzyme that plays an important role in the thymine biosynthesis
(▶ Sect. 27.2) in that it delivers building blocks for DNA synthesis. Because cancer
cells divide more quickly than healthy cells, they are more dependent on the activity
of thymidylate synthase.

9.4 L-DOPA Therapy: A Clever Prodrug Concept

The neurotransmitters dopamine and acetylcholine fulfill different tasks in partic-


ular parts of the central nervous system. Parkinson’s disease, also called
the “shaking palsy,” is a result of the degeneration of dopamine-producing cells
in the Substantia nigra in the midbrain. The ensuing disproportion between the
182 9 Designing Prodrugs

O
CH3 NH2
HN O F
F N
N
O N
O N H 3C O
H3C O Carboxyl- Cytidine-
esterase deaminase
Liver HO OH Liver, Tumor
HO OH 9.35
9.34 Capecitabin

O
F
HN O
F
O N HN
H3C O Thymidine-
phosphorylase O N
Tumor H
HO OH
9.36 9.33 5-Fluorouracil

Fig. 9.8 The triple-prodrug capecitabin 9.34 is activated to 9.35 by a carboxylesterase in the liver,
then it is transformed into 9.36 by a cytidine deaminase in the tumor, and a thymidine phosphor-
ylase produces the cancer therapeutic 5-fluorouracil 9.33.

dopaminergic and cholinergic nerve impulses leads to episodic chronic movement


disorders such as rigidity, tremor, shaking, and an inability to move normally.
Similar side effects are caused by substances that block the dopamine receptors,
for instance, the tricyclic neuroleptics (▶ Sect. 1.6). Intravenous administration of
dopamine 9.37 (Fig. 9.9) does not lead to the desired effect because the substance
cannot penetrate the blood–brain barrier. Because of its purely peripheral effect,
undesirable side effects on the heart and circulation are observed, for example, an
increase in heart rate and blood pressure.
The desired equilibrium in the brain should also be established by suppressing
the cholinergic system. This route is also taken by giving anticholinergics, that is,
antagonists to the cholinergic receptors. The administration of the amino acid
L-DOPA 9.38 (Fig. 9.9) is a more elegant possibility for dopamine substitutions.
This metabolic precursor of dopamine is an orally bioavailable, CNS-effective
medicine. It is even more polar than dopamine and can neither be absorbed from the
gastrointestinal tract nor can it cross the blood–brain barrier just by passive diffusion.
Because it is an amino acid, it uses an amino acid transporter (▶ Sect. 30.7).
With this, the first goal, CNS activity, is achieved. Oral L-DOPA administration
however, still presents too many side effects in the peripheral nervous system.
Furthermore, L-DOPA is very short acting as dopamine is quickly metabolized in
the brain. Therefore, one must try to prevent the metabolism of the substance while
simultaneously reducing its concentration in the periphery. The combination of
9.5 Drug Targeting, Trojan Horses, and Pro-prodrugs 183

HO NH2 HO NH2

HO HO COOH

9.37 Dopamine 9.38 L-DOPA

OH NH2 CH3
H
HO N N
N CH2OH CH
H CH3
O
HO

9.39 Benserazide (racemate) 9.40 Selegilin

Fig. 9.9 Because dopamine 9.37 cannot enter the central nervous system, the metabolic precursor
L-DOPA 9.38 is used. To reduce the cardiovascular effects of dopamine, L-DOPA is combined
with a peripherally active decarboxylase inhibitor benserazide 9.39. The administration of
a monoamino oxidase inhibitor, for example, selegilin 9.40, prevents the fast degradation of
dopamine.

L-DOPA with the peripheral decarboxylase inhibitor benserazide 9.39 and the CNS-
effective monoamino oxidase inhibitor selegilin 9.40 (▶ Sect. 27.8) largely solves
this problem. The peripheral side effects are reduced and the CNS effects are
extended (Fig. 9.9). Despite this tour de force of drug design, which has led to
significant therapeutic progress, the metabolically produced dopamine still acts in
too many places. Aside from the residual peripheral side effects, sudden changes
between excessive movement, normal movement, and rigidity, insomnia, agitation,
and hallucinations are all manifestations of the generalized CNS activity.
It has been speculated in conjunction with this observation, whether, in addition
to endogenous and genetic factors, environmental factors, for example, the meta-
bolic transformation of structurally analogous foreign substances, might be respon-
sible for triggering Parkinson’s disease.

9.5 Drug Targeting, Trojan Horses, and Pro-prodrugs

The design of active substances that exert their effect only in, or overwhelmingly in,
one particular organ is called drug targeting. Aside from general principles, for
example an optimal lipophilicity as a prerequisite for crossing the blood–brain
barrier, specific metabolic transformations are used. The Parkinson’s disease drug
L-DOPA, which was introduced in the previous section, is such a prodrug. The
anticonvulsive medicine progabide 9.41 is a double prodrug because both func-
tional groups of the neurotransmitter are masked. After crossing the blood–brain
barrier and release of the amino and carboxyl groups, the actual active compound,
g-aminobutyric acid (GABA, Fig 9.10), is formed.
184 9 Designing Prodrugs

OH
O O
N H 2N
F NH2 OH
Blood–Brain
Barrier

9.41 Progabid 9.42 GABA


Cl

Fig. 9.10 Because it is a lipophilic neutral molecule, progabide 9.41 can cross the blood–brain
barrier. It is transformed into the neurotransmitter g-aminobutyric acid (GABA) 9.42 upon
metabolic release of the amino and carboxyl groups.

Periphery Blood–Brain Barrier Brain

H H O H H O

X Drug X Drug

N Neutral N
lipophilic
CH3 CH3

9.43 Metabolic Activation


O O

X Drug X Drug

+N Charged +N Metabolic
CH3 polar CH3 cleavage

9.44 Free
drug

Fast
elimination

Fig. 9.11 Drug targeting in the brain is accomplished with a drug–dihydropyridine conjugate
9.43. This substance can easily enter the central nervous system. Metabolic oxidation leads to
a permanently charged pyridine 9.44, which cannot cross the blood–brain barrier. The active
compound is released in the brain, and the polar conjugate is quickly excreted from the periphery.

The ability of the blood–brain barrier to exclude polar substances can also be used
as a prodrug concept. For this an active compound with a metabolically labile group
can be coupled to a dihydropyridine. The neutral conjugate 9.43 can cross the blood–
brain barrier. Oxidation leads to a permanently charged compound 9.44, which can
no longer leave the brain. Upon metabolic cleavage the free active compound is
released in situ (Fig 9.11). If oxidation takes place in the periphery, the highly water-
soluble complex is excreted before the actual active substance is released. As nice as
this principle seems, it has not found its way into therapy yet.
9.5 Drug Targeting, Trojan Horses, and Pro-prodrugs 185

O
N 9.45 Aciclovir, X = H NH2
HN
9.46 Valaciclovir, X = CH3
H2N N N
O CH3
XO O

Fig. 9.12 Aciclovir 9.45 is a Trojan horse. An enzymatic phosphorylation of its hydroxyl group
by a viral kinase affords its monophosphorylated form in virus-infected cells only, which is then
transformed to the triphosphate derivative by the cellular kinases. Valaciclovir 9.46 is a
pro-prodrug because it is first transformed to aciclovir by hydrolysis and subsequently activated.

Several analogues of nucleoside bases and nucleosides are Trojan horses. The
anti-herpes medicine aciclovir 9.45 enters the cell as its inactive form. The first
monophosphorylation occurs only in virus-infected cells by a virus-specific thymi-
dine kinase. Next cellular kinases carry out the formation of the triphosphate, the
actual active substance. Because of this aciclovir acts as a targeted antiviral. The
compound is, however, poorly absorbed. The more suitable valaciclovir 9.46
(Fig. 9.12) is understood to be a pro-prodrug. In the organism it is initially
hydrolyzed to aciclovir and then transformed into the active form by the viral
enzyme. Valaciclovir is more lipophilic than aciclovir, but despite this it is more
soluble in water and approximately 55% bioavailable.
Omeprazole 9.47 is the prodrug of an irreversible inhibitor of the H+/K+-
ATPase, the so-called proton pump. Only under strongly acidic conditions, in the
acid-producing cells of the stomach, it is transformed into sulfenic acid 9.48, which
is in equilibrium with cyclic sulfenamide 9.49 (Fig. 9.13). This reacts irreversibly
with an SH group of the enzyme to form a disulfide. Omeprazole is more effective
than the H2 antagonists (▶ Sect. 3.5) because it blocks not only the histamine-
induced acid secretions but rather all forms of acid secretion.
The different metabolic activity in different tissues can be used to achieve
a selective effect in one specific organ. In principle, adrenaline (▶ Sect. 1.4) as
well as some b-blockers are suitable for the treatment of glaucoma, because they
can normalize elevated intraocular pressure. However, they have substantial unde-
sirable side effects on the heart function and circulation. This can be avoided by the
administration of prodrugs that are metabolized more quickly in the eye, or only in
the eye, for example, a particularly robust ester 9.50 of adrenaline 9.51, or a ketone–
oxime ether 9.52 of timolol 9.53 (Fig. 9.14).
The area of drug targeting has developed into an exciting field in the last years.
Aside from the above-described prodrugs that release active compounds in the
target area, the concept of antibody-coupled drugs has been pursued especially
for the development of novel cancer therapeutics. Another approach is the
coupling of drugs to a cell-specific recognition sequence. The goal of this work
is to trick the membrane transporters of very specific cells so that the drug
conjugate gains entry. Tumor therapeutics that were derived from N-lost were
introduced in Sect. 9.3. These cytotoxic alkylating compounds, however, are very
reactive and should only be activated in the desired target tissue. For this, the
186 9 Designing Prodrugs

CH3
OMe
CH3
N
CH3 OMe
N H+ N +
S N
N CH3
MeO O
H MeO N
H S
9.47 Omeprazole OH 9.48

CH3
CH3
N + ATPase-SH
N +
N OMe
N OMe
MeO N
H MeO N
CH3
S CH3
S
ATPase S 9.49

Fig. 9.13 In the presence of acids, omeprazole 9.47 is rearranged to a sulfenic acid 9.48, which is
in equilibrium with a cyclic sulfenamide 9.49. This reacts irreversibly with an SH group on the
H+/K+-ATPase, the so-called proton pump.

O
OH X
H
RO N H CH3
N O
CH3 N
CH3
RO N N H3C
S

9.50 Dipevefrine, R = COC(CH3)3 9.52 Oxime Ether, X = N-OCH3

Ketone, X = O
OH
H
HO N
CH3

HO 9.53 Timolol, X = H, OH

9.51 Adrenaline, R = H

Fig. 9.14 The metabolic peculiarities of the eye are exploited for drug targeting in glaucoma
therapy. After penetrating the cornea, the bis-pivaloyl ester, dipivefrin 9.50 of adrenaline 9.51 is
hydrolyzed 20 times faster than it is in the periphery. The oxime ether of timolol 9.52 is
metabolized through the ketone to the active form, timolol 9.53, only in the eye.
9.6 Synopsis 187

Fig. 9.15 The highly Cl Cl Cl Cl


reactive cancer therapeutic
derivative 9.55 is released
from prodrug 9.54, which is N N
activated by a specific
carboxypeptidase. The Carboxypeptidase
carboxypeptidase is bound to
an antibody that is targeted to COOH
the cancer cell.
O N COOH
H O OH

9.54 9.55

following strategies were developed. Aromatic N-lost derivative 9.55 (Fig. 9.15) is
released from prodrug 9.54 by specific peptide cleavage with carboxypeptidase
G2, an enzyme that only exists in bacteria. This enzyme was coupled to
a monoclonal antibody (▶ Sect. 32.3) that specifically recognizes human colorec-
tal cancer cells. With this, the enzyme that “arms” the cancer drug is brought in
the immediate vicinity of the cancer cell. In the future, this antibody-guided
enzyme-activated prodrug therapy could make cancer therapy more tolerable
and less toxic by releasing the active substance locally and in a distinctly more
targeted way.

9.6 Synopsis

• If it is impossible to achieve sufficient bioavailability, duration of action,


membrane penetration or metabolic stability by chemical modifications,
a prodrug can be developed that corresponds to a non- or poorly active precursor
or derivative that is converted in the organism to its active form.
• After absorption, a drug is transported to the liver and exposed to degrading
enzymes that make it better water-soluble for excretion. The amount of the drug
that survives this first liver pass is referred to as the bioavailable portion and can
be distributed in the organism.
• Esters are often used as prodrugs to mask polar acid groups; they are cleaved by
ubiquitously present esterases.
• A large variety of chemical modifications have been applied to modulate the
physicochemical properties of drug molecules, however, they require special
enzymes in the targeted cells or organs for metabolic activation.
• L-DOPA, an amino acid analogue of dopamine, is delivered to the brain via an
amino acid transporter and rapidly decarboxylated. To avoid side effects in the
periphery, a combination with polar decarboxylase inhibitors is advisable.
• Drug targeting to particular organs or cells exploits specific metabolic trans-
formations only present in these compartments of the body.
188 9 Designing Prodrugs

• Antibody-coupled drugs are specifically delivered to those compartments or


organs that present the antibody-specific recognition site on the surface of
disease-related cells. To trick membrane transporters, drugs can be coupled to
cell-specific recognition sequences and thus gain entry to the cells.

Bibliography

General Literature

Balant LP, Doelker E (1995) Metabolic considerations in prodrug design. In: Wolff ME (ed)
Burger’s medicinal chemistry, vol I, 5th edn. Wiley, New York, pp 949–982
Bodor N (1987) Prodrugs and site-specific chemical delivery systems. Annu Rep Med Chem
22:303–313
Bundgaard H (ed) (1985) Design of prodrugs. Elsevier, Amsterdam
Bundgaard H (1991) Design and application of prodrugs. In: Krogsgaard-Larsen P,
Bundgaard H (eds) A textbook of drug design and development. Harwood Academic, Chur,
pp 113–191
Ettmayer P, Amidou GL, Clement B, Testa B (2004) Learned from marketed and investigational
prodrugs. J Med Chem 47:2394–2404
Gibson GG (1994) Introduction to drug metabolism. Blackie, London
Rautio J (2012) Prodrugs and targeted delivery—towards better ADME properties. In:
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry,
vol 47. Wiley-VCH, Weinheim
Silverman RB (2004) The organic chemistry of drug design and drug action, 2 edn. Elsevier
Academic, Oxford, Chapter 7, Drug metabolism, and Chapter 8, Prodrugs and drug delivery
systems
Stella VJ, Borchardt RT, Hageman MJ, Oliyai R, Maag H, Tilley JW (eds) (2007) Prodrugs:
challenges and rewards, vol 2. Springer, New York
Testa B (2007) Prodrug and soft drug design. In: Taylor JB, Triggle DJ (eds) Comprehensive
medicinal chemistry II, vol 5. Elsevier, Oxford, pp 1009–1041
Testa B, Mayer JM (2003) Hydrolysis in drug and prodrug metabolism – chemistry, biochemistry
and enzymology. Wiley-VHCA, Z€ urich

Special Literature

Bodor N, Buchwald P (2005) Ophthalmic drug design based on the metabolic activity of the eye:
soft drugs and chemical delivery systems. AAPS J 7:E820–E833
Brewster ME, Pop E, Bodor N (1993) Chemical approaches to brain-targeting of biologically
active compounds. In: Kozikowski AP (ed) Drug design for neuroscience. Raven, New York
Napier MP, Sharma SK et al (2000) Antibody-directed enzyme prodrug therapy: efficacy and
mechanism of action in colorectal carcinoma. Clin Cancer Res 6:765–772
Peptidomimetics
10

Peptides are open-chain polymers made up of amino acids (Fig. 10.1). The main
chain is constructed of alternating amide groups —CONH— and aliphatic
carbon atoms, which are labeled Ca. The side chains branch from the main chain
at the Ca atom. The amide group is barely flexible (▶ Sect. 14.1). In contrast,
a rotation around the Ca–Cb bond is possible. The side chains are flexible as well.
Because of this, each amino acid can take on multiple conformations. As a
consequence, peptides are very flexible molecules with many rotatable bonds and
a multitude of possibilities to adopt different spatial configurations. Formally, there
is no difference between the construction of peptides and proteins. Nonetheless,
oligomers of amino acids up to a size of 30–50 monomer building blocks are called
peptides, and the term protein is preferred for any members of this substance class
that are above this limit.

10.1 The Therapeutic Relevance of Peptides

Peptides are responsible for numerous biological functions in humans as enzyme


substrates and hormones. A few important examples are summarized in Table 10.1.
Accordingly, peptides are interesting for therapeutic purposes, and in fact, several
important drugs are peptides (Fig. 10.2).
The use of peptides as drugs is significantly limited by several factors:
• Peptides are poorly absorbed after oral administration; this is mostly because of
their high molecular weight and pronounced polarity.
• Peptides are easily degraded by proteases in the gastrointestinal tract and are
therefore metabolically unstable.
• The body is able to very quickly excrete peptides via the liver and kidneys.
Because peptides accomplish so many biological functions in our bodies,
there is tremendous interest in finding active substances that do not have the
above-mentioned detrimental properties, but that bind to the same receptors
analogously to peptides or block enzymes that transform peptide substrates.
A stepwise approach is taken in the search for such compounds. Peptide structures

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_10, 189


# Springer-Verlag Berlin Heidelberg 2013
190 10 Peptidomimetics

O O
H H
H2N N N COOH
N N
H H
O O

HO
Tyr Gly Gly Phe Leu

Cg
Cb

O χ
H
N
N
ω Hφ ψ
O
Ca

Fig. 10.1 The pentapeptide Leu-enkephalin as an example of a peptide structure. The left side
with the free NH2 group is the N terminus, and the other is the C terminus. Each amino acid
contributes three atoms to the peptide chain. Nature almost exclusively uses the 20 natural
(proteinogenic) L-amino acids for the construction of peptides (see Appendix 1). Depending on
the functional groups in the side chains, the distinction is made between hydrophilic acidic and
basic amino acids and those with hydrophobic aliphatic and aromatic side chains. The amino acids
are abbreviated with three-letter codes. A one-letter code is also used. The definition of the torsion
angles o, f, c, and w is shown on the example of the amino acid phenylalanine. The angle o is
practically always close to 180 . The spatial course of the peptide backbone is determined by the f
and c angles (see ▶ Sect. 14.2). The first atom in the side chain is called the Cb atom, and the next
is given the index g.

Table 10.1 Several Peptide Function


important peptide hormones.
Leu-Enkephalin, Opiate receptor ligands, analgesics
Met-Enkephalin
Fibrinogen Platelet aggregation
Angiotensin II Blood pressure increase
Endothelin Blood pressure increase (among other
actions)
Neuropeptide Y Blood pressure increase (among other
actions)
Substance P Bronchoconstriction and pain mediation
10.1 The Therapeutic Relevance of Peptides 191

H-Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly-NH2 Oxytocin

O Me O Me
Me N N O
N N
N Ciclosporin
O O Me
O
O Me N
Me N O O O
H
N N
H N N
Me O H
O

pGlu-His-Trp-Ser-Tyr-D-Leu-Leu-Arg-Pro-NHEt Leuprolide

Fig. 10.2 Peptides as drugs. Oxytocin is used to induce and strengthen contractions during labor.
The immunosuppressive ciclosporin prevents organ rejection after transplantation. Leuprolide
(pGlu ¼ pyro-glutamate) is an analogue of LHRH (luteinizing hormone releasing hormone), one
of the hypothalamic hormones that, via LH (luteinizing hormone), controls the synthesis of male
and female sexual hormones. Leuprolide is used to treat advanced-stage prostate cancer.

are replaced with isosteric building blocks so that the molecular recognition
properties of the peptide remain, but the undesirable characteristics are reduced.
Such peptidomimetics should have the following qualities:
• Few or no cleavable amide bonds to improve metabolic stability.
• Reduced molecular weight to improve oral bioavailability.
• The same spatial orientation of groups responsible for strong binding to the
receptor or enzyme as in the peptide.
Bacteria are the true masters of constructing peptide structures that frequently
achieve the desired metabolic stability. They incorporate amino acids that do not
belong to the typical 20 residues that are usually used for the construction of
proteins. Stereochemically inverted amino acids are also employed, and many of
these structures have a cyclic architecture. They have even evolved a dedicated
synthesis machinery for this: nonribosomal peptide synthesis (▶ Sect. 32.6). This
system of modular, coupled enzymes works like an assembly line. Depending on
the desired product, different enzymatic functional units are lined up, one after the
other, to successively assemble the amino acids cyclizing the product in the final
step. The exchange of an enzymatic synthesis unit causes other amino acids to be
incorporated into the otherwise unchanged peptide. Even ester bonds can be
constructed with a very similar multienzyme complex. Many lead structures all the
way to complete drugs can be derived from these originally bacterial peptides, such
as ciclosporin in Fig. 10.1, which is a most important immunosuppressant. A large
number of macrolide antibiotics (▶ Sect. 32.6) are also synthesized in this way.
Recently a so-called chemoenzymatic synthetic strategy has been developed for
the construction of such macrolides. As discussed in ▶ Sect. 11.6, linear oligopeptides
can easily be synthesized by using the Merrifield synthesis. Non-natural amino acids
192 10 Peptidomimetics

with L and D configurations can also be used to generate high combinatorial diversity.
It is very difficult to cyclize these linear oligopeptides to the desired macrocycle by
using chemical-synthetic methods. Here the nonribosomal peptide synthetic machin-
ery is of service. The synthetically prepared peptides are then funneled into the
enzymatic process chain and the cyclization domain from the bacteria catalyzes
the ring closure of the peptide: a perfect symbiosis between synthetic chemistry
and enzyme biology!

10.2 Designing Peptidomimetics

In the beginning of the 1980s, there was only one generally accepted example for
a low-molecular-weight active substance that takes over the function of an endog-
enous peptide: the opiate. It is assumed that morphine 10.1 is a mimetic of the
endogenous peptide b-endorphine 10.2 (Fig. 10.3). A comparison of both structures
makes it immediately clear that morphine cannot possibly simulate all of the
functional groups of the peptide. Obviously not all are necessary for the biological
activity. This underscores the suspicion that other peptides also bind to receptors
with only a few functional groups. If this hypothesis is true, it should be possible to
identify the essential functional groups and find a small organic molecule that has
the necessary functional groups in the correct relative orientation.
The starting point for the design of peptidomimetics is the identification of the
biologically active peptide, the function of which is to be imitated. In the first step,
single amino acids are excluded to determine whether a portion of the peptide
retains sufficient activity. Next the importance of the individual side chains is
investigated. In a so-called alanine scan (Sect. 10.7), each amino acid is succes-
sively replaced with alanine. A severe loss of activity is an indication that the
removed side chain is important. Until now only peptides made up of the natural
20 amino acids have been investigated. In the next step structural elements are
introduced that do not occur in the 20 proteinogenic amino acids. In principle, the
following are possibilities for peptide structure modification:
• The use of D- instead of L-amino acids.
• Modifications of the side chain of amino acids.

HO

O H H Tyr-Gly-Gly-Phe-Met-Thr-Ser-Glu-Lys-Ser-
N CH3 Gln-Thr-Pro-Leu-Val-Thr-Leu-Phe-Lys-Asn-
Ala-Ile-Ile-Lys-Asn-Ala-Tyr-Lys-Lys-Gly-Glu
HO

10.1 Morphine 10.2 b-Endorphine

Fig. 10.3 Morphine 10.1 is a peptidomimetic for the endogenous peptide b-endorphine 10.2 and
the enkephalins (▶ Sect. 1.4). It binds as an agonist to the opiate receptor.
10.3 First Step to Variation: Modifying Side Chains 193

• Changes on the peptide main chain.


• Cyclization to stabilize the conformation.
• The use of templates that enforce a particular secondary structure, or that allow
the attachment of side chains in a defined spatial orientation.

10.3 First Step to Variation: Modifying Side Chains

An improvement in a peptide’s binding properties can often be achieved by using


other side chains. For instance, in Fig. 10.4 a few analogues of the amino acid
phenylalanine are shown that could be used as possible replacements. An increase
in the binding affinity can be achieved if nonproteinogenic amino acids fill the

β
α COOH
Phenylalanine
NH2

COOH COOH COOH


NH2
NH2 NH2

COOH
NH
NH N COOH H2N
H
O

NH HN
H2N NH2

O O

F O

H2N COOH H2N COOH H2N COOH

Fig. 10.4 Sterically demanding, conformationally fixed, or metabolically stable analogues of the
amino acid phenylalanine; the structural enhancements are indicated in red.
194 10 Peptidomimetics

binding pocket more completely. Rigid analogues lead to improved binding if the
biologically active conformation, the one that is adopted in or at the receptor site,
is immobilized.
The introduction of nonproteinogenic amino acids can increase the metabolic
stability. The hydroxylation of aromatic side chains can be suppressed by using
a substituent, for example fluorine or a methoxy group, in the para position.
Stability to cleavage by the digestive enzyme chymotrypsin can be improved by
adding substituents to the Cb atom because the modified side chain no longer fits
into the active site of this protease. A peptide’s proteolytic stability can also be
improved by exchanging L- for D-amino acids. As described above, bacteria have
already recognized this trick. Distributing D-amino acids randomly in the peptide
can furnish active substances with astonishing metabolic stability.

10.4 A More Courageous Step: Modifying the Main Chain

An important step in the design of peptidomimetics is the replacement of amide


bonds in the main chain. A few commonly used groups are summarized in
Fig. 10.5. It can be difficult or even impossible to find replacements for amide
groups, which make hydrogen bonds to the protein with the C═O as well as NH
groups, that do not decidedly reduce the binding affinity. If the amides only bridge
functional groups to one another and do not form hydrogen bonds to the protein,

O R
Amide bond
N
H

O R O R H OH R R R

N
CH3
N-Methyl- Ketomethylen- Hydroxyethylen- (E)-Ethylen- Carba-

R R

O N
H
Ether Reduced Amide

R
H O OH R
N X = -NH-, -O-, -CH2-
P
X
O
Retro-inverso Phosphonamides, Phosphonates, Phosphinates

Fig. 10.5 Different functional groups that can serve as a replacement for amide bonds in
peptidomimetics.
10.4 A More Courageous Step: Modifying the Main Chain 195

then a large palette of different replacement groups is available. Substitution at the


amide nitrogen atom leads to metabolic stabilization because proteases can hardly
cleave N-methylated amide bonds. If the N-methylation of a main-chain amide
group leads to a loss in affinity, several different explanations come into question.
One is that the N-methylated compound can no longer form hydrogen bonds, and an
essential H-bond is lost in which the NH group was involved. Further, it could be that
an undesired conformational change might have occured as a result of the additional
methyl group, or the methyl group might be sterically blocking the binding onto
the protein. On the other hand, an improvement in binding as a result of N-methylation
indicates that the biologically active conformation is stabilized. At room temperature,
an amide bond is practically exclusively in the trans geometry. Therefore it can also be
substituted with an ester bond that takes on the same geometry. In doing so, however,
the hydrogen-bond-donating properties of the amide are lost.
An N-methyl substitution improves the stability of the 180 -rotated conforma-
tion of the amide. In the case of proline, the only proteinogenic amino acid with an
N-alkyl substitution, both the cis and trans amide configuration can be found.
The exchange for a 1,5-disubstituted tetrazole can replace the cis orientation of
a proline. In addition, trans-configured double bonds imitate the geometry of an
amide bond well. The polar characteristics however, are lost. To a certain extent,
this can be compensated if the double bond is substituted with fluorine. The
reduction of an amide or an isosteric ester bond means the loss of the carbonyl
group and leads to increased flexibility. If the carbonyl group is exchanged
for an —S═O, —SO2 or —PO2 group, the H-bond-accepting characteristics
are amplified, however, a geometry change comes with the bargain. The exchange
of an amide for a thioamide results in a weakening of the H-bond-accepting
properties and can serve as a test of the possible importance of H-bonds to carbonyl
groups in the peptide backbone. Nonetheless, a measure of caution is warranted
because the desolvation of a thiocarbonyl group is less difficult than that of
a carbonyl group. This overlaps with the observed affinity and can mask the
effect of the loss of the H-bond. The retro-inverso exchange of an amide bond
can lead to marked improvement in the proteolytic stability without losing the
binding qualities (▶ Sect. 5.5).
An entirely different concept is the incorporation of b-amino acids
(▶ Sect. 31.7). In contrast to the proteinogenic a-amino acids, these residues have
four chain members per monomer unit. The amide bonds are separated by two
aliphatic carbon atoms. Peptides that are made from these amino acids also show
secondary structural characteristics (Sects. 10.5 and ▶ 14.2). They have already
successfully been incorporated into naturally occurring peptides as mimetics and
can simulate peptide–protein interactions. Because of the altered sequence of amide
bonds, they are stable to proteolytic degradation.
If the cleavable bond of a protease substrate is replaced with an isosteric, non-
cleavable group, a substrate can be converted to an inhibitor (▶ Sect. 6.6). If the
newly introduced group forms particularly favorable interactions with the active
site of an enzyme, an exceedingly potent enzyme inhibitor can result. An example is
found in the ketomethylene group in serine and cysteine protease inhibitors as a
196 10 Peptidomimetics

possible replacement for the amide bond that is destined for cleavage (▶ Chap. 23,
“Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”). The
hydroxyethylene group is especially suitable for aspartic protease inhibitors
(▶ Chap. 24, “Aspartic Protease Inhibitors”). Phosphonamides, phosphonates,
and phosphinates are often strong inhibitors of metalloproteases (▶ Chap. 25,
“Inhibitors of Hydrolyzing Metalloenzymes”).

10.5 Rigidifying the Backbone by Fixing Conformations

An important aspect to the design of peptidomimetics is the peptide conformation.


Peptides are flexible molecules and can take on different conformations. It is known
however that certain conformations are preferably adopted in proteins and in some
peptides. Among these are the two most important secondary structural elements:
the a helix and the b sheet (▶ Sect. 14.2). Furthermore there are loops and turns at
the ends of these secondary structural elements that also adopt preferred patterns,
particularly the b turn (Fig. 10.6).
A b turn is formed when a hydrogen bond exists between the carbonyl group of
the amino acid i and the NH group of the amino acid i + 3. It is obvious that such
hydrogen bonds can only form for certain combinations of the torsion angles f and
c, which are determined by the amino acids in the i + 1 and i + 2 positions
(▶ Sect. 14.2).
b turns are especially interesting because many peptides bind to proteins in a
b-turn conformation. Let us assume that the main chain of the peptide only serves
to position the side chains so that an optimal receptor interaction can occur. Then it
should be possible to replace the peptide chain with an entirely different scaffold,
upon which functional groups are attached that adopt the same spatial orientation as
the amino acid side chains.
If a b-turn-configured peptide binds to a receptor, then a rigid analogue that
“freezes” the b-turn conformation should lead to improved binding. The simplest
way to fix a b turn is the incorporation of the necessary sequence in a small cyclic
peptide. It is known from experimental structure determination that cyclic penta-
and hexapeptides almost always contain a b turn. The conformation of these
peptides were investigated at length in the research group of Horst Kessler at the
Universities of Frankfurt and Munich. It could be shown that the position of a b turn

y i+1 f i+2
Fig. 10.6 A b turn is
a peptide conformation in Ri+1 O Ri+2
f i+1 y i+2
which a hydrogen bond is
formed between the amino N
acids i and i + 3. Particular HN H O
ranges for the values of the O HN
torsion angles fi+1, ci+1, fi+2, Ri Ri+3
and ci+2 are characteristic for NH O
the b turn.
10.5 Rigidifying the Backbone by Fixing Conformations 197

H H
S S

N N H
N H N S
H O
O O NH
O
O O

R O O
S
N
H R
N O
N
O O
O N O N

Figure 10.7 Typical b-turn mimics. The amino acids are added onto the template at the colored
positions.

R N
O O
N N N N
N N
N H O H
O R R O O
H
O

N N S S
HN H N
H N
N N O
N N N
O R H H
R R O O

Fig. 10.8 The illustrated rings replace one or two amino acids and force a particular
conformation.

in a sequence can be controlled. Proline as well as D-amino acids prefer the i + 1


position in these loops. The introduction of D-amino acids supports the formation of
a b turn above other possible conformations.
A b turn can also be forced by a non-peptide template. Numerous b-turn
mimetics have been proposed for this (Fig. 10.7). A part of the structures serves
as a template on which two peptide chains can be forced in an antiparallel orien-
tation. However, substitution by introduction of the R2 and R3 side chains is
synthetically difficult. Benzodiazepines are interesting scaffolds onto which all
four side chains R1–R4 can be coupled. Other peptide conformations can also be
fixed by the introduction of rigid groups. A few examples of conformation-
stabilizing ring systems are displayed in Fig. 10.8.
An especially convincing example of a scaffold mimic is the design of
a thyreotropine-releasing hormone (TRH) mimetic by P. N. Olson and co-workers.
198 10 Peptidomimetics

H
N

N
O
N 10.3 TRH
N
H
NH O CONH2
O

H
N
H
N
N
N

O
O N CONH2 N CONH2
R Ph

Pharmacophore 10.4

Fig. 10.9 By starting with the structure of tripeptide TRH 10.3 and a hypothesis for the functional
groups that are essential for binding, the non-peptidic molecule 10.4 was designed, which also
binds to the TRH receptor.

TRH is the tripeptide pGlu–His–Pro–NH2 10.3. The approach is shown in Fig. 10.9.
After deducing a pharmacophore hypothesis, a rigid scaffold molecule was sought
upon which the side chains could be appended in the correct relative orientation.
Cyclohexane was chosen as a scaffold. Compound 10.4 is a potent TRH receptor
ligand. The substance acts as an agonist and elicits the same effects as TRH. An
improvement in cognitive function could be seen in animal experiments after the
administration of 10.4.

10.6 Peptidomimetics to Interfere with Protein–Protein


Interactions

Proteins communicate with one another and transmit information and signals in that
they form mutual complexes through commonly shared surfaces. The area of the
shared contact surface is usually larger than a few thousand square Ångströms (Å2).
This is a large value when compared to the surface that a small organic molecule of
typical drug size occupies upon binding. Furthermore, the contact area between
two proteins is, as a general rule, not very jagged. It hardly resembles the deep
binding pockets in enzymes that can host small ligands. Nevertheless it would open
entirely new perspectives for drug therapy if such protein–protein contact surfaces
could be blocked with low-molecular-weight compounds. At first glance, this task
seems almost impossible. How can a small molecule bind to a flat, barely structured
10.6 Peptidomimetics to Interfere with Protein–Protein Interactions 199

Fig. 10.10 The NMR spectroscopic structure of the BCL-XL protein with the a-helical,
16-membered peptide fragment from the BK protein (orange). The peptide binds in a deep groove
with the amino acids Ile85, Ile81, Leu78, Val74 (from left to right, side chains are in light blue).
The surface of the BCL protein is shown in white, the contact surface of the hydrophobic amino
acids of the peptide all protrude into the cleft and are indicated by the light-blue net.

protein surface with an interaction that is strong enough not to be “washed away”
when the protein–protein contact forms? Furthermore, there is the problem that
amino acid residues on the convex surface of a protein have in general much more
space to flexibly adapt their conformation. A statistical analysis of the amino acid
composition across the contact surfaces in protein complexes showed a preference
for aromatic residues, aspartate, arginine and the aliphatic residues proline and
isoleucine. The selective exchange of amino acids in the contact surface also
showed that there are a few protruding residues that dominate the interaction
(so-called “hot spots,” ▶ Sect. 17.10). The search for possible binding sites of a
small molecule that can compete with the formation of the protein–protein interface
starts with a detailed analysis of the complementary geometry to the contacting
surfaces. Are there clustered areas with charged residues or does a structural
element such as a b turn or a helix penetrate a little more deeply into the opposite
contact surface? Next, the peptide sequence that corresponds to the contact surface
is synthesized. This can be portions that preferably adopt a helical structure or that
can be fixed in a turn pattern such as a cyclopeptide. If an active peptide is found, it
must be structurally characterized in complex with the opposite contact surface.
The complex of the BCL-XL (B-cell lymphoma) protein with a 16-residue
peptide that was cut from the BAK protein is shown in Fig. 10.10. BCL-XL belongs
to the proteins that prevent programmed cell death (apoptosis). Its function is
regulated by binding to pro- and antiapoptotic factors such as BAK. Inhibitors of
this contact formation might therefore deliver potential drugs for an anticancer
therapy. The binding of the helical peptide takes place in a stretched-out groove.
Small molecules have been discovered that fill this crevice (Fig. 10.11). The group
200 10 Peptidomimetics

of Andrew Hamilton at Yale University has been searching for a basic scaffold that
can imitate the characteristics of a helix and simultaneously hold the side chains on
one side. Terphenyl derivatives 10.5–10.7 were found that can arrange the side
chains in a staggered conformation analogous to a helix. An alanine scan along the
BAK peptide showed that four hydrophobic residues (Val74, Leu78, Ile81, and
Ile85) are essential for binding. In addition, Asp83 forms a salt bridge to BCL-XL.
The terphenyl scaffold was therefore furnished with an acidic group at the end and
decorated with alkyl and aryl residues in the ortho positions. Compound 10.6 binds
to the BCL-XL protein with an affinity of 114 nM.
A different approach was taken at Abbott. Small molecules that interact
with the BCL protein were sought by NMR spectroscopy (▶ Sect. 7.8). The
millimolar inhibitors para-fluorobiphenylcarboxylic acid 10.8 (Fig. 10.11) and
1-hydroxytetraline 10.9 were discovered. Both bind to distinct but neighboring
positions. They replace Asp83 and Leu78 of the binding domain of the BAK peptide,
and 10.9 occupies the Ile85 position. From the two discovered fragments, the scientists
at Abbott developed compound 10.10, which had two-digit nanomolar affinity for the
protein. Further optimization led to 10.11, a highly potent antagonist that blocks the
entire family of antiapoptotic BCL-2 proteins. The synergistic effect of ABT-737
together with radiation and chemotherapy was demonstrated in animal experiments.
An analogous case was studied with the MDM2 protein at Roche. MDM2 is
overexpressed in many tumors. It binds to the tumor-suppressor protein p53, which
protects cells from converting to a malignant state. It is therefore the protein that is
most often inactivated during the carcinogenesis. Inhibition of complex formation
between the overexpressed MDM2 protein and p53 could thus represent an
approach to a possible cancer therapy. Here too, an a-helical p53 peptide stretch
binds to a hydrophobic groove on the MDM2 protein. A cis-imidazoline with an
affinity of 100–300 nM was found in screening. The co-crystal structure was
accomplished with 10.12 (Fig. 10.11). The imidazoline scaffold imitates the side
of an a helix of the peptide from the p53 protein. The two p-bromophenyl rings
replace a Trp and a Leu. The ethyl ether group on the third aromatic ring orients in
the pocket that is filled with a phenylalanine in the peptide. The MDM2 protein is
blocked through this competitive binding, and the level of free p53 increases.
Through this, the p53 pathway in cancer cells is activated, and the cell cycle
comes to a complete stop. The cell may go into programmed cell death. The
tumor growth inhibition was already demonstrated in animal models.
Another large class of proteins that is controlled by contacts with other proteins
is the integrins. Numerous low-molecular-weight inhibitors have been discovered
for this class. An example for the successful design of antagonists by starting from
cyclic peptides is presented in ▶ Sect. 31.2. Many G protein-coupled receptors
(▶ Sect. 29.1) are controlled by endogenous peptides or proteins. For this, the
peptide or protein binds to the receptor. The replacement of the peptide sequences
with an organic molecule that imitates the binding of the natural ligand has also
been attempted. An example of the design of such an active compound is given in
▶ Sects.29.5 and ▶ 29.6. Although successful, the design concept that was followed
10.6 Peptidomimetics to Interfere with Protein–Protein Interactions 201

COOH COOH COOH

O O O

COOH COOH COOH


10.5 10.6 10.7
Kd = 1.89 μM Kd = 114 nM Kd = 2.70 μM

NO2
H
OH N
O
O S S

COOH O NH
10.9
Kd = 4.3 mM

10.10

Ki = 36 nM
F F
10.8
Kd = 0.3 mM

NO2
H
N H N
O
O S S OH
O NH Br
O N
N
N
O
N
N O

10.11 Br
N
Ki = 1 nM ABT-737 10.12

Cl

Fig. 10.11 Different inhibitors of protein–protein contacts that imitate the a-helical structural
building blocks in the contact surface. The terphenyl derivatives 10.5–10.7 bind to the BCL-XL
protein in a pronounced crevice and block the binding site of a helix. The small fragments 10.8 and
10.9, which led to the development of inhibitors 10.10 and 10.11 were discovered in the same area
in an NMR spectroscopic screening. Compound 10.12 is a different helix mimetic that prevents the
interaction between the MDM2 and p53 proteins.
202 10 Peptidomimetics

was wrong: the active peptide and the derived synthetic mimic do not bind in
an overlapping binding region of the receptor.

10.7 Tracing Selective NK Receptor Antagonists by Ala Scan

Tachykinins are neuropeptides that all contain the same lipophilic C terminus:
–Phe–X–Gly–Leu–Met–NH2. A well-investigated representative of the tachykinins
is substance P, Arg–Pro–Lys–Pro–Gln–Gln–Phe–Phe–Gly–Leu–Met–NH2 (10.13,
Table 10.2). Tachykinins bind to at least three different tachykinin receptors, the
NK1, NK2, and NK3 receptors. All three belong to the class of G protein-coupled
receptors (▶ Sect. 29.1). They mediate a variety of biological effects, for example,
bronchoconstriction or pain transmission. Consequently a receptor antagonist could
be helpful for the treatment of asthma as well as to fight pain.
The study that was carried out on the development of an NK2 receptor antagonist
at Parke–Davis in Cambridge is a classic example of conversion of a peptide to
a peptidomimetic (Table 10.2 and Fig. 10.12). A compound was sought that binds
to the same receptor as substance P. Starting point of the work was a hexapeptide,
Leu–Gln–Met–Trp–Phe–Gly–NH2 (10.14), known from the literature that binds to
the NK2 receptor with an affinity of 11.7 nM. In the first step each amino acid was
systematically exchanged for alanine (10.15–10.20). In a few cases the

Table 10.2 The rational design of NK2 receptor ligands.


No. Structure Ki (nM)
Substance P 10.13 Arg-Pro-Lys-Pro-Gln-Gln-Phe- 295
Phe-Gly-Leu-Met-NH2
Minimal fragment 10.14 Leu-Gln-Met-Trp-Phe-Gly-NH2 11.7
Ala scan 10.15 Ala-Gln-Met-Trp-Phe-Gly-NH2 40
10.16 Leu-Ala-Met-Trp-Phe-Gly-NH2 138
10.17 Leu-Gln-Ala-Trp-Phe-Gly-NH2 156
10.18 Leu-Gln-Met-Ala-Phe-Gly-NH2 >10,000
10.19 Leu-Gln-Met-Trp-Ala-Gly-NH2 8,300
10.20 Leu-Gln-Met-Trp-Phe-Ala-NH2 28
10.21 Leu-Gln-Met-Trp-Phe-NH2 200
Dipeptid 10.22 Z-Trp-Phe-NH2 2,700
Immobilization of the biologically active 10.23 Z-Trp-(R,S)-(a-Me)Phe-NH2 327
conformation
N-Terminal optimization 10.24 (2,3-di-OCH3)C6H3CH2OCO- 37.6
Trp-(R,S)-(a-Me)Phe-NH2
Stereochemical optimization 10.25 (2,3-di-OCH3)C6H3CH2OCO- 10,000
Trp-(R)-(a-Me)Phe-NH2
10.26 (2,3-di-OCH3)C6H3CH2OCO- 17.2
Trp-(S)-(a-Me)Phe-NH2
Addition of amino acid 10.27 (2,3-di-OCH3)C6H3CH2OCO- 1.4
Trp-(S)-(aMe)Phe-Gly-NH2
10.7 Tracing Selective NK Receptor Antagonists by Ala Scan 203

H
N
10.22 , R = H
Ki = 2700 nM
O O
H
N
O N NH2
H R 10.23, R = CH3
O
Ki = 327 nM

H
N
10.26, R = H
Ki = 17.2 nM
O O O
H
O N
O N NHR
10.27, R = CH2 CONH2
H
O
Ki = 1.4 nM

Fig. 10.12 Important intermediates on the way to NK2 receptor antagonists 10.27.

replacement with alanine resulted in only a weak decrease in the binding affinity.
As an example, the N-terminal leucine could be replaced with an alanine (10.15).
The conclusion was that the Leu side chain can only be of secondary importance for
receptor binding. The compound in which tryptophan or phenylalanine were
replaced with alanine, however, showed very little affinity for the NK2 receptor.
This was the “smoking gun” that these two amino acids are essential for the
binding. The removal of the C-terminal amino acid glycine (10.21) decreased the
affinity by a factor of 7. Obviously this amino acid also has some importance
for receptor binding. The testing of several N-terminal protected dipeptides led
to Z–Trp–Phe–NH2 (10.22, Ki ¼ 2700 nM) as a lead structure for further work.
With this, the first stage of the project was accomplished. As a dipeptide, 10.22
represented an interesting lead structure for further work.
In the next stage, additional methyl groups were introduced at different
positions of the molecule. This limited the number of possible conformations.
A decrease in binding affinity was observed for many of the investigated com-
pounds with conformational restriction. A methyl group on the Ca atom of
phenylalanine increased the binding affinity by a factor of 8 (10.23, Ki ¼ 327 nM).
A possible explanation for this finding is that the conformation that is adopted in the
receptor is stabilized by the additional methyl group. Then the N-terminal part
of the molecule was varied. The replacement of the terminal phenyl ring with
a 2,3-dimethoxyphenyl group further increased the binding affinity by a factor of 10
(10.24, Ki ¼ 37.6 nM). This value corresponds to the racemic a-methylphenyl-
alanine. The enantiomerically pure compound 10.26 with this building block in the
204 10 Peptidomimetics

H
N 10.28, R = Et, X = H I C50 = 3800 nM
10.29, R = H, X = H I C50 >10000 nM
10.30, R = H, X = 3,5-di-CH3 I C50 = 1533 nM
X
R O 10.31, R = Ac, X = 3,5-di-CH3 I C50 = 67 nM
N
H 10.32 , R = Ac, X = 3,5-di-CF3 I C50 = 1.6 nM
O

F
H
N
CF3
CF3

O H
N O
N CF3
N CF3 O
H N N O CH3
O
H
10.33, IC50 = 3 nM 10.34 Aprepitant

Fig. 10.13 The optimization of lead structure 10.28, which was found by screening, to selective
NK1 receptor antagonists 10.32 and 10.33. In contrast to the metabolically labile benzyl esters
10.28–10.32, ketone 10.33 is also active in animal experiments. The first NK1 receptor antagonist
aprepitant 10.34 was successfully brought to the market by MSD for the prevention of acute
emesis.

S configuration binds with a Ki of 17.2 nM. The reintroduction of the C-terminal


glycine finally led to the highly potent compound 10.27 (Ki ¼ 1.4 nM).
Independent of the work at Parke–Davis, lead structure 10.28 was optimized to
the NK1-specific receptor antagonists 10.32 and 10.33 at Merck, Sharp, & Dohme
(MSD). Although 10.28–10.32 were only effective in vitro, 10.33 is also active
in vivo because of its higher metabolic stability (Fig. 10.13). MSD was finally
successful with the structurally related aprepitant 10.34. The compound was intro-
duced as a medicine to prevent acute emesis (vomiting) during highly nausea-
inducing chemotherapy.

10.8 CAVEAT: Idea Generator for the Design of


Peptidomimetics

In the previous sections it was often highlighted that the side chains of the amino
acids are responsible for the binding to receptors. Usually the main chain merely
plays the role of a scaffold that serves to bring the side chains into the necessary
spatial alignment for binding. As such, a rigid, non-peptidic scaffold onto which the
side chains can be attached in the same spatial orientation should be suitable to
design molecules with similar properties as peptides. This idea was embedded in
a computer program in the group of Paul Bartlett at the University of California in
Berkeley. The program CAVEAT allows the search for rigid molecules that
10.9 Design of Peptidomimetics: Quo Vadis? 205

HN NH2

NH
NH

O B
O
N A
H
NH HN
C
O

OH

Fig. 10.14 The principles of a 3D search for scaffold mimics with the CAVEAT program. First,
the relative orientation of the biologically active side chains in the peptide lead structure is defined
by the Ca–Cb vectors. In this example the three essential amino acids Trp, Arg, and Tyr are taken.
The three vectors, A, B, and C are the essential information used to search the 3D database for rigid
scaffold structures that bear substitutable bonds in the same relative orientation. A list of cyclic
structures that represent possible templates for peptidomimetics is the result.

imitate a particular segment of a peptide scaffold. For this, the bonds on the peptide
backbone are described with vectors (Fig. 10.14). The 3D structure of the peptide
for the peptidomimetic being sought must be known as a prerequisite. The orien-
tation of the side chains is determined by the binding vectors Ca–Cb. The relative
orientation of, for instance, three amino acid side chains is found by the position
of the relevant Ca–Cb binding vectors. With this spatial pattern of vectors, a 3D
database of molecular scaffolds that contain three substitutable bonds oriented
analogously to the three Ca–Cb vectors is searched. The result is a list of rigid,
usually cyclic molecular scaffolds, the free positions of which can be coupled to the
amino acid side chains.

10.9 Design of Peptidomimetics: Quo Vadis?

In this chapter the systematic approach to the design of peptidomimetics has been
described. The approaches have proven themselves in many cases and have led to
many attractive drugs. Nevertheless there are also difficulties. The first problem is
the stepwise approach. A peptide is systematically modified, and the synthesized
structures serve only to identify the essential functional groups. The synthesis of
the many resultant derivatives, that is, practically all in which an amide group was
206 10 Peptidomimetics

replaced by one of the structures in Fig.10.4, is laborious. Furthermore these


compounds only serve as tools because most modified peptides have high molecular
weights, and this can result in poor oral bioavailability.
In the past, many new nonpeptidic active substances, especially as receptor
antagonists, were found in high-throughput screening, and these could frequently
be developed into clinical candidates in relatively little time. These successes have
pushed rational concepts for the development of peptidomimetics, which were once
in the foreground, somewhat into the background. Despite this, the design of
peptidomimetics remains an important research area in drug design. The terphenyl
scaffold helix mimetics serve as an example of this. Many enzyme inhibitors that
are introduced in ▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme
Intermediate”; ▶ 24, “Aspartic Protease Inhibitors” and ▶ 25, “Inhibitors of Hydro-
lyzing Metalloenzymes” continue to have peptidic character. Here the peptide
substrate is clearly the “gold standard” for the design of a mimetic. As always,
the peptidomimetic concept plays an important role in lead structure optimization.

10.10 Synopsis

• Peptides are open-chain polymeric molecules made up of amino acids that


are mutually linked by amide bonds. Side chains branch from the main chain
at the Ca atoms and show a high degree of flexibility. If such a polymer contains
up to 30–50 amino acids, it is called peptide, beyond this limit, it is called
protein.
• Peptides are responsible for many biological functions; their applicability as
drugs is limited due to size, polarity, and poor proteolytic stability.
• Due to their multiple functions, peptides can be mimicked by smaller – similarly
binding – and metabolically stable peptidomimetics.
• Peptidomimetic design starts with the identification of the minimal peptide
sequence responsible for a biological effect, followed by successive replacement
of each amino acid in the chain with alanine to detect the side chains responsible
for activity. Finally, individual amino acids are replaced by non-proteinogenic
ones or similar chemical building blocks.
• Multiple surrogates for amino acid side chains have been developed and can be
tested to reveal better binding and more stable peptidomimetics. If not involved
in direct binding, main-chain amide bonds can be replaced by a large variety of
substitutes that achieve a similar geometry.
• Peptides are flexible and adopt multiple conformations. If a particular fold is
adopted to correctly orient interacting side chains, the peptide backbone can be
replaced by an entirely different scaffold that correctly positions the essential
interacting groups.
• Peptides fold upon themselves through particular turn patterns. These turns
stabilize a required conformation and can be chemically replaced by rigid
structural surrogates that freeze a given turn conformation.
Bibliography 207

• Proteins communicate with each other through the formation of large, mutu-
ally shared surface patches. Small molecules designed to bind to such flat
surfaces can antagonize complex formation and interfere with protein–protein
communication.
• Design of small molecules to block protein–protein interfaces exploits depres-
sions on the surface that accommodate spatial patterns such as turns or helical
portions of the penetrating contact surface of the partner protein.
• Peptides bind to receptors mostly via side chains, and the backbone provides the
scaffold for their attachment. Computer programs can be used to screen struc-
tural databases to retrieve alternative scaffolds that are able to orient substituents
in very similar fashion.

Bibliography

General Literature

Ahn J-M, Boyle NA, MacDonald MT, Janda KD (2002) Peptidomimetics and peptide backbone
modifications. Mini Rev Med Chem 2:463–473
Gante J (1994) Peptidomimetics—tailored enzyme inhibitors. Angew Chem Int Ed Engl
33:1699–1701
Giannis A, Kolter T (1993) Peptidomimetics for receptor ligands—discovery, development, and
medical perspectives. Angew Chem Int Ed Engl 32:1244–1267
Hirschmann R (1991) Medicinal chemistry in the golden age of biology: lessons from steroid and
peptide research. Angew Chem Int Ed Engl 30:1278–1301
Marahiel MA (2009) Working outside the protein-synthesis rules: Insights into non-ribosomal
peptide synthesis. J Pept Sci 15:799–807

Special Literature
Howson W (1995) Rational design of Tachykinin receptor antagonists. Drug News Perspect
8:97–103
Lauri G, Bartlett PA (1994) CAVEAT: a program to facilitate the design of organic molecules.
J Comput Aided Mol Des 8:51–66
Lelais G, Seebach D (2004) b2-amino acids-synthesis, occurrence in natural products, and
components of b-peptides. Biopolymers 76:206–243
McLeod AM, Merchant KJ, Cascieri MA et al (1993) N-Acyl-Ltryptophan benzyl esters: potent
substance P receptor antagonists. J Med Chem 36:2044–2045
Merchant KJ, Lewis RT, MacLeod AM (1994) Synthesis of homochiral ketones derived from
L-tryptophan: potent substance P receptor antagonists. Tetrahedron Lett 35:4205–4208
Olson GL, Bolin DR, Bonner MP et al (1993) Concepts and progress in the development of peptide
mimetics. J Med Chem 36:3039–3049
Part III
Experimental and Theoretical Methods
210 III Experimental and Theoretical Methods

A crystal is the prerequisite for the 3D-structure determination of a protein with


X-ray crystallography (▶ Chap. 13). The figure shows crystals of a complex of
protein kinase A that were used to elucidate the reaction mechanism of this class of
enzymes (▶ Chap. 26). (Reprinted with the kind permission of Dr. Dirk
Bossenmeyer, Deutsches Krebsforschungszentrum, Heidelberg, Germany.)
Combinatorics: Chemistry with Big
Numbers 11

The search for new lead structures and the optimization of their activity profile by
systematic modification are among the most time and cost-demanding steps in drug
research. The optimization of a small organic molecule can serve as an example.
Even if the number of different groups per position is limited to relatively few,
several million structures are possible as exemplarily shown in the case of the
multisubstituted tetrahydroisoquinoline carboxylic acid amide 11.1 (Fig. 11.1). The
combinatorial explosion of all imaginable substitution possibilities can no longer be
realized with classical chemical techniques. The diversity increases even more
when the different stereoisomers are considered. The number is already consider-
ably larger than the number of all of the compounds referenced in Chemical
Abstracts (33 million) or in Beilstein (10 million compounds).
In the days when substances were tested on whole animals or in complex
pharmacological in vitro models, the biological tests were the rate-determining
step. The introduction of molecular test models, for example, enzyme or
receptor-binding tests, and extensive automation of screening has fundamentally
changed this situation. Testing of many thousands of compounds per day is
technically unproblematic (▶ Sect. 7.3). To use the capacity of these methods
to their fullest extent, the synthesis of thousands or even tens or hundreds of
thousands of different molecules is desirable. The strategy can then shift either to
automated parallel synthesis to cover a large number of single compounds or
the simultaneous production of compound mixtures by using combinatorial
chemistry.

11.1 How Nature Produces Chemical Multiplicity

Nature has shown a way to achieve combinatorial diversity with the nucleic acids
and with proteins. A 600-base-pair DNA sequence codes a protein with 200 amino
acids. From the “pool” of four nucleic acids that code for the 20 proteinogenic
amino acids in triplet sequences, 4600 (a number with 360 digits!) different

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_11, 211


# Springer-Verlag Berlin Heidelberg 2013
212 11 Combinatorics: Chemistry with Big Numbers

5 4

5 R5 O
R4 2
R6 R9
* N

R7 * N R3 R10 20
R1 R2
R8
5
10

5 10
2
11.1

Fig. 11.1 The tetrahydroisoquinoline carboxylic acid amide 11.1 is to be substituted in 10 posi-
tions. The groups in these positions encompass a multiplicity of a total of 68 building blocks
(R1–R10 ¼ 5, 10, 10, 4, 5, 5, 5, 2, 2, 20 groups). Twenty million compounds can be constructed in
this way. If the structural diversity that results from the two stereocenters (*) is considered, this
number increases again by a factor of 4.

Table 11.1 Four-hundred Compounds Number


dipeptides, 8,000 tripeptides,
160,000 tetrapeptides, and Natural amino acids, A 20
64 million hexapeptides can Dipeptides, A–A 400
be generated from the Tripeptides, A–A–A 8,000
20 natural amino acids, A. If Tetrapeptides, A–A–A–A 160,000
the palette is expanded to 100 Hexapeptides, A–A–A–A–A–A 64,000,000
modified, non-natural amino
Modified amino acids, M 100 (for example)
acids, M, the combinatorial
diversity increases Modified hexapeptides, M–M–M–M–M–M 1,000,000,000,000
dramatically. Number of known compounds >33,000,000

DNA sequences are possible. This translates to 20200 (a number with 260 digits!)
different amino acid sequences for the resulting protein. Short peptides with
enormous structural variety can be constructed with just the 20 proteinogenic
amino acids. If instead of amino acid A, a manageable number of modified
amino acids M is used, the number of possible analogues increases even more
(Table 11.1).
Peptides play an important role in biological systems. They are found as
protein ligands in the free form or as simple derivatives. Peptide sequences exposed
on the surface of a protein determine the recognition properties of the protein at
a receptor. Nature exhausts the full combinatorial diversity of the variable
sequences on the surface regions (epitopes) of proteins for their selective recogni-
tion. This principle of Nature can be adopted to generate huge compound libraries
with highly variable composition.
11.2 Protein Biosynthesis as a Tool to Build Compound Libraries 213

11.2 Protein Biosynthesis as a Tool to Build Compound


Libraries

How can the biochemical synthesis machinery be used as a vehicle to generate


a multiplicity of peptide sequences? It is possible to connect short sequences to a
carrier protein so that they are exposed on the surface and can interact with the
target protein in a molecular test system. The test system is constructed in a way
that the binding to the target protein is monitored with an easily registered signal,
for instance, a fluorescence signal or a colorimetric reaction (▶ Sect. 7.2).
To exploit protein synthesis for the construction of such a library, information
about the randomly constructed peptide must be translated into the “genetic make-
up” of the DNA molecule. This codes the sequence of the protein on the surface of
which the library will be presented. Randomly assembled double-stranded DNA
sequences must be introduced in the correct position of the DNA. After production
of a large number of identical copies (clones), the gene can be expressed. A large
population of proteins is produced that carry the randomly composed peptide
sequence in a particular position, usually at the beginning or the end of the polymer
chain. These proteins are then investigated in a test system. The distribution of the
20 proteinogenic amino acids over the variable sequence section is not entirely
homogenous. That is because some amino acids are coded with a single triplet
sequence (codons), and others are represented with up to five different codons
(▶ Sect. 32.7). Because of this, biased libraries are inevitably formed.
The bacteriophage M13 is an extremely popular expression system. M13 is
a virus that infects Escherichia coli strains well. The virus carries six proteins on its
coat. Two of these coat proteins allow randomly assembled protein sections to be
added to their ends. A library of 20 million modified 15-residue peptides was
constructed with this M13 system. Their binding to the protein streptavidin was
tested. Fifty-eight candidates were identified as binding partners. They all carried
the ─His─Pro─Gln─ segment in common. A crystal structure of the streptavidin in
complex with this oligopeptide was successfully determined. The ─His─Pro─Gln─
segment of the peptide occupies the binding pocket that is normally used by biotin.
This proves that selectively binding peptide sequences can be found with this method.
The biochemical approach to generating and presenting compound libraries has
the overwhelming advantage that the high-capacity protein biosynthesis is
exploited. Furthermore the sophisticated protein and DNA synthesis techniques
and analytical methods that have been developed for such substances (Sect. 11.7)
can be used to characterize screening hits. But it also has disadvantages. The
molecular diversity is limited to the 20 proteinogenic L-amino acids, and only
peptides result as lead structures. Often these represent the starting point for
the development of an active substance. However, we want to get away from the
metabolically unstable, poorly bioavailable peptides. Therefore structures are
searched for by using classic organic molecular scaffolds. At the very least,
peptidomimetics or peptides with metabolically stable non-natural amino acids are
desired. Unfortunately the step away from peptides toward alternative scaffolds, with
retention of the biological activity, is not trivial (▶ Chap. 10, “Peptidomimetics”).
214 11 Combinatorics: Chemistry with Big Numbers

11.3 Organic Chemistry from a Different Angle: Random-


Guided Synthesis of Compound Mixtures

Organic preparative methods were devised as an alternative to the biological


approaches to generate compound libraries. Simple access to a compound library
is gained by starting with reactive molecular building blocks, such as oligofunctional
acid chlorides (11.2–11.4, Fig. 11.2). These components are simultaneously reacted
with numerous reagents, for example, amines or amino acids. A mixture of many
products is formed in an uncontrolled manner. Contrary to the general academic
opinion that organic reactions should only deliver homogenous products, in this
case as much product diversity as possible is desired. The advantage of this method
is that it is easily carried out, and automation is readily realized. This synthesis

O
O Cl O O Cl
Cl Cl
O
O Cl Cl
O
Cl O O
Cl
11.2 11.3

Cl AA4 O O AA1
O O
O

Cl Cl AA3 AA2

11.4 O O

Lys O O Ile Lys O O Ile

O O
Val Pro Pro Val

O O O O

11.5 11.6

Fig. 11.2 The oligofunctional acid chlorides of the central building blocks cubane 11.2, xanthene
11.3, and benzene 11.4 are treated with protected amino acids. A xanthene-containing library
inhibits the digestive enzyme trypsin. The active component of the library was deconvoluted and
characterized by targeted resynthesis. In the end the isomers 11.5 and 11.6 remained as the most
potent compounds. The derivative 11.5 inhibits trypsin with a Ki of 9.4 mM.
11.4 What Is Contained in Chemical Space? 215

strategy also has disadvantages. The coupling partners have different reactivities. As
a result, the products are not evenly distributed. The transformation of a particular
functional group on the central building block can depend upon which components
the central molecule has already reacted with, and how this influences the other
functional groups.
The thus-generated library is then tested. If binding to the target protein is found,
the active substance in the mixture is characterized, a task that is not particularly
simple. On the one hand, sophisticated analytical techniques such as liquid chro-
matography coupled with NMR spectroscopy and mass spectrometry can be used.
Moreover, an attempt can be made to “deconvolute” the library. For this, a targeted
resynthesis of the library is carried out in which a partial library is prepared by using
a defined selection of building blocks. This smaller library is then tested and the
composition of the active mixture is determined. This strategy must be followed
back to the level of single defined reactions product.

11.4 What Is Contained in Chemical Space?

At this point the fundamental question must be asked: how many organic molecules
are principally possible from which medicinal chemists can create their candidates?
What is the content of this, at first virtual, chemical space? Much has been speculated
about this question. Numbers between 1020 and 10200 possible molecules have been
named. The last claim encompasses so many molecules that the entire mass of the
universe would not be enough to synthesize at least one molecule of every com-
pound! We have to thank Tobias Fink and Jean-Louis Reymond of the University of
Bern for forming a concrete idea of the principle occupancy of this chemical space.
Beginning with mathematical graphs that describe simple hydrocarbon scaffolds,
molecules with up to 11 C, N, O, and F atoms were generated on the computer.
Heteroatoms and unsaturated bonds were scattered throughout the generated molec-
ular graphs in a combinatorial fashion. Different filters that consider the chemical
stability of the functional groups, the strain of the ring systems, and the formation of
tautomers produced a database of 26.4 million structures. If all possible stereoiso-
mers are generated, an average of 4.2 isomers per entry is formed. The database
finally encompassed 110 million molecules. It is interesting to see that the number of
entries increases exponentially with the square of the number of atoms. Therefore
already 90% of the database is composed of molecules with 11 non-hydrogen atoms.
If the number of molecules that can be generated with 25 non-hydrogen atoms is
estimated, the result is 1027 imaginable products. Twenty-five atoms represent
approximately the average size of a typical drug molecule.
It is worthwhile, however, to look at the database with entries of 11 non-
hydrogen atoms more closely. The average molecular mass in this database is
153  7 Da. Molecules of this size fall into the range of typical fragments or
“lead-like” molecules (▶ Sect. 7.9). Exclusion criteria were proposed that emphasize
promising candidates for drug development. The so-called “rule of three” leans on
the “rule of five,” which was established by Chris Lipinski at Pfizer (▶ Sect. 19.7).
216 11 Combinatorics: Chemistry with Big Numbers

If the database is filtered with these rules, approximately half of the entries remain. Of
these, ca. 15% are acyclic compounds, and about 43% contain one ring. It is very
enlightening to see that only about 55% of the ring systems in the virtual database
have been described in Chemical Abstracts or Beilstein. Comparison with a data
collection of already-synthesized molecules of the same size makes clear where the
chemical space has been only sketchily explored. It seems that very big gaps still
exist! Over 99.8% of the entries in the virtual database are waiting to be synthesized.
A comparison of the physicochemical properties of the molecules in both databases
suggests that very broad areas still remain that until now have not been explored.
If the chemical space is limited to compounds with 7, 8, or 9 atoms, it seems that the
chemical space is well covered with already prepared molecules. Approximately 2/3
of the molecules with 10 or 11 atoms in the virtual database are chiral. In this group
particularly, there are many candidates that meet the “lead-like” criteria. This is a real
challenge for synthetic chemists. Chiral fused carbo- and heterocycles are difficult to
make. Nevertheless, Nature has led the way: many biologically active natural prod-
ucts contain just these building blocks.

11.5 Compound Libraries on Solid Support: Complete Yield


and Easy Purification

An interesting variation on classical chemistry in solution is found in the synthesis


of compound libraries on solid supports. Organic polymers, usually cross-linked
polystyrenes, are used as carriers. This material is chemically modified so that
it carries numerous reactive functional groups of a particular sort, for example,
chloromethyl, carboxylate, or amino groups. Through these groups the reaction
product remains covalently attached to the insoluble polymer during the synthetic
steps. Stepwise growth of the product is accomplished by coupling with appropri-
ately protected building blocks (e.g., amino acids) and subsequent cleavage of these
protecting groups. Large excess of reagents causes fast and nearly complete trans-
formations. Unreacted starting materials can be removed by simple washing. After
assembly of the target molecule, all protecting groups are removed. At the end of
the synthesis, the product is either tested directly on the support or it is cleaved and
its biological activity is tested in solution (Sect. 11.7).
The technique can be easily automated. In the beginning of the 1960s, Robert
Bruce Merrifield developed solid-phase synthesis for peptides and small proteins
(Fig. 11.3). In the beginning of the 1980s the idea to use synthetic combinatorial
principles for peptide synthesis emerged for the first time. H. Mario Geysen
devised a multipin synthesis of peptides. By using a conventional Merrifield
solid-phase synthesis 96 different peptides or defined peptide mixtures were
prepared in an 8  12 format on polymer pins. This concept was so revolutionary
that the originally submitted manuscript was rejected for publication in 1984. The
referees were too severely restricted by their traditional thinking. The absolute
control of stoichiometry and yield were less in the foreground for Geyson, rather
the creation of combinatorial diversity with minimal effort was more important.
11.5 Compound Libraries on Solid Support: Complete Yield and Easy Purification 217

ClCH2OCH3 + SnCl4
or ZnCl2 + HCHO

Cl

Boc-NHCHR1COO−
R1 Boc
N
O H
O

HCl / CH3COOH
R1 H
N
O H
O

Boc
Boc-NHCHR2COOH/
O N H
DCCI / DMF R1
N R2
O H
O

Cleavage with
strong acid
HBr/CF3COOH H
O N H
Br R1
+ N R2
HO H
O

Fig. 11.3 The Merrifield peptide synthesis is assembled on a polymeric resin that is
functionalized in an appropriate way. The first N-terminal-protected amino acid is coupled to
the chloromethylene group (Boc ¼ tert-butoxycarbonyl protecting group). Then the amino group
is released, activated with dicyclohexylcarbodiimide (DCCI), and coupled with a second amino
acid. The N terminus of the resulting dipeptide can be deprotected and elongated. It can also be
cleaved from the resin under strongly acidic conditions as a peptide.
218 11 Combinatorics: Chemistry with Big Numbers

In this way, thousands of different peptides could be prepared weekly. Entire


libraries of compounds could be prepared and tested. The new methods were
originally used for “epitope mapping,” that is, the structural probing of the surface
of a protein with different antibodies (▶ Sect. 32.1). This technique allows the
recognition of areas in a polypeptide chain that are exposed to the surface of
a protein. Later it served the search for optimal sequences of protease substrates
(▶ Sect. 14.6) and for the synthesis of biologically active peptides. In addition to
the multipin method, high-efficiency methods have been established, for instance,
the teabag method. Support beads are filled into teabags and dipped into solutions
of protected amino acids with which their peptide sequence is to be elongated.

11.6 Compound Libraries on Solid Support Require Contrived


Synthetic Strategies

An especially sophisticated synthetic strategy is needed for the construction of


compound libraries. Hexapeptides are considered as an example. In principle, all
20 proteinogenic amino acids could be used and 206 ¼ 64 million hexapeptides
prepared and individually tested–an impossible undertaking. Therefore intelligent
strategies are needed to quickly identify the biologically active sequences. As
a consequence, an attempt is made to summarize the 64 million peptides in partial
libraries. They contain constant amino acids in fixed positions. For example, all
400 partial libraries should be prepared for all possible hexapeptides with the form
XXABXX (A, B ¼ predefined amino acids, and X is any natural mixture of amino
acids). After testing these 400 substances the most biologically active mixture is the
starting point for the second round of synthesis. Another 400 libraries are generated,
this time with the form XA(Aa1)(Aa2)BX. Aa1 and Aa2 are the amino acids from the
most active mixture from the first testing. These too are fed into the testing. The
“best” amino acids for position 2 and 5 are found. This strategy is pursued step for
step, until the most active sequence is identified.
In a simpler procedure the amino acids are varied in one position at a time. By
starting with 20 libraries AXXXXX the most active amino acid is determined in the
first position. The starting point for the next synthetic cycle is the most active
mixture Aa1XXXXX. By varying the adjacent position, the second amino acid is
ascertained. This is repeated and 6  20 ¼ 120 hexapeptide libraries are prepared in
the form of AXXXXX, Aa1AXXXX, . . . Aa1Aa2Aa3Aa4Aa5A until the “best”
amino acids in all positions are determined.
Another method allows the targeted construction of a library in few working
steps. The conceptual design of the synthesis ensures that a defined compound is
produced on each polymeric support bead. This is achieved by using the so-called
“split- and -combine” technique (Fig. 11.4). For example, it is possible to synthesize
all 8,000 possible tripeptides from the 20 proteinogenic amino acids in only
60 reaction steps. They are produced as 20 mixtures of 400 substances each. In the
end, one definite compound is located on each polymer bead. The individual beads
are available as a batch that is easily separated mechanically and individually tested.
11.6 Compound Libraries on Solid Support Require Contrived Synthetic Strategies 219

A C
B

A B C

A C
B

A A A B B B C C C
A B C A B C A B C

A C
B
A A A B B B C C C
A A A A A A A A A

A B C A B C A B C

A A A B B B C C C
B B B B B B B B B
A B C A B C A B C

A A A B B B C C C
C C C C C C C C C
A B C A B C A B C

Fig. 11.4 The construction of a compound library according to the split-and-combine technique
starts with a certain amount of resin beads. These are evenly distributed among n reaction vessels.
Only three are considered here for the sake of simplicity. In the first flask reagent A (e.g., amino
acid A) is coupled to the resin. Reagents B and C are analogously added to flask 2 and 3. In the next
step a dipeptide is constructed. To solve the problem of different reaction rates between the
different amino acids A, B, and C, only one soluble reaction partner is added in excess to the
mixture of solid-phase-bound starting materials. After the first reaction step, the resin, which is
now loaded with an amino acid is combined and mixed. It is again distributed between 3 (or more)
reaction flasks. The next reaction is carried out. In the case of a peptide synthesis, amino acid A is
added to flask 1, B to flask 2, and C to flask 3. The resin is combined and mixed thoroughly. In the
meantime all nine possible dipeptides are on the beads. After separating the beads again, the third
step follows. In case the peptide chain is to be extended by another amino acid, amino acid A is
added to flask 1, B to flask 2, and C to flask 3. Now all 27 imaginable sequential tripeptides are on
the resin after three parallel reaction steps. A clearly identifiable compound is found on each resin
bead. The library can be tested directly on the polymer or it can be tested in solution after cleavage
from the support.
220 11 Combinatorics: Chemistry with Big Numbers

11.7 Which Compound in the Solid Support Combinatorial


Library Is Biologically Active?

The libraries that were generated on the solid support are biologically tested. This
can be done directly on the polymer-immobilized compounds. As with testing
the libraries from bacteriophages, there is a danger that the support material
influences the test, for example, through steric hindrance or unspecific interactions.
Furthermore, it is important that the test protein is in a soluble form. Membrane-
bound receptors therefore elude testing. Alternatively, the compound library can be
cleaved from the resin. For this, the coupling between the resin and the library
component must be made by using an appropriate “linker,” which allows the library
to be selectively released. This linker is cleaved off, for instance, at a low pH or
photochemically with UV light. It must not interfere, however, with the synthetic
assembly of the library, and must not be cleaved during the synthesis.
The final cleavage from the resin must not destroy the products. Testing the
cleaved products certainly correlates better with physiological conditions. Spread-
ing the cleaved compounds onto a large area or embedding them in a gel
achieves a spatial separation so that compounds interacting with the test protein
occur in local high concentrations. This way, the binding to insoluble (e.g., mem-
brane-bound receptors) proteins can be tested. However, the advantage of
the mechanical manipulation of a polymer-bound compound library is lost
upon release.
If biological activity is found in the test, it remains to be determined which
compound from the library is responsible. If the library is precisely defined through
the synthetic program, then it is known which compounds were tested. Active
components are narrowed down by deconvolution and the resynthesis of partial
libraries. Only one defined compound is produced on each resin bead with the one-
bead-one-compound technique. It is not known, however, which one. It is only
after activity is found that the compound characterization is attempted. There are
many ways to do this: they can be tested on the resin by separating the relevant resin
beads and analyzing the compounds. If the library is of peptides or oligonucleo-
tides, peptide sequencing by Edman degradation (works even on 0.1 picomolar!) is
carried out, or polymerase chain reaction allows (▶ Sect. 12.1) amplification and
enrichment of oligonucleotides.
Even more elaborate techniques are also used. During synthesis, the library is
allowed to “grow” on multiple different linkers. The single library compounds can
be released from these linkers under different conditions (e.g., different pH values,
or photochemically at different wavelengths). First the compound is cleaved from
the first linker to carry out testing. The cleavage from the second linker is performed
after mechanical separation of the desired resin beads. This method serves to
practically “label” the resin beads. The technique is therefore an elegant variation
on library testing in a detached state. The different linker-bound compounds on the
resin bead need not be identical. Therefore a test library of peptides can be linked to
the resin bead by oligonucleotides, which are used as labels. Halogenated aromatics
were also proposed as labels because they can be easily identified by mass
11.8 Combinatorial Libraries with Large Diversity: A Challenge for Synthetic Chemistry 221

spectrometry even in the smallest quantities. The labels can even be encoded based
on their sequence or the number of monomer building blocks with an appropriate
binary code.
The techniques of labeling the resin bead require considerable synthetic effort
for the library preparation. The transformation steps for the assembly of the library
and the labeling must not disturb one another. Even the final reading of the labels
can require multiple working steps. The alternative route using the programmed
synthesis concept with deconvolution and resynthesis also means increased effort
for the repeated construction of the library components. However, the same work-
ing steps are always used, they are just carried out with different reagent compo-
sitions. With respect to automation, this is certainly an advantage.

11.8 Combinatorial Libraries with Large Diversity: A Challenge


for Synthetic Chemistry

Another aspect speaks for the last above-mentioned concept. In the meantime
a large number of organic reactions have been transferred to solid-phase synthesis.
For each solid-phase synthesis, a special strategy, a specific linker, and a suitable
cleavage method must be developed. Each single synthetic step must be compatible
with the protecting groups, the polymer support, and the linker. However, a whole
new dimension of chemical diversity is made available than is possible with
peptides and nucleotides.
Careful design of the target molecules to be synthesized is indispensible for
combinatorial chemistry. Limitations arise from the accessibility, that is, the devel-
opment of an appropriate synthetic scheme, and furthermore from the desired
structural diversity of the resulting library. Computer methods help to find
a “reasonable” selection of synthetic components. How is the optimal composition
obtained? This highly depends on what the constructed library should be tested for.
A library can be developed for general-purpose screening. It should then be
“optimally diverse.” Its composition is outlined according to generally accepted
criteria such as molecular weight, total lipophilicity, an even distribution of H-bond
donors and acceptors, as well as the size of the hydrophobic surface area.
These characteristics are important for the similarity or diversity of active com-
pounds in the library (▶ Chap. 17, “Pharmacophore Hypotheses and Molecular
Comparisons”). The desired library diversity can also be considered in relation to
the biological properties of a receptor (target oriented). Criteria that make
a molecule “similar” or “diverse” for one receptor are not necessarily the same
for another receptor (▶ Sect. 17.7). In view of the broad palette of proteins for
which combinatorial libraries should be tested, there is no absolute measure of
diversity. Therefore, combinatorial chemistry plays an important role in the estab-
lishment of structure–activity relationships for a target protein. For this the chem-
ical variation in different positions must be very quickly conducted on a suitable,
discovered lead structure. The design and synthesis of such targeted compound
libraries opens the gateway quickly.
222 11 Combinatorics: Chemistry with Big Numbers

11.9 Nanomolar Ligands for G Protein-Coupled Receptors

Chemists at the company Chiron synthesized a library of trimeric N-substituted


oligoglycines (peptoids) by using the split-and-combine method (Fig. 11.5). In their
design of the nitrogen substituents the scientists had G protein-coupled receptors in
mind. These receptors are the targets of many neurotransmitters and hormones. In
the construction of their peptoids they combined at least one aromatic group and
a side chain with an H-bond donor in the form of a hydroxyl group (Fig. 11.5,
Groups A and O). Furthermore, a basic nitrogen atom is present in the molecules
with X ¼ H. These groups match those also found in neurotransmitters and
hormones. They have chosen as diverse a substituent composition as possible for
the remaining third substituents (Group D). A peptoid library with ca. 5,000 di- and
tripeptoids was prepared from these groups.
Different mixtures were tested on the adrenergic receptors. The H-ODA-NH2
partial library was identified as the most active. It served as a starting point for the
stepwise deconvolution of the library. Partial libraries were resynthesized, first by
keeping the hydroxy side chain O constant, then the members of the diverse group D,
and finally the aromatic substituent A. In the end, 11.7 remained as a nanomolar
ligand (Fig. 11.6).
The same peptoid library was tested on another GPCR, the opiate receptor. In
this case the most active partial library H-ADO-NH2 was found in the first step. The
relevant deconvolution through resynthesis delivered 11.8 as a nanomolar ligand.
The molecule has a p-hydroxyphenylethyl moiety and a diphenylmethane group on
both ends of the tripeptoid. It is known from detailed studies on Met-enkephalin
11.9 that the amino acids tyrosine and phenylalanine are essential for the activity.
There are analogous groups for both moieties on the tripeptoid (Fig. 11.6).

11.10 More Potent than Captopril: A Hit from a Combinatorial


Library of Substituted Pyrrolidines

The Affymax company prepared a library of ca. 500 differently substituted


pyrrolidines by 1,3-dipolar cycloaddition. In the first step, the resin was loaded
with protected amino acids (Gly, Ala, Leu, and Phe; Fig. 11.7). Then the transfor-
mation to an imine was made with four different aromatic aldehydes. Cycloaddition
with five different alkenes led to five-membered-ring heterocycles. In the last step,
the pyrrolidines were N-substituted with three different thiols.
This last step was done in view of testing these ligands on the angiotensin-
converting enzyme (ACE, ▶ Sect. 25.4). Inhibitors of this enzyme contain a
functionalized proline residue at their C terminus. The iterative deconvolution of
the library afforded 11.10 as a potent ACE inhibitor (Fig. 11.7; Ki ¼ 160 pM). It is
distinctly a stronger binder than the marketed product captopril and belongs to the
most potent thiol-containing ACE inhibitors.
11.10 More Potent than Captopril 223

X H R2 O A
O ∗
X N NH2
N N
H3C ∗
R1 O R3 O ∗
O
cHex

X R1 R2 R3

O ∗
OH X A D O

X A O D
OH X O D A

X O A D
OH ∗
X D A O
X D O A

D CH3 COOH CH3 ∗ CH3


∗ ∗ ∗ O ∗ NH2

CH3 CH3
OH O NH2
∗ ∗ O ∗
CH3 CH3

O O

N ∗
∗ ∗

OCH3 O
OCH3 OCH3 O

N ∗ O ∗ ∗

Fig. 11.5 Peptoids are oligoglycines that are substituted at nitrogen. A library of di- and
tripeptoids was constructed according to the split-and-combine technique. Three X groups
were added to the N terminus. Three groups O with a hydroxy function, 4 groups A with an
aromatic ring, and 17 groups D with diverse groups were used as nitrogen side chains. Eighteen
mixtures (6 permutations of A, O, and D with 3 end groups) gave ca. 5,000 di- and tripeptides.
The H-ODA-NH2 library showed activity on the a-adrenergic receptor. First, the hydroxy
groups O were deconvoluted. The compounds with p-hydroxyphenethyl groups were the most
active ones. In the next synthesis round, 17 partial libraries were composed with this O group
held constant, and defined groups were used from the diverse D group. Compounds with
a diphenyl or diphenyl ether group were particularly active. With these groups in the
D position, the work was continued. The division of the aromatic side chains A in the last
position led to eight individual compounds.
224 11 Combinatorics: Chemistry with Big Numbers

N NH2
HN N
O O O
D
O
O A

O
N NH2
HN N
OH
O O
11.7
A O

11.8 OH

O O O
H H
N N NH2
HO N N
H H
O O

SCH3 HO
11.9

Fig. 11.6 The derivative 11.7 is the most potent compound from the H-ODA-NH2 library with
a Ki ¼ 5 nM on the a-adrenergic receptor. Testing on the opiate receptor gave compound 11.8 as the
candidate with highest affinity (Ki ¼ 6 nM) from the H-ADO-NH2 library after deconvolution. Met-
enkephalin 11.9 is a potent opiate receptor ligand. The relationship between the p-hydroxyphenyl
group in 11.8 and the tyrosine side chain in 11.9, and a phenyl portion in the diphenylmethane
groups of 11.8 and the benzyl groups of phenylalanine in 11.9 is obvious. Tyr and Phe are essential
for the activity of Met-enkephaline.

11.11 Parallel or Combinatorial, in Solution or on Solid


Support?

Combinatorial chemistry on solid support has enabled the automated synthesis of


numerous molecules, but it also faces problems. The difficulties associated with
testing on resins or the deconvolution and resynthesis of libraries have already
been mentioned. Labeling is an elegant but laborious alternative. Another way to
11.11 Parallel or Combinatorial, in Solution or on Solid Support? 225

Ar-CHO:
*
Aa:
O
a R1
Gly NH2 +
CH2Cl + O R1:
Ala
Le
Leu R H
Phe Me
OMe
Y: OSiMe2tBu

CN O
O H
b CO2Me c
N Ar N Ar
O + O
COMe
R R
CO2tBu
Y

CO2Me
O

Cl Thio O O Thio
d
+ Thio : O N Ar
CH2SAc R
CH2CH2SAc Y
CH(Me)CH2SAc
CH3
O O SH
N Ph
HO
11.10
CO2Me

Fig. 11.7 The amino acids Aa ¼ Gly, Ala, Leu, or Phe are coupled to the support resin (a). Next,
they are transformed to imines with four different aromatic aldehydes (Ar-CHO; b), which react
with alkenes under 1,3-dipolar cycloaddition conditions to give pyrrolidines (c). In the last step the
free NH proton on the heterocycle is treated with different thiol compounds (thio-COCl; d). With
the help of the split-and-combine technique the library is cleaved from the polymer with release of
an acid function. Its ability to inhibit the angiotensin-converting enzyme was tested. By
resynthesis and renewed testing, the library was deconvoluted to the active compound. In doing
so, compound 11.10 was identified as a high-affinity inhibitor.

avoid deconvolution of a library but which still uses the advantages of combinato-
rial chemistry is parallel synthesis in spatially separated reaction vessels. It remains
clear along the entire reaction sequence which reactant and product is in each
vessel. A laborious deconvolution is omitted. At first this strategy seems to be
impractical. How should a thousand reaction components be reasonably
transformed in a thousand reaction flasks? For this purpose, the reaction flasks
226 11 Combinatorics: Chemistry with Big Numbers

should not be thought about in the classical organic chemistry sense. Rather,
miniaturized reaction “automats” are developed in which all reactions steps are
carried out in parallel. Alternatively, methods have been developed in which the
resin beads are filled into many small reaction capsules (e.g., called teabags or
“KansTM”). These are open for the solution-phase for compound transport, but the
beads are mechanically enclosed. Each capsule is fitted with a label that can be read
with a radio transmitter. All of the capsules are then placed in a classical round-
bottomed flask and the usual chemistry is carried out. The capsules can be mechan-
ically separated and brought into contact with different reagents. Which reaction
sequence is performed on which capsule is followed by the registration system with
the radio transmitter. In this way, one molecule can clearly be prepared by combi-
natorial principles per reaction capsule, practically as it is in parallel synthesis. The
single compounds are then available for testing.
Synthesis on a solid support material has disadvantages compared to chemistry
in the solution phase. Usually transformations are slower and the analysis to follow
the reactions is considerably more laborious to carry out. Coupling to the solid
support requires a suitable linker. Such an anchoring group should be removed from
the library before testing as tracelessly as possible. Above all else, upon removal of
the linker (“traceless linkers”) no functional groups should remain in the library that
might unintentionally be part of the pharmacophore. The chemistry to attach
and remove the linker must be compatible with all of the other reactions in the
synthesis of the solid-supported library. This can lead to limitations in the usable
chemistry. In preparative chemistry, molecules are preferably constructed by using
a convergent synthesis strategy. For this, an approach is developed in which the
components of the final product are prepared in separate steps, each in parallel. In
the subsequent reaction steps, the previously prepared components are brought
together and coupled with one another to produce the final product. Such a strategy
is more efficient and leads to a higher yield than a linear synthetic route.
A convergent strategy, however, cannot be carried out by sequential construction
on a resin. Therefore, the tables have been turned for some syntheses. The prepared
libraries are not bound to the solid support, but rather the reagents with which they
are treated. The advantage of carrying out reactions on the solid support is retained.
Good mechanical separation of reaction components, working simply with large
excesses of reagent, and automated reactions belong to this technique. An advan-
tage is that it is now possible to carry out convergent syntheses. Even toxic reagents
can be used as their separation is ensured by their firm adhesion to a solid support.
The usual analytical methods that are typically applied for the solution phase can
also be used.
Some reactions, especially ring-closure reactions or condensations, are in
competition with intermolecular transformations. To avoid these, highly diluted
solution conditions are used. If a solid-supported reactant is used, the local con-
centration of the reactant will be reduced as it is fixed to the solid support and
spatially separated. Reactions that occur over a trapped reaction product can be
simplified if the trapping reagent is coupled to a solid support. Mechanical filtering
is enough to separate the trapped components. Similarly, the products can be
11.12 The Protein Finds Its Own Optimal Ligand 227

separated and purified by trapping them on a solid support. Acids and bases can be
separated for purification by treatment with an immobilized amine or a sulfonic
acid. In the meantime, the adhesion of metal-complexing groups or hydrophobic
adhesion groups are already used for the purification of combinatorially produced
compound libraries.
How will combinatorial chemistry develop further? The miniaturization
of reaction vessels and synthetic automats seems to be a seminal perspective. The
“lab-on-a-chip” concept is already intensively used for bioanalytical methods. Small
reaction volumes, integrated separation columns, miniaturized valves, and pumps
that are controlled by piezo elements are integrated on small chip cards. We can only
wait and see whether such serial reaction automats are the laboratories of the future.

11.12 The Protein Finds Its Own Optimal Ligand: Click


Chemistry and Dynamic Combinatorial Chemistry

Could a protein simply produce its own best inhibitor itself? With the ideal
geometry, it should be able to form the optimal interactions directly in the binding
pocket of the target enzyme. Which chemical reactions might be best suited for
such a concept? It would have to be a reaction that can be conducted in aqueous
medium, is reliably enthapically driven, is fast, and that gives complete turnover.
Such a reaction, named “click chemistry,” was investigated in detail in the group of
Barry Sharpless in La Jolla California in recent years. Cycloadditions of unsatu-
rated compounds (1,3-dipolar cycloadditions, Diels–Alder reactions); nucleophilic
substitutions, particularly ring-opening reactions; non-aldol-like carbonyl reac-
tions; and additions to C─C multiple bonds fulfill these requirements. These
can be applied by using combinatorial principles. The 1,3-dipolar cycloaddition
(Huisgen Reaction) can be particularly well used to build five-membered triazole
and tetrazole heterocycles (Fig. 11.8). 1,4-Disubstituted 1,2,3-triazoles can be
regiospecifically produced by the reaction of an azide and alkyne in the presence
of Cu(I) salts at room temperature. 1,5-Disubstituted triazoles are formed when
copper ions are excluded or other ions such as ruthenium are added. The reaction
runs in a broad pH range between 4 and 12. The reaction type can be extended to
tetrazoles. For this, nitriles are needed as dipolarophile reaction partners in the
presence of zinc ions.
The research group of Jean-Marie Lehn in Strasbourg chose another way. They
developed “dynamic combinatorial chemistry” through the spontaneous construc-
tion of molecules from suitable starting materials and irreversible reactions
(Fig. 11.9). All imaginable combinatorial products form from a mixture of different
building blocks. A dynamic exchange equilibrium is established between them. The
target receptor, (e.g., a protein) is added to such an equilibrium system. This way
the mixture components with the best protein-binding characteristic have an advan-
tage, as the protein captures the best binders and shifts the equilibrium. It leads to
a self-perpetuating choice of the ligands that fit best into the binding pocket. In this
way the added protein practically seeks its own best inhibitor.
228 11 Combinatorics: Chemistry with Big Numbers

R2
+
N N N

R2 R2
R1 CH N N
N N
1 N 1
N
R2 +
+
N N N 4 R1 5
R1
1,4-Triazole 1,5-Triazole
HC R1

R2
+ R2
N N N N
N
N

R1 N
R1 N
1,5-Tetrazole

Fig. 11.8 The 1,3-dipolar cycloaddition (Huisgen reaction) is a typical click chemistry reaction
and leads to five-membered triazole and tetrazole heterocycles. In the presence of Cu(I) salts,
azides and alkynes react regiospecifically at room temperature to form 1,4-disubstituted 1,2,3-
triazoles, in the absence of copper but with ruthenium ions, 1,5-disubstituted products are formed.
If a nitrile is used instead of the alkyne component, and the reaction is catalyzed by zinc ions, 1,5-
disubstituted tetrazoles are obtained as products.

Even click chemistry can be directed toward a such a self-selecting synthetic


process. Acetylcholinesterase (AChE, ▶ Sect. 23.7) was added as a target protein to
a mixture of potential azides and alkynes as starting materials for the Huisgen
reaction partners. From the multiplicity of imaginable reaction products
a femtomolar inhibitor was selected! When decorated on one end by
a phenylphenanthridine group, and a tacrine head group suited for the shallow
entrance, the azide and alkyne react in the middle of the hose-shaped binding
pocket to afford a triazole (Fig. 11.10). Very few products form. They are
predetermined by the possible position of the starting compounds. The crystal
structures could be determined with two potent products. The newly produced
triazole ring forms an H-bond that is mediated by a water molecule with Ser203
in the catalytic center of the protein. The triazole ring does not form preferentially
as the entropically favored product of a simple linker the polar interaction with
Ser203 appears to be the driving force.
In a similar way, carboanhydrase II (▶ Sect. 25.7) was used as a target protein
for the selection of suitable reactants in a Huisgen reaction. In this case the alkyne
component was initially coupled to the catalytic zinc ion over a benzylsulfonamide
anchor. Later, structurally fitting azide components could be brought to react in the
funnel-shaped binding pocket. Nanomolar inhibitors were produced. Analogously,
11.13 Synopsis 229

Library
generation
Selection through
the receptor

Dynamic exchange of
library components
Receptor

Receptor
Selection of the
Library best binder
components

Fig. 11.9 A mixture of different library components is furnished that interact under equilibrium
conditions in dynamic combinatorial chemistry. Numerous products can form in the equilibria.
They represent potential “keys” that can fit in the “lock” of the target protein. The added receptor
protein binds to the best-fitting ligands from the compound mixture and shifts the equilibrium in
the direction of increased formation of this product. It is then removed from the equilibrium by the
protein binding (according to O. Ramström and J.-M. Lehn).

success has been achieved with HIV protease (▶ Sect. 24.3) and ACh-binding
protein (▶ Sect. 30.5) Goal-oriented combinatorial libraries are used as starting
materials for these reactions. Time will tell what significance this in situ inhibitor
synthesis will gain for practical drug research.

11.13 Synopsis

• As a consequence of the tremendous acceleration of automated compound


screening for biological activity, the amount of compounds required for testing
significantly increased and stimulated the development of automated parallel
synthesis and combinatorial chemistry.
• Nature produces an enormous chemical multiplicity by combining either amino
acids or nucleic acids to reveal polymers that fold into 3D arrangements.
• The chemical space of organic molecules with up to 25 non-hydrogen atoms and
that satisfy the requirements of drug-likeness has been estimated to host about
1027 imaginable candidates.
• Chemical reactions on solid support, usually organic polymer resins such as
cross-linked polystyrene, follow a stepwise synthesis strategy to build up mol-
ecules sequentially on the solid phase. Complete yields and easy purification can
be achieved, and product release from the solid phase is accomplished as the
final step.
230 11 Combinatorics: Chemistry with Big Numbers

Phenylphenanthridine

H2N NH2 H2N NH2H2N NH2


+ + +
N N N

N
N N
N N N N
H N N
N
HN
HN
N
N
N

11.11 Tacrine 11.12

Ser203

Fig. 11.10 The library produced from alkynes bearing an AChE-suitable tacrine side chain
and AChE custom-made phenylphenanthridine-substituted azides. In the presence of acetylcho-
linesterase (AChE) the products 11.11 (green) and 11.12 (gray) are formed, which proved to
be potent enzyme inhibitors. They differ in the topology on the five-membered ring. Crystal
structure determinations were accomplished with both inhibitors. The surface around the
protein is shown with the bound ligand 11.12. Both ligands occupy the tube-shaped binding
pocket of AChE. Compound 11.11 binds via a water molecule (red sphere) to the hydroxy function
of Ser203.
Bibliography 231

• Sophisticated synthetic strategies have been developed to generate multiple


products on the solid support from reagent mixtures in a limited number of
reaction steps. Elaborate protocols have been established to keep track of
product formation that also use elaborate chemical labeling techniques.
• The biological activity testing of compound libraries generated by combinatorial
chemistry on a solid support requires sophisticated protocols to detach and
deconvolute the library.
• The design and selection of building blocks used for library synthesis are
purpose-oriented and consider the properties of the target(s) at which the library
is subsequently screened.
• Multiple protocols have been developed, either for combinatorial chemistry or
parallel synthesis that immobilize either the library substrates on the solid phase,
or the reagents are immobilized and the library is developed in the solution
phase.
• The target protein can be added to a mixture of reagents in click chemistry and
dynamic combinatorial chemistry. From a large variety of possible reaction
products, the protein binding pocket selects the best binder as a potent inhibitor
or antagonist of the target protein.

Bibliography

General Literature

Balkenhohl F, von dem Bussche-Hünnefeld C, Lansky A, Zechel C (1996) Combinatorial syn-


thesis of small organic molecules. Angew Chem Int Ed Engl 35:2288–2337
Bannwarth W, Hinzen B (2006) Combinatorial chemistry. From theory to application. In:
Mannhold R, Kubinyi H (eds) Methods and principles in medicinal chemistry, 26th edn.
Wiley-VCH, Weinheim
Baum RM (1994) Combinatorial approaches provide fresh leads for medicinal chemistry. Chem
Eng News 72:20–26
Beck-Sickinger AG, Weber P (2002) Combinatorial strategies in biology and chemistry. Wiley,
Weinheim
Bunin BA (1998) The combinatorial index. Academic, San Diego
Gallop MA, Barrett RW, Dower WJ, Fodor SPA, Gordon EM (1994) Applications of combina-
torial technologies to drug discovery. 1. Background and peptide combinatorial libraries.
J Med Chem 37:1233–1251
Gordon EM, Barrett RW, Dower WJ, Fodor SPA, Gallop MA (1994) Applications of combina-
torial technologies to drug discovery. 2. Combinatorial organic synthesis, library screening
strategies, and future directions. J Med Chem 37:1385–1401
Jung G (1999) Combinatorial chemistry. Wiley-VCH, Weinheim
Jung G, Beck-Sickinger AG (1992) Multiple peptide synthesis methods and their applications.
New synthetic methods. Angew Chem Int Ed Engl 31:367–383
Kay BK (1994) Biologically displayed random peptides as reagents in mapping protein–protein
interactions. Persp Drug Discov Design 2:251–268
Kolb HC, Finn MG, Barry Sharpless K (2001) Click chemistry: diverse chemical function from
a few good reactions. Angew Chem Int Ed Engl 40:2004–2021
Ley SV, Baxendale IR (2002) New tools and concepts for modern organic synthesis. Nat Rev Drug
Discov 1:573–586
232 11 Combinatorics: Chemistry with Big Numbers

Madden D, Krchnak V, Lebl M (1994) Synthetic combinatorial libraries: views on techniques and
their application. Persp Drug Discov Design 2:269–285
Moos WH, Green GD, Pavia MR (1993) Recent advances in the generation of molecular diversity.
Annu Rep Med Chem 28:315–324
Nicolaou KC, Hanko R, Hartwig W (2002) Handbook of combinatorial chemistry. Drugs, cata-
lysts, materials. Wiley-VCH, Weinheim
Ramström O, Lehn J-M (2002) Drug discovery by dynamic combinatorial libraries. Nat Rev Drug
Discov 1:27–36
Seneci P (2000) Solid-phase synthesis and combinatorial technologies. Wiley-Interscience,
New York

Special Literature
Bourne Y, Kolb HC, Radic Z, Sharpless KB, Taylor P, Marchot P (2004) Freeze-frame inhibitor
captures acetylcholinesterase in a unique conformation. Proc Natl Acad Sci 110:1449–1454
Carell T, Wintner EA, Sutherland AJ, Rebek J, Dunayevskiy YM, Vouros P (1995) New promise
in combinatorial chemistry: synthesis, characterization, in screening of small-molecule librar-
ies in solution. Chem Biol 2:171–183
Dooley CT, Chung NN, Schiller PW, Houghton RA (1993) Acetalins: opioid receptor antagonists
determined through the use of synthetic peptide combinatorial libraries. Proc Natl Acad Sci
USA 90:10811–10815
Fink T, Reymond J-L (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N,
O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new
ring systems, stereochemistry, physicochemical properties, compound classes, and drug dis-
covery. J Chem Inf Model 47:342–353
Geysen HM, Meloen R, Barteling S (1984) Use of peptide synthesis to probe viral antigens for
epitopes to a resolution of a single amino acid. Proc Natl Acad Sci USA 81:3998–4002
Murphy MM, Schullek JR, Gordon EM, Gallop MA (1995) Combinatorial organic synthesis of
highly functionalized pyrrolidines: identification of a potent angiotensin converting enzyme
inhibitor from a mercaptoracyl proline library. J Am Chem Soc 117:7029–7030
Zuckermann RN, Martin EJ, Spellmeyer DC et al (1994) Discovery of nanomolar ligands for
7-transmembrane G-protein- coupled receptors from a diverse N-(substituted)glycine peptoid
library. J Med Chem 37:2678–2685
Gene Technology in Drug Research
12

Engineers and writers have predicted many developments in science and technol-
ogy. In addition to other sophisticated machines, Leonardo da Vinci described the
principle of the helicopter. In the early 1820s, Charles Babbage designed an
automatic calculator long ahead of its time. Over 160 years later, the mechanical
precursor of a programmable computer was in fact built, and it worked! Jules Verne
described submarines and a journey to the moon, and Hans Dominik described
obtaining energy by splitting the atom. All of these visions have become reality.
Only a single application was preconceived for gene technology, the most seminal
invention of our time: the cloning of two genetically identical individuals in Aldous
Huxley’s Brave New World. It remains a hope that researchers will respect ethical
boundaries, and despite the feasibility, never actually use Huxley’s idea.
With the methods of gene technology it is possible to bring new genes into the cell,
multiply them, and exchange or remove them. If they are removed or changed, the cell
can no longer produce the original protein derived from that gene. By introducing a new
gene and using a clever choice of method, the cell manufactures a foreign product,
either a purposefully modified protein, or an entirely new one. For many diseases, the
molecular cause is known to be the absence of a protein, or a genetically caused
mutation in a protein. Only a few generally known examples are mentioned here:
• Diabetes as a result of insulin deficiency,
• Particular, hereditary cancer forms (e.g., familial colon cancer, malignant
melanoma),
• Chorea Huntington, a chronic form of brain atrophy,
• Sickle cell anemia, a genetic disease producing malformed red blood cells
(Sect. 12.14), and
• Bleeding disorders that are caused by the absence of particular coagulation
factors (see Sect. 12.14).
The possibility of purposefully producing arbitrary proteins has yielded the
following main applications of gene technology.
• The identification of genes and proteins that could play a role in the treatment of
a disease,
• The development of animal models to test a therapeutic principle,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_12, 233


# Springer-Verlag Berlin Heidelberg 2013
234 12 Gene Technology in Drug Research

• The production of proteins for therapies in which a particular protein is missing,


• The manufacture of monoclonal antibodies and vaccines,
• The manufacture of proteins for molecular test systems, and the determination of
the 3D structures of enzymes and other soluble proteins,
• The generation of proteins of which a targeted mutagenesis has been undertaken
to exchange one or more amino acids for the elucidation of the mode of action of
enzymes and for the characterization of receptor binding sites,
• Somatic individual gene therapy for specific patients.
Other application possibilities, for example, manipulation of the human
germline, or genetic changes in crops to achieve herbicide resistance, or to prolong
the shelf life of fruits, are only briefly mentioned here.

12.1 The History and Basics of Gene Technology

The foundations of gene technology were first established in the middle of the
twentieth century. The starting shot was made in 1953. Back then, James Watson
and Francis Crick elucidated the three-dimensional structure of the hereditary
substance of all living things, desoxyribonucleic acid, DNA. Immediate indications
were obtained from the structure about the mechanism of our hereditary transfer
and about the genetic code for the biosynthesis of proteins. A few years later,
Werner Arber found enzymes, restriction enzymes that attack a very specific
position on the double helix to sequence-specifically cleave the DNA. What was
initially seen as a curiosity proved to be an exceedingly important discovery for
gene technology. It is possible to selectively cleave DNA with these enzymes and to
introduce new fragments. Next, the merging of new information with the original
DNA, the recombination of the genetic constitution, is accomplished with ligases
from special viruses called bacteriophages. The techniques for DNA sequencing
have also made decisive progress. Soon afterward, the amino acid sequence of
a protein was no longer directly determined, but rather deduced from the analysis of
the corresponding DNA. Today, for sequencing the detour over the cDNA is used,
which is complementary to the RNA (Sect. 12.6).
In 1973 Stanley Cohen and Herbert Boyer managed to recombine the genome of
a bacterium for the first time (Fig. 12.1). Then things happened one after the other:
two years later the bacterial strain Escherichia coli K12, which is still used today,
was developed. A part of its genetic constitution is missing so that it is only viable
under laboratory conditions. This bacterium can be arbitrarily genetically manipu-
lated without the worry that it could be injurious. The British scientists H. Wil-
liams-Smith and E. S. Anderson carried out self-experiments independently of one
another in that they orally ingested Escherichia coli K12. They proved that these
bacteria only survive in the GI tract for a short period of time, and that the K12
gene, which confers antibiotic resistance for the selection of transformed cells,
cannot be transferred to the normal Escherichia coli that is found in the intestinal
flora. Experts discussed the possible dangers of gene experiments at a conference in
Asilomar, California, and defined different risk and safety classes. In 1976 the
12.1 The History and Basics of Gene Technology 235

Bacteria Cell Transformed


Cell

DNA Recombinant
Vector DNA

Plasmid Cutting the Target DNA Ligases Fuse


Plasmid into is Added the Vector and
a Linear, Double- Target DNA
Stranded DNA
The Plasmid is
Introduced into
the Cell

Fig. 12.1 The principle of gene technological recombination of hereditary information. Bacteria
often contain additional genetic material in addition to their “chromosome” in the form of ring-
shaped plasmids; these are used in gene technology as vectors to introduce foreign genes. Plasmids
are removed from the cell and sequence-specifically cut with so-called restriction enzymes,
which come from bacteria. The target DNA that carries the desired genes, which were typically
also treated with the same restriction enzyme, is bound in vitro to the overlapping single-stranded
DNA ends. The DNA ends are coupled with the enzyme DNA ligase, and the modified, recom-
binant plasmid is brought into the bacteria cell. Plasmid vectors that are used in gene technology
carry in addition to the DNA segment that is necessary for replication, additional information that
allows the recognition and selection of the transformed cells (usually an antibiotic-resistance
gene). In the presence of the selecting agent, only plasmid-containing cells grow.

company Genentech was established. Its founder, Herbert Boyer had to borrow
US $500 as start-up capital! When in 1980 the company was initially traded on the
stock market, within a few minutes he became a millionaire because of the value of
his stock. As early as 1982 Genentech introduced the first medication to the market,
that was manufactured by using gene technology human insulin (Humulin ®).
In 1983 Kary Mullis made a very decisive contribution to gene technology in
that he developed the polymerase chain reaction (PCR) while he was working at the
California company Cetus, which was founded in 1971. Heating melts double-
stranded DNA into its single strands, then the four DNA nucleotides are added, as
are two short single-stranded DNA pieces that are complementary to the regions at
the beginning of the DNA, the so-called primers. A polymerase can then be used to
synthesize new DNA in a test tube. This means that by starting with the primers,
a new double strand is formed (Fig. 12.2). A heat-stable DNA polymerase from the
bacteria Thermus aquaticus, which is endemic to the hot springs in Yellowstone
National Park is used for the DNA synthesis. Each repetition of this step doubles the
original DNA amount. Within a few hours, billions and trillions of DNA molecules
can be manufactured from a single starting molecule. This amount is enough for
a sequencing of the relevant DNA segment.
236 12 Gene Technology in Drug Research

Heating at 95 °C
Double-Stranded
DNA Molecule
Two Single Strands
+ Two Primers
+ Excess
Nucleotide

(DNA + Copy) + Heat-Stable


Polymerase Arrows: Direction
Heating
at 95 °C of the DNA Synthesis

Four Single Strands

Repeating the
PCR Cycle

(DNA + 3 Copies) Multiple Repetitions of the PCR Cycle

Fig. 12.2 Polymerase chain reaction allows unlimited identical copies of a DNA molecule to be
manufactured. For this the DNA is heated to separate the double-stranded DNA into complemen-
tary single strands. Synthetic oligonucleotides with approximately 20 bases, so-called primers,
which are complementary to these DNA strands hybridize with the corresponding strand. Each
primer must bind to one end of the two DNA strands. The primers set the boundaries of the
amplified DNA. Furthermore an excess of primer must be used because in each cycle one primer
pair is needed for each DNA double strand. They are not explicitly produced in later cycles. The
primers are necessary to effectuate the new synthesis of DNA in the presence of the DNA
polymerase and an excess of the four different nucleotides. This occurs in the reverse direction
(dashed arrow) because of opposite course of the DNA strands and the specificity of the
polymerase. The newly synthesized DNA segment can be a few hundreds to thousands of base
pairs long. The result is two identical double-stranded DNA molecules. After heating, single
strands are obtained and the above-described procedure is repeated. Because the DNA polymerase
is heat stable, it does not need to be repeatedly added. Each repeat of the above-described steps
leads to a doubling of the DNA molecule. Its number grows exponentially. Ten cycles lead to
about 1,000 DNA molecules, 20 to a million, and 30 already to a billion. In this way, a single DNA
molecule can be multiplied into a quantity that is biochemically analyzable.

PCR methods are applied diversely. The entire genetic information of an indi-
vidual can be derived from a single DNA molecule. In medical diagnostics, this
serves to evidence genetic disorders, cancer, infectious diseases, and risk factors.
PCR methods are also used to establish a genetic fingerprint in paternity tests and in
forensic science.
12.2 Gene Technology: A Key Technology in Drug Design 237

New genetic information cannot only be brought into bacterial cells, but also
into yeasts, virus-infected insect cells, and even in mammalian cells. In a first
approximation though, it is valid that the more complex the organism is, from
bacteria all the way to mammalian cells, the more difficult it is to produce proteins
in these cells. On the other hand, insect and mammalian cells have the advantage
that they not only produce small proteins but also more complex ones (e.g.,
glycosylated proteins) in a functional form. In many cases, such organisms are
therefore to be depended upon.

12.2 Gene Technology: A Key Technology in Drug Design

The 1970s and 1980s were the grand age of receptor-binding tests with membrane
preparations. Radioactively labeled ligands were used to determine the specific
binding of new substances. The most important receptors for hormones and
neurotransmitters were known and in some cases, the difference between pre-
and postsynaptic receptors as well. The different subtypes and their amino acid
sequences were not known. Correspondingly, the results of the investigations
were inaccurate.
Gene-technology methods allow the preparation of homogeneous recombinant
proteins in practically unlimited quantities. They play an important role at the very
first step of drug design: the identification of a target protein. Progress with the
methodology led to the discovery of new receptors with partially unknown function
or specificity. The next steps are the testing of the therapeutic concept
on genetically altered animals. Another important contribution is the preparation
of proteins for molecular test systems and the isolation of adequate material for
the elucidation of the 3D protein structure (▶ Chap. 13, “Experimental Methods of
Structure Determination”). With perhaps the exception of a very few proteins that
can be isolated from blood or other natural sources, the production of large
quantities of protein is dependent on gene technology. Nowadays the purification
of proteins from animals or human blood is done rather reluctantly. The risk of
transmitting viruses or infections is deemed to be too high.
Gene technology offers the possibility to selectively produce structural variants
of proteins. The generation of point mutations (site-directed mutagenesis) allows
particular properties in proteins to be improved, and the binding and catalytic
properties of enzymes to be purposefully changed. Membrane-bound receptors
can be probed position by position to establish which amino acids are responsible
for the maintenance of the 3D structure, the adoption of a particular conformation,
or are of critical importance with respect to binding of a ligand. Three-dimensional
structural models of receptors can be generated in this way, or their relevance can
be appraised.
In many cases, it has also proven worthwhile to introduce point mutations that
change the surface properties of proteins and help to elucidate the 3D structure of
proteins. Sometimes the charge on individual amino acids must be changed for the
238 12 Gene Technology in Drug Research

sake of the protein crystallization. In the case of proteins in which a part of their
sequence is anchored in the membrane, the membrane anchor, which would impede
crystallization, is removed before the experiment. With soluble receptors it has
proven worthwhile to remove individual domains, crystallize them, and determine
their structure. Of course such modified proteins must still fulfill their particular
functions, that is, ligand binding or DNA docking. If the difficult crystallization
step is accomplished, then the actual structural elucidation is nowadays usually only
a matter of a few weeks in most cases (▶ Chap. 13, “Experimental Methods of
Structure Determination”).
If the contributions for humanity are considered that come from all of this
progress, the question involuntarily arises: why are such broad segments of society
so afraid of gene technology? It takes a little effort to understand these prejudices.
With the use of gene technology, almost everything that is theoretically imaginable
is possible in the field of genetics. The trust that people have in science is, however,
not as unshakable as it was before the atom bomb. Now, when significantly more
chances than risks are at hand, the sins of our forefathers have come back to haunt us.
Scientists have all too often underestimated possible risks in the past and put their
ethical concerns on hold. Scientists have still not managed to assuage public fears.
We must take these fears earnestly and build new trust by behaving responsibly.

12.3 Genome Projects Decipher Biological Constructions

The entire human genome is organized on 23 chromosomes. In 1990 the Human


Genome Organization (HUGO), equipped with a budget of US $3 billion, started
with the then exceedingly ambitious goal of sequencing the entire human genetic
code from DNA within 15 years. By the end of 1993, the first annotated genomic
maps became available, which were later refined. By 2001 it was then so advanced
that the entire genome was published in Science and Nature in parallel by two
consortia.
The two competing consortia followed different strategies. The publically funded
international consortium chose an approach of setting progressively narrowing
parameters, the stepwise digestions of the genome, and the systematic elucidation
of the sequences for the complete genomic analyses. In humans, this means that in
addition to the 5% of DNA that corresponds to genes, the other 95% of sequenced
DNA, the function of which was unknown, was classified with the somewhat
derogatory term “junk DNA.” Today it is known that these areas take on important
tasks in the regulation of gene expression (Sect. 12.7). The second strategy, which
was pursued by the privately financed consortium, made use of the so-called shotgun
method. For this, a longer DNA strand was amplified, and then cleaved into many
arbitrary small segments. After these segments were sequenced, the sequences were
reconstructed to the original long DNA strand by using a powerful computer pro-
gram. This can only work, of course, when the sequences of the cleaved segments
display adequate overlap. This technique proved to be significantly faster than the
12.3 Genome Projects Decipher Biological Constructions 239

usual systematic sequencing methods. Above all, it benefited from the development
of faster and faster sequencing machines and powerful bioinformatic programs. It
was of no disadvantage in the end that because of the high redundancy of the method,
the genome had to be sequenced multiple times with the shotgun method. Interest-
ingly, the shotgun method was also used at the end by the international consortium
that followed the systematic approach to elucidate local sequence areas. Because the
initial intent of the private enterprise was to patent the sequenced genome, the
competition between the two initiatives was great. In March 2000, the American
President Bill Clinton declared the human genome to be not patentable, and spoke for
its use by everyone for the common good.
How did it come that a competing private initiative started to sequence the
genome? In spring 1995 Craig Venter and his group identified the entire genome for
the bacteria Haemophilus influenzae by using the shotgun method. The enormous
amount of 1,830,121 base pairs that code for 1,749 genes was sequenced. The
complete genomes of individual viruses were already known, but this was the
decoding of the genetic information of a self-contained creature. The subsequent
decoding of the sequence of 580,067 base pairs of the Mycoplasma genitalium
genome by Venter’s wife, Claire Fraser, took only four months.
Venter and his group worked with the shotgun method on the entire genome, the
so-called “whole-genome shotgun sequencing.” The statistical approach that was
followed by Venter initially seemed so unusual and utopian that his application for
a research grant from the American National Institutes of Health (NIH) was
rejected. This brought about the founding of The Institute for Genomic Research
(TIGR) and the Celera Genomics company. There, Venter could pursue his
research according to his ideas and plans. Finally, the success proved the feasibility
of the proposed strategy.
Whose genome was actually sequenced? In both initiatives the DNA of multiple
individuals was mixed and the individual differences were purposefully calculated
out. In this way the “consensus sequence” of the human genome was determined.
But it did not stop with the human genome. The complete elucidation of baker’s
yeast Saccharomyces cerevisiae, and the common thale cress Arabidopsis thaliana,
the rice plant Oryza sativa, the pinworm Caenorhabditis elegans, the fruit fly
Drosophila melanogaster, the chimpanzee Pan troglodytes, the mouse Mus
musculus, and many other organisms (Table 12.1) has been accomplished. In the
meantime new ones emerge weekly. This raises new questions: how should this
plethora of information be managed? How can the genetic information be trans-
lated into useful knowledge? The field of bioinformatics has been challenged.
Computer programs for the intelligent comparison of sequences and the analysis of
metabolic pathways and signaling cascades already exist. New initiatives were
founded that have the goal of determining the spatial structure of all or at least
many sequences. The structural space of all real, naturally occurring proteins is
filling slowly. The crystal structures of all representatives of some protein families
of the human genome have been determined. Therefore it is only a question of time
until we can lay spatial blueprints next to the catalogue of all sequences in our
genome.
240 12 Gene Technology in Drug Research

Table 12.1 Examples for the sequenced genomes of different organisms


Organism Genome sizea Genes
HI virus 9.2  103 b 9
HI-9.2 virus, Phage l 4.85  104 70
Intestinal bacteria Escherichia coli 4.6  106 4,800
Baker’s yeast, Saccharomyces cerevisiae 2  107 6,275
Pin worm, Caenorhabditis elegans 8  107 19,000
Wallcress, Arabidopsis thaliana 1  108 25,500
Fruit fly, Drosophila melanogaster 2  108 13,600
Green blow fish, Tetraodon nigroviridis 3.85  108
Human, Homo sapiens 3.2  109 25,000
Common newt, Triturus vulgaris 2.5  1010
Ethiopian lung fish, Protopterus aethiopicus 1.3  1010
Amoeba, Amoeba dubia 6.70  1010
a
Number of base pairs
b
Single-stranded RNA

12.4 What Is Contained in the Biological Space of the Human


Proteome?

After the human genome was sequenced, the exciting question arose as to which
gene products all of these DNA sequences code for. Initially it must be remarked
that the genome is not static, it is constantly changing. It is only in this way that the
genetic variations can occur that make up the diversity of all creatures. In the course
of evolution, the genetic constitution has expanded. Simple single-cell organisms
without cell nuclei (prokaryotes) have a circular genome that contains only coding
genes. Single-celled organisms with a cell nucleus (eukaryotes) such as yeast, have
a larger genome, of which about 20% represents coding genes. Multicellular
organisms such as humans have a genome that is 200-times larger than that of
yeast (Table 12.1). The number of coding genes, however, is not larger. There are
even organisms such as the amoeba that have a genome that is 200-times larger than
that of humans. Even the miniscule water flea numerically overshadows us with its
31,000 genes. So the alleged masterpiece of creation does not necessarily also have
the largest genome. Obviously only a small number of additional DNA sequences
have accrued during the course of evolution that in fact code for additional gene
products. Many genes in higher organisms are similar to those in simpler species. If
the number of coding genes has hardly grown from the single-cell organisms to
humans, and even the gene products that are coded for are similar, what is the
explanation for the massive increase in complexity of the genome in higher-
developed organisms? The answer is not in the diversity of the needed gene
products, but much more in the finely tuned regulation of gene expression
(Sect. 12.13). In higher organisms, it is of decisive importance where and at what
time particular genes are expressed and gene products are synthesized. The 95% of
12.4 What Is Contained in the Biological Space of the Human Proteome? 241

human DNA that does not code for proteins contains numerous sequences and
signals that control this regulation. Therefore the total number of genes in higher-
developed creatures does not seem to increase, but rather the gene density
decreases. On average, 12 genes per one million base pairs are found in the
human genome, whereas this number is 118 in the fruit fly, 197 in the pinworm,
and 221 in the common thale cress. Furthermore the human genome is very
scattered. It seems that it is not the number of genes but rather how they are used
and how their activation is regulated that is decisive for the developmental state of
the organism. It must also be considered that multicellular organisms also need
a great deal of cell differentiation into different organs. These processes must be
reliably regulated and controlled. Moreover, higher organisms achieve a much
larger diversity in their protein composition by so-called alternative splicing.
Posttranslational modification after the biosynthesis also plays a role. This is
observed to a much smaller extent in, for instance, prokaryotes. The splicing
process cuts out the portions of DNA that are not coding for proteins during
translation from DNA to RNA. During alternative splicing, it is decided in what
is cut out and what is translated. In this way, one DNA sequence can code for
multiple different proteins.
To date, the largest genome of a prokaryote that has been found belongs to the
pathogenic protozoa Trichomonas vaginalis. It consists of 160 million base pairs.
This pathogen is usually transmitted in humans by sexual intercourse and causes
urinary tract infections. Its enormous genome takes on an over-proportional dimen-
sion in the cell. This could create an advantage for the pathogen because its large
surface area adheres to the vaginal mucosa better. Furthermore, the immune system
has trouble to attack and destroy such an over-sized parasite. The genome of the soil
bacterium Sorangium cellosum with 13 million bases and 10,000 genes is four times
as large as the average genome of other bacteria. This might have something to do
with the fact that this soil bacteria is able to carry out special tasks that makes its
therapeutic use interesting. It is a versatile producer of complex natural products
such as the epothilones, which are potent chemotherapeutics that have great poten-
tial in the treatment of cancer.
According to an analysis carried out in 2007, the human genome encompasses
3.25 billion bases. It contains around 25,000 genes, a few thousand of which are
recognized as RNA genes (even today the number is not exactly named because
only 92% has been fully sequenced). The earlier textbook knowledge that one gene
product is behind each DNA sequence, must be expanded. It must not be
overlooked that our genome contains many thousands of genes that are for non-
coding RNA segments. The resulting RNA molecules accomplish important func-
tions in our bodies. The large groups of tRNAs that serve as adapter molecules for
the reading and translation of base-pair triplets in the genome into the correct amino
acid sequence deserve special mention. Furthermore it has been shown that the
ribosome itself, which is the molecular machinery for protein synthesis, consists
largely of RNA. The spliceosome, the complex machinery for the removal of non-
coding segments of the genome, contains RNA molecules, so-called snRNAs.
242 12 Gene Technology in Drug Research

There are even more small RNA molecules (snoRNAs) that are responsible for the
processing and modification of other RNA molecules.
Since then, it is known that over 21,500 genes in our genome are translated into
proteins. It is not known however, what functions all of these proteins fulfill.
Bioinformatics has contributed a great deal to classification of their biochemical
function, that is, whether the protein is an enzyme (e.g., a protease, kinase, or
oxidoreductase) or whether it is a receptor, ion channel, or transporter. The function
or to what protein class a new sequence belongs can be discovered by sequence
comparisons to already annotated proteins. Often by making so-called multiple
sequence comparisons within a protein family, a significant similarity can be recog-
nized. The information about the spatial architecture and folding (▶ Sect. 14.2) can
be analyzed through relationships because the spatial geometry of proteins has
been much more strongly conserved than the sequential composition of the folded
protein chain. It is often that individual motifs or characteristic sequence segments
disclose a particular biochemical function of a protein. Another tool in this detective
tour de force of functional annotation has proven to be protein sequence comparisons
between the genomes of other species.
The assignment of a biochemical function to a protein sequence affords a
glimpse into its molecular function. It shows whether, for example, it cleaves
a peptide sequence as a catalyst, carries out a metabolic reduction, or transduces
a signal to the cell as a receptor. What this regulation and control mean for
the organism remains to be resolved. Whether a particular protein causes
a disease by either a defective function or by dysregulation is just as unclear.
The correction of such a defect could lead to a successful pharmaceutical
therapy.
In the Science publication from the Venter group in 2001, it was assumed that the
genome coded for more than 26,500 proteins. At that time, a definitive function
could not be assigned to 40% of the sequences. In the remaining part, about 10%
were detected to be enzymes. Another 12% proved to be involved in signal
transduction, and 13.5% are nucleic acid binding proteins. The large remaining
group was scattered across many different functions such as proteins of the cyto-
skeleton, surface receptors, ion channels, transporters, extracellular matrix proteins,
immune system proteins, or chaperones. Seven year later this picture could be
refined. The largest protein family with more than 7,000 members contains the zinc
finger domain (▶ Sect. 28.2). These proteins assume an important role in transcrib-
ing sequence segments of the DNA into RNA. Most zinc finger proteins belong
to the group of transcription factors. Another large protein family contains the
immunoglobulins. These domains (▶ Sect. 32.1), which are constructed from b-
pleated sheets, occur in antibodies. A few protein families are listed in Table 12.2
and are presented in more detail in ▶ Chaps. 23, “Inhibitors of Hydrolases with
an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”;
▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”;
▶ 27, “Oxidoreductase Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear
Receptors”; ▶ 29, “Agonists and Antagonists of Membrane-Bound Receptors”;
▶ 31, “Ligands for Surface Receptors”; and ▶ 32, “Biologicals: Peptides, Proteins,
12.4 What Is Contained in the Biological Space of the Human Proteome? 243

Table 12.2 Examples of protein families in the human genome and the number of their members
Protein superfamily Number
Zinc finger (C2H2 and C2HC) 7,707
Protein kinase-like 876
G Protein-coupled receptor-like 784
a/b-Hydrolases 151
Cysteine proteases 164
Trypsin-like serine proteases 155
Metalloprotease (“Zincins”), catalytic domains 132
FAD/NAD(P)-binding domains 79
Cytochrome P450 79
Integrin a, N-terminal domains 51
Cytokines 52
cycl. Nucleotide-phosphodiesterase, catalytic domains 50
Caspase-like 39
Carbonic anhydrases 23
Aquaporin-like 20
Integrin domains 18
Aspartic proteases 16
ClC-chloride channel 16
Subtilisin-like 14
http://hodgkin.mbu.iisc.ernet.in/human/

Nucleotides, and Macrolides as Drugs” of this book. The compilation of what


protein family is frequently associated with what disease (Fig. 12.3) is interesting.
This list is led by the protein kinases (▶ Chap. 26, “Transferase Inhibitors”).
Therefore it is not surprising that current basic research in the pharmaceutical
industry is intensively concentrated on the control and inhibition of protein kinases.
The cadherins follow this group. These proteins are important for the stabilization
of cell–cell contacts. They play a role in the embryonic morphogenesis, signal
transduction, and intervene in the construction of the cytoskeleton in cells. The
G protein-coupled receptors, ion channels, trypsin-like serine proteases, or RAS
proteins also belong to this list of proteins that are potentially associated with
disease, especially when genetically altered.
Finally, how the human genome is different from other eukaryotes should also
be considered. From more than 2,200 protein families that have been discovered in
organisms with a cell nucleus, over 1,000 are missing from the human genome.
Most of these families assume specific tasks in the relevant organisms or are
explained phylogenetically. Among these are, for example, venoms such as found
in snakes, scorpions, or insects. Proteins occur in plants that assume a very specific
function for the plant, for example nutrient storage in the seeds, or defense against
disease. As a rule, the proteins that are absent in humans assume biochemical
functions that are irrelevant for our organism, or they assume a highly specific
task in the lower eukaryotes.
244 12 Gene Technology in Drug Research

300 Protein kinases


Cadherins
250 GPCR
Fibronectin III
200 Homeobox
Spectrin
150 MHC I
Ion Transport
100 Myosin
RRM
50 Trypsin-like
Laminin EGF
0 Ras
Frequency
SH2

Fig. 12.3 The composition of protein families that are particularly often associated with human
diseases (GPCR: G-protein-coupled receptor; Fibronectin: extracellular glycoproteins in tissue
construction; homeobox: proteins that influence the morphogenetic development; spectrin: cyto-
skeletal proteins; MHC I: major histocompatibility complex proteins that are involved in immune-
recognition processes; myosin: motor protein in muscle control; RRM: RNA-recognition motif
transcriptions factor; trypsin-like: serine proteases; laminin EGF: a growth factor in the extracel-
lular matrix; Ras: oncoprotein in tumorigenesis; SH2: protein domains in the phosphorylation
signal cascade).

12.5 Knock In, Knock Out: Validation of Therapeutic Concepts

Molecular biology delivers a plethora of information about how diseases


develop and how their course can be influenced. The long route from the search
for and development of a new drug is based on this. In the end it might be
determined that the result, even though it was so well planned, did not lead to
the desired clinical success. It is therefore important to have an animal model
available upon which the therapeutic concept can be validated early on. Clas-
sical test models are often not available because the corresponding disease does
not occur in animals.
Since the 1980s, transgenic animals have been increasingly used in pharmaco-
logical research. These are animals in which a particular gene is fully or partially
turned off or is replaced by a human gene. An animal in which the gene is
completely turned off corresponds to an animal in which the relevant protein is
absent or non-functional. A heterozygous animal in which the gene is only present
in one parent, corresponds to an animal in which the protein is partially blocked. If
the gene for an enzyme or receptor is affected, the influence of an inhibitor or an
antagonist can be simulated. The development and progress of a disease or the
influence of protein inhibition on a disease can be observed in such an animal. In
this way some assurance about the relevance of a therapeutic concept is established
before an exceedingly long research and development process. The increased
12.5 Knock In, Knock Out: Validation of Therapeutic Concepts 245

production of a particular protein can be induced by multiplying a gene. If the


absence of a gene causes the overexpression of another gene, this will also become
transparent. The gene product that is then produced in increased quantities can take
over the missing function of the turned-off product. In such a case the planned
therapeutic principle would only work if the other gene-product’s function were
also blocked. This question plays an important role in the inhibition of kinases
(▶ Sect. 26.2).
A very specific gene is turned off in the so-called knock-out method. The
technique was developed in 1987 by Mario Capecchi at the University of Utah.
The sequence of the turned-off gene must be known. A structurally homologous
gene that is not functional, for example because of the insertion of a stop signal,
is generated. The gene is introduced in an animal, and the intact gene is replaced at
exactly the same position. The process is called homologous recombination or
also gene targeting. Mice are particularly well suited because the technology to
manipulate their embryonic stem cells is especially advanced. A foreign gene, for
example a human gene, can also be introduced. Mice are also well suited for this
because their genome and the human genome are surprisingly similar.
To generate a transgenic mouse, the female mice are treated so that they produce
a large number of egg cells. After fertilization, stem cells are extracted from the
embryos in a very early stage, the blastocytic stage. They are cultured in vitro, and
the desired gene is injected into the cell. This procedure only works in low yield.
A technique was developed with which transfected from non-transfected cells can
be differentiated. For this, the gene that should be transferred is coupled beforehand
with a gene that confers resistance to the cell toxin neomycin. When the cells are
treated with neomycin, only the transformed cells survive. The blastocytes are
united with blastocytes of other mice and the altered embryos are carried to birth
by mice. The offspring of the surrogate mothers are chimeric, that is, they carry the
genetic information from the donor as well as the acceptor mice. Here mice with
differently colored fur are chosen so that the transformed mice can be easily
recognized by their spotted fur.
Another method is that foreign DNA can be injected directly into an early
embryonic stage. A disadvantage of the random incorporation of a gene is the
possibility of destroying another gene, a lack of expression of the new gene, or
multiple incorporations. Animals from the first litter are bred to generate both
genetically mixed, heterozygous animals, and genetically homogenous, homozy-
gous animals. Particularly sophisticated techniques even allow the selective turning
on and off of the new genes.
In this way transgenic animals are generated in which hereditary diseases, for
instance, cystic fibrosis, Crohn’s disease, phenylketonuria, and others, can be
studied. Relevant animal models also exist nowadays for diseases that have
different or multiple causes such as cancer, diabetes, rheumatoid arthritis, and
Alzheimer’s disease. Since 1988 when the American Patent Office first granted
a patent for a transgenically altered mouse, a controversy has erupted as to whether
a living creature can be patented at all. In the meantime there are whole series of
246 12 Gene Technology in Drug Research

patents for gene-technologically altered animals, including European and German


patents, and the conflict about whether these patents are ethically or legally valid
continues.

12.6 Recombinant Proteins for Molecular Test Systems

Early on, pure or enriched enzymes were available for in vitro tests, but only in
those cases in which the material was easily available, for instance, human throm-
bin from blood. In other cases, animal material had to be used with all the risks that
come with it considering the relevance for rational design (see ▶ Sect. 19.11).
There are many proteins that cannot be isolated in adequate amounts or in
a homogeneous form. The sequence determination and the production of such
proteins are simple today. The unbelievably small amount of a few picomoles
(1 pmol¼1012 mol) is enough to determine the primary structure of a short
sequence. It is over the thus-determined amino acid sequence that, after the
translation procedure, the genetic code can be reconstructed into a gene. In doing
so, it must be considered that multiple base triplets can stand for a particular amino
acid (so-called degenerate codes, ▶ Sect. 32.7). A group of single-stranded oligo-
nucleotides are synthesized that could theoretically cover all the original peptide
segment. These molecules can be used to find a complementary sequence in
a cDNA library. cDNA (complementary DNA) is the complementary DNA to
the mRNA (messenger RNA). It is obtained from the mRNA, which merely
contains the sequence that is needed for the biosynthesis of proteins, by translation
with a reverse transcriptase (▶ Sect. 32.5). Finally the gene is produced in larger
quantities by using PCR techniques, and the amino acid sequence is determined via
its base sequence simply because oligonucleotides are much easier to sequence.
Next, the gene is brought into cells that are allowed to reproduce. There can be
difficulties in a few cases with this step. In bacteria, such as the intestinal bacterium
Escherichia coli, or in yeast cells, only soluble proteins can be produced. Some
proteins accumulate in inclusion bodies. They must be extracted, dissolved, and
refolded under specific conditions. The gene segment for a small protein is often
coupled with the information for another protein and both are then expressed
together. The large protein conjugate that forms in the cell is better protected from
metabolic degradation than small proteins. In the preparation, the non-essential part is
cleaved from the protein conjugate. There can be problems if the folding of the
protein is not correctly accomplished, or if multiple chains (as in insulin) must be
coupled over disulfide bridges. Larger proteins that must be furnished with sugar
groups to accomplish their function (glycosylation) must be produced in cells from
higher organisms, for example in mammalian cells. The manufacture of complex
proteins in insect cells has become particularly attractive. These cells are infected
with the so-called baculovirus, in which the desired information has been incorpo-
rated into its genome. The virus codes for the protein and insect cells provide the
production and subsequent glycosylation abilities. Not only enzymes, but also recep-
tors, ion channels, and entire signaling cascades can be produced in cells in this way.
12.7 Silencing Genes by RNA Interference 247

12.7 Silencing Genes by RNA Interference

How intervention in the germline of organisms of genetically altered species can


occur so that a particular absent or defective gene and therefore gene product can be
replaced was introduced in Sect. 12.5. The function of particular genes for an
organism can be studied in this way. The consequences for the organism, for
example, of blocking a particular gene product, are made transparent prior to the
development of a potent active substance. In the late 1990s, another technique was
discovered that allows genes to be silenced without intervening in the molecular
biology of the genes of an organism. This work was carried out by Andrew Fire and
Craig Mello. For their accomplishments, which are only slowly being validated,
they were awarded the Nobel Prize in 2006.
Genes are archived in DNA. For gene expression the coding part of the genome
is transcribed into mRNA. Based on this copied information, the ribosome trans-
forms the base sequence into a peptide sequence. In the early 1980s the idea
emerged to trap the translated information on the single-stranded mRNA by adding
an inversely arranged RNA complement strand, the so-called antisense strand. The
two strands can hybridize, that is, they can bind to form a matching double strand.
Double-stranded RNA is then the result. This antisense principle (▶ Sect. 32.4) did
not deliver the hoped-for, break-through result. Genes were partially or weakly
suppressed, however, even the addition of a normal RNA strand can achieve
suppression. Fire and Mello suspected that neither the normal nor the antisense
strand could cause a gene blockade, but rather the double-stranded form that was
added inadvertently as an impurity. Renewed experiments confirmed the assump-
tion. Interestingly, even small amounts of double-stranded RNA are enough to take
many mRNA molecules out of action. When using the antisense strands, on the
other hand, stoichiometric amounts are necessary. This also shows that short, ca.
20-nucleotide-long double-stranded RNA fragments are enough to silence an entire
mRNA gene sequence. Fire and Mello named the phenomena RNA interference.
What had happened? An enzyme with the name dicer cleaved the double-stranded
RNA into 21–23 base-long pieces that then caused the blockade. For this, double-
stranded RNA pieces are incorporated in an enzyme complex called RISC (RNA-
induced silencing complex) and separated into single strands. One strand breaks
away from the complex while the other remains there to act as a template to capture
mRNA molecules.
The sequence of the captured strand allows the RISC complex to recognize all
mRNA with a complementary base sequence and to cleave them sequentially.
Finally, they are digested by enzymes in the cell plasma. The cell selectively
eliminates only the mRNAs that contain the sequence pattern that is complementary
to the short RNA strands in the RISC complex. In practice, this gene blockade has
proven to be simpler and more reliable than the antisense technique. RNA interfer-
ence even allows discovered genes to be systematically blocked to draw conclu-
sions about the resulting consequences for the organism. RNA interference serves
not only analytical purposes. There are already biotech companies that want to turn
off disease-causing genes with small RNA fragments.
248 12 Gene Technology in Drug Research

There is another bigger problem though: how is a 22-base RNA molecule to be


transported into the cell to the place where it should act? Strongly charged mole-
cules cannot cross the cell membrane. For this, a special delivery system that allows
this task is needed. Intensive research is taking place on the development of such
systems, but the problem is a long way from being solved. A reliable and highly
efficient system that can selectively transfer such polar and nuclease-sensitive
molecules into the cell interior is likely to open a totally new and presently
unforeseeable perspective for the therapy of disease.
The goal is to construct delivery systems that can pack the fragile and polar
freight of RNA molecules and dock onto the cell. There, the coat of these carriers
must merge with the cell membrane or selectively achieve penetration to arrive
in the interior of the cell. One concept follows the packaging and compartmental-
ization of RNA in polymers such as polyethylenimine. The positive charges on
the polymer backbone can bind and encapsulate a negatively charged polymer
molecule such as RNA or DNA building blocks. Other systems try to make the
RNA or DNA molecules bioavailable for the cell by encapsulating them in
a membrane-like coat. This packaging in liposomes leads to a selective adhesion
of the artificial cell to the membrane of the target cell, and then the liposome melts
with the target cell in an endocytosis-like process.
A further problem is the danger that small, silencing RNA molecules (siRNAs)
could cause an immune response. A solution to this dilemma is represented by the
chemical modification of the siRNAs. For this the RNA molecules are modified so
that they can still optimally hybridize to the addressed segment in the mRNA, but
have better properties in terms of transport, immunogenity, and stability. For this,
the OH groups of ribose building block of the nucleotides have been exchanged for
fluorine, methoxy, or hydrogen.
siRNA research certainly is still in its infancy. The potential of the methods seems
to be impressive, as it uses the principles applied in Nature for gene regulation. As
previously described, we have genes in our genome that code for microRNAs and
that show sequence complementarity over long stretches. Structurally, they exist as
double strands. They are cut to size by the dicer protein and can serve to interfere with
RNA: as a result, this leads to an alternative means of gene regulation. For a broad
therapeutic application of externally administered RNA fragments there are certainly
important prerequisites to fulfill such as transport into the host cell and the prevention
of an immune response. Currently the technology serves the construction of model
organisms, to study the consequence of turning off genes. Nonetheless, the validation
of the method for use under in vivo conditions has long since begun.

12.8 Proteomics and Metabolomics

The approaches that were described in Sects. 12.5 and 12.7 pursue the goal of
turning a disease-causing gene, or a gene that plays a role in a disease off. But how
is it to be recognized whether a particular gene or gene product is involved in
a disease process at all? Decisive indicators to answer this question can be extracted
12.8 Proteomics and Metabolomics 249

from the protein composition in the cell. This composition changes dynamically. It
is termed proteome, and reflects the totality of all proteins in a cell, actually in the
entire organism, at a given time under entirely defined conditions. If we concentrate
on the protein pattern of a cell from a particular organ, important variables are the
metabolic state, the developmental stage of the organism, the time point in the cell
cycle, or the surrounding temperature. Disease processes and pharmaceutical ther-
apy also change this pattern. In the transcriptome, all theoretically expressed
proteins are coded as static hereditary information. In contrast, the proteome
reflects the protein composition at a particular time point. The difference between
a butterfly in its caterpillar and adult phases serves as an impressive example of the
difference between the genome and the proteome. The genome is the same for both,
but the proteome is significantly changed, which is expressed in the form of
a completely different phenotype.
In view of disease processes or a pharmaceutical therapy thereof, the proteome
can be used to compare the state of cells that are healthy, diseased, as well as under
the influence of a drug therapy. Initially this seems like an extremely complex,
barely solvable task. A cell contains thousands of proteins, of which many are
modified after their expression. For example, the first amino acids in a sequence are
cleaved (▶ Sect. 25.9), phosphate groups are transferred (▶ Sect. 26.3), sugar
building blocks are added on, disulfide bridges are coupled, prosthetic groups are
added, and ubiquitin or prenyl groups are added (▶ Sect. 26.10). In addition,
alternative RNA splicing occurs, which is carried out as a mechanism of gene
regulation and further increases the diversity of the proteome on the basis of
a comparatively small number of genes. All of this dramatically increases the
diversity of the protein composition, probably by a factor of 5–10 compared to
the genome composition. Nevertheless, a sophisticated analytical method has been
developed with which it is possible to analyze the proteome of a cell at a particular
time point. First a cell must be denatured in a way that all modifying processes are
abruptly stopped so that conclusions can be drawn about the cell contents. The cell
lysate is then subjected to separation. Proteins contain many acidic and basic amino
acids so that an exactly defined pH value exists for each protein at which the
protonation or deprotonation arrives at a state at which the protein appears to be
overall electrically neutral. This pH value is specific for each protein and depends
on the amino acid composition (isoelectric point). The protein mixture is added to
a solid support (a polyacrylic acid gel) as would typically be used for chromatog-
raphy purposes. Then voltage is applied. If the proteins carry a charge, they migrate
over this solid support in the direction of the oppositely charged pole. In this way, at
some point in their migration over the gel, which is construed in such a way that
a continuous pH gradient from one end to the other is established, the applied
proteins reach a point where their exterior appears to be uncharged overall. If this
position is reached on the solid support, the proteins no longer migrate. Proteins are
then separated according to their isoelectric point by using this so-called isoelectric
focusing. All proteins with the same isoelectric point migrate the same distance and
occur as a mixture. Then the chromatography plate is turned 90 , and the proteins
are separated again, however now according to a different principle. For this, the
250 12 Gene Technology in Drug Research

a b c
2 3 1 2 3 1 2 3
1

8 9 8 9 8 9
5 7 5 7 5 7
4 6 4 6 4 6

10 10 10

Fig. 12.4 2D-Gel electrophoresis for cellular proteome analysis: (a) proteome of a normal cell.
(b) Proteome for a pathologically altered cell. (c) Proteome of a pathologically altered cell after
treatment with a drug. Changes in the protein concentration are indicated by red circles. Above all,
the proteins at positions 3, 6, and 7 are clearly up-regulated in the diseased state. A few of the
pathological changes are corrected by the drug therapy, but new changes in the proteome (e.g.,
2, 8, and 10) might be induced by side effects (Figure taken from Lottspeich F (1999) Angew
Chem Intl Ed Engl 38:2476–2492).

proteins are thermally denatured and their charges are masked with sodium
dodecylsulfate, a strongly charged anionic surfactant, so that all are virtually
equally charged on the exterior (SDS-PAGE). The denatured proteins migrate
again by the application of an electrical field. Now the migration speed is, however,
dependent on the mass of the proteins. The direction of the migration, which is
perpendicular to the first isoelectric separation, causes the originally applied pro-
teome to be broadly distributed and well separated on the solid support in the end.
By using this 2D electrophoresis, it is possible to separate many thousands of
proteins. The quantity and sequential composition of the separated proteins must be
characterized. Many different staining and fluorescence techniques have been
developed for the quantitative analysis. They allow quantitative determinations,
especially in comparison to the proteomes of analogous cells that are in a different
state. This is how the quantitative comparison of the protein composition in
a diseased and healthy state is accomplished. How the proteome changes under
the influence of a drug (Fig. 12.4) can also be determined. But how does one
recognize what is hidden in each individual protein spot on such a 2D gel? For
this the proteins are extracted from the plate and digested with trypsin. This
protease (▶ Sect. 23.3) cleaves the denatured proteins into small peptide fragments,
which are finally analyzed by mass spectrometry. Sophisticated technologies
together with computer analyses of precalculated fragmentation patterns of proteins
allow the proteins to be reconstructed and characterized with regard to their
sequence. Proteins in the proteome that have either been up- or down-regulated
due to a disease process can be detected in this way. Whether, however, the altered
expression pattern causes the pathological state or is a consequence of it is
a question that remains to be answered by independent experiments.
As described, the proteome of a cell can change upon therapy with a drug. What
are the interaction partners for a given drug? Are the induced effects always the
12.8 Proteomics and Metabolomics 251

same if drugs from the same compound class are used? The properties of three
kinase inhibitors that were developed for the treatment of chronic myeloid leukemia
(▶ Sect. 26.5) were investigated in detail in the research group of Giulio Superti-
Furga in Vienna, Austria. For this, the drug first had to be equipped with
a chemically inert anchor group. It is certainly quite a sophisticated challenge to
find the correct position to place an anchor on such an active substance so that the
mode of action is not significantly perturbed. As a rule, multiple positions along the
molecular scaffold must be tried for this purpose. Finally the drug is irreversibly
covalently coupled over the attached anchor group to a chromatography column.
Once equipped with these “baits,” the proteome from the lysate of a cell is added to
the column. Proteins that have affinity to the immobilized drug stick to the column.
Finally the binding partners that were detected in this pull-down experiment must be
released from the column, separated, and characterized analogously to the above-
described technique. The composition of all proteins that have affinity to the active
substance is obtained. It is difficult to initially extract quantitative conclusions about
the affinity of the binding partners, above all because the protein quantities and their
composition in the lysate are highly variable. It is possible, however, to construct
a profile for each active substance according to its protein interaction partners. This
led to the surprising result that even drugs that belong to the same or similar substance
classes and were developed for the same therapeutic indication can well display
significantly different interaction profiles in the cell. This is an impressive observa-
tion, the evaluation and application of which will require great research effort. We
will see in the next section that the different efficacy, therapeutic deviations, and
variable side-effect profiles in patients can be explained by this.
Proteome analytical techniques (proteomics) can also be used in clinical diag-
nostics. Without exactly resolving the analyte, significant changes in the form of
a mass fingerprint can be recognized. Tumor diseases are revealed by changes in
their protein composition. These can be recognized at a very early point, which
should hopefully still allow a curative treatment for the tumor.
Another technique that is analogous to proteomics is the analysis of metabolites
that are produced in an organism. The term metabolome comprises all metabolites
(e.g., metabolic degradation products) that are present at a specific time point. The
techniques of metabolomics try to quantify the metabolite composition and to
draw conclusions about the condition of the cell based on this information. This
is particularly valid when the cell is exposed to foreign substances. If the metabolite
profile at a particular time point is studied, especially in pathophysiologically
or genetically changed conditions, the term metabolomic is used. The goal of this
technique is to draw conclusions about the molecular composition in cells from
body fluids such as urine, serum, or cerebral spinal fluid. This can lead to an
improved and more sophisticated diagnostic procedure, and therefore an easier
early detection of diseases. These techniques also serve to characterize proteins
for drug therapy or to analyze the greater influence of an event in the cell that is
being treated with a drug. The hope remains that these techniques will allow for
a better understanding of the total effects of the use of pharmaceuticals, and finally
achieve a higher safety standard for therapy.
252 12 Gene Technology in Drug Research

12.9 Expression Patterns on a Chip: Microarray Technology

Thousands of molecules are found in the analysis of the genome, transcriptome,


proteome, or metabolome, the occurrence of which must be characterized. This
flood of data requires immense measurement capacity. Therefore in the late 1980s
the development of microarray technology was initiated. Thousands of molecules
that are to be analyzed in parallel in an automated fashion are attached to a support
that is only a few centimeters large and is made of glass, silicon, gold, or nylon
(Fig. 12.5). Only very small quantities of the biomolecules are needed. In the
meantime, this technique has achieved a maturity that allows its use in routine
analytical procedures. In addition to the appropriate preparation of the surface, it is
also the art of reliable and standardized immobilization of the molecules needed
for the precise analysis that guarantees the success of the method. In addition to
proteins and protein domains, antibodies, antigens, and especially DNA, oligonu-
cleotides, and RNA can be immobilized. Proteins are often anchored in that the
protein of interest is co-expressed coupled to an anchoring protein such as
streptavidin as a so-called fusion protein. The streptavidin anchor is attached to
the surface via biotin. Further, chemistry with thiol groups is used. The coupling to
the surface, which was previously equipped with appropriate reactive groups, is
accomplished with disulfide bridges. Other strategies use amino groups, for exam-
ple of lysine, which are then coupled to a reactive aldehyde group on solid support
material. To test the composition of an analyte, a soluble mixture is added to
a premanufactured chip. If binding partners are found in this transformation, the
components from the analyte solution remain adhered to the surface.
Such binding must be simple and detectable on the chip in a spatially resolved
manner. Initially, stain and fluorescence were the method of choice (Fig. 12.5).
Fluorescence stains, for example, green and red stains, are used for this because
they can be excited and detected easily and in a spatially resolved way. If mixed
signals resulting from a simultaneous red and green fluorescence occur, a yellow
signal is obtained. In the meantime surface plasmon resonance has achieved
a greater significance (▶ Sect. 7.7). As an alternative, this technique is used for
the detection of binding. Moreover, techniques that function similarly to ELISA
methods are also used (▶ Sect. 7.3).
Frequently, microarrays are used to analyze the expression pattern of biolog-
ical systems. For this the transcriptome of a cell is investigated under different
conditions, for example in a diseased and healthy state. The first molecules to be
successfully anchored onto chips were single-stranded DNA oligonucleotides. To
study the coding mRNA of a cell in a particular state, these molecules are translated
into a complementary DNA segment, the so-called cDNA by using reverse tran-
scriptase (Fig. 12.5). These cDNA molecules, or the fragmented sections of cDNA
that are obtained, are immobilized on a chip and cleaved into single strands. The
cell lysate with the single-stranded mRNA (transcriptome analysis), or the trans-
lated cDNA that was prepared from it, is added to such a chip, and the comple-
mentary mRNA strand hybridizes with the oligonucleotide fragments that are
anchored there. It is important in this process that the samples to be analyzed are
12.9 Expression Patterns on a Chip: Microarray Technology 253

Cells from Cells from


Healthy Diseased
Tissue Tissue

Gen2 Gen3
Gen1
Reverse
RNA Transcriptase
PCR Isolation Fluorescence
Labeling
mRNA mRNA

Construction cDNA cDNA


of the Microarray
Chip

Hybridization
…..
…..

Fig. 12.5 Manufacture and testing of an expression pattern with microarray technology. Individ-
ual gene segments from an organism are cut out and amplified by using PCR (above left). Next
they are immobilized on a microchip support as single-stranded oligonucleotides (below left). In
addition to the isolated and amplified DNA, synthetically manufactured DNA building blocks or
cDNA molecules, which are obtained from reverse transcription can also be brought onto the
support. One sort of bait molecule is at each point on the support. RNA molecules are isolated from
the cells of healthy (green) and diseased tissue (red), translated into mRNA, and reverse-
transcribed into cDNA. The cDNA is provided with a colored fluorescence marker. Then the
test molecule is added in a single-stranded form to the microarray plate, and if it is complementary,
a hybridization (below middle) results. Finally the binding is analyzed under fluorescent light
(below right). Yellow areas indicate that mRNA molecules from the healthy as well as the diseased
cells have bound. The mRNA that binds there is expressed in healthy as well as diseased states.
Areas that remain dark indicate that the mRNA is up-regulated neither in a healthy nor in
a diseased state. Areas that are either only green or only red fluorescent indicate a difference in
the expression pattern between cells from healthy and diseased tissue.

equipped with different fluorescence dyes according to their origin. For example,
the mRNA from a healthy cell is labeled green and that from a diseased cell is red.
After the hybridization on the chip, there will be areas that fluoresce green, red, or
yellow upon excitation, and others that remain without fluorescence. Areas that
glow yellow under the fluorescent light indicate that mRNA molecules from
healthy as well as diseased tissue have been bound. Obviously the mRNA that
binds there is available equally in the diseased and healthy states. Areas where no
fluorescence is seen indicate that neither healthy nor diseased cells produced
mRNA that bound there. Areas that fluoresce either green or red are interesting
254 12 Gene Technology in Drug Research

because they indicate differences in the expression pattern between healthy and
diseased cells. In this way, gene products can be discovered that are involved in
a disease process. If a misregulation is present, an attempt can be made to correct
this state with a pharmaceutical therapy.

12.10 SNPs and Polymorphism: What Makes us Different

What makes a single organism of a particular species different and leads to the
enriching diversity of a population? We speak of the human genome, but many
interesting deviations must be present so that we all look different and have
different features. Polymorphisms, that is, variations in the composition of the
genome, cause the observed diversity in or form the different phenotypes of
a species. The most obvious phenotypic difference is the division into male and
female individuals. Of course this is not the only difference that we recognize for
the human species. Many sequence variations occur within a population at the
genome level. If they occur in more than 1% of the population, then different alleles
are spoken of, otherwise they are attributed to mutations that have not yet been
enforced by evolution. Genetic polymorphisms are, for instance, observed as
insertions or deletions in which at least one nucleotide has been either partially or
completely incorporated or lost. However, single nucleotide exchanges occur as
the most common sequence variation. Here the term SNPs (spoken “snips”) is
used, which is an abbreviation of single nucleotide polymorphism. Compared to the
entire genome, polymorphisms encompass only a very small portion. They are
estimated to be 1% of the entire genome, so about three million bases. Of these,
SNPs are the overwhelming portion with about 90% share. Therefore the largest
part of our genome is identical over the entire species human, even though enor-
mous diversity in the phenotype is observed between us.
Within the SNPs, coding and non-coding changes are differentiated according to
whether these observed exchanges are translated into proteins or not. In the coding
regions of the genome the single exchange of a nucleotide can lead to an altered
protein sequence. In ▶ Sect. 32.7 the translation procedure of a base triplet into
a protein sequence is introduced. If a base in a coding triplet is changed, it can either
be translated into the same amino acid, or it leads to the incorporation of a different
group. This is related to the fact that sometimes multiple triplets code for the same
amino acid. The incorporation of a different amino acid into a protein can change its
properties. For example the amino acid composition of a glycosyltransferase is
decisive for the blood group that we have. An example is introduced in ▶ Sect. 29.7
of how an altered incorporation of a few amino acids in a G protein-coupled
receptor can exert an influence on our sense of smell. Humanity is divided into
different alleles according to their ability to smell different intensities and qualities.
However, not only SNPs in coding regions lead to differences in our species.
SNPs in noncoding segments of the genome can lead to changes in gene regulation.
In the context of drug research and therapy, SNPs can also be relevant where they
have no immediate effect on the phenotype. It is assumed that some SNPs confer
12.11 The Personal Genome: Access to an Individual Therapy? 255

susceptibility to diseases or influence the cellular response to a drug. It must be


considered at this point that SNPs can also occur in the region of the binding site of
a drug molecule, which may not necessarily be identical with that of the natural
substrate. Then they exert a direct influence on the affinity and the binding profile of
the active substance. As a result, an active substance can exert a stronger or weaker
inhibition of protein function in patients with an observed SNP than it would in
patients in which this SNP is not present.

12.11 The Personal Genome: Access to an Individual Therapy?

Genome sequencing and the analysis of SNPs and polymorphisms have impres-
sively uncovered the source of disease predisposition, and why drugs have attenu-
ated tolerability and different side-effect profiles. It has offered an explanation for
why undesirably high variations in the efficacy of drugs can occur in different
patients. All the more reason to ask whether the sequencing of the individual
genome of each person would provide options for a tailored individual and
personalized therapy. It is in no way an utopian idea that in a few years the full
sequencing of each individual person will be possible at manageable prices and
within an acceptable time frame.
It is long known in medicine that the blood groups of donor and recipient must
match for blood transfusions. A genome analysis would make the search for a
matching donor organ easier for transplantations. A particularly high density of
SNPs has been discovered in the genome, especially in regions coding for proteins
that present antigens in the immune system on their surface to stimulate an immune
response (▶ Sects. 31.7 and ▶ 32.2). An SNP analysis of each individual could
indicate the probability of developing a particular disease. Here, early detection of
this risk and possible lifestyle modification could be better than any therapy.
Already today high-resolution DNA chips (Sect. 12.9) allow the simultaneous
determination of more than 500,000 genetic SNP markers. Discovered SNPs can
indicate an elevated disposition for, for instance, the development of Alzheimer’s
disease in old age. A simple screening of the individual DNA sequence would allow
a predisposition for a particular disease pattern to be recognized.
Craig Venter, who determined the human genome in his company by the mRNA
shotgun method, had his own genome analyzed and published. From the gene
analyses of these data, a tendency for obesity and cardiovascular disease was
identified. His own father died at 59 years old of a heart attack. Based on this
analysis, Venter decided to take a lipid-lowering agent from the statin class
preventatively. A doctor could simply read from a personal genome whether the
patient displays an SNP pattern that would lead one to expect an intolerance for
a particular drug therapy. Moreover the doctor could see what type of metabolizer
category (▶ Sect. 27.7) the patient belongs to. This could reduce intolerance upon
the simultaneous treatment with multiple drugs, and would allow a safe adjustment
of individual dosing. It can also help to choose the right drug for a therapy,
particularly if multiple drugs are available for one indication.
256 12 Gene Technology in Drug Research

The dream of a development of “personalized medicines” for individual ther-


apy will be difficult to realize for cost reasons. Just the addition of one more methyl
group in a drug requires a full toxicological and pharmacological testing program to
achieve approval. It would devour millions in development costs. As always, the
determination of the individual genome and the elucidation of all imaginable pre-
dispositions for possible diseases has, however, its downside. In the hands of the
treating physician, this information is a blessing. But what would a future employer
read from these data about the prospect of hiring an employee? Insurance compa-
nies could accept only risk-free clients based on their genomic data—a chilling idea
that the individual genomic composition would decide an insurance premium!
By all estimations, our genetic differences and the imaginable consequences for
drug therapy, it must not be forgotten that our gastrointestinal tract is home to
millions of microorganisms. This flora exerts a decisive influence on our wellbeing,
our health stability, our metabolism, and also on our response to drug therapy. The
individual gastrointestinal flora begins to build up at birth, and is influenced in
critical measure by the mother. It varies considerable with lifestyle, the food culture,
and exposure to the regional microorganism landscape. In India, China, or Europe
a different microbe culture is found than in, for instance, America. Interestingly, it
changes if a person changes his home between the continents. Other microorganisms
cause a different configuration of secondary metabolites and contribute to a displaced
health equilibrium. Presumably these differences between individuals are just as
important as the genetic diversity that makes us different.

12.12 When Genetic Difference Becomes Disease

Genetic diseases have a molecular origin. A gene is altered (allele), sometimes the
two genes originating from both parents. Each of us carries a large number of such
altered genes, which are a result of arbitrary base exchanges: the SNPs. The
principle of evolution is based on these random mutations. If a mutation causes
a better adaptability of an individual in the environment, the chances of survival and
reproduction increase. Those genes are then reproduced with increased probability.
So-called horizontal gene transfer exerts an accelerated effect on evolution in
asexually reproducing species. There, entire DNA fragments between individuals
or even species are exchanged. Crossover plays an important role in this sense in
sexual reproduction. In this case, neighboring gene sequences of both parents
arbitrarily crossover and make new couplings. Without mutations and crossover,
all species would remain absolutely constant. In individual cases many errors are
produced as a mechanism of evolution. Some of these errors are the cause of genetic
disease. In sickle cell anemia a single amino acid in hemoglobin, which gives
blood its red color, is exchanged and a glutamic acid in position 5 of the b chain of
hemoglobin A (HbA) is replaced by a valine. The altered hemoglobin aggregates: it
“sticks” together in the red blood cells. The cells collapse and take on
a characteristic sickle form. Homozygous carriers, that is, individuals in whom
the “sick” gene is inherited from the father and the mother, are not able to survive.
12.13 Epigenetics: Lifestyle and Environment Influence 257

Heterozygous carriers who carry one “sick” and one “healthy” gene produce
normal and altered hemoglobin alongside one another. These people indeed have
a shorter life expectancy, but usually achieve reproductive maturity. In areas in
which malaria is endemic, there is a selection pressure for the genetic disease.
Heterozygous carriers of sickle cell anemia are more resistant to malaria than
healthy people (▶ Sect. 3.2). Here we are witnesses to Nature’s great experiment.
How will it end? Even people intervene. If malaria is successfully treated, wild-type
HbA carriers are no longer disadvantaged, the evolutionary advantage of sickle cell
anemia and the consequent selection pressure in the direction of this disease
disappears. This genetic disease could become “extinct” after a few generations.
On the other hand, if sickle cell anemia is treated either conventionally or gene
therapeutically, then these people would have entirely normal “healthy” red cells.
The malaria pathogen could reproduce well in them again. The protection from this
disease would disappear, and the susceptibility of these people to malaria would
rise to a normal risk level.
In addition to sickle cell anemia, around four thousand other diseases and their
molecular causes are known. Some, for example cystic fibrosis, phenylketonuria,
and inherited coagulopathies occur relatively frequently. Many others are rare and
are sometimes only described once. In the last years a multifactorial genetic cause
has been established for an increasing number of diseases, for example for diabetes,
rheumatoid arthritis, some cancers, asthma, and Alzheimer’s disease. The occur-
rence of these diseases is brought about by the simultaneous coincidence of
multiple genetic alterations, or is at least fostered by them.
The mechanisms of evolution are also responsible for the development of
resistances (▶ Sect. 4.8). Here, the selection pressure is exerted by a drug or an
insecticide (e.g., to exterminate malaria-carrying mosquitoes). Because of their
rapid reproduction, bacteria and viruses adapt quickly to a “hostile” environment.
The true masters are the retroviruses, which can develop resistance particularly
quickly because of their high mutation rates, and can therefore annihilate the
success of a drug with one stroke (▶ Sect. 24.5).

12.13 Epigenetics: Lifestyle and Environment Influence Gene


Activity as a Pen Would Make a Mark in the Book of Life

For the development of an organism, it is not only the kind of hereditary infor-
mation stored in the DNA that can be translated into gene products that is critical,
it is just as important that particular genes are only read in particular cells at
particular times. Even social factors and environment influence the genes and
change their behavior. Scientists observed the following example with zebra
finches. If a male zebra finch hears the song of another male, the gene EGR-1 is
more strongly read. The unknown song of a potential rival leads to a much
stronger activity in EGR-1 than background bird song that the finch has already
heard. EGR-1 is itself a key gene in gene regulation so that a change in the social
surroundings of the finch leads to many shifts in the protein expression pattern of
258 12 Gene Technology in Drug Research

the bird. This response helps the bird to adapt to the new changes because the
intrusion of a potential competitor into his own territory can be of essential
importance to him.
Pluripotent embryonal stem cells can differentiate into very different cell types.
For example, liver, brain, and muscle cells have the same chromosome set. They
are fundamentally different in their function. Many different phenotypes arise
from the identical genotype. This is true for the different cell types of an organism
at the same time as well as for different time-staged developmental steps in an
organism. Research on twins has produced remarkable results in this regard.
Comparative studies on identical twins, who are genetically identical, show that
with increasing age, and above all with different lifestyles, progressively larger
differences in the phenotype occur. There must therefore be mechanisms that lead
to changes in the phenotype that are passed along without changes in the genotype.
They regulate the transcription process and pass along this property to daughter
cells. This process is summarized under the term epigenetics. It leads to the
situation where an additional level of information is formed that regulates the
reading of the genes from the DNA.
The surroundings exert their effect on the genes through the epigenome.
Upbringing, childhood experiences, the effects of chemicals or intoxicants, and
stress are all epigenetic regulatory influences over which the gene activity is
temporarily or even permanently changed. As the following example of the Agouti
mice shows, such information can even be passed along to subsequent generations.
Normally, these rodents are small brown, thin, and very agile animals. The so-
called Agouti gene is contained within their genes, which after activation causes the
animal to become ill, their coat turns yellow, and they become ravenous and fat.
The offspring of these ill mice are colored the exact same way and are just as frail as
their parents. The American molecular biologist Randy Jirtle at Duke University in
Durham, NC, fed pregnant Agouti females a special diet that was rich in dietary
supplements such as vitamin B12, folic acid, choline, and betaine. As a result, the
majority of the offspring of these females were brown, thin, and in the best of
health. The Agouti gene was turned off by the enriched diet, without requiring any
changes to the genome sequence of the rodents.
On the molecular level, it is in particular methylation and acetylation that
transmits the additional epigenetic information. In contrast to genetic changes
that cause mutations in the translated gene products, epigenetic changes have
a strong dynamic component and are, above all, reversible. In the stretched-out
state, there is more than two meters of DNA in the cell; this is wound into a highly
compact form onto small basic proteins: the histones. Lined up like pearls along
a string, they collectively make up the chromatin, which makes up the chromo-
somes in its maximally packed form. Histones are the most strongly conserved
proteins in existence, for example, the 102-residue histone protein H4 from the pea
and from the cow are only different in two positions.
Epigenetic changes modify as one option the DNA in that methyl groups are
transferred to cytosine by methyltransferases (see ▶ Sect. 26.9) to give
5-methylcytosine. The base pairing with guanine in the DNA is not affected by
12.14 The Scope and Limitations of Gene Therapy 259

this modification, and the genetic code remains unchanged. If a methylation occurs
in a promoter region of the DNA, this leads to a silencing of the corresponding
gene. The methylation makes the DNA inaccessible to the reading apparatus, which
is somewhat similar to password-protected computer data. If the promoters in these
gene segments are demethylated again by methylases, the translation into the
corresponding protein is possible once more. As a second epigenetic change
histone proteins can be modified. Methyl, acetyl, and phosphate groups can be
enzymatically transferred to lysine and arginine residues of these basic proteins
with, for example, histone acetyltransferases (HATs). The added acetyl groups
neutralize the positive charge on the Lys and Arg residues (the so-called “histone
tails”). They can no longer interact as efficiently with the negatively charged
phosphate groups of DNA. Added phosphate groups have an even more repulsive
effect. These changes lead to less densely packed chromatin, which makes the DNA
reading in particular regions easier. The transcription and gene expression is
regulated in this way. On the contrary, the cleavage of acetyl groups by histone
deacetylases (HDACs) or by methylation of the Lys and Arg residues of the
histone causes the packing density of the chromatin to increase, and this diminishes
the probability for the DNA to be read in the affected areas.
Misregulation of the described enzymes is associated with the development of
diverse cancers. Because epigenetic processes are fundamentally reversible, there is
a chance that a drug therapy could intervene in the misregulated function of these
transferases. For this reason, intensive research efforts are underway for inhibitors
of different methyltransferases and histone deacylases, the latter of which are
mechanistically comparable to metalloproteinases (▶ Chap. 25, “Inhibitors of
Hydrolyzing Metalloenzymes”). The hope remains that these inhibitors can sup-
press disease-causing epigenetic changes and become potent drugs for cancer
therapy in humans.

12.14 The Scope and Limitations of Gene Therapy

In September 1990 the 4-year-old Ashanti DeSilva was the first patient to be treated
with a gene therapy. The alleles of both parents for the enzyme adenosine deam-
inase were defective. Because this enzyme is critical for the function of the immune
system, the little girl suffered from severe immune insufficiency that could no
longer be classically treated. As a therapy, the white cells of the patient were
repeatedly infected with a virus that carried the correct information for the missing
enzyme. The patient, who previously was hospitalized and in constant danger of
infection, has developed into a person with entirely normal health.
The term gene therapy refers to any technology with which a gene is introduced
into a cell of a patient to replace a defective or missing gene. In principle it is very
simple. Viruses demonstrate it for us daily: they bring their own genetic informa-
tion into a foreign cell and use it to code for a few key enzymes that are necessary
for their own reproduction. For the rest they use the biosynthesis machinery of the
infected cell. The retroviruses, the genetic information of which is coded in RNA,
260 12 Gene Technology in Drug Research

translate this information into DNA and integrate it into the host’s DNA. In gene
therapy, a nucleic acid segment is inserted into the genome of a virus that codes for
the protein that is to be substituted in the patient. The construct, which is what
these modified viral genes are called, is surrounded by the virus capsid and is
introduced into the cells of the patient. This can either take place outside of the
body, that is, in bone marrow or in white blood cells, that have already been
aspirated or within the body such as by injection into tumor tissue or in
a particular organ. Adenoviruses, herpes viruses, or retroviruses are all well suited
as carriers of the genes because these viruses incorporate their own genetic infor-
mation into mammalian DNA. Although retroviruses only transfer their genes
during cell division, adenoviruses can cause non-dividing cells to incorporate and
use foreign genetic information. Plasmids, DNA and liposomes and pure DNA
constructs are also being experimented with. The rates of transfer for the new
information into cellular DNA is significantly higher here than for the viruses. In
the meantime over 1,000 gene therapy clinical studies are underway, most in the
USA and overwhelmingly for tumor therapy. Cancer is indeed not a hereditary
disease, but the genetic information that is inherited from cell to cell creates
a “local” genetic disease. Oncogenes are a large group of proteins that are respon-
sible for the occurrence of cancer. Tumor-suppressor genes code for proteins that
interfere in the cell cycle and stop the division of cells. The quickly increasing
knowledge of the molecular structure of these proteins has afforded many
approaches for the gene therapy of tumors.
Other diseases can also be approached with gene therapy. The standard therapy
for cardiovascular diseases that are characterized by an excessive growth of endo-
thelial cells and consequent narrowing of the blood vessels is widening with
a balloon catheter. That helps, but only temporarily. After a few months the cells
proliferate anew and the blood flow in the downstream areas decreases threaten-
ingly. Here a gene therapy could be employed. Adenoviruses can be released
locally during the balloon catheter treatment. These carry the genetic information
for a protein that inhibits cell division, the so-called retinoblastoma protein. The
cells can then no longer proliferate.
AIDS patients die from infections because their immune systems are damaged.
The so-called T cells die. Bone marrow transplantation is a possible therapy. For this
it is decisive that the immunological properties of the donor and patient are as close as
possible. Many people are eliminated as possible donors, not to mention animals. Or
are they suited? A new approach for bone marrow transplantation and perhaps even
organ transplantation is the humanization of animals. For this immature human
bone marrow cells, stem cells, are transplanted into an animal, for example, a baboon.
The rejection reaction of the foreign cells is prevented by treatment with immuno-
suppressants. The human recipient does not bear the risk of an immune reaction, but
rather the animal donor. After the proliferation of the human cells in the animal, the
cells can be safely transplanted into the human “pro-donor.”
Will gene therapy replace classical drug therapy? The answer is absolutely
certain: no. The technique is very laborious and each patient needs an individually
adapted therapy. Moreover, the results to date have been a bit disappointing and
12.15 Synopsis 261

sometimes devastating. Fatalities have been observed in the gene therapy of


pediatric leukemias. Gene therapy will conquer a place in the therapy of special
diseases because it is a curative and not a symptomatic therapy. With increasing
experience and better appraisal of the possible risks, interventions into the human
genome will become acceptable for such diseases because it would make it possible
to eliminate the genetic disease for the individual and his or her offspring once and
for all and eradicate it from the world.
Gene technology not only solves problems, it creates new ones too. The techni-
cal barrier to the creation of a Homo perfectus is as low as it has ever been in the
history of humanity. The door to possible misuse has been widely opened. We can
only hope that ethics and common sense prevents this from happening. Draconian
legal regulations damage the beneficial use of gene technology more than it
contributes to the prevention of misuse. Those in responsibility have recognized
this and have established a framework in which gene technology can further
develop for the good of humanity.

12.15 Synopsis

• Gene technology has developed as a key technology in modern drug research


because it allows the production of pure proteins, the targeted mutagenesis to
elucidate functional and mechanistic properties of proteins or to confirm and
disprove binding modes, produces animal models by knocking-in and out par-
ticular genes, allows genes to be activated or silenced, or allows somatic
individual gene therapy.
• The elucidation of the genetic code, the recombinant production of genes and
gene products, and the polymerase chain reaction were milestones in the estab-
lishment of gene technology.
• Sequencing of the human genome revealed the constitution of our genes, the
number of gene products, and many functional insights. Meanwhile hundreds of
genomes of other species have been sequenced, and the genome analysis of
individuals is on the horizon.
• The human genome contains about 25,000 genes of which about 22,000–23,000
are translated into proteins. Some sequence segments are non-coding RNAs and
they accomplish important functions in the organism (e.g., in the ribosome or
spliceosome). About 95% of the genome contains numerous sequences and
signals that control the regulation of the genome. A functional classification of
the gene products has been accomplished for a significant portion of the genome.
• To study the relevance of blocking the function of a gene product, that is,
a protein in a disease situation, a particular gene can be knocked-out in an
animal model, mostly in mice. Genes can also be knocked in. Such turning on
and off of genes is of utmost importance in drug research because it provides
decisive information about the relevance of a planned therapeutic intervention.
• In vitro models for drug screening could only be developed once proteins could
be produced in pure form and high yield. Various expression systems from
262 12 Gene Technology in Drug Research

bacterial up to mammalian cells can be used for the production of foreign


proteins, which are brought into cells via the corresponding coding DNA.
• Genes can be silenced by RNA interference. Therefore small amounts of double-
stranded RNA, usually produced by the enzyme dicer, are incorporated in the
enzyme complex RISC. RISC uses one strand of the RNA dimer segments as
a template to capture mRNA molecules with a complementary sequence and
cleaves them sequentially. By doing this, mRNAs with particular sequences are
eliminated.
• To copy this principle for therapy, one needs about 22-base RNA molecules that
have to be transported across the membrane into cells, a difficult task with fragile
and highly polar species. Furthermore, these molecules can cause unwanted
immune response. Chemical modifications of the RNA molecules are aimed at
improvements in the transportation, immunogenicity, and stability properties.
• The proteome reflects the totality of all proteins in a cell at a given time under
precisely defined conditions. Its composition changes dynamically and differs
between healthy or diseased states or under the influence of therapeutic
treatments.
• The proteome can be analyzed at any given time by 2D gel electrophoresis; this
combines a separation by isoelectric focusing and SDS-PAGE analysis. Differ-
ences in the expression patterns indicate the involvement of proteins in a disease
situation. Back regulation under drug administration can indicate a possible
therapeutic strategy.
• Pull-down experiments with immobilized drug molecules on a chromatographic
solid support allow trapping of proteins that show interaction with the studied
drug molecules. Interaction profiles for drug molecules in the cell can be
determined.
• Biomolecules can be immobilized on microarray chips. Particularly RNA, DNA,
and oligonucleoides thereof are anchored on these chips to extract from the
complementary RNA or DNA sequences from large mixtures. By appropriate
fluorescence labeling of the anchored baits, sequences, and the target sequences
to be “fished,” detection of binding can be easily recorded in automated fashion.
With this, expression patterns of cells can be studied.
• Polymorphisms, particularly single nucleotide polymorphisms (SNPs), are vari-
ations in the composition of the genome of a species. These changes make
individuals different, and some SNPs confer susceptibility or resistance to
diseases or influence the cellular response to a drug.
• Differences in the individual genomes might be the key to a tailored individual
and personalized drug therapy and can allow a susceptibility to a particular
disease pattern to be recognized. Intolerance to a given drug therapy could
become transparent or classification of an individual into different metabolizer
classes could be achieved.
• Genetic differences can be a reason for the development of diseases. In some
cases, they are caused by single amino acid exchanges in one gene product (e.g.,
sickle cell anemia), in other cases multifactorial genetic causes are responsible
for the disease development.
Bibliography 263

• Epigenetics regulate the transcription process not by altering the genetic


sequence of DNA but by regulating the reading of the genes from the DNA.
Lifestyle, experience, and environment exert their effect on the genes through
the epigenome.
• Methylations and acetylations transmit additional epigenetic information in
a reversible manner. Either the bases of DNA are directly methylated or the
packing density of stored DNA on the histone proteins is altered making it more
or lesser accessible to the reading apparatus. The latter process modifies the
charges of positively charged Lys and Arg residues involved in packing via the
transfer of acetyl groups.
• Gene therapy tries to replace a defective or missing gene in the cells of
a patient. This would make it possible to eliminate the genetic disease for the
individual and his or her offspring. A nucleic acid segment is inserted into the
genome via viral carriers, and it codes for the protein that is to be substituted in
the patient. Gene therapy opens opportunities in special disease situations but it
also has its risks.

Bibliography

General Literature

Cooper NG (ed) (1994) The human genome project. Deciphering the blueprint of heredity.
University Science, Mill Valley
Kiely JS (1994) Recent advances in antisense technology. Ann Rep Med Chem 29:297–306
Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Monastersky GM, Robel JM (eds) (1995) Strategies in transgenic animal science. Blackwell
Science, Oxford
Mullis KB, Ferré F, Gibbs RA (eds) (1994) The polymerase chain reaction. Birkh€auser, Boston
Pandit SB, Balaji S, Srinivasan N (2004) Structural and functional characterization of gene
products encoded in the human genome by homology detection. IUBMB Life 56:317–331
Post LE (1995) Gene therapy: progress, new directions, and issues. Ann Rep Med Chem 30:219–226
Slagboom PE, Meulenbelt I (2002) Organisation of the human genome and our tools for identi-
fying disease genes. Biol Psychol 61:11–31
Venter JC et al (2001) The sequence of the human genome. Science 291:1304–1351
Wolff JA (1994) Gene therapeutics. Methods and applications of direct gene transfer. Birkh€auser,
Boston

Special Literature
Adams MD et al (1995) Initial assessment of human gene diversity and expression patterns based
upon 83 million nucleotides of cDNA sequence. Nature 377(Suppl 6547):3–174 (85 authors
including JC Venter)
Carlton JM et al (2007) Draft genome sequence of the sexually transmitted pathogen Trichomonas
vaginalis. Science 315:207–212
Chang MW, Barr E, Seltzer J, Jiang Y-Q, Nabel GJ, Nabel EG, Parmacek MS, Leiden JM
(1995) Cytostatic gene therapy for vascular proliferative disorders with a constitutively active
form of the retinoblastoma gene product. Science 267:518–522
264 12 Gene Technology in Drug Research

Craig C (1995) Bristol-Myers to Pay $2.7M for transgenic goats that make human antibodies.
BioWorld Today 6:1
Explore the Homo sapiens genome. http://www.ensembl.org/Homo_sapiens/index.html
Fleischmann RD et al (JC Venter et al) (1995) Whole genome random sequencing and assembly of
Haemophilus influenzae Rd. Science 269:496–512
Human genome database with functional predictions
Schneiker S et al (2007) Complete genome sequence of the Myxobacterium Sorangium
cellulosum. Nat Biotech 25:1281–1289
Seide RK, Giaccio A (1995) Patenting animals. Chem Ind 16:656–659
Sippl W, Jung M (2009) Epigenetic targets in drug discovery methods and principles in medicinal
chemistry. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal
chemistry, vol 42. Wiley-VCH, Weinheim
Experimental Methods of Structure
Determination 13

In this chapter we want to turn to the experimental structure determination methods of


ligands and proteins. There are two techniques in particular that deliver information
about the three-dimensional structure of small organic molecules all the way to
proteins: crystal structure analysis and high-resolution NMR spectroscopy. The
first technique is the older method. It goes back to an experiment of Max von Laue in
1912. It was just 17 years earlier that Wilhelm Röntgen had discovered an electro-
magnetic radiation, which was later named X-rays, or “Roentgen rays” in German in
honor of him. Together with his collaborators Walter Friedrich and Paul Knipping,
Laue was able to demonstrate the wave nature of X-rays with a copper sulfate crystal.
At the same time they proved the lattice structure of crystals. Only one year later
William Lawrence Bragg and his father William Henry Bragg reaped the rewards of
these experiments. They determined the crystal structure of sodium chloride. The
technique has grown over the years. Today the structures of proteins with 4,000
amino acids have been determined. In the last years electron microscopy has proven
to be a very powerful crystal diffraction technique tool for the structure elucidation of
membrane-bound proteins and viruses. NMR spectroscopy is likewise a relatively
young technique. In 1945 the research group of Felix Bloch and Edward Purcell in the
USA observed the resonance absorption of hydrogen atom nuclei in a magnetic field
for the first time. From this experiment, the technique has grown, mostly due to
progress with the instrumentation, to the extent that the structure determination of
proteins with more than 800 amino acids has been accomplished. For this purpose,
however, the protein must be extensively labeled with different isotopes.

13.1 Crystals: Aesthetic on the Outside, Periodic on the Inside

The term “crystal” causes one to immediately think of well-formed minerals or


sparkling gemstones with a magnificent cut. The association of crystals with the
structures of the molecules that determine our lives only occur to us as a second
thought. The crystal is typically associated with “dead” material. When Jack Dunitz
took over his chair as professor of organic chemistry at the ETH in Zurich at the end

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_13, 265


# Springer-Verlag Berlin Heidelberg 2013
266 13 Experimental Methods of Structure Determination

of the 1950s, the famous natural product chemist Leopold Ruzicka dismissively
told him that crystals are a “chemical graveyard.” Nonetheless, Dunitz and his
research group showed over many years that a crystal in no way belongs in
a “graveyard,” but rather is the key to understanding the structure, dynamics, and
reactivity of molecules.
If a mineral is considered, the regular construction of the single crystals stands
out. Even organic materials have the ability to form shapely crystals. One must only
think of the fascinating crystals of candied sugar. Is this external regularity
a representation of the inner structure? Before this question is answered, the way
that crystals are obtained should be clarified. A mineralogist got it easy. Nature has
already provided well-formed crystals over thousands or millions of years. Organic
molecules and proteins rarely occur in Nature in a crystalline state. Conditions must
be found under which they crystallize.
In general, crystals are grown from a solution. For simple organic substances this
can also be accomplished from liquid material or by sublimation. Both crystalliza-
tion methods are known from water when a lake freezes to ice, or from beautiful
crystals of frost. For crystallization from solution a solvent is sought in which the
compound is adequately soluble. By changing the conditions, the saturation point
of the solution is exceeded. If this occurs slowly, small crystal nuclei form that can
grow to large crystals. As a rule the solubility of the compound decreases with
sinking temperatures. The saturation point of the solution can be exceeded by
changing the temperature. The solution can also be “thickened”, that is, some of
the solvent is removed. Another possibility is the addition of a second solvent in
which the compound is less soluble. If the ratio of the two solvents is correctly
chosen, the saturation point can be slowly approached. For compounds with acidic
or basic groups, pH conditions can be found under which the compound exists as
a salt. Because of strong ionic interactions the salts often form better crystals. They
can be “salted out.” For this, a salt, for example, sodium chloride, is added to an
aqueous solution of the compound. The salt “uses up” the water molecules as it goes
into solution. It becomes surrounded by a solvation sphere of water molecules. In
doing so, the water is removed from the organic compound, which also has a sphere
of water surrounding it, the solvent. The saturation point of the compound is
exceeded, and the crystallization begins.
Proteins are complex entities that, as a general rule, are only soluble in water.
Because of their amino acid composition, they carry charged ionic groups on their
surfaces. Even with proteins it holds true that conditions must be found under which
they associate in periodic array. This is accomplished by slowly changing the
amount of water in which the protein is dissolved. This can work in both directions.
Hydrophobic proteins begin to aggregate when the amount of water increases.
Proteins that have stronger polar groups on their surfaces aggregate when the
water molecules are removed from their surfaces. Adjusting pH to find the right
value, the choice of suitable salt for salting out, and different temperatures are the
conditions that must be optimized. In addition to salts, surface-active substances
(detergents) can also influence the solvent shell and support the crystallization.
Despite this, crystallization is a kind of fine art. The search for suitable conditions
13.2 Just Like Wallpaper: Symmetries Govern Crystal Packings 267

a b

Fig. 13.1 Paving stones cover a surface without leaving holes (a). This is only possible if they are
derived from a particular basic geometric pattern, for instance a parallelogram, rectangle, square,
triangle, or hexagon. This basic pattern can by modulated by complementary bulges and recesses.
A path cannot be covered without holes if equilateral pentagons or octagons are used. If an
octagonal stone is combined with a square stone, however, the surface is completely covered. It
is immediately clear that if a square stone is cut along its two diagonals, two triangles result.
Adding four such pieces an octagon can be amended to a square in this way (b).

requires creativity and diligence. Today, however, the crystallization methods are
so elaborate that the tedious work of setting up thousands of different test condi-
tions is carried out by robots.
Sometimes considerable effort is invested into structure determination. In 1995,
the crystallization and structure determination of HIV integrase, one of the key
enzymes in the generation cycle of the virus, was accomplished only after the
40th point mutation of the original protein. This point mutation was made with the
goal of changing the surface properties of the protein so that an orderly aggregation
to a crystal could occur.
Let us return to the original question of whether the orderly outward appearance of
a crystal is a reflection of the internal construction. Chemically, a crystal is
homogenously composed. The organic molecule or the protein represents the basic
building block. It is only when these building blocks are spatially neatly organized
that a periodic array occurs that optimally fills the space. In daily life, many solutions
to these packing problems are easily seen, for example, sugar cubes that only fit into
the box if they are layered in the right direction, or paving stones that must be neatly
laid in a periodic fashion to completely cover the path without gaps (Fig. 13.1).
A single paving stone, when correctly fitted to the next, represents a repeating unit
in the lattice. A crystallographer refers to this unit as an elementary unit cell, and the
orderly setting of one unit upon another in terms of periodic translation. In the most
simple organic crystal structure, the elementary cell is one molecule (Fig. 13.2).

13.2 Just Like Wallpaper: Symmetries Govern Crystal Packings

The contents of an elementary cell can also be more complexly composed, for
example, like a wallpaper pattern. A basic motif is repeated so that it fills the surface
area. Crystallographers call the basic motif the asymmetric unit. In Fig. 13.3 this
motif is a flower branch. Not all of the motifs can be generated simply by shifting
268 13 Experimental Methods of Structure Determination

Fig. 13.2 In the most simple case, molecular packing, or unit cell, is accomplished purely by
shifting the molecule in all three spatial directions. The resulting unit, the elementary cell, is
derived from an irregularly angled body, a parallelepiped (above right, violet). If a point near the
molecule is picked out and all of the molecules in the crystal packing are connected by this point,
a three-dimensional lattice results.

Fig. 13.3 An area can be


covered not only by purely
shifting an object, the
asymmetric unit. Additional
symmetry operations such as
reflection and rotation can
also be used. This way
multiple copies of the object
are generated. In the
presented case, the flower
branch along with its mirror
image makes up the unit (the
elementary cell is outlined in
red) that can be used to cover
the surface simply by shifting
it regularly.

the branch, some must be additionally reflected. A pair of image and mirror-image
branches represent the elementary cell. The surface can now be filled with this
building block by simply shifting it. In addition to reflecting, basic motifs can also
be rotated. By using reflections and rotations, both so-called symmetry operations,
the contents of the elementary cell is generated from the asymmetric unit. This cell
is layered on itself in all three spatial directions in an orderly formed crystal lattice.
Even as a three-dimensional entity, the elementary cell must take on a particular
13.3 Crystal Lattices Diffract X-Rays 269

form to completely fill all of the space. If the basic types of elementary cells are
combined with all of the possible symmetry operations, 230 possibilities result for
the basic motif to fill the space. The crystallographer calls them the 230 space
groups. For chiral molecules, and proteins belong to this group, mirror reflection
does not occur. Therefore proteins only crystallize in 65 space groups.

13.3 Crystal Lattices Diffract X-Rays

Max von Laue used crystals to prove the wave nature of X-rays (Roentgen rays) by
diffracting them. For illustration, we shall consider a water wave. When a drop of
rain strikes a puddle, circular waves form that propagate from the center outward.
The drop generates a so-called elementary wave upon submersion. If two drops that
are separated by a particular distance simultaneously strike the water’s surface,
circular waves propagate outwardly from both submersion points. It is better to
observe this experiment if the water’s surface is constantly being “excited,” for
instance, with a constantly dripping tap. The circular outwardly spreading wave
fronts meet each other at some point. What happens? A lamellar pattern forms, parts
of the water’s surface remain at rest and other parts seem to move vigorously
(Fig. 13.4). In the cross section the water surface moves sinusoidally (Fig. 13.5).
How do two waves behave that collide and superimpose with one another? If the
wave peak and another wave peak or the wave trough and another wave trough
meet, the wave is amplified. If, on the other hand, a wave peak meets a trough, they
cancel one another out. The water surface remains calm. The lamellar pattern of
moving and still water surface between waves that are moving outwardly and
inwardly is caused by this superimposition. It is called interference. The band
density depends on the distance between the submersion points of the drops. The
ensuing interference pattern therefore contains information about the relative posi-
tion of the points from which the elementary waves were generated.

Fig. 13.4 Two raindrops


strike the surface of the water
and form circular, outwardly
moving water waves. These
superimpose on one another
to give a band-formed
interference pattern. There are
areas along these bands where
the water surface is quiet. In
other areas it moves that much
more strongly.
270 13 Experimental Methods of Structure Determination

Fig. 13.5 The waves run in a sinusoidal manner in cross section. The distance between two wave
peaks is called the wavelength. The height of the water wave at the summit is called the amplitude.
The position at which the wave crosses the resting position determines the phase. (a) If two wave
trains with the same phase meet, they add to one another and the amplitude doubles. This situation
is in the places in Fig. 13.4 where the water’s surface moves more strongly. (b) If there is a phase
difference of exactly one half of a wavelength, the wave peaks meet with the troughs. Both waves
cancel one another out. This represents the parts of Fig. 13.4 where the water surface is very still.
(c) Any other superimposed phase shift causes a wave, the amplitude of which is somewhere
between the extremes in (a) and (b).

If parallel water waves (e.g., a wave front at the coast) collide with a barrier that
has a small opening (e.g., a harbor entrance) semicircular waves spread outward
from the backside. If this barrier has two neighboring openings (double slit),
a semicircular wave develops behind each opening. The same picture as with the
two raindrops is achieved (Fig. 13.4). The waves interfere with one another behind
the double-slit barrier, and a diffraction pattern forms. The density of this pattern,
that is, the progression of the bands, depends on the geometry of the double slit.
Formally, the diffraction sequence on the crystal lattice is analogous. The same
principles are valid, but the superimposition is more complex. A very simple lattice
shall be considered that only has one type of atom. An X-ray runs as a parallel wave
toward this crystal. It collides with an array of atoms and initiates an interaction that
is comparable to that between the raindrop and the puddle. Each atom generates
a spherical wave because of the interaction between the atom’s electrons and the
X-ray. The circular wave on the water’s surface represents therefore the spherical
wave in space. The spreading spherical waves superimpose on one another and
form a wave that leaves the crystal in a changed direction (Fig. 13.6). Formally
seen, the incoming and outgoing waves have an angular relationship to one another
that is equivalent to the reflection of the wave in a plane perpendicular to the
13.4 Crystal Structure Analysis 271

a b

Fig. 13.6 If a wave front (blue) in one plane meets with a row of atoms (black points on the dotted
lines), each atom in this row becomes the starting point for a circular wave. This is analogous to
those created when the raindrop hits the surface of a puddle. The circular waves that formed from
the back row of atoms superimpose upon one another just as in the case with the water waves
(Fig. 13.4). All circular waves are generated with the same phase in the indicated direction of the
incoming wave (a). As a result of this superimposition, a new wave front forms (red) that leaves
the crystal in an altered direction. Relative to the direction of the incoming wave, they have an
angle that is formally a reflection of the incoming wave front on the atom row that is marked with
the green line. If a different incoming direction is taken the circular waves are not generated from
the same place (b), that is, there is a phase difference between them. Their superimposition does
not lead to a new wave front.

considered atom row. Therefore, the diffraction of the three-dimensional crystal


lattice can be treated formally as a reflection at a plane in the lattice.
Many parallel sets of such lattice planes can be inscribed on a crystal with
differing relative separation from one another and relative occupation density with
atoms (Fig. 13.7). The reflected waves contain the information about the geometry
(distance) and the relative occupancy (scattering power) in this plane. To record the
diffraction properties of a crystal, each set of parallel planes of the crystal must be
oriented in the X-ray beam so that a reflection is possible. This laborious work is
taken over by a computer-controlled diffractometer.

13.4 Crystal Structure Analysis: Evaluating the Spatial


Arrangement and Intensity of Diffraction Patterns

To demonstrate that different lattices indeed generate different diffraction patterns,


a simple experiment should be considered. For this purpose a laser pointer and
different pinhole filters are needed. The pinhole filters can easily be made. A black
and white print out of the periodic alignment as is shown in Fig. 13.8 can be reduced
and transferred to high-resolution photography film. This homemade aperture rep-
resents a two-dimensional periodic lattice. The laser beam is bent through the pinhole
mask and generates the diffraction pattern on a screen that is shown in Fig. 13.8.
272 13 Experimental Methods of Structure Determination

a b

c d

Fig. 13.7 A cluster of parallel planes can be laid through the atoms of a crystal lattice (a, b, c).
Their relative distance from one another and their atomic occupation density varies. Each one can
give rise to “reflections” in an X-ray diffraction experiment. For this the crystal must be brought
into the correct orientation for the incoming beam each time. The X-ray counter is positioned so
that it captures the out-going X-ray beam. It is from this geometry that the spatial orientation of the
cluster of planes in the crystal is determined. The occupation density of the atoms decides how
“well” a particular plane cluster reflects. This information is contained in the intensity (amplitude)
of the outgoing wave. (d) Different types of atoms in a molecular crystal have different spatial
relationships to one another. A parallel cluster of planes can be placed through each atom in the
molecule (here a three-atom molecule). The amplitude of the outgoing beam results in the
superimposition of wave trains that are reflected in these planes.

In the first two masks the distance and symmetry of the pinhole mask is changed.
In the third and fourth mask the repeating motif of the three or five differently
sized holes represent a molecule that has two types of atoms. These motifs
produce a periodic lattice when lined up next to each other. They have the same
dimension as is found in the first image on the left. If the diffraction pictures are
compared, the distribution of the intensity of the light points is different. That is
13.4
Crystal Structure Analysis

Fig. 13.8 A perforated mask can be used for a diffraction experiment with a laser pointer. For this the displayed hole patterns (above) must be brought to the
size of the wavelength of laser light. The diffraction patterns below were generated from the masks. The holes in the two left masks are all the same size, which
is comparable to having only one type of atom. The hole pattern changes from wide-meshed squares to an angular orientation. The diffraction patterns reflect
the symmetry and distance of the holes to one another. In the two masks on the right, the distance between the repeating units is identical to the first masks. The
composition of the motif in the repeating unit, however, varies. It is made up of multiple holes and can be compared to the different atoms in a molecule. The
distance between the diffracted light reflections (lower row) is identical for the first, third, and fourth masks. The intensity of the diffracted radiation, however,
varies from reflection to reflection. It contains the information about the composition and the geometry of the original motif.
273
274 13 Experimental Methods of Structure Determination

what contains the information about the construction of the motif that generated the
lattice. It is just this information that is used to determine the crystal structure.
The reflections, that is, the intensity of the individual light points in the
diffraction pattern, contain the information about the form of the molecule. There
is a mathematical technique, the Fourier transform, which can be used to translate
the diffraction pattern back to the generating motif. A Fourier transform is the
superimposition of many sine and cosine functions. The intensity of the diffraction
reflections determines the contribution of the functions, as does the phasing. The
importance of these aspects was already underscored in the interference of the waves
(Fig. 13.5). Unfortunately just this information about the relative phasing is lost in the
diffraction experiment. The diffractometer only registers the intensity of the reflec-
tions. The missing information is referred to as the phase problem of crystal
structural determination. It must be reconstructed for the individual reflections
by computational methods and by using appropriate measuring conditions. Fre-
quently large electron-rich elements (e.g., heavy-metal ions) are embedded in the
protein (i.e., by coordinating to histidine or cysteine). These heavy atoms dominate
the diffraction pattern, and in doing so, they betray their position in the crystal lattice.
Another method takes advantage of the so-called anomalous scattering. This effect is
based on the interaction of X-rays with the electrons of heavy atoms in particular.
This leads to the situation that a spherical wave that is propagating toward an atom is
reflected with a phase shift. Simply stated, it is returned with a delay. The effect is
dependent on the wavelength and can be exploited to determine the phasing. The
crystal is measured on a synchroton (particle accelerator that also produces electro-
magnetic radiation in a broad wavelength range, including X-rays) and the diffraction
experiment is carried out with multiple different wavelengths. Anomalous scatter-
ing requires that a heavy atom is contained in the protein structure. This is already
the case for metalloproteins. Often another approach is taken. Proteins that are
produced in a special expression system (▶ Sect. 12.6), can be generated with
selenomethionine instead of methionine. The heavier selenium serves as an anoma-
lous scatterer in the diffraction experiment. There are methods for small molecules
that allow a straightforward reconstruction of the phase information from the inten-
sity distribution, the so-called “direct methods.” The development of such methods is
being worked on for protein structural determination. Often an already-solved,
related protein structure can be utilized as a starting model for a structure determi-
nation (molecular replacement method). The model is translated and rotated in the
elementary cell by computer simulations until a calculated diffraction pattern is
produced that matches the diffraction pattern of the unknown protein.
The phasing obtained at the beginning of the structural analysis with this method
is only approximate. Altogether the regeneration of the phasing information is not
trivial. Even in the 1960s, phasing calculations kept one scientist busy for several
years. The methodical progress and the increased performance of computers now
allow this to be accomplished in a few minutes. Even today, however, this step can
still be very challenging for proteins. It is becoming apparent though, that the
structure determination of medium-sized proteins is becoming routine. Historically,
the time span from crystallization to structure determination could be quite long.
13.5 Diffraction Power and Resolution Determine the Accuracy of a Crystal Structure 275

Urease is certainly a curiosity. It was the first protein to be successfully crystallized.


James B. Sumner accomplished this back in 1926. Its 3D structure, however, was
first elucidated in 1995, that is, 70 years later!

13.5 Diffraction Power and Resolution Determine the Accuracy


of a Crystal Structure

A picture of the contents of the unit cell is the result of the Fourier transform. It
is portrayed in terms of the electron density in space (Fig. 13.9). The detail with
which the electron density can be determined depends on the spatial resolution
with which the diffraction pattern was measured. In relation to the Fourier trans-
form, this is a question of the number of different wave fronts that were
superimposed upon one another in the correct amplitude and phase. It can be seen
in the diffraction pattern created with the laser beam (Fig. 13.8) that the intensity
clearly weakens toward the edges. The extent to which the diffraction pattern is

Fig. 13.9 View of a crystal structure of aldose reductase (▶ Sect. 27.4). The electron density (the
so-called 2F0–Fc density at 1s level) is displayed as a blue mesh on the predefined contour level
around a tryptophan residue. In (a) the diffraction data were obtained at a resolution of 4 Å, and
a Fourier transform was used to calculate the electron density. The resolution increases from (a)
4 Å to (b) 3 Å, to (c) 2 Å, and to (d) 0.66 Å. The resolution in the last-shown contour density is so
high that hydrogen atoms can be recognized as single density peaks in the difference density map
(positive is yellow, negative is violet F0–Fc difference density, 2s level). The electron density is so
clearly structured at 2 Å that it is simple to fit the indole building block in place. At 4-Å resolution
this assignment is problematic and can easily lead to errors.
276 13 Experimental Methods of Structure Determination

perceivable in the edges limits the accuracy with which the generated motif can be
spatially resolved. For small organic molecules, this resolution is easily achieved in
that the atoms are visible as distinct maxima in the electron density. If the crystal’s
quality is diminished due to lattice defects or disorder, the resolution is poorer. The
resolution in protein crystals is usually between 1.5 and 3 Å. In the best case,
a resolution is achieved that is in the order of magnitude of a bond length. The
upper limit falls into the range of the cross section of a benzene ring. Resolutions of
less than 1 Å, however, have been achieved (Fig. 13.9). In those cases many details are
recognizable, such as single hydrogen atoms or multiple arrangements of side chains.
At higher resolution the electron density maxima are directly assigned to the atoms
in the molecule (Fig. 13.10). In the beginning this assignment is crude, the phases used
in the Fourier transform are only approximate. The position of the detected maxima
must still be optimized. This is defined as “refinement of the structure.” For this the
experimentally observed diffraction pattern is compared with the diffraction pattern
that is calculated from the atomic positions of the preliminary model. If the measure-
ment is very accurate, the density of a “pseudomolecule” with spherical atoms can be
subtracted from the observed electron densities at the end of the structure determina-
tion. What remains is the electron distribution of the bonds between the atoms in the
molecule (Fig. 13.10). This is, however, only possible with very high-resolution
measurements. At lower resolution, as is the case in moderately resolved protein
structure determinations, a direct assignment of the atoms of the protein to the
electron density maxima cannot be made (Fig. 13.11). More commonly the course
of the chains is fitted to the electron density. Because proteins are constructed from
20 different amino acids that prefer to take on typical geometries, the interpretation
of the electron density is simplified (Fig. 13.11). As with low-molecular-weight
structures the model is iteratively refined, and the structural data improved.
Electrons scatter X-rays. Therefore, the number of electrons around an atom
determines how well it is detected in the resulting density. Hydrogen atoms have
only one electron in their shell. As a consequence, they are often not located or are
located with poor accuracy in the electron density. Hydrogen atoms can be recog-
nized as densities in the structure determination of small molecules, but this is only
possible in protein structures if the resolution is less than 1 Å. It is unproblematic as
long as it only concerns hydrogen atoms at positions that correspond to spatially
fixed positions at a rigid molecular scaffold, for instance, hydrogen atoms on phenyl
rings. It is more difficult if the hydrogen atom is on a conformationally flexible
group or groups that can be protonated or deprotonated. It is good to know if
a carboxyl group is ionized, or if it exists as the free acid, and in which direction
the hydrogen atom is oriented. This information can only be indirectly gleaned from
the protein structure through an exact analysis of the spatial orientation of the
surrounding hydrogen-bonding partners.
The accuracy of the structure determination depends on the resolution of the
data that was obtained from a crystal. Even if the structure of the protein is
displayed on the computer screen like that of an organic molecule, its geometry is
much less accurately determined. The error margins in small molecule determina-
tions are approximately 0.01 Å for bond lengths, 0.1 for bond angles, and 1 –2 for
13.5 Diffraction Power and Resolution Determine the Accuracy of a Crystal Structure 277

a b

c
d H

H 0 0 0
H

C C H
H
0 0
0 H

e f

Fig. 13.10 Crystals with an edge of 0.1–0.3 mm are needed for the structure determination
of small organic molecules. (a) A diffraction pattern is obtained in an X-ray beam (compare
Fig. 13.8) that is displayed on a photographic plate or (b) is registered with a diffractometer
counter. The molecule that generated this diffraction pattern, which is periodically arranged
in the crystal is back-calculated from the reflections. (c) A Fourier transform is carried out
with approximate phasing, and a map of the electron density in space is obtained that is
contoured according to its height. The maxima are assigned to the atoms in the molecule
(here oxalic acid). (d) The spatial blurring of the electron density is associated with thermal
motion of the atoms. It is displayed with ellipsoids that represent the 50% probability of the
occupancy of each atom. (e) Crystals that scatter well allow the determination of the electron
density in the bonds between atoms. (f) The application of symmetry operations generates the
molecular packing in the crystal lattice. It delivers information about noncovalent interactions
between molecules.
278 13 Experimental Methods of Structure Determination

a b

c d

e f

Fig. 13.11 (a) The diffraction pattern of a protein crystal clearly shows more reflections. As they
are made up by larger molecules the unit cells comprise a bigger volume and exhibit more lattice
planes and therefore reflections. However, due to high solvent content and inherent flexibility of
the more complicated macromolecules the crystals give rise to poorer diffraction quality and the
data are registered to a lower resolution. (b) The enormous data flood is registered with an area
detector on a diffractometer. This allows the simultaneous registration of many diffracted inten-
sities. (c) A Fourier transform performed with phases from the first model delivers the distribution
of the electron density in space (blue mesh). Because no atomic centers are resolved in this density,
the trace of the protein chain (here a segment from a b sheet of tumor necrosis factor, TNF) is fitted
to the electron density distribution. (d) Similarly to small molecules, the obtained model is refined
until all of the atoms of the protein fit optimally into the density. (e) The color-coded thermal
motion of the molecule is shown over the entire molecule. Blue to yellow to red color changes
show the transition from mild to severe movement. (f) Symmetry operations generate the molec-
ular packing in the crystal lattice. There are “empty” areas that are occupied by numerous water
molecules. Because of the strong thermal motion and the disorder that it causes, they are not found
in the electron density map.
13.6 Electron Microscopy 279

dihedral angles (▶ Chap. 16, “Conformational Analysis”). For protein structures,


significantly larger errors must be assumed, and they are difficult to quantify.
They depend on how the structure was refined. The electron density does not
allow individual atoms to be resolved. Therefore amino acids are placed with
idealized bond lengths and angles in the electron density. Their geometry is left at
the predefined knowledge-based values for the subsequent refinement. The
assignment of atom types for the placement of the side chains is partially based
on assumptions. Knowledge-based values are used, or attempts are made to keep
the hydrogen-bonding network consistent. These aspects are to be considered
when judging the accuracy of a protein structure. The result of the crystal
structure determination is given in a spatially and time-averaged picture of one
“mean” molecule that represents the whole crystal. Often it is discovered that the
electron density in some areas indicates only a reduced occupancy of a side chain
or a part of a bound ligand. Furthermore, alternative orientations (conformations)
are recognizable. Sometimes the electron density from entire areas is missing.
This is indicative of “disorder,” and argues for a distribution over multiple
orientations in the crystal. This disorder can be dynamic, that is, the relevant
groups jump back and forth between two or more orientations. Or the disorder is
static which mean several orientations are present side-by-side in a crystal.
Because the structure is an averaged picture, these arrangements are scattered
throughout the crystal with different orientations. If a part of the molecule is
entirely disordered, that is, scattered over numerous orientations, the electron
density is usually not visible. Today, just to reduce the damage due to radiation
exposure, structures are measured at 100 K by using a nitrogen cool gas stream. At
this temperature many movements in the crystal are frozen and static disorder can
be observed. Despite this, it has been shown that the determined structure corre-
sponds well to the situation at room or body temperature. These conclusions can
be drawn by comparing the results to the analogous determination from NMR
spectroscopy (Sect. 13.7) and molecular dynamic simulations (▶ Sect. 15.8).

13.6 Electron Microscopy: Using Two-Dimensional Crystals to


Trace Membrane Proteins

Cryoelectron microscopy represents an ideal complement to X-ray structure deter-


mination because it makes the structure of very large membrane-bound proteins
accessible. Electrons are used as the radiation source. These slightly penetrate the
crystalline sample and they are more strongly absorbed than X-rays. Molecules
scatter electrons much more strongly than X-rays. Therefore much smaller crystals
can be used. Even crystals that are razor blade thin and are made up of only a few
molecular layers are sufficient. Single molecules can even be imaged, but their
molecular mass must exceed several million Daltons. Smaller molecular weights
make periodically organized arrays of multiple molecules necessary. In the mean-
time, membrane protein crystals have been successfully grown in two-dimensional
periodic molecular orientation. The attempt to grow crystals of such proteins that
280 13 Experimental Methods of Structure Determination

are large enough for an X-ray structure analysis has only worked a few times and
requires very special additives for the crystallization.
In recent times crystallization of membrane proteins has been successful in
lipidic cubic phases. Sophisticated mixtures of lipid, water, and protein can form
structured three-dimensional lipidic arrays that are pervaded by water channels.
Protein molecules diffuse into this structured yet flexible matrix, which facilitates
crystal nucleation and growth.
In addition to the work with readily obtained crystals, electron radiation has
another advantage over X-rays. It can be used for a diffraction experiment as well as
for the direct visualization of an object. The microscopic visualization is unfortu-
nately not possible with X-rays because a convergent lens cannot be built for
X-rays. This is successful for electrons because they can be focused by using
magnetic fields. Why not use an electron microscope to visualize molecules in
general? Despite the reduced radiation, electrons still damage the samples con-
siderably. Furthermore the crystals that are used represent about a millionth the
sample size that is used for X-ray structure analysis. The data for an X-ray
structure can be collected on one single crystal. In contrast, several hundred to
thousand tiny, often only 5-mm large crystals are needed for electron microscopy.
They are shock-frozen under high vacuum and directly exposed to the electron
beam. Proteins can only withstand these conditions after special preparation.
A very low radiation dose is worked with. Because of this, the images are very
noisy and must be averaged over many observations. To obtain a detailed reso-
lution in the plane perpendicular to the crystal’s plane, the crystal must be
measured in many orientations. Fine structural details are lost in doing this. The
analogous patterns in the electron diffraction diagram, as would be obtained in an
X-ray experiment, can be corrected by computational methods. With the help of
the Fourier transform, an electron density map of the molecule is obtained. Its
interpretation or refinement is accomplished in the same way as for the X-ray
experiment. The phasing that is necessary for the transform can be determined
from the images in electron microscopy.
The technique is relatively young and the methods are developing further.
There is more work to be done. Structural determination still takes several years,
and only a few laboratories have adequately powerful microscopes. Nonetheless,
the knowledge that we have about the structure of membrane-bound receptors
today is often based on the results that were achieved with this method
(▶ Chap. 30, “Ligands for Channels, Pores, and Transporters”).

13.7 Structures in Solution: The Resonance Experiment in NMR


Spectroscopy

Many atomic nuclei have an angular momentum, or spin. The nuclei that occur in
biological systems that have a nuclear spin are the hydrogen isotope 1H, the carbon
isotope 13C, the nitrogen isotope 15N, the fluorine isotope 19F, and the phosphorus
13.7 Structures in Solution: The Resonance Experiment in NMR Spectroscopy 281

a b

Fig. 13.12 Atomic nuclei with a rotational momentum behave like a spinning top. In the absence
of an external magnetic field, they orient in all possible directions randomly (a). Upon application
of a magnetic field, they orient their rotation axes parallel or antiparallel to the direction of the field
(b). The precession movement is oriented in an arc around the applied field direction. The two
orientations, parallel or antiparallel, with respect to the direction of the field are energetically
different. Because of this, there is a small difference in occupancy between the two states. By
applying an electromagnetic field with a frequency that corresponds to the rotational speed of the
top’s axis, the occupancy can be inverted. This resonance absorption, the exact frequency of which
depends on the type of nucleus and its immediate chemical environment, is registered with
a spectrometer.

isotope 31P. Just as a top would, these nuclei rotate about their axes. As long as no
magnetic field is applied, the tops orient in all possible spatial directions. In
a magnetic field they are forced into alignment (Fig. 13.12). If a toy top is spun,
it moves in the gravitation field. This field has one preferred direction. If the
alignment of the rotation axis of the top and the direction of the gravitation field,
which is oriented toward the center of the Earth, are not exactly the same, the top
wobbles. The end of the rotation axis performs a circular movement, an arc, with
a very precise rotational speed. It depends on the mass and geometry of the top. In
physics this movement is known as precession.
Atomic nuclei with a spin behave in a very similar way. In contrast to the
macroscopic top, they obey the laws of quantum mechanics. This means that the
rotation axes that their precession movement takes on can only adopt very specific
angles with respect to the applied field direction. The result for the 1H, 13C, 15N, 19F,
and 31P nuclei is that the rotation axis for the precession arc can only be parallel
or antiparallel to the direction of the field. The orientation in the direction of the
field is energetically somewhat more favorable than the rotation antiparallel to
282 13 Experimental Methods of Structure Determination

the direction of the field. Statistically, therefore, more nuclear spins in the substance
sample will align with the direction of the field. If an additional magnetic field
is applied to the outer magnetic field, and its frequency corresponds to the
precession frequency of the nuclear spin, the occupancy of “parallel” to “antipar-
allel” spinning nuclei can be reversed and a resonance absorption for the sample can
be registered. After a particular time span, the original situation is restored
(relaxation).
The rotational speed of the top’s axis for precession movements is character-
istic for each type of nucleus. It depends further on the composition of the
chemical environment in which the nucleus resides. A carbon atom of a phenyl
ring has a different resonance frequency than that of an aliphatic chain. The
relative position of the resonance absorption in relation to a standard reference is
also called the chemical shift. Furthermore the individual nuclei can perceive the
spin orientation of the neighboring nuclei. An alignment in the same direction as
a neighboring nucleus is energetically different from that of an antiparallel
orientation. This influence also modulates the rotational speed of the spin on
the observed nucleus. The information transfer regarding the orientation or the
magnetic state of the nuclei in the vicinity can be transmitted over several bonds.
This transfer can even occur through space without any direct covalent
connection.
To measure an NMR spectrum (nuclear magnetic resonance), a solution of the
substance has to be placed in a strong magnetic field. In addition, a variable
electromagnetic field is applied to the sample. The frequencies at which the nuclei
in the sample have resonance, meaning when they flip from parallel to antiparallel,
are recorded. The resulting spectrum discloses information about the composition
and the chemical environment around the studied nuclei. It contains information
about the spatial structure of the molecules under investigation. Based on the work
of Richard Ernst, multidimensional NMR techniques have been developed in the
last 30 years. By using suitable measuring conditions and selectively irradiating
electromagnetic fields, information about the mutual influence of resonance fre-
quencies among individual nuclei is separated and analyzed. This either-way
induced information transfer about the magnetic state of neighboring nuclei is
apparent from the signal form of multidimensional spectra, which are registered
in terms of cross peaks. Only the hydrogen isotope 1H occurs in nearly 100%
natural abundance. Therefore, it can be assumed that for statistical reasons, two 1H
nuclei will always be adjacent to each other in a molecule. In contrast, the 13C and
15
N isotopes are scarce. As a result, statistically they are only very rarely found in
the direct vicinity of one another. Data on the mutual influence of the magnetization
of these nuclei are required for the spectra. Therefore it is necessary to enrich the
proteins with the appropriate isotopes. For this, bacteria are fed with isotopically
labeled substrates such as glucose or ammonium chloride and will then produce
proteins that are isotopically enriched. It is even necessary to produce deuterated
proteins for the structural investigation of very large proteins. Today, by using
numerous spectroscopic techniques, spectra from proteins of more than 800 amino
13.9 How Relevant Are Structures in a Crystal or NMR Tube to a Biological System? 283

acids have been successfully interpreted. The following questions can be addressed
by NMR analysis:
• Which atomic nuclei occur in which chemical environment?
• What is in the immediate, covalently connected neighborhood of these nuclei?
Information about the spatial orientation of atoms in the vicinity is also
contained within these spectral parameters.
• Which geometric relationships are given between different segments of the
polypeptide chain? This results from information transfer about magnetic states
of nuclei that are not directly connected by covalent bonds.

13.8 From Spectra to Structure: Distance Maps Evolve into


Spatial Geometries

This last-mentioned observation, which results from the nuclear Overhauser


effect (NOE), yields intramolecular distances of spatially neighboring but not
directly covalently bound atoms. The entire connectivity, that is, the list of all
covalent bonds within a molecule, and a list of the recorded intramolecular
noncovalent distances are applied to generate the structure for the molecule
(Fig. 13.13). For this purpose, so-called distance–geometry calculations are
used to create the spatial coordinates of the atoms.
Often multiple equally good structural models fulfill the experimentally deter-
mined distance conditions in complex molecules. If the spectral parameters for a
section of the structure are too scarsely distributed with too large distances, it is
very difficult to achieve a unique spatial configuration of the atoms. Therefore, the
generation of a structural model is coupled with molecular dynamics simulations
(▶ Sect. 15.7). These calculations deliver geometries of molecules that represent
energetically favorable 3D structures consistent with the spectral parameters.
Multiple slightly divergent models are given in areas with few spectral conditions.
Therefore, the NMR spectroscopists always suggest a bunch of structural solutions
(Fig. 13.14).
Attempts are often made to compare the quality of X-ray and NMR structures.
Both methods measure different properties, and the structures are derived from
different measured variables. This must be considered when making a direct
comparison. The accuracy of an NMR structure fluctuates with the density and
frequency of spectral distance constraints, while that of an X-ray structure mainly
depends on the resolution of the diffraction experiment.

13.9 How Relevant Are Structures in a Crystal or NMR Tube to


a Biological System?

The discussed structure determination techniques investigate molecules in a crystal


assembly or in solution in an NMR tube. Are these conditions at all relevant for the
284 13 Experimental Methods of Structure Determination

COOH

B H2N A B

COOH
H2N A

0.0

2.0

4.0

6.0

8.0

A 10.0

10.0 8.0 6.0 4.0 2.0 0.0


PPM

Fig. 13.13 A multidimensional NMR spectrum contains information about the spatial vicinity of
atomic nuclei in a molecule (here, the trypsin inhibitor from bovine pancreas). It is expressed in so-
called cross peaks. Information can be extracted about the distance between non-covalently bound
atoms in a molecule. The individual signals of the spectra are assigned to atoms in the molecule
(e.g., A and B). The positions that these atoms have in the polypeptide chain are known from the
sequence of the protein (above left). The intensity of the cross peak indicates which spatial distance
is found between nuclei A and B in the folded polypeptide chain (above right). Just as was done for
A and B, the many other cross peaks are evaluated and translated into distance conditions.

biological conditions in an organism? Small flexible molecules change their geom-


etry depending on the environment. They will adopt a different shape in a crystal, in
solution, or in the binding pocket of a protein. Therefore the question can be asked
whether the data from a small-molecule crystal structure are suitable to deliver
information about the molecular geometry in a binding pocket. From the numerous
known crystal structures, and in the meantime it is more than 500,000, some general
principles about the molecular architecture of organic compounds can be deduced.
All of the published crystal structures are electronically archived at the Cambridge
Crystallographic Data Centre in England. They can be retrieved and compared with
one another. It will be shown in ▶ Chaps. 14, “Three-Dimensional Structure of
13.9 How Relevant Are Structures in a Crystal or NMR Tube to a Biological System? 285

Fig. 13.14 The accuracy of an NMR structure depends on the density of the experimentally
determined atomic distances. These come from experiments that deliver information about the
exchange of the magnetic state of spatially adjacent, but not directly connected atoms (so-called
nuclear Overhauser effect, NOE). With the connectivity list and the NOE conditions, multiple
structural models are generated. These models represent the low-energy geometries that agree with
the spectral parameters. In the left part of the figure (a) the experimentally measured NOEs (black
dashed lines) are distributed over the 3D structure of a domain of the guanine nucleotide exchange
factor. For the sake of clarity, only the long-range NOEs are shown. Most of the amino acid side
chains are also suppressed; many of these NOEs therefore indicate the positions of atoms that are
not shown. In areas in which very few distances could be determined (e.g., in the green loop areas
or at the termini), the model is ambiguously defined. Multiple models are consistent with the
experimental data (b). The main chain of the protein fans out. In areas where a large number of
NOE conditions are found (e.g., the helices and the central b strand), the structural models diverge
only slightly from one another.

Biomolecules” and ▶ 16, “Conformational Analysis” that valuable information


about possible molecular and interaction geometries are available through a
statistical evaluation of these data, which provides insights also relevant for the
conditions in a protein binding pocket.
Nevertheless, are the structures in the crystal of the protein too remote from the
conditions in a biological system, much further than, for instance, the solution-
phase state? A good many structure determinations that were carried out in solution
and in the crystal in parallel are available. Experience has shown that the correlation
is usually very large. Deviations are preferably found on the surface area of
proteins. There, the amino acid side chains form interactions with the environment.
Therefore, these deviations are not surprising. The crystal packing of tumor necro-
sis factor (TNF) is presented in Fig. 13.11. Large holes are conspicuous in the
crystal packing. These areas are filled with water molecules that are so loosely
incorporated into the crystal that they can freely move to a large extent. Therefore,
they are not locatable in the electron density. Channels filled with water in protein
286 13 Experimental Methods of Structure Determination

crystals can make up to 70% of the crystal’s mass! Therefore, the crystal can also be
considered as a highly concentrated, ordered solution. NMR measurements also
require high concentrations. They are considerable higher than in biological
systems, but are still 10–100 times lower than in protein crystals.
The high water content of protein crystals offers the possibility to allow small
molecules to diffuse into the crystals. In the water channels, they move as they
would in an aqueous solution. In favorable cases, the binding pocket of the protein
is directly accessible from one of these channels. By placing the protein crystal
directly in a solution of the active substance (soaking), the latter can penetrate the
crystal through the channels, diffuse into the binding pockets, and dock there. Then
a new diffraction experiment is carried out with the loaded crystal. The reflections
are measured, and, based on the known structure of the protein, the electron density
map is generated. The density of the uncomplexed protein is subtracted from that
map. The difference density of the incorporated ligand remains. This information is
of essential importance for understanding the interactions between small molecules
and proteins. The question of whether the experimental structure is really relevant
for the biological conditions has still not been answered. Crystalline hemoglobin is
able to reversibly take up and release oxygen. It could be shown on crystals of
purine nucleoside phosphorylase (PNP) that the enzyme is still catalytically active
in the crystal (Fig. 13.15).
The research group of Malcolm Walkinshaw at the University of Edinburgh
could even show on the example of the enzyme Cyp3, a peptidylproline isomer-
ase, that there is a quantitative agreement between the crystalline and solution
states. Different concentrations of an inhibiting prolyldipeptide were allowed to
diffuse into the crystal. Afterward, the occupancy of this inhibitor obtained from
the differently concentrated soaking solutions was determined in a crystallo-
graphic experiment. The binding constants were then ascertained from this
occupancy data. They quantitatively agreed with the inhibition constants that
were determined in a functional assay in solution.
The diffraction data can be very quickly collected with even more intense,
so-called white X-rays from a synchrotron source (the so-called Laue technique).
With this experiment, it was possible to observe stable intermediates of enzyme
reactions. Structural changes of the two-dimensional crystals of the acetylcholine
receptor (▶ Sect. 30.4) could be observed with electron microscopy after loading
with the natural ligand. This and other experiments have proven that proteins exist
in a crystal lattice that must be, at the very least, very similar to the biologically
active form.

13.10 Synopsis

• The most powerful methods to determine the spatial structure of molecules are
X-ray crystallography and NMR spectroscopy. The former requires the bio-
molecules to be arranged in periodic arrays in a crystal, and the latter studies
them in solution, usually in an isotopically labeled form.
13.10 Synopsis 287

O O
N N
NH NH

HO N N NH2 N N NH2
O H

+ H2PO3− PNP
HO OPO2H−
OH OH O
+

OH OH

Crystal
removed
Reaction rate

Crystal
removed

Crystal
soaked
Crystal
soaked
Time

Fig. 13.15 The enzyme purine nucleoside phosphorylase (PNP) transforms guanosine and
phosphate to guanine and ribose-1-phosphate. If a protein crystal is placed in a solution of the
substrate, the reaction begins. This could also have been caused by a partial dissolution of the
enzyme crystal. If the crystal is removed from the solution, the reaction stops. If the crystal is
brought back into the solution, the reaction carries on. This experiment demonstrates that even
crystalline enzymes are catalytically active. Therefore, a geometry must be present in the crystal
that corresponds to the biologically active form.

• Crystals need special conditions to grow from saturated solutions. They spatially
arrange in periodic arrays, and the molecules pack through translational sym-
metry in three dimensions. In addition to the pure shifting of basic motifs,
usually one molecule that represents the asymmetric unit, symmetry operation
such as mirror reflection, two-, three-, four-, and six-fold rotation or inversion
can be applied.
• Crystal lattices diffract X-rays and the diffraction experiment can be understood
as a three-dimensional interference of elementary spherical waves generated
at the positions of the atoms in the lattice. The diffraction phenomenon
at a 3D lattice can be treated formally as multiple reflections at crystal planes
in the lattice.
• Because the relative phases of the generated elementary spherical waves,
superimposed in the various reflections, are not accessible by experiment, they
288 13 Experimental Methods of Structure Determination

must be regenerated by sophisticated phasing methods. Only then can a Fourier


transform be calculated from the measured reflections that represents the spatial
distribution of the electron density in the crystal. A model of the crystallized
molecules is assigned to this electron density.
• The diffraction power and resolution of the crystals determine the accuracy of the
resolved structure. For proteins, a resolution of 1.5–3 Å is usually achieved. At the
lower end, molecular building blocks such as phenyl rings are well resolved, and
individual water molecules are visible. At the upper limit, only the overall
topology is determined, and the water molecules usually cannot be assigned.
• The crystal structure is an average structure over space and time. Enhanced
B-factors give an estimate of the residual mobility of molecular portions in
a molecule.
• Cryoelectron microscopy is an alternative method to determine the structure of
membrane-bound proteins in particular by diffraction experiments. Data are
collected from thousands of tiny razor blade-thin crystals.
• NMR spectroscopy records the resonance of magnetic nuclei such as 1H, 13C, or
15
N oriented in a strong magnetic field. The transition between parallel and
antiparallel orientation of the nuclear spins can be induced by additional fields.
Because the frequency at which these transitions take place depends on the
chemical environment in a molecule, the spectral parameters contain informa-
tion about the 3D structure of the molecules in solution.
• The multiplicity of the recorded spectral parameters can be transformed into
distance maps. They can be translated into the spatial structure of the protein by
using a distance geometry approach coupled with molecular dynamics simulations.
• It could be shown for many cases that the NMR structure of a protein in solution
and the X-ray structure in a crystal largely coincide with one another. Differ-
ences are observed for the surface-exposed residues.
• Protein crystals contain up to 70% water and exhibit large water channels that
pass through the crystal. Small molecules can diffuse and access binding sites
through these channels, particularly if these sites are accessible from one of these
channels. The binding modes of small-molecule ligands can be easily deter-
mined by using soaking techniques.
• The significance of the architecture of proteins determined in a crystalline envi-
ronment for biologically relevant conditions has been demonstrated. For example,
enzyme reactions also take place when the protein is arranged in a crystalline state.

Bibliography

General Literature

Blundell TL, Johnson LN (1976) Protein crystallography. Academic, London


Drenth J (1994) Principles of protein X-ray crystallography. Springer, Berlin
Dunitz JD (1979) X-ray analysis and the structure of organic molecules. Cornell University Press,
Ithaca
Bibliography 289

Friebolin H (2010) Basic one- and two-dimensional nmr spectroscopy. Wiley-VCH, Weinheim
Glusker JP, Trueblood KN (1985) Crystal structure analysis, a primer, 2nd edn. Oxford University
Press, New York
Glusker JP, Lewis M, Rossi M (1994) Crystal structure analysis for chemists and biologists. VCH
Publishers, New York
Pellecchia M, Bertini I, Cowburn D et al (2008) Perspectives on NMR in drug discovery:
a technique comes of age. Nature Rev Drug Discov 7:738–745
Wuthrich K (1986) NMR of proteins and nucleic acids. Wiley, New York

Special Literature

DeRosier DJ (1993) Turn-of-the-century electron microscopy. Curr Biol 3:690–692


Wear MA, Kan D, Rabu A, Walkinshaw MD (2007) Experimental determination of van der Waals
energies in a biological system. Angew Chem Int Ed 46:6453–6456
Three-Dimensional Structure of
Biomolecules 14

In drug design the ligand, which is generally a small organic molecule with
a molecular weight of under 500 Da is under focus. It undergoes interactions with
a macromolecular receptor and exerts an influence on the receptor’s characteristics.
On the other hand, the surrounding receptor can also determine the properties of the
bound active ligand. Selective interference in these interactions requires not only an
understanding of the ligand but also the receptor. After the methods for the
structural determination of biomolecules were introduced in the last chapter, we
want to take a look at what can be learned about the construction principles and
characteristics of these molecules. Proteins are made up of 20 basic building
blocks, the amino acids (see Appendix 1). A dipeptide is formed by coupling two
amino acids through an amide bond. Larger peptides and proteins are formed by the
addition of further amide bonds.

14.1 The Amide Bond: Backbone of Proteins

The simplest molecule with an amide bond is formamide 14.1. Its structure is
shown in Fig. 14.1. This connection occurs many hundreds of times in proteins,
for instance, over 50,000 times in the shell of the rhinovirus. The bond length
between the carbon, oxygen, and nitrogen atoms can be obtained from the crystal
structure of formamide. The microwave spectrum of gas-phase formamide also
affords bond lengths, but different values are obtained. In the gas phase, formamide
is “isolated,” that is, it does not “perceive” any neighbors in its immediate vicinity.
The C═O double bond is shorter, and the C–N single bonds are longer than in the
crystalline formamide. In the crystal assembly, the individual formamide molecules
are not “alone.” They are connected to neighboring molecules by hydrogen bonds.
A hydrogen bond is a non-covalent interaction. It couples a functional group
carrying a hydrogen atom (e.g., NH or OH) with an electronegative heteroatom
(e.g., N, O; ▶ Sect. 4.2). Obviously, incorporating a molecule into a network of

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_14, 291


# Springer-Verlag Berlin Heidelberg 2013
292 14 Three-Dimensional Structure of Biomolecules

H H Bond length in Å
Formamide
N C=O C-N
Crystal assembly 1.241 1.318
H O Gas phase 1.219 1.352
14.1

H H

C O

Fig. 14.1 Formamide 14.1 is the smallest molecule that has an amide group. Its molecular
structure is shown the lower part. Because of thermal motion in the solid state the molecule
carries out vibrational movements. Its electron density is therefore distributed over a larger area.
This is described by using ellipsoids that encompass the 50% probability of occurance the atom.
Two hydrogen bonds are incurred between the carbonyl group and the amide group of
a neighboring molecule in the crystal packing. An extended H-bond network stabilizes the crystal
structure and polarizes the amide group. The bond lengths (in Å) are different in the crystal
assembly and in the gas phase (upper part).

hydrogen bonds causes a change in its geometry. The electron density between the
atoms is shifted so that the C═O double bonds are longer and consequently weaker.
Simultaneously, the C–N single bonds become shorter and stronger. Twisting the
molecule around this bond away from planarity is therefore made difficult.
The amide bond is the fundamental building block of proteins. Every third bond
in the polymer chain is an amide bond. As we have seen in formamide, they have
a planar geometry, that is, a plane can be defined through its atoms. The folding of
the polymer chain and the concomitant spatial construction of the protein is
determined by the torsion angle in the plane of the amide bonds against one another
(Fig. 14.2). Its rigidity and planarity is decisive for the stability of the spatially
folded protein. In proteins, the amide bonds are practically only in the trans
configuration. Only the rotation around the plane of the amide bond remains as
a degree of freedom for the polymer chain. These torsions (▶ Chap. 16, “Confor-
mational Analysis”) occur around bonds that lie between the Ca carbon atoms. As
was shown in the bond-length comparison between the gaseous and crystalline
formamide, the decisive additional stiffening of the amide bond is caused by its
incorporation into a hydrogen-bonding network.
14.1 The Amide Bond: Backbone of Proteins 293

N H

O C
ψ
H


φ
H
R
N

C
O

180
160
140
120
100
80
Ψ
60
40
20
0
−20
−40
−60
−80
−100
−120
−140
−160
−180
−180
−160
−140
−120
−100
−80
−60
−40
−20
0
20
40
60
80
100
120
140
160
180

Fig. 14.2 The spatial course of a polypeptide chain is determined by the relative orientation of the
planar peptide bonds (above). The twist of these planes against one another is measured on
the basis of the two twisting or dihedral angles f and c. These do not assume any value around
the bond axes, but rather are limited to a few combinations of value ranges. In the diagram, a
so-called Ramachandran plot, the values for both angles (below) along the peptide chain are
plotted. The angle combinations for an a helix are found in the middle left (Fig. 14.3), and those for
a b-pleated sheet in the top left (Fig. 14.4).
294 14 Three-Dimensional Structure of Biomolecules

a b
R 11

10 R

9 R
R
8
R
7
6 R

5
R
R
4
3 R

2 R

R
1

Fig. 14.3 The a helix is a commonly found secondary structure. The polypeptide chain forms
a right-handed spiral with a pitch of 7 Å, and 3.6 amino acids per turn (a). All carbonyl groups
(oxygen is red) are oriented parallel to the helix axis in the same direction. The NH functionalities
(nitrogen is blue, hydrogen is light blue) are oriented in the opposite direction. The groups form
a pronounced hydrogen-bond network (violet dashed line) between themselves (b). The side chain
(R) on the Ca atoms are on the outside pointing away from the helix axis. This forms a typical
furrow pattern that runs in a spiral over the surface. This “ridge and groove” pattern determines the
mutual packing of a helices in proteins.

14.2 Proteins Fold in Space to Form a Helices and b Strands

Typically, the angles named f and c are used to describe the two dihedral angles
around the Ca carbon atom, and these angles usually take on value pairs from two
ranges. These ranges are related to a helical or sheet-like course of the polymer
chain (Fig. 14.2). In an a helix with a right-handed turn, all CO and NH groups
orient in the same direction (Fig. 14.3). They form a network of H-bonds among
themselves. Each amino acid in the helix is in contact with the next fourth amino
acids in the sequence. This unidirectional orientation of the polar groups of the
14.2 Proteins Fold in Space to Form a Helices and b Strands 295

R
R
R
R

R
R
R
R

O C
N H O C H N H H
H H N
H
C C C C
R C O H N R R C R C O
O
H N R R C O H N H N
R R
C C C C
O C H H C H H
N H O O C
N H O C N H H
H H H H N
C C 7Å C C
R C O H N R R C O R C O
H N R C O H N H N
R R R
C C C C
O C H H H H H
N O C O C
N H O C N H N H
H H H H
C C C C
R C O H N R R C O R C O
H N
H N O

Antiparallel Parallel

Fig. 14.4 A second important secondary structure, the b strand is composed of multiple sections
of the polymer chain that exist in a stretched conformation (top). The strands can run parallel or
antiparallel. They are crosslinked to each other via hydrogen bonds (violet). The sheet-like
structure displays a zigzag wrinkle and is called a b-pleated sheet. The side chains (R) of the
amino acids point away from, and alternate above and below the pleated sheet.

amide bonds in an a helix has consequences for the electrostatic characteristics


(▶ Sect. 15.4). Whereas a helix is made up of amino acids from a single segment of
the peptide, amino acids from at least two sequence sections must come together to
form a b-pleated sheet. Both strands can be bonded with each other in either
a parallel or antiparallel orientation relative to the polymer chain (Fig. 14.4). This
network exhibits a different progression of H-bonds for both orientations. The side
296 14 Three-Dimensional Structure of Biomolecules

Fig. 14.5 Within a b-pleated sheet of multiple strands, here shown with a parallel orientation,
a right-handed twist occurs. For simplification, the single b strands are indicated with an arrow.
The twist can be seen by the internal rotation of the arrow. The pleated sheet here is shown in two
perpendicular views.

chains alternate above and below the pleated sheet. The entire strand is slightly
twisted upon itself. Because of this a pleated sheet of multiple strands has a twist to
it when viewed from the side (Fig. 14.5).
Aside from these two common secondary structures, other typical combinations
of torsion angles occur. A polymer chain that folds to a globular structure in space
must reverse its direction. This is achieved in the so-called turn or loop region.
Turns can be classified according to the number of involved amino acids and the
type of interaction that closes the turn. Loops that form a C═O···H–N hydrogen
bond in the direction of the polymer chain, inverse turns with hydrogen bonds in the
reversed orientation, and open turns in the chain that are held together by van der
Waals interactions and polar interactions can be distinguished from one another
(Fig.14.6). A total of 158 turn classes were summarized in a recent evaluation by
Oliver Koch.
What force effectuates the organization of a protein? Amino acids possess
hydrophilic and hydrophobic side chains. Hydrophobic groups avoid aqueous
environments (▶ Sect. 4.2). During the folding of the polymer chain in an aqueous
medium the hydrophobic amino acids aggregate to diminish their common hydro-
phobic surface. That is why the hydrophobic amino acids are predominantly
found in the inside of a folded protein. The polar groups of the amide bonds of
the main chain become saturated in the secondary structure by hydrogen bonds.
The side chains of polar amino acids are only found on the inside of a protein if
they can form a polar interaction with another amino acid in the vicinity. Other-
wise they orient themselves on the outside of a protein; they protrude into the
surrounding water. Proteins can also span a cell membrane. In those areas where
they have contact with the membrane they have a large, cohesive hydrophobic
surface (Sect. 14.7). If the packing density in the interior of the protein is
14.3 From Secondary to Tertiary and Quaternary Structure 297

COi – NHi+n NHi – COi+n Cαi – Cαi+n

3–6 2–6 4–6 Amino acids

Fig. 14.6 The polymer chain of a globular protein reverses its direction in the loop or turn area.
Numerous turn patterns have been found. They are made up of 2–6 amino acids. Normal turns
(left) form a C═OHN hydrogen bond (violet) in the direction of the polymer chain. This
hydrogen bond has a different order in inverse turns (middle). Another group of open turns
(right) is held together by van der Waals contacts and polar interactions.

considered, it is on the same scale as is found in crystals of small organic


molecules. The interactions that determine the molecular packing are identical
in both cases.

14.3 From Secondary Structure Via Motifs and Domains to


Tertiary and Quaternary Structure

Proteins organize their secondary structural segments in motifs. As an example


the sequence of an a helix, a b strand, and another a helix makes up one motif.
Multiple motifs fold into domains to yield the tertiary structure of a protein.
Domains can be preferably constructed from helices, pleated sheets, or
a combination of both building blocks. Often the domain has a particular function.
Many proteins are made up of a single domain. Complex proteins can be built
from multiple domains. If a complex assembly of multiple separate polymer
chains forms (e.g. as in hemoglobin), this will be referred to as quaternary
structure.
Despite the enormous multiplicity that can be achieved by combining the 20
amino acids into sequences, there seems to be a rather limited number of folding
possibilities for the domains. How many total folding patterns exist can be specu-
lated upon. Of all the crystal structures that are known today, 1,150 different folding
patterns have been found. Because no new examples have been found in the last years
despite intense efforts, it can be assumed that there are perhaps 1,200 stabile patterns.
This number is essentially based on data from globular enzymes and transport pro-
teins. Approximately 30% belong to one of the classes shown in Fig. 14.7. To date
perhaps only 100 structures are known from the group of membrane-bound proteins.
On the basis of these examples it seems difficult to make an estimate about possible
additional folding classes to be found in membrane proteins.
Drug design concentrates on the interaction of a ligand with a protein. Therefore
the structural considerations of chemists are usually limited to the amino acid
298 14 Three-Dimensional Structure of Biomolecules

Fig. 14.7 The course of the polypeptide chains is symbolized with spirals for a helices, with
arrows for b-pleated sheets, and with threads for different turn segments. Approximately 30% of
the structurally known proteins can be assigned to one of the nine shown folding classes. The first
folding pattern (bottom left) is a “TIM barrel,” and the one above is an open pleated-sheet structure.
14.4 Are the Fold Structure and Biological Function of Proteins Correlated? 299

groups that protrude into the binding pocket. The folding pattern in the vicinity
of the binding pocket, however, exerts an influence on the properties that are
found there. For example, a helix that is arranged toward the binding pocket
decisively determines the local electrostatic potential. Even this can be exploited
for the design of selective ligands that bind only to proteins of a particular
folding class.
Despite progress in the methods of structure determination techniques, it can
occur that the structure analysis of an important protein fails, but the structure of,
for example, a related protein can be solved. A model of the desired protein can be
built on this basis (▶ Sect. 20.5). Information about the construction and folding
principles of proteins are needed for this purpose. They allow the understanding of
what part of the protein stabilizes the scaffold, what parts determine functions, and
what parts make up the differences between homologues.
An in-depth discussion of these principles would go too far here. As an example,
the folding pattern of the b barrel should be examined. A stretched-out sheet of
multiple b strands has an internal twist (cf. Fig. 14.5). If, as an example, eight such
strands are lined up next to one another, a cylinder is formed. This barrel-like
folding pattern of eight and more strands is often observed. Several variations of
this folding pattern are displayed in Fig. 14.8 that show how, and according to
which principles, a polypeptide can spatially fold.
A loop acts as a connecting element between the pleated sheet strands of the b
barrel in the example in Fig. 14.8. a Helices can also serve as connecting elements
(Fig. 14.7). A barrel-like structure forms on the surface, and the bridging a helices
align on its surface. This folding pattern was first discovered in triosephosphate
isomerase. It is therefore called a TIM barrel (Fig. 14.7). Another important
folding class that is made up of a-helical and b-pleated sheet segments are the
open-sheet structures (Fig. 14.7). In this class the pleated sheet does not close to
a cylinder but rather it remains open. Helices group above and below the sheet.

14.4 Are the Fold Structure and Biological Function of Proteins


Correlated?

How is the structure of a protein coupled to its function? Do all proteases, for example,
display the same folding pattern? A large number of enzymes that have distinctly
different functions all belong to the TIM barrel type, or the open-sheet structure.
There are many oxidases, isomerases, kinases, aldolases, synthases, dehydrogenases,
or proteases that can be assigned to these two classes. Here, Nature started from
a common origin and developed divergently. Consequently, the function of a protein
is not necessarily coupled to a particular folding pattern. If the construction of the
enzyme is analyzed further, it turns out that the catalytic sites of the proteins of
a folding class are at the same position. This is found at the C terminal end of the
barrel in the TIM-barrel structure, and at the topological switch of the connecting
helices from the upper to the lower side of the open-sheet structure (Fig. 14.9).
300 14 Three-Dimensional Structure of Biomolecules

C N
C
N
C

3 2

4 1 C N
4 1 2 3 N C C
C N
N
c

5 4

6 3

7 2
C
8 1 N

Figure 14.8 The folding pattern of different b-barrel structures can be thought of as
a polymer chain with eight separate b strands (arrows). These are separated by loop areas.
(a) An up-and-down barrel forms when the folding of the polymer chain of eight b strands follows
a zigzag pattern. The antiparallel sections form hydrogen bonds between themselves that close
up to form a cylinder. (b) The four-b-strand polypeptide chains lie next to one another so that the
first chain interacts with the fourth, and the second interacts with the fifth. Then the double
strand folds and the first pair comes to lie next to the second. Because the course of the polymer
chain is reminiscent of the engravings on Greek vases, the pattern is called a Greek key. Two
such patterns can come together into a cylinder-like orientation and form a Greek-key barrel.
(c) Another folding pattern is formed from a double strand that is placed together with
an internal twist. The double strand wraps itself into a cylinder-like structure that is called a
jelly roll.
14.4 Are the Fold Structure and Biological Function of Proteins Correlated? 301

Fig. 14.9 The folding-pattern-determining and function-carrying amino acid groups are found in
proteins in different regions. (a) The catalytic site (yellow spheres), which binds and transforms
substrates lies in a TIM-barrel-type structure (a helices: red cylinder, b strands: light-blue arrows)
at the end of the barrel where one would expect to find a lid. The loops of the polymer chain that
surround this “lid” (gray and green threads) carry the function-determining amino acids. (b) The
function-determining amino acids in the loop area occur in the open-pleated-sheet structure there,
where the attached helices change from the top to the bottom of the pleated sheet.

The function-determining amino acids occur in the loop area between neighboring
pleated sheets and helices. Why would Nature follow this principle of separating the
folding structure from the function? The amino acids that enable the stable folding of
a domain are separated from those that induce a specific function. This approach is
a very efficient evolutionary strategy. Two areas were simultaneously optimized:
• The stability of the protein scaffold in special folding patterns
• The layout of the amino acid sequence to serve a special function.
Spatially separating and displacing the function-carrying groups in the structur-
ally less-committed loop areas allowed the two tasks to be optimized in parallel.
Exchanging a single amino acid in a secondary structure element could destabilize
the entire folding pattern and stop the folding. This is avoided if the amino acid
sequence that is to be functionally optimized is placed on a stable scaffold that does
not interfere with the optimization.
A protein class that implements this principle to perfection are the immunoglob-
ulins. As antibodies they recognize and bind to xenobiotics, the antigens. To remove
an antigen, immunoglobulins with highly specific binding pockets and high affinity
must be available within a few days. The recognized substances could be anything
from small organic molecules to large proteins. Despite this, it is estimated that
about 1012 different variable sequences are formed based on only about 25,000
human genes. The difficult task of achieving such high diversity is solved by
immune-system cells by using a combination of different variable gene segments
and excessive amino acid exchange in these segments during lymphocyte maturation.
In this way, variable loop areas are formed that are set upon a stable scaffold of
302 14 Three-Dimensional Structure of Biomolecules

Fig. 14.10 The immunoglobulins form a highly specific binding pocket in which they recognize
antigens, which are exogenous substances. The enormously large structural variety of these binding
pockets is achieved by variations in the amino acids in the loop areas. The immunoglobulins
have a Y-like form that is divided into a trunk (constant Fc domain) and two identical Fab branches
(a). The course of the polymer chain in these branches corresponds to the barrel type. The antigen-
binding site is indicated by an arrow. Picture (b) is an enlargement of the circled branch in
(a). Loops are found at the right end (colored) that are responsible for the recognition of exogenous
substances. They grasp the antigen (here dark red) like the fingers of two hands.

barrel-like pleated sheet structures (Fig. 14.10). The therapeutic value of such bio-
molecules (so-called biologicals) has been recognized. Many humanized antibodies
can be found in development as therapeutics (▶ Sect. 32.3).

14.5 Proteases Recognize and Cleave Substrates in


Well-Tailored Pockets

Proteases cleave polypeptide chains during enzymatic degradation or upon the


release of an active protein or peptide from an inactive precursor form. For this,
the enzymes possess a catalytic site in which the cleavage takes place (Sect. 14.6
and ▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”;
▶ 24, “Aspartic Protease Inhibitors”; and ▶ 25, “Inhibitors of Hydrolyzing
Metalloenzymes”). To recognize a particular substrate specifically, multiple
binding pockets are on their surface. These are structurally complementary to
the side chains of the substrate that orient themselves around the catalytic sites.
In 1967 Israel Schechter and Arieh Berger proposed a system of nomenclature to
describe these pockets (Fig. 14.11). The position of the amino acids of the
peptide substrate are described as P3, P2, P1 P10 , P20 , P30 and so forth. Starting
at the N terminus, the position P1 is immediately before and the position P10 is
immediately after the cleavage site. The binding pocket of the enzyme for the
side chain of the amino acid P1 is called S1, and the same goes for the other side
chains. This very useful nomenclature is initially purely formal. The translation
of these labels to a particular enzyme does not mean that the named binding
pocket really exists. Two binding pockets can appear as one large binding pocket
14.6 From Substrate to Inhibitor: Screening of Substrate Libraries 303

S3 S1 S2⬘

R3 P2O R1 P1⬘ O R2⬘ P3⬘O


H H H
N N N
N N N
H H H
P3 O R2 P1 O R1⬘ P2⬘O R3⬘

S2 S1⬘ S3⬘

Fig. 14.11 The side chains of a peptide substrate and the binding pockets that they belong to them
are classified on the N-terminal side of the peptide as P3, P2, P1. . . or S3, S2, S1. . . (left); on the
C-terminal side they are classified as P10 , P20 , P30 . . . or S10 , S20 , S30 . . . (right).

in the 3D structure. The S3 and S4 binding pockets in the serine protease thrombin
are really only one large pocket (▶ Sect. 23.3). It can also happen that a substrate
amino acid has no complementary binding pocket in the enzyme. It then pro-
trudes into the water.

14.6 From Substrate to Inhibitor: Screening of Substrate


Libraries

Peptides are easily synthesized with enormous diversity (▶ Sect. 11.5). If the
peptide is attached to a probe that changes its color or fluorescence upon release
(▶ Sect. 7.2), the labeled peptide can be used to ascertain the substrate profile of
the protease. For this purpose a large library (▶ Sect. 11.1) of these peptides is
offered to the protease, and the members that are well cleaved are identified. In
Fig. 14.12 the amino acid composition of a labeled tetrapeptide is given that is
preferably cleaved by the proteases trypsin, factor Xa, plasmin, and chymotryp-
sin. Peptides with basic groups such as arginine or lysine are preferably cleaved
by trypsin, plasmin, and factor Xa. Factor Xa converts peptides with arginine in
the P1 position almost exclusively. Chymotrypsin behaves entirely differently. It
prefers to have aromatic amino acids such as tyrosine, phenylalanine, and tryp-
tophan in the P1 position. The selectivity at the positions P2 to P4 is not nearly as
pronounced. Trypsin transforms tetrapeptides that have branched groups at P2
such as Phe, Tyr, Trp, Ile, or Val much more poorly if an arginine is at the P1
position. Basic groups are also less preferred. Trypsin shows virtually no selec-
tivity at the P3 and P4 positions. Factor Xa has a particular preference for the small
glycine at position P2, but hardly any difference at all is seen for the groups in the
304 14 Three-Dimensional Structure of Biomolecules

a
Trypsin Faktor Xa

F K H D E N Q S T Y RWG A P V I L n F K H D E N Q S T Y RWG A P V I L n
NH2
O
O P4 H O P2 O
H
N N
N N N O O
H O P3 H O H
P1

Plasmin Chymo-
trypsin

F K H D E N Q S T Y RWG A P V I L n F K H D E N Q S T Y RWG A P V I L n

b
Trypsin P4 P3 P2

R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n
NH2

O
O P4 H O P2 O
H
N N
N N N O O
H O P3 H O H
P1- constant
NH

H2N NH

Faktor Xa P4 P3 P2

R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n

Fig. 14.12 A tetrapeptide library, held constant in position P2 to P4, was varied at position 1 with
19 amino acids (one-letter notation; n norleucine). It is cleaved by trypsin after arginine and lysine,
by factor Xa after arginine, and by plasmin after lysine (a). If arginine is held in position P1 and the
remaining three positions are varied, trypsin shows practically no selectivity for the amino acids at
P2, P3, and P4. On the other hand, factor Xa prefers a glycine in position P2 (b).
14.7 When Crystals Learn to Walk 305

P3 position for this enzyme. On the other hand, different groups in the P4 position
are more strongly selected. The substrate-binding profile helps to expose the
selectivity characteristics of enzymes. They display the complementary proper-
ties in the binding pocket and help to inspire the first ideas about the design of
imaginable inhibitors.
This concept was applied to cysteine proteases in the research group of
Jonathan Ellman at the University of California at Berkley. Substrate molecules
were synthesized that carried a fluoresence marker at the end of an amide bond
that was to be cleaved. Different organic building blocks were placed on the other
side. If such a substrate molecule is cleaved by the protease, the organic part
must be bound in the binding pocket of the enzyme. Therefore, the transformation
indicates the binding of a test molecule. The method can be optimally used
for screening. A hit that is discovered in this way can easily be chemically
transformed from a substrate molecule to an inhibitor. If the cleaved amide bond
is replaced with, for instance, an aldehyde function, a cysteine-protease inhibitor
(▶ Sect. 23.9) can be developed that has very little in common with the peptide
substrate.

14.7 When Crystals Learn to Walk: From Static Crystal


Structures to Dynamics and Reactivity

What kind of information about the dynamics and reactivity of molecules can be
extracted from a crystal structure? Molecular vibrations are visible even in the solid
state. This is reflected in the blurriness of the electron density. If a molecule takes
part in a reaction, bonds are broken and new ones are formed. The formation and
cleavage of amide bonds is a central task in biochemical processes. The molecule
14.2 contains an amide and an ester group (Fig. 14.13). If a crystal of this compound
is exposed to thermal energy, a reaction takes place in the solid state to form 14.3.
The molecule is in a geometry in the incipient crystal structure that is conducive for
entry into the reaction pathway.
Having information about changes in the geometric orientation of functional
groups in the chemical reaction is decisive for understanding the concomitant
structural changes that occur. This knowledge is a prerequisite for the design of
transition-state-analogue inhibitors (▶ Sects. 6.6 and ▶ 22.3). In view of the for-
mation or cleavage of an amide bond, the question is posed: from which direction
does the amino group attack the carbonyl carbon in the course of the nucleophilic
addition to form a new bond?
In the early 1970s Hans-Beat B€ urgi and Jack Dunitz began to extract information
about the geometric changes along such reaction steps from crystal structures. Before
there were movies and television, people developed creative ideas to bring pictures to
movement, for example, with flip-books (Fig. 14.14). These impart the impression of
the dynamic sequence of a story. Let us imagine that because of frequent use, the
pages of the little book have fallen apart and are now in disarray. You must bring
them into the correct order again. Ordering criteria are needed in this case. A similar
306 14 Three-Dimensional Structure of Biomolecules

C(9)
N(1)
C(8)
C(7)
O(3)
a O O(2)
CH3
O(1)
O H2N HN
O O HO O C(2) C(1)
CH3

14.2 14.3 C(3)


C(6)

C(4) C(5)

Fig. 14.13 If thermal energy is applied to a crystal of 14.2, the carbonyl group of the ester
function reacts with the amide NH2 group and an imide bond is formed between N1 and C8 to give
14.3 (a). There must be implied vibrational motion (b) that ends in the reaction. Simultaneously
the ester bond between C(8) and O(2) is cleaved during the reaction steps.

task is posed for the organization of structural data to describe a reaction. Particular
crystal structures are sought from databases of known crystal structures (▶ Sect. 13.9)
in which an amino group is in the vicinity of a carbonyl group, as in the structure of
14.2. Finally they are brought into a logical order (Fig. 14.15).
The systematic comparison of crystal structure data affords a first understanding
of structural molecular properties, for instance, about the preferred conformation
(▶ Sect. 16.4). The geometry of non-covalent interactions can also be evaluated this
way. The side chain of the amino acid histidine contains an imidazole ring with its
two nitrogen atoms. In the neutral state one of these nitrogen atoms is a hydrogen-
bond acceptor, and the other is a donor. There are hundreds of molecules with an
imidazole ring in the database of low-molecular-weight crystal structures. In these
structures the imidazole ring has, in fact, acceptor and donor interactions, usually
with neighboring molecules. All these structures are superimposed upon one another
based on their common imidazole ring (Fig. 14.16). It shows in which spatial
direction the imidazole nitrogen atom’s hydrogen-bonding partner is found. The
task of estimating the possible interaction positions in the binding site of the protein
for the functional groups of a ligand is undertaken in the course of de novo drug
design (▶ Chap. 20, “Protein Modeling and Structure-Based Drug Design”). Fur-
ther, this information is needed for comparing the binding properties of molecules
(▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Comparisons”) or for the
exploration of binding pockets for their preferred ligand-binding sites (hot spots).
14.7
When Crystals Learn to Walk

Fig. 14.14 A story is shown in static pictures in flip-books. If the different pages of this story flip past the eyes quickly enough, the impression of a dynamic
process is given.
307
308 14 Three-Dimensional Structure of Biomolecules

Fig. 14.15 The formation or cleavage of an amide bond occurs by nucleophilic addition.
A nucleophile, for instance, an oxygen or nitrogen atom, approaches the planar carbonyl carbon
atom. During the reaction it rises out of the plane of the three neighbors and adopts a tetrahedral
configuration. Examples were sought from low-molecular-weight crystal structures in which
a nitrogen atom approaches a carbonyl group between a single bond and a van der Waals
contact in the crystal packing. By superimposing these data it is recognizable that the approach
of the nucleophilic nitrogen towards the carbonyl group is “perfomed from back and behind.” With
this approach the carbon migrates out of the plane in the direction of the nucleophile. The
geometry of this reaction step also determines the structural composition of the catalytic center
of a variety of hydrolases (▶ Sect. 22.3).

The database Isostar, assembled at the Cambridge Crystallographic Data Centre,


holds numerous such contact geometries and spatial distributions available.

14.8 Solutions to the Same Problem: Serine Proteases with


Differing Folds Have Identical Function

It was shown in Sect. 14.4 that the amino acids that determine the folding
and function of a protein occur in separate parts of the structure. For enzymes
with the same function, Nature has come to the same solution, however, by different
folding.
The function and therapeutic meaning of serine proteases will be discussed in
more detail in ▶ Chap. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme Inter-
mediate”. A unit of three amino acids, the so-called catalytic triad, plays a key role
in accelerating the hydrolysis of amide bonds by these enzymes. The two amino
acids serine and histidine, and an acidic amino acid, such as aspartic or glutamic
acid, are found in a characteristic spatial orientation. They are defined by the narrow
14.9 DNA as a Target Structure of Drugs 309

Fig. 14.16 The crystal packing of low-molecular-weight compounds affords an overview of


possible interaction geometries of hydrogen-bond donors (left) and acceptors (right) around the
nitrogen atoms of an imidazole ring. Accordingly all structures with an imidazole ring were sought
in which at least one of the two nitrogen atoms participates in a hydrogen bond. The superposition
of the structures shows where the positions of the interacting partners can be expected.

boundaries that are established by the reaction geometry required for a nucleophilic
addition (Sects. 14.6 and ▶ 23.2). Their composition is ideally suited for the
cleavage of amide bonds.
The enzyme trypsin is constructed from two barrel-like subunits (Fig. 14.17a).
The catalytic site is located at the interface of these two subunits. Subtilisin is
another serine protease that belongs to the class of open-pleated-sheet structures.
The catalytic triad occurs in a loop area at the edge of the pleated sheet
(Fig. 14.17b). If the amino acids that are involved in the catalysis are removed
from the protein and superimposed in space, the identical geometry of the triad is
obvious. In addition to the mentioned enzymes, this catalytic triad is also encoun-
tered in lipases and esterases (▶ Sect. 23.7), which also cleave peptide or ester
bonds. Although they display divergent scaffold folding, the geometric orientation
of their triads is once again identical.

14.9 DNA as a Target Structure of Drugs

Our genetic information is encoded on the DNA molecule. It is a thread-like


molecule approximately 20 Å in diameter and reaches a length of up to 2 m in
the extended form. It is constructed as a double helix (Fig. 14.18). On the outside,
a polymer chain of sugar and phosphate building blocks tighten themselves
like a guardrail around the base pairs. The latter bases form a complementary pair
on each step. Base pairs are coupled between one another by a hydrogen-bond
pattern. In doing so, a purine base (adenine A and guanine G) always interacts with
310 14 Three-Dimensional Structure of Biomolecules

Fig. 14.17 Trypsin (a, red) and subtilisin (b, green) are serine proteases. They have the same
catalytic triad of serine, histidine, and aspartic acid. These function-determining amino acids are,
however, placed upon entirely different folding patterns. In the above-right picture, the course of
the chain of both proteins is superimposed upon one another (c). Despite this, the side chains of the
amino acids of the catalytic triad are in the same spatial position (d). The course of the polymer
chains are shown with colored ribbons that represent the spatial orientation of side chains of the
three catalytic amino acids.

a pyrimidine base (cytosine C and thymine T, in the related RNA molecule,


thymine is replaced by uracil U; Figs. 14.18 and 14.19). The spiral staircase that
is formed has a pitch of 34 Å and reaches a full turn after ten steps. The two
mutually wound polymer strands form two grooves of different sizes on their
surfaces (Fig. 14.18). If the DNA is examined from the side along the steps
at the major and minor groove, the characteristics of the base pairs will be visible.
There are three functionalities in the minor groove that determine the interaction
with other molecules. In the major groove there are four. Interestingly the pattern
that is read in the major groove is unambiguous because of the exposed properties
for each base pair on a step. Only the difference between either AT/TA or GC/CG
can be distinguished in the minor groove (Fig. 14.19).
The base pairs on each three neighboring steps code for an amino acid
(▶ Sect. 32.6). To read this information unambiguously from the DNA, proteins
that regulate gene expression (so-called transcription factors) read the information
14.9 DNA as a Target Structure of Drugs 311

a b c

e
ov
ro
jo rg
ma

e
ov
ro
rg
no
mi

Fig. 14.18 The DNA molecule is built of single stair steps. A base pair forms each step.
The sugar phosphate chain suspends the steps like a double banister. It forms a major and
a minor groove on the surface. (a) A segment of DNA with 14 base pairs, (b) a schematic
representation with the sugar phosphate backbone as a gray arrow, thymine (light blue) adenine
(red), cytosine (violet), and guanine (light green). (c) A model of a DNA surface in which the
size difference between the minor and major grooves is emphasized. The individual bases
align according to their interaction properties (blue: H-bond donor, red: H-bond acceptor, gray:
hydrophobic contact).

from the major groove, from the side (cf., ▶ Sect. 28.2). Only there is it possible to
read the prescribed code (AT, TA, GC, CG) unambiguously. Due to the many
outwardly oriented phosphate groups, the DNA molecule is heavily charged.
This charge is neutralized by the formation of ion pairs, mostly with magnesium.
Because of its important role in the mediation of genetic information, several
important drugs act on DNA. Two examples are briefly mentioned here. Cisplatin
14.4 is a reactive metal complex that can react with the nitrogen atoms of
two nucleobases on two adjacent steps of the DNA by exchanging both chlorine
substituents (Fig. 14.20). This crosslinking distorts the DNA in such a way that the
sequence information is no longer readable. Cisplatin and analogous derivatives such
as carboplatin are used in cancer therapy as potent chemotherapeutics. Daunorubicin
14.5 is a representative with a somewhat different mode of action, but it also prevents
the reading of the DNA base pairs. By slightly spreading the DNA along the chain the
planar molecular part of 14.5 slips largely between two adjacent base pairs and causes
a structural distortion of the DNA (intercalation). This intravenously administered
cytostatic is used as a combination scheme therapeutic for the treatment of acute
leukemias. Many natural products also use this so-called intercalation mechanism for
312 14 Three-Dimensional Structure of Biomolecules

major major
H
H
N O H N H
N N H O CH3
O N G N H N C O N
N N
A N H N T
N N
N H O O
H O O
H

minor minor
G•••••C A•••••T

major major
H H
H N H O N H3C O H N N

N N O
C N H N G O T N H N A
N N N N
O O H N O O H
H

minor minor
C•••••G T•••••A

Fig. 14.19 The DNA base pairs of cytosine (C) with guanine (G) and thymine (T) with adenine
(A) on the individual steps are formed by complementary hydrogen bonds. Each base carries a sugar
phosphate group that is coupled with the polymer chain. It affords a double-helical construction
with a minor (green) and major (yellow) groove (cf. Fig. 14.18). If viewed from parallel to the steps,
four groups can been seen in the major groove that possess either hydrogen bond donors (blue),
acceptors (red), or hydrophobic properties (gray). Three such groups are aligned in the minor
groove. If an attempt is made to read the interaction pattern from this side, a GC or CG pair and a AT
or TA pair are recognized as identical. Here the orientation of the interaction pattern cannot be
distinguished. In the major groove, on the other hand, the pattern of exposed interaction is
unambiguous. Therefore, proteins read information about the DNA from the major groove.

their antibacterial activity spectrum. Other pharmaceutical research approaches try to


use segments of DNA themselves for therapy. Such modified-oligonucleotide thera-
peutics are discussed in ▶ Sect. 32.4.

14.10 Synopsis

• Every third bond in the polymer chain of a protein is an amide bond. It is the
fundamental building block in the protein backbone and the mutual spatial
arrangement of the sequential planar amide bonds determines the overall archi-
tecture of a protein.
14.10 Synopsis 313

Fig. 14.20 Crystal structure of an oligomeric DNA segment after a reaction with cisplatin 14.4
(a) or intercalation with daunorubicin 14.5 (b). In both cases the DNA molecule is severely
distorted and the genetic information on the DNA cannot be read for cell division. Cisplatin reacts
with the nitrogen atoms of two nucleobases (here guanine) of the DNA on neighboring steps with
substitution of both chlorine atoms. With its planar tetracyclic ring system, daunorubicin interca-
lates between two neighboring base pairs by spreading the DNA along the helix axis. The
compound’s amino sugar accommodates in the DNA minor groove.

• Typical arrangements involving the amide NH and C═O groups in hydrogen


bonds lead to a-helical and b-pleated sheet structures. Reversal of the polymer
chain in space is achieved in turns that can adopt a variety of distinct geometries.
• Helices, sheets, and turns, the secondary structure elements, assemble into
motifs and domains to form the tertiary and quaternary structure of proteins.
• The function of a protein is not necessarily coupled to a particular folding
pattern, however, the catalytic and ligand-functional sites within a folding
class are found at the same position.
• Nature separates fold-stabilizing residues from function-carrying amino acids to
keep the dual optimization problem separated.
• Proteases recognize peptide sequences specifically via the binding in well-
tailored pockets on both sides of the cleavage site.
• Peptide libraries with an attached photometric or fluorescent label that can be
cleaved by the protease reaction help to elucidate the substrate profile of different
proteases.
314 14 Three-Dimensional Structure of Biomolecules

• Structural arrangements of molecular portions found in multiple crystal struc-


tures can be arranged sequentially in a kinematic order to provide an idea of
a dynamic process.
• The spatial arrangement of amino acid residues exerting a particular chemical
transformation is highly conserved and can reside on protein architectures with
similar geometry that are constructed from deviating folds.
• The DNA molecule encodes our inheritance and forms a double helix of two
banister-like sugar-phosphate polymer chains wrapping around complementary
pairs of bases on successive steps. Through the H-bonding pattern of the central
bases each single DNA strand is complementary to the second strand. A minor
and a major groove are formed between the sugar-phosphate banisters. A unique
reading of the coding base pairs can be accomplished from the major
groove only.

Bibliography

General Literature
Branden C, Tooze J (1999) Introduction to protein structure, 2nd edn. Garland, New York
Bürgi HB, Dunitz JD (1994) Structure correlation, vol 1. VCH, Weinheim
Jeffrey GA, Saenger W (1991) Hydrogen bonding in biological structures. Springer, Berlin
Schulz GE, Schirmer RH (1978) Principles of protein structure. Springer, New York

Special Literature
Allen FA, Kennard O, Taylor R (1983) Systematic analysis of structural data as a research
technique in organic chemistry. Acc Chem Res 16:146–153
CSD Database: www.ccdc.cam.ac.uk/products/csd/
Klebe G (1994) The use of composite crystal-field environments in molecular recognition and the
de novo design of protein ligands. J Mol Biol 237:212–235
Koch O, Klebe G (2008) Turns revisited: a uniform and comprehensive classification of normal,
open, and reverse turn families minimizing unassigned random chain portions. Proteins: Struct
Funct Bioinform 74:353–367
Lario PI, Vrielink A (2003) Atomic resolution density maps reveal secondary structure dependent
differences in electronic distribution. J Am Chem Soc 125:12787–12794
Orengo CA, Jones DT, Thornton JM (1994) Protein superfamilies and domain superfolds. Nature
372:631–634
PDB Database: http://www.rcsb.org/pdb/home/home.do
Vyas K, Monahar H, Venkatesan K (1990) Thermally induced O to N acyl migration in
salicylamides. Thermal motion analysis of the reactants. J Phys Chem 94:6069–6073
Wood VJL, Patterson AW et al (2005) Substrate activity screening: a fragment-based method for
the rapid identification of nonpeptidic protease inhibitors. J Am Chem Soc 127:15521–15527
Molecular Modeling
15

Molecules are most commonly communicated in chemistry as two-dimensional


molecular representations. This formalism is tried and true and has proven to be
enormously fruitful. The ability of a chemist to quickly comprehend and intellec-
tually process structures should not be underestimated. The notation nonetheless
has its limitations. In particular, the three-dimensional shape of a molecule is not
directly apparent from the chemical formula. The geometry, however, is of great
importance for the physical, chemical, and biological properties of drugs and
consequently for drug design as well. Therefore structure determination (▶ Chap.
13, “Experimental Methods of Structure Determination”) is granted special impor-
tance. Whenever possible, the experimentally determined 3D structure of the active
substance and the target protein is consulted to explain the structure–activity
relationship. That notwithstanding, there is often the problem that these structures
are not always available. In these cases, the explanation for the experimental results
is limited to the structural consideration of generated models.

15.1 3D Structural Models as Well-Established Tools in


Chemistry

Three-dimensional structure models have been used since Jacobus H. van’t Hoff
and Joseph Le Bel. Emil Fischer reported in his book Aus meinem Leben about
a vacation in Italy:
In the previous winter 1890/91 I was busy with the task of clarifying the configuration of
sugar, without entirely achieving my goal. Then the thought came to me in Bordighera that
the decision about the configuration of pentose has to do with its relation to trioxyglutaric
acid. Unfortunately for lack of a model I could not tell to what extent such acids are
possible according to theory and I therefore posed the question to Baeyer. He picks up such
things with great enthusiasm, and directly constructed carbon atoms from balls of bread
and toothpicks. But after many attempts he gave the cause up, ostensibly because it was too
hard. Later in W€urzburg after considering good models at length, I managed to find the
conclusive solution.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_15, 315


# Springer-Verlag Berlin Heidelberg 2013
316 15 Molecular Modeling

Linus Pauling was the first to propose the a helix as a secondary structure in
proteins.
The key to Linus’s success was his reliance on the simple laws of structural chemistry. The
a-helix had not been found by only staring at X-ray pictures. The essential trick, instead,
was to ask which atoms like to sit next to each another. In place of pencil and paper, the
main working tools for this work were a set of molecular models superficially resembling
the toys of pre-school children.

With these sentences the Nobel prize winner James Watson described the
approach of Pauling in his book The Double Helix. Pauling’s success was also
based upon well-founded proficiency in theoretical chemistry. That is how Pauling
knew that an amide bond is stiff and flat, whereas his rivals, William Bragg, Max
Perutz, and John Kendrew, were of the misconception that they would be flexible.
James Watson and Francis Crick went the same way as Pauling in the search for the
DNA structure:
We could thus see no reason why we should not solve the DNA problem in the same way [as
Pauling]. All we had to do was build a set of molecular models and begin to play—with luck
the structure would be a helix.

Working with molecular models must not have been pure pleasure back then.
In one place in the book, for example, he writes:
Our first minutes with the models, though, were not joyous. Even though only about fifteen
atoms were involved, they kept falling out of the awkward pincers set up to hold them the
correct distance apart.

Later other problems were talked about:


No serious models were built, however, for several days. Not only did we lack the purine
and pyrimidine components, but we had never had the shop put together any phosphorus
atoms. Our machinist needed at least three days merely to turn out the simplest phosphorus
atoms. . .

Based on this background the achievement of Watson and Crick seems even
more impressive. They were awarded the Nobel Prize in 1962 for the elucidation of
the double-helix structure of DNA. This example should underscore the importance
of models in science. To end with a word from Francis Crick: “A good model is
worth its weight in gold.”

15.2 Strategies in Molecular Modeling

In contrast to the 1950s and 1960s, computers are available today with impressive
graphical performance and high computing speed. Accordingly, programs are
available for working with molecular models. The new field of molecular model-
ing has been established. This term encompasses the display and manipulation of
realistic three-dimensional molecular structures along with the calculation of their
physicochemical properties. The most important methods that are employed in the
context of molecular modeling are summarized in Table 15.1.
15.2 Strategies in Molecular Modeling 317

Table 15.1 Overview of the Technique Objective


most important molecular
modeling approaches in Interactive computer Display of 3D structures
pharmaceutical research graphics
Modeling small molecules 3D Structure generation
(CONCORD, CORINA)
Molecular mechanics—force
fields
Molecular dynamics
Quantum mechanical techniques
Conformational analysis
Calculation of physicochemical
properties
Comparing molecules Superimposition of molecules
according to their similarity
Volume comparisons
3D-QSAR (e.g., CoMFA
methods)
Protein modeling Sequence comparisons
Protein homology modeling
Protein-folding simulations
Modeling of protein–ligand Binding constant calculations
interactions Ligand docking
Ligand design Searches in 3D databases
Structure-based ligand design
de novo design
Virtual screening

In principle, molecular modeling can be approached from two sides. One


possibility is to extrapolate the geometry and physicochemical properties to the
investigated structure from known experimental data. In the other approach an
attempt is made to obtain as accurate a computed prediction as possible by starting
from first principles. Quantum chemical methods and force-field calculations
belong to this strategy. In practice both approaches are used in parallel and are
increasingly coupled to one another. When relevant experimentally determined
structures are available, it would be silly not to use these for the model construction.
On the other hand, the quantum chemical and molecular mechanical approaches are
broadly applicable and deliver reliable results.
The construction of a structural model is achieved in three steps:
• Generation of a starting model
• Optimization and analysis
• Work with the model.
It is advisable to stay as close to experimental structures as possible when
generating the starting model. For this, the crystal structure of an active
substance can be consulted. The Cambridge Crystallographic Database, in which
experimentally determined structures of small molecules are stored, is searched,
318 15 Molecular Modeling

and the geometry of the resulting hits most closely resembling the query molecule
are used. In the next step the molecule is optimized by a force-field calculation.
There are also standard programs for the generation of starting models that
translate a 2D structure formula into a 3D spatial structure according to the
principle of a molecular model kit. These “electronic molecule-construction kits”
have lists of bond lengths and angles as well as preferred fragment geometries
stored, and build molecules according to a sophisticated system of rules. In frac-
tions of seconds they determine the 3D spatial structure for the 2D structural
formula. The program CONCORD from Robert Pearlman in Austin, Texas, and
CORINA from Johann Gasteiger and Jens Sadowski at the University of Erlangen
are among the most important. Both programs are used to generate 3D structures of
small molecules. The 3D structure of a protein, however, cannot be built with these
programs. More sophisticated techniques are necessary for proteins (▶ Sect. 20.1).

15.3 Knowledge-Based Approaches

Perhaps the most often used technique for molecular modeling is the so-called
knowledge-based approach. Here an attempt is made to exploit the enormous
accumulated knowledge from experimentally determined molecular structures,
crystal packings, protein structures, protein sequences, and structure–activity rela-
tionships from protein–ligand complexes, etc., to efficiently solve the relevant
problem. Basically nothing more is done here than to imitate the approach that
a conscientious scientist would take with a computer program. Initially as much
experimental data as possible is collected and analyzed. Important information
sources are the Cambridge Crystallographic Database with over 500,000 crystal
structures of small molecules as well as the protein databank (PDB) with more than
80,000 protein and DNA structures. Physicochemical properties are also available
in databases. The Beilstein database, with almost 10 million chemical structures,
contains, for example, pKa values for more than 20,000 compounds. The challenge
lies in the extraction of the necessary data for the question at hand from the
enormous plethora of electronically available information. Furthermore, it must
be considered that the data comes from different sources and could be partially
erroneous.
The largest growth in electronically available data recently has occurred in the
area of DNA sequences. Hundreds of genomes have been sequenced, and new ones
are added weekly. The nearly endless number of sequences can only be conquered
with intelligent searching protocols. Knowledge-based approaches play a central
role in this area and in the modeling of protein structures.

15.4 Force-Field Methods

Force-field methods, also known as molecular mechanics, are empirical tech-


niques for the calculation of molecular geometries. The goal of a force-field
15.4 Force-Field Methods 319

calculation is the determination of an energetically favorable three-dimensional


structure of a molecule, or of a complex of several molecules. The forces that act
between the atoms are described in an analytical form with the appropriate param-
eters. Covalent and non-covalent forces are considered. The central idea of molec-
ular mechanics is the assumption that the bond lengths and angles adopt values that
are close to standard values in molecules. Steric interactions, that is, the repulsion
of two atoms that are not directly connected to one another, can lead to the situation
that some bond lengths and angles cannot adopt their ideal values. These repulsive
interactions are also called van der Waals interactions. For the first time in 1946,
three terms, van der Waals interaction, bond stretching, and angle deformation,
were proposed that should be enough to calculate the structure and energy of
molecules. However, at that time the execution of such calculations was extremely
difficult. It was only after the availability of computers increased that molecular
mechanics calculations gained importance. In addition to the three originally
proposed terms, a typical force-field that is used today contains at least one
additional contribution that considers rotations around the dihedral angles
(Fig. 15.1). Furthermore, many force-fields use terms for electrostatic interactions.
For this, a partial charge must be assigned to each atom. The sum of these charges
results in the formal charge of the entire molecule. In most cases, this is set to zero.
Coulomb’s law is used to describe the forces that occur between charges. This
law states that the product of interacting charges is inversely related to the square of
the distance between them, or considering the potential, it is inversely related to the
distance. The assignment of the charges and the correct choice of dielectric constant
is critical for the correct treatment of electrostatic energy contributions. These
values are in the denominator of Coulomb’s law and can adopt values between
e ¼ 80 for water and e ¼ 1 for vacuum. With this, the electrostatic interactions in
water are very quickly damped, whereas in a vacuum they tend to reach further. The
choice of the correct dielectric constant for force-field calculations in proteins is
very difficult. Many values between e ¼ 4 and e ¼ 20 have been tried. The constant
is sometimes assumed to be environmentally dependent so that larger values are
chosen next to the surface than for the protein interior. The van der Waals
interactions are described by the Lennard–Jones potential. This interaction has an
attractive term that falls at a rate of 1/r6, and a repulsive term that falls at a rate of
1/r12 (Fig. 15.1). The result of the combination of these terms is a gradient that is
very large near the atoms, and that approaches zero the larger the distance becomes.
In the middle it passes through a potential energy minimum (▶ Fig. 18.5). In
addition to the A/r6–C/r12 gradient, other distance dependencies with other poten-
tials or exponential gradients in force-fields are used.
A force-field is derived by calibrating upon the experimental data and upon the
results of high-level quantum mechanical calculations. For this the 3D structures of
small molecules as well as infrared and Raman spectroscopy-derived force con-
stants are used. It is clear that different parameters must be used for a single bond
between two carbon atoms than for a double bond. Therefore multiple different
atom types per element are used in a force-field. The crystal packing of small
organic molecules can be consulted for nonbonding interactions. Amino acids and
320 15 Molecular Modeling

E = EBond length + EBond angle + ETorsion + ENon-covalent

E=
1

2 Bonds
Kb (b − b0)2

+
1
∑ KΘ(Θ − Θ0)2
2 Bond angle

+
1
2 ∑ KΦ(1+cos(n Φ − d )2
Torsion angle

+ ∑ (Aij rij−12 − Cij rij−6 + qi qj / εrij)


Nonbonding atom pairs

Fig. 15.1 E is the total energy of a molecule or a complex of several molecules. It is composed of
various contributions. The first term describes the energy change upon stretching or compressing
a chemical bond. In the example at hand, it describes the so-called harmonic potential with the
force constant Kb and the equilibrium bond length b0 as a parameter. The energy as a function of
the bond angle Y is described in the second term. Here too, the harmonic potential is used with the
force constants KY and an equilibrium constant Y0. The third contribution describes the change in
the energy upon changing the dihedral angle, and the last term stands for non-covalent interactions.
The sum of three terms is used for this last contribution. The first term Aij/rij12 is always positive
and rises quickly with decreasing distance. It describes the repulsion between atoms that come too
close together. The contribution from Cij/rij6 is always negative and approaches zero with
increasing distance rij, though not as fast as the repulsive term. It describes attractive interactions,
which are also called dispersion interactions. Other attractive interactions exist between polar
molecules that are also proportional to 1/rij6 (for a description of the potentials see ▶ Sect. 18.12,
▶ Fig. 18.5). The last term qiqj/erij describes the electrostatic interactions based on Coulomb’s law,
which are based on a point charge model. The dielectric constant is e. The non-covalent contri-
bution to the total energy, without the electrostatic term, is called van der Waals energy.

many functional groups of active compounds can occur in a protonated or


deprotonated state according to the applied pH conditions (so-called titratable
groups). The strength of the interactions strongly depends on the charge state of the
involved functional groups. The acidity or basicity of a given functional group is
15.5 Quantum Chemical Methods 321

determined by its pKa value. This indicates how easily a group accepts or releases
a proton. This property, in turn, depends heavily upon the partial charge that the
group carries and what other charges are in the immediate vicinity of the group.
Thus, the pKa value shifts if a functional group comes into an altered environment.
For example, carboxylic acids become more acidic when they are brought near
a positive charge. Their acidic nature changes, on the other hand, if a partially
negatively charged group is nearby. This effect must be considered in a reliable
force-field calculation. An attempt can be made to predict the protonation state in
protein–ligand complexes with such calculations. For this, the contribution to the
energy content of the complex is determined by evaluating all possible combina-
tions of states of titratable groups. In this way the shift in the pKa values of
functional groups can be estimated.
The importance of water as a binding partner in the formation of protein–ligand
complexes was emphasized in ▶ Chap. 4, “Protein–Ligand Interactions as the Basis
for Drug Action”. Complex formation causes a change in the solvation conditions
for the involved molecules. This must be considered in the force-field calculations.
For this, a force-field is combined with estimations for the contribution from
solvation. Newer methods such as the MM-PBSA or MM-GBSA methods try to
sum up these contributions over the local environment in a surface-dependant way.
The choice of a relevant starting geometry is important for any force-field
calculation. A force-field calculation leads to an energy minimization. By starting
from an energetically unfavorable geometry, the force field drives “downhill” to the
next local minimum on the multidimensional energy surface (▶ Sect. 16.2). If one
starts with two different geometries, the resultant minimized structure can also
be different. Many molecules and especially protein–ligand complexes can adopt
numerous energetically favorable conformations. It is therefore recommended that
multiple force-field calculations are performed by starting from different
geometries.

15.5 Quantum Chemical Methods

In quantum mechanical approaches, the electronic structure of molecules is calcu-


lated by using the Schrödinger equation. Its mathematically closed solution is,
however, only possible for simple cases such as the hydrogen atom or the molecular
ion of hydrogen, H2+. For molecules with multiple electrons, approximate methods
must be used for the solution of the quantum mechanical “many-body problem.”
The most commonly used approximation is the so-called Hartree–Fock method.
Here, the many-body problem is reduced to multiple single-body problems. The
sum of the electron–electron interactions within a molecule is replaced with an
effective field that can be iteratively refined. It is from this that the commonly used
name, SCF (self-consistent field) is derived. Each electron in this model “sees”, in
addition to the potential of the nuclei, the averaged potential of the remaining
electrons. The state of each electron in a molecule is described by a single-particle
322 15 Molecular Modeling

function, the so-called atomic orbital (AO) or, in a molecule, molecular orbital
(MO). The wave function of the entire molecule is applied as the antisymmetric
product of these many orbitals. The Hartree–Fock equation is obtained on the
condition that optimally chosen orbitals lead to minimal energy. The main defi-
ciency of the Hartree–Fock approach, namely, neglecting the electron correlation,
can be corrected with more elaborate methods, whereby the calculation time,
however, severely increases.
Quantum mechanical ab initio calculations allow the calculation of the molec-
ular structure and electron density distribution as well as molecular properties
without the assumptions that are necessary for force-field calculations. In many
cases it is difficult to make predictions a priori based on the hybridization state of
the atoms. In the case of amines and sulfonamides, it is often impossible to predict
whether the atoms that are bound to nitrogen are in the same plane or whether
nitrogen is in a pyramidal environment. In a force-field calculation one must specify
from the very beginning what atom type is to be assigned to which atom (i.e., for
the above case, whether it should be a planar or a pyramidal nitrogen atom). If
the wrong atom type is chosen, the resulting structure is, of course, meaningless.
Quantum mechanical calculations require no such assumptions.
The majority of currently applied force-fields use a point-charge model to
describe the electrostatic interactions. One possibility to derive the atomic charges
is to calculate the electrostatic potential of a small molecule that contains the group
in question by using quantum-mechanical methods. Subsequently, a set of partial
charges is assigned to the various nuclei so that the quantum mechanically calcu-
lated potential is depicted as accurately as possible. These charges can then be
transferred to force-field calculations to be used in a large system.
A further important application of quantum-mechanical calculations in drug
design is found in the calculation of conformational energies of small molecules
to calibrate force-fields. The force-fields that have been developed for proteins
and peptides are based on conformational energies that have been quantum-
mechanically calculated for small peptides.
In contrast to force-field methods, quantum-mechanical techniques are able to
consider the polarization of the electron density caused by the influence of neighboring
groups. For example, the amide bond dipoles in an a helix are all oriented in the same
direction so that they sum up to a significant total dipole moment. As a consequence,
such large compiled dipoles can polarize other groups that are localized at the end of
the helix. In this way the induced dipoles are incompletely described by force-field
methods. For quantum-mechanical methods, this is not a problem. A further important
application area is chemical reactions for which force-fields are hardly parameterized
at all, with the exception of a few special cases. Here quantum mechanical methods
are the only possibility for theoretical description.
Quantum-mechanical methods are considerably more elaborate than force-field
methods. The most accurate methods, which also devour the most calculation time,
are the so-called ab initio methods. These techniques meet their limits however,
with very large systems. Therefore other less computationally demanding methods
were developed. In these so-called semiempirical methods, certain integrals, the
15.6 Computing Molecular Properties 323

determination of which represents the rate-determining step in ab initio methods,


are replaced with adequate approximations that are quickly calculated. The drasti-
cally reduced calculation time that results, which is nevertheless accompanied by
reduced accuracy, allows the routine application of semiempirical calculations to
active molecules and proteins. Density functional theory represents another faster
ab initio technique. With this method, the position-dependent electron density
distribution is calculated in the ground state for a many-body system; the complete
solution to the Schrödinger equation for a many-body system is avoided. All of the
interesting properties are then derived from the electron density. Techniques have
been developed for large protein–ligand systems that treat the interesting areas, for
example, the binding site or the catalytic reaction center, quantum mechanically.
The surrounding areas are approximated with a faster force-field method (QM/MM
methods).

15.6 Computing Molecular Properties

The result of a molecular mechanics or quantum chemical calculation is at first a set


of atomic coordinates that define the three-dimensional shape of the molecule.
What can be done with this? An important application of the calculations is the
determination of conformational energies: this is the relative energy of a molecular
conformation in comparison to another (▶ Sect. 16.1).
Two further molecular properties can be calculated: the form and size of
a molecule along with its electronic characteristics. All of the currently used
graphics programs have multiple options for the display of the spatial structure of
molecules. The most important are summarized in Fig. 15.2.
The most often used representation is a line or stick representation (Dreiding
models), sometimes atoms are displayed as little spheres. As a general rule, a color-
coding is used to denote the atoms; nitrogen is blue, oxygen is red, sulfur is yellow,
fluorine is turquoise, chlorine is green, bromine is brown, and iodine is violet.
Hydrogen atoms are shown in white, but usually they are omitted for the sake of
clarity. Carbon atoms are generally shown in black or gray. In the majority of
figures in this book, carbon atoms that belong to protein are shown in orange, and
carbon atoms that belong to the ligand are shown in gray. Another display option is
the space-filling model, with which van der Waals surfaces are shown. For this
representation each atomic nucleus is shown with a sphere, the size of which
corresponds to the van der Waals radius. Values for these radii come from the
crystal packing or from very exact ab initio calculations. Such representations are
also known as CPK models (named after the scientists Corey, Pauling, and Koltun).
Furthermore there are other options for displaying surfaces (Fig. 15.3). The
solvent-accessible surface has proven particularly valuable for proteins. The
most-used protein-display form in this book is transparent-opaque white surfaces.
The van der Waals surfaces in Fig. 15.3a give the impression that a crack is present
at the position that is marked with the arrow. This crevice, however, is so narrow
that no other atom fits inside. Therefore the solvent-accessible surface (Fig. 15.3b)
324 15 Molecular Modeling

Fig. 15.2 Different computer graphics representations of dopamine (▶ Sect. 1.4, Formula 1.13).
Carbon atoms are colored gray, hydrogen atoms are white, nitrogen atoms are blue, and oxygen
atoms are red. (a) Dreiding models. (b) Ball-and-stick models. (c) Space-filling models (CPK
representation). (d) Solvent-accessible surface. (e) Electrostatic potential projected on the surface
(positively charged areas are blue, negatively charged areas are red). (f) Highest-occupied
molecular orbitals (HOMO), calculated for the uncharged dopamine molecule. The blue or red
areas of the wave function indicate a different sign.

is less misleading. It is generated by rolling a sphere with a radius of 1.4 Å, which
corresponds to the size of a water molecule, over the surface of the molecule. This
surface appears much smoother. Depressions that are still present mean that
small molecules – at least a water molecule – can really fit in there. The Lee–
Richards surface is less frequently used but very helpful. It is so chosen that ligand
atoms that come into contact with the examined surface lie directly on this surface
(Fig. 15.3c).
The surface can be colored too. For example, a color can be assigned to each
atom type, and then the color of the next-closest atom can be used for the surface.
A representation in which the molecule’s surface is colored according to other
properties, for example, electrostatic or hydrophobic potential, is very instructive.
15.7 Molecular Dynamics: Simulation of Molecular Motion 325

Fig. 15.3 Definitions of molecular surfaces (a) van der Waals surface. The arrow marks a place
where a crevice is found, but it is too small to accommodate a water molecule. (b) Solvent-
accessible area. (c) Lee–Richards surface.

15.7 Molecular Dynamics: Simulation of Molecular Motion

None of the processes that are interesting to us run at 0 Kelvin, but rather at body
temperature, which is approximately 310 Kelvin. It is therefore clear that not only
the potential energy but also the kinetic energy must be considered. Molecules
move at room temperature. They diffuse and change their shape in that they adopt
different conformations. The flexibility and adaptability of both partners play a big
role in protein–ligand interactions. A prerequisite for protein binding is that the
ligand can take on a conformation that corresponds to the shape of the binding
pocket. On the other hand, the protein is flexible to a certain extent. For example,
side chains on the surface can adopt different conformations or entire domains can
move relative to one another. The mutual adaptation of protein and ligand shapes
plays an important role in the formation of protein–ligand complexes in particular.
The molecular dynamics simulation (MD) is a theoretical method to describe
these effects. In molecular dynamics simulations the movement of atoms and
molecules is followed under the influence of the chosen force-fields. It is assumed
326 15 Molecular Modeling

in these calculations that the interactions between particles obey the laws of
classical mechanics. For this, the Newtonian equations of motion are solved in
parallel and stepwise for all particles simultaneously. Usually it is assumed that the
force between two particles is not influenced by other particles.
In practical applications, a starting geometry is generated at first (Fig 15.4). If an
experimentally determined structure, for instance, the crystal structure of a protein–
ligand complex, is available, then that is the starting point. To take the surrounding
water shell into consideration, the complex is dipped into a “water bath,” that is,
a large number of water molecules enclose it. Further, an adequate number of ions is
added to keep the whole system in an electrically neutral state. To prevent boundary
effects on the “walls,” a trick called “periodical boundary conditions” is used on the
water bath. If the simulated protein complex approaches such a wall and wants to
leave the water bath, the process is handled on the computer as though the complex
had again entered from the opposite side. Formally, the boundary areas of the water
bath are eliminated.
In the beginning of the actual simulation each atom is assigned a random starting
velocity with an arbitrary orientation. The velocities are chosen so that on average
they correspond to the desired temperature (Boltzmann distribution). Then all
forces from all surrounding atoms acting on a particular atom are calculated.
At set time intervals the next position is calculated with Newtonian motion equa-
tions, and so forth. The step width is typically a femtosecond (1 fs ¼ 1015 s). This
small step width is necessary because there are many extremely fast processes that
occur on the molecular level. The development of the movement is followed for
multiple nanoseconds, and is shown in terms of a trajectory. Ten nanoseconds are
enough to follow the movement of side chains and sometimes even of protein
domains. It is not enough, however, to describe the diffusion of an active compound
into the binding pocket. For this, longer simulation times are necessary. The folding
of a protein is also difficult to simulate with this technique. The necessary time for
protein folding is on the actual time scale between 20 ms and 1 h. The calculation of
one time step (1 fs) still requires seconds of processing time on even the fastest
computers. Nonetheless new algorithms and computers with more specific archi-
tectures are being developed that will make such simulations possible in the
foreseeable future.
Another application of MD simulations, the calculation of binding affinity,
should be mentioned here. In principle the free energy DG for a given system can be
calculated. From the point of view of statistical thermodynamics the so-called
partition function (German: Zustandssumme) is determined for this, in which the
energetic contributions of all possible configurations of a system are considered.
The entropic component of the system is automatically calculated by determining
the distribution and relative population of the many states. Differences in the free
binding energy of different ligands is of particular interest in the context of protein–
ligand interactions. Experience has shown, however, that only differences in the
binding free energy between two similar ligands can be reliably calculated. In
modern applications (e.g., for screening purposes, ▶ Sect. 7.4), particularly large
amounts of data are evaluated. Therefore, the effort associated with MD
15.8 Dynamics of a Flexible Protein in Water 327

Fig. 15.4 Schematic course


of a molecular dynamics Generate Start Coordinates
simulation. The starting
geometry is either an
experimentally determined
structure or a geometry that Choose Starting Velocity
was optimized with
force-fields. Each atom is
assigned an appropriate
Calculate Forces
starting velocity. Then the
(Pair Approximation)
movement equations are
stepwise solved with these
starting conditions and the
coordinates are periodically Calculate Velocity
saved. and New Coordinates

Save Coordinates

Another Step?
Yes
No

End

calculations to estimate the binding affinities can hardly be afforded. Furthermore,


many simple empirical methods allow a good affinity estimation to be made that is
of similar quality. Therefore these faster methods are more readily used.

15.8 Dynamics of a Flexible Protein in Water

The most important application of molecular dynamics simulations is undoubtedly


the ability to follow the motion of one or more molecules in solution. For example,
which parts of a protein’s binding pocket or a ligand are rigid upon protein–ligand
complex formation and which are flexible, can be investigated.
The enzyme aldose reductase has proven to be a very flexible protein. It is
capable of adapting its binding pocket to the shape of a complexed ligand in
versatile ways. This property is related to the biological function of this protein.
It reduces a very broad palette of aldehyde substrates. Its exact function and role as
a target structure for a drug therapy is discussed in ▶ Sect. 27.5. Highly flexible and
adaptive proteins pose a special challenge to drug design. From the many crystal
structure determinations it became apparent that there are several parent confor-
mations for aldose reductase that are most likely in a dynamic equilibrium with one
another. A binding ligand picks out a conformation from this equilibrium that fits,
and the conformation becomes stabilized upon binding. These considerations are
328 15 Molecular Modeling

applied to GPCRs in particular, which are introduced in ▶ Chap. 29, “Agonists and
Antagonists of Membrane-Bound Receptors”.
Matthias Zentgraf carried out extensive molecular dynamic simulations on
aldose reductase. The profile that resulted was consistent with the crystallographic
structure determinations. Amino acids that are repeatedly found in many protein–
ligand complexes with modified geometries were shown to be very flexible in MD
simulations as well. If the trajectory of such simulations is evaluated, it is apparent
that the protein flips between the above-mentioned parent conformations. Addi-
tionally, many geometries occur that have only small but structurally critical
variations to these parent conformations. Small areas in the binding pocket are
thus opened that are able to accommodate, for example, an additional methyl group
or a phenyl ring on a ligand. Such information can be directly used for the design of
new inhibitors.
To provide an overview of the flexibility of a protein, the variation of the atom
positions is calculated from one simulation state to the next along a trajectory. Just
as with photographic film, these momentary pictures of complexes are called
“snapshots.” Above all, it becomes transparent if a protein fluctuates for
a particular time in one conformation before it flips into another geometry. In
further progress it can either return to the original geometry or flip into another
basis geometry. Such an orientation map is shown in Fig. 15.5. From this map,
it can be extracted that the protein spends time in multiple parent conformations.
If representative snapshots from these clusters of basis conformations are
superimposed upon one another, a very good picture of which groups in the binding
pocket show enhanced flexibility is obtained. In the example at hand, the side
chains from two neighboring phenylalanines (Phe121 and Phe122, Fig. 15.6) are
particularly implicated. These can swing out of the way to open a new, previously
closed cavity in the binding pocket. In the context of drug design, such information
can be translated into the design of new inhibitors that can occupy new binding
pockets. In this way an improved affinity or selectivity for the target protein can be
achieved. A ligand is shown in Fig. 15.7 that has been furnished with an additional
benzyl group (red), that optimally fills the newly opened cavity in the snapshot (in
light blue) in Fig. 15.6.

15.9 Model and Simulation: Where Are the Differences?

To conclude this chapter, the terms “model” and “simulation” should be briefly
compared and contrasted. Molecular models are used to approach questions that are
experimentally difficult or impossible to address. What different conformations can
a molecule adopt? This question is currently difficult to answer experimentally.
Does a possible drug candidate fit into a protein’s binding pocket? Even this
question is only answerable with laborious experiments. The use of models is
an elementary component of every scientific discipline. Models have always played
a central role in chemistry. It is shown in ▶ Chaps. 23, “Inhibitors of Hydrolases
with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”;
15.9 Model and Simulation: Where Are the Differences? 329

600
Number of Snapshots

rmsd [Å]
1.8
500

1.5

400
1.2

300
0.9

200 0.6

0.3
100

0
0
0 100 200 300 400 500 600
Number of Snapshots

2D RMS Diagram

Fig. 15.5 The development with time of the spatial deviations of various snapshots along the
simulation trajectory are visualized on this map. Large deviations are color-coded with red, medium-
sized deviations with green, and small deviations are colored blue. Green delineated square areas are
recognizable along the main diagonal. There the complex spends time near a parent conformation.
The transition to the next square represents a flip to a new geometry. If sectors outside the main
diagonal are colored increasingly red, the geometry deviates strongly from the previously adopted
conformation. If an area outside the diagonal is reached that is green, the newly adopted geometry is
not very different from a state that the system reached one time. With such a map it is possible to see
which of the many parent conformations a complex swings between.

▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”;


▶ 27, “Oxidoreductase Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear
Receptors”; ▶ 29, “Agonists and Antagonists of Membrane-Bound Receptors”;
▶ 30, “Ligands for Channels, Pores, and Transporters”; ▶ 31, “Ligands for Surface
Receptors”; and ▶ 32, “Biologicals: Peptides, Proteins, Nucleotides, and
Macrolides as Drugs” how models, built on the basis of crystal structures of
protein–ligand complexes, afford important contributions to drug design, especially
in the preselection of possible molecular candidates for synthesis.
The term “simulation” describes the calculations with models. Multiple options
or variable combinations can be quickly evaluated on the computer for a given
mathematical model. Such investigations can contribute considerably to a better
understanding of the system. Next to theory and experiment, computer simulations
have been called the third pillar of exact science.
330 15 Molecular Modeling

Fig. 15.6 Representative snapshots were taken from the different square area along the main
diagonal in Fig. 15.5 and superimposed upon one another. It can be seen that above all else, the
side chains of the phenylalanines Phe121 and Phe122 can undergo severe movements in the
binding pocket. In doing so, they can also adopt conformations (e.g., the light-blue geometry) that
open a new hydrophobic cavity in the binding pocket.

Beware of too-high expectations in the area of drug design! It should not be


overlooked that the performance of a reasonable simulation requires that the
fundamental model is accurate and its limitation are well understood. This prereq-
uisite is indeed well met in many areas of engineering science so that a simulation
plays an important role in the design of automobiles or computer chips. Unfortu-
nately, in chemistry things are more complicated. The currently available molecular
models allow the assembly and ranking of compounds that are to be synthesized.
They can also be used to design ligands with improved binding properties. None-
theless, the present models are often not exact enough to allow detailed simulations
of protein–ligand complexes with sufficient accuracy to determine a binding
energy. In view of the importance of this field, this can only mean that more effort
must be exerted for the collection of experimental data for the development of
improved models.
15.10 Synopsis 331

O
N COOH

Fig. 15.7 Conformations occur along the trajectory of the protein that open a new hydrophobic
pocket when the side chain of a phenylalanine swings away (Fig. 15.6, i.e., light-blue geometry).
This pocket can be occupied by a ligand. For this a benzyl group was added to the scaffold of the
shown benzodiazepine-like inhibitor, which can occupy the opened pocket during the simulation.

15.10 Synopsis

• Models have been and still are used in chemistry in general, but in particular in
modern drug design. Computer graphics is a versatile tool to display structures
and models along with various properties assigned and/or geometrically
superimposed onto these molecules.
• Structures can be calculated by starting from first principles and by trying to
regard physics as closely as possible. This is done with quantum mechanical
calculations. Because these methods easily become elaborate and computation-
ally intractable, an alternative is the empirical approaches. They are based on
much simpler physics, normally classical mechanics, and treat molecules as a set
of point charges in space interconnected by springs following harmonic
potentials.
• Empirical approaches can only be used if enough experimental data are available
to parameterize and calibrate the empirical concepts. Therefore large databases
assembling knowledge about molecular properties have been developed.
332 15 Molecular Modeling

• Molecular mechanics to compute the geometry of molecules are based on


empirical force-fields. They comprise multiple energy terms that describe
mutual interactions in the molecule either through bonds or through space.
Particular potentials are used to describe the torsional barrier to rotations around
single bonds. Furthermore nonbonded interactions are handled by special
potentials.
• The accuracy and required computational capacity of quantum chemical
approaches depend on the sophistication of the basis sets of atomic or molecular
orbitals used for the calculations. Parameterization of some parts of the calcu-
lations with empirical data can significantly reduce the computational require-
ments. Density function theory is a faster approach and works with electron
density distributions instead of orbitals. Combinations of quantum chemical
methods and force-field approaches have been developed to handle large sys-
tems such as protein–ligand complexes.
• Properties such as charges can be displayed on the surface of molecules.
Different types of surfaces have been defined such as the van der Waals surface
or the solvent-accessible surface.
• Molecular dynamics simulations are normally based on potentials derived from
empirical force-fields. They consider the properties of a molecule under dynamic
conditions by solving Newtonian equations of motion. As a result, the motion of
a molecule can be evaluated with time by analyzing the so-called molecular
trajectory.
• Molecular dynamics simulations can be used to study the flexibility of a protein
next to its ligand-binding site. Such simulations can show multiple conforma-
tions of the protein that are competent to accommodate different ligands.
• Computer simulations allow the possible properties of molecules under differ-
ent test conditions to be enumerated. They help to interpret results from
experiments or help to predict properties of molecules to better plan the next
experiments.

Bibliography

General Literature
Barnickel G (1995) Molecular modelling – von der Theorie zur Wirklichkeit. Chemie in unserer
Zeit 29:176–185
Birner P, Hofmann HJ, Weis C (1979) MO-theoretische Methoden in der organischen Chemie.
Akademie-Verlag, Berlin
Burkert U, Allinger NL (1982) Molecular mechanics, ACS monograph 177. American Chemical
Society, Washington, DC
Goodfellow JM (ed) (1995) Computer modelling in molecular biology. VCH, Weinheim
Kunz RW (1991) Molecular modelling f€ ur Anwender, Teubner Studienb€ ucher
Leach A (2001) Molecular modelling: principles and applications, 2nd edn. Prentice Hall, New York
Lipkowitz KB, Boyd DB (eds) (1990) Reviews in computational chemistry. VCH, Weinheim
Bibliography 333

Special Literature
Cornell WD et al (1995) A Second generation force field for the simulation of proteins, nucleic
acids, and organic molecules. J Am Chem Soc 117:5179–5197
Cram DJ (1988) The design of molecular hosts, guests, and their complexes. Angew Chem Int Ed
Eng 27:1009–1020
Fischer E (1922) Aus meinem Leben. Springer, Berlin, p 134
Pullman B (1990) Molecular modelling, with or without quantum chemistry. In: Rivail JL (ed)
Modelling of molecular structures and properties, vol 71, Studies in physical and theoretical
chemistry. Elsevier, Amsterdam, pp 1–15
van Gunsteren WF, Weiner PK (1989) Computer simulations of biomolecular systems. ESCOM,
Leiden
Watson JD (2010) The double helix, Phoenix, London; originally published by Weidenfeld &
Nicholson 1968
Conformational Analysis
16

Assembling a molecule with a modelling kit makes it already clear that rotations
around single bonds can be easily carried out. The molecule will achieve a different
shape, or as the chemists say, it is transformed into a different conformation. In a real
molecule, rotations around these bonds are not fully free. They are subjected to
a potential and the molecule adopts during the rotation particular, energetically favor-
able arrangements. n-Butane represents the simplest case (Fig. 16.1). The central
torsion or dihedral angle determines the relative orientation of the two bonds to the
methyl groups to one another. If n-butane is rotated out of the arrangement with the two
bonds to the methyl groups in 180 orientation (trans), the methyl group at the “front”
carbon and the hydrogen atom at the “back” carbon will directly coincide which each
other at a rotation angle of 120 and 240 called “eclipsed”. In this geometry, they come
closer to one another, therefore this arrangement is unfavorable for steric reasons. At a
rotation angle of 60 and 300 the groups are again in a staggered geometry, which is an
energetically more favorable situation. This arrangement is somewhat less favorable
than the staggered trans orientation because of the spatial vicinity of the methyl groups,
which are now said to be “gauche” to one another. Finally along the rotation path an
orientation is adopted at 0 and 360 in which both methyl groups are exactly behind
one another. This is an even less favorable orientation.

16.1 Many Rotatable Bonds Create Large Conformational


Multiplicity

Multiple energy maxima and minima can be passed through during the course of
a full rotation about 360 depending on which atoms and groups are attached to the
rotatable bond. They are at different energy levels relative to one another. The
lowest minimum is called the global minimum, and the energetically higher minima
are called local minima. Knowledge about these minima is important because
molecules adopt geometries that correspond to such energy minima. Calculations
are necessary to find these minima. A possible method is in the systematic rotation
of all rotatable bonds, for instance in 10 steps. At each step the energy of the

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_16, 335


# Springer-Verlag Berlin Heidelberg 2013
336 16 Conformational Analysis

gauche trans gauche

25,5 kJ
Energy 14,6 kJ
(kJ/mol)
3,8 kJ

CH3 CH3
CH3 CH3 CH3 CH3 CH3 CH3 CH3CH3 CH3
CH3
τ CH3 CH3

0 60 120 180 240 300 360


Torsion Angle t [°]

Fig. 16.1 Butane, CH3CH2CH2CH3, is made up of a linear chain of carbon atoms. If the terminal
methyl groups are covering one another after rotation around the central C—C bond, the torsion
angle about the central bond is 0 . At a 60 angle the “back” methyl group is half way between the
“front” methyl group and a hydrogen atom. This situation is called a “gauche” orientation. At 120
a methyl group and a hydrogen atom are eclipsed to one another. At 180 the terminal methyl
groups are exactly opposite one another. Here the energetically most favorable situation, the trans
orientation, is achieved. From now on, the course of the rotation is mirror symmetrical, and ends in
the starting position at 360 . The orientations at 120 and 140 are energetically less favorable than
the 180 -orientation by 14.6 kJ/mol. The gauche orientations at 60 and 300 are the least
favorable ones and are 25.5 kJ/mol higher in energy. If a minimization method is applied that
can only run “downhill,” the three minima on the potential curve can be reached by starting at the
110 , 130 , and 250 points.

molecule is calculated by using a force-field. All detected minima correspond to


possible conformations of the molecule.
Most drug-like molecules have many single bonds and therefore exhibit more
than one rotatable bond. For these bonds, multiple values of the torsion angle can be
adopted. These values have to be combined for all rotatable bonds in the molecule.
The number of possible combinations increases multiplicatively. The molecule
n-hexane has three rotatable bonds. If, analogous to n-butane, three local minima
are assumed for each rotatable bond (60 , 80 , and 300 ), we can expect 3  3 
3 ¼ 27 minima. To perform a systematic search for these minima in 10 steps
however, the evaluation of 36  36  36 ¼ 46,656 positions would be necessary. In
principle, the energy must be calculated for each of these positions. Not all angle
positions will, however, lead to reasonable geometries. It can happen that parts of
the molecule fold back upon itself, and parts will mutually superimpose. Such
collisions can be recognized by computer programs, and the geometry is discarded
from consideration. It is also easily imaginable that with an increasing number of
rotatable bonds, the number of local minima and adoptable geometries can dramat-
ically increase in a systematic search.
16.2 Conformations Are the Local Energy Minima of a Molecule 337

16.2 Conformations Are the Local Energy Minima of


a Molecule

It was shown in ▶ Chap. 15, “Molecular Modeling” that the energy and geometry
of a molecule can be calculated with the help of a force field or a quantum
mechanical method. In this way every possible angle value combination about the
rotatable bonds in a molecule can be found that correspond to energetically
favorable states. The mathematical method that is used to search for such
a minimum geometry can only move downhill on the potential energy surface
(▶ Sect.15.5). For this, the potential of n-butane should be considered again
(Fig. 16.1). If an angle of 130 is used as a starting value, the minimization ends
with a trans geometry. If an angle of 110 is started with, which is only 20 distant, the
optimization will lead to a gauche orientation. By doing this, two of the three possi-
bilities are detected. The third minimum that mirrors the gauche conformation is
reached if an angle of 350 is started from. In this way, all three conformations are
found for the simplest possible case.
How are complex molecules to be approached? In principle, in exactly the same
way. Because it is not known which torsion angles of the individual single bonds
will give access to potential minima, that is, stable conformations, the minimization
must be started from numerous angles for each of the single bonds. From these
values the minimization always goes “downhill”. The minima on the potential
surface are found in this way. The art is to efficiently define the starting points
from which a given geometry is minimized. This is a very laborious task, particu-
larly with large molecules. It is akin to a hiker in the mountains searching for the
deepest valley.
Adenosine monophosphate 16.1 serves as an example (Fig. 16.2). The analysis
concentrates on the five-membered ribose ring, the bond to nitrogen in the adenine,
and the three bonds of the sugar phosphate side chain. What conformations can this
molecule adopt? Rotations are performed about the open-chain bonds in 10 steps.
In the systematic search for the ribose ring only those orientations are considered
that allow the ring to close. To get a rough overview of the hypothetically obtained
geometries, the distance between the center of the adenine scaffold and the phos-
phorus atom is measured in each generated geometry. This falls between 4.5 and
9.3 Å for the more than 300,000 generated geometries. To estimate the energy content
of a molecule in an arbitrary geometry, its van der Waals energy (▶ Chap. 15,
“Molecular Modeling”) is calculated. Such a calculation is quickly accomplished.
The energies of the 300,000 geometries are between 0 and 64 kJ/mol. The
so-generated structures are not yet in local potential minima. To achieve this, each
starting geometry must be minimized (cf., the potential energy curve of n-butane in
Fig. 16.1). The subsequently obtained conformations are compared to determine
whether the same local minima have been reached by starting from different points.
This is a rather laborious endeavor for 300,000 starting geometries! It is akin to letting
our hiker walk downhill from each level square to find the deepest valley. Hopefully
he is granted great longevity so that he lives long enough to see the results of the
search! Can this search be structured more effectively?
338 16 Conformational Analysis

O− NH2
O OH
P N
N
t3
O t2
N N
O
t1 t4

HO OH 16.1

Fig. 16.2 Adenosine monophosphate 16.1 exhibits the conformationally flexible ribose ring and
four open-chain torsion angles, t1–t4. Rotations are performed and the center of the around these
torsion angles during the conformational analysis. To get a rough description of the attained
geometry,
N the distance between the phosphorus atom in the side chain and the adenine scaffold
( ) is measured.

16.3 How to Scan Conformational Space Efficiently?

Sometimes rolling the dice is better than systematic probing! The hiker could
choose random places in the mountains from which to descend into the next valley.
With a little luck he will find the deepest valley with significantly less effort. Such
Monte Carlo methods are very popular in conformational analysis. For this the
starting angles for the conformation search are chosen purely randomly. Molecular
dynamics serves as another approach. The hiker would have to climb into an
airplane that flies at high speed between the mountains and changes its direction
with each obstruction. After set time intervals, the hiker jumps from the airplane
and hikes to the base of the valley upon landing. The higher the airplane flies,
the fewer mountain peaks are encountered and the faster the mountains can
be crisscrossed. In the course of molecular dynamics a molecular trajectory
(▶ Sect. 15.8) is followed, and the geometry is saved at predefined time intervals
to use them as starting points for energy minimizations in a conformational anal-
ysis. By increasing the temperature (i.e., flying higher) a larger area of conforma-
tional space can be searched in a shorter period of time.

16.4 Is It Necessary to Search the Entire Conformational


Space?

Until now molecules have been considered in an isolated state. How does their
flexibility change when they are brought into an environment like the binding
pocket of a protein? In principle nothing changes in their conformational flexibility.
It could be that minima are found at different positions that have different relative
energies because of electrostatic and steric interactions in the binding pocket. This
begs the question of whether the torsion angles in all areas must be sought for
16.4 Is It Necessary to Search the Entire Conformational Space? 339

a ligand that is in a binding pocket. If energy minima occur preferentially at


particular torsion angles, it is reasonable to limit the search to these angles. The
hiker could, for example, get the impression that the villages are predominantly
found in valleys and hardly ever on peaks or slopes. Because of this, all of the
villages would be worthwhile as starting points for his minimum search.
Ligands in the binding pocket of a protein are under the influence of directional
interactions from the amino acids that are located there. Similar conditions are
found for molecules in a crystal lattice. There, the environment is built of identical
copies of neighboring molecules (▶ Chap. 13, “Experimental Methods of Structure
Determination”). These undergo directional interactions with the molecule, analo-
gously to the amino acids in the binding pocket. Interestingly, the molecular
packing density in the interior of a protein is similar to organic molecules in
a crystal lattice. As was already mentioned in ▶ Sect. 13.9, the crystal structures
of numerous organic molecules are known and stored in a database. Experience has
unfortunately shown that the conformation of a flexible molecule in a crystal
structure is often not identical, or even similar to the geometry of the molecule in
the binding pocket of a protein. The same is true for conformations that have been
found in solution.
The receptor-bound conformation of a molecule cannot be unambiguously
derived from its small-molecule crystal structure or from that in solution. None-
theless, much can be learned from crystal structures. As an example, not the entire
molecule should be considered, but rather individual torsion angles. The potential
energy for the central torsion angle of n-butane is shown in Fig. 16.1. If the
angles for multiple C—CH2—CH2—C fragments are extracted from a database
of small-molecule crystal structures, they gather overwhelmingly in areas where the
potential energy curve shows local minima. Adenosine monophosphate 16.1 has
four open-chain torsion angles t1–t4 (Fig. 16.2). The bond between the ribose
ring and the adenine scaffold forms the torsion angle t4. A further fragment is the
phosphate group with the oxygen and the attached carbon in the chain (t3).
This fragment occurs in the database in a large variety of different structures.
A representative picture can be expected because this fragment occurs in very
many different environments when enough crystal structures are considered. The
results of such searches for the four torsion angles t1–t4 are shown in Fig. 16.4 as
frequency distributions, so-called histograms. Experience has shown that clearly
preferred values occur for many torsion angles. That is the case here for t1, t2, and
t3. The question can be raised as to why this statistical evaluation is not better
performed on ligands that are taking part in crystallographically studied protein–
ligand complexes. Unfortunately the diversity of these data is still limited, and the
data are usually not accurate enough for the desired evaluation. Nevertheless,
comparative studies have shown that the same torsion angles are preferentially
found in protein–ligand complexes and small-molecule crystal structures
(Fig. 16.3).
The experience that torsion angles prefer particular values can be used for the
conformational search. The angle t4 between the ribose ring and the adenine
scaffold shows a broad distribution over many possible values (Fig. 16.4).
340 16 Conformational Analysis

80

Frequency [%]
60

40

20

0
0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°]

Fig. 16.3 A value distribution for the torsion angles with clusters at 60 , 180 , and 300 is derived
from a database of small-molecule crystal structures for the C—CH2—CH2—C fragment. Most
values are found at 180 . Torsion angles between 0 and 360 are entered as the relative frequency
in percent. The maxima of the distribution are at the points where the potential curve of n-butane
(Fig. 16.1) shows its energy minima.

Unfortunately, the search cannot be narrowed here. This looks better for the other
angles t1–t3. There, only specific values occur. If the systematic search is limited to
these areas, and a search in 10 steps is carried out around the average value, it would
only be necessary to generate 6,340 geometries. Almost the same distance between
phosphorus and adenine is covered with 5.9–9.3 Å as in the unrestricted search.
If a van der Waals energy calculation is carried out on these geometries, values
between 0 and 16.3 kJ/mol are obtained. In contrast to the results from Sect. 16.2, all
the geometries that correspond to the energetically unfavorable areas are discarded.
How can it be confirmed that this restricted search also covers that part of the
conformational space that includes the receptor-bound conformations? Adenosine
monophosphate 16.1 often occurs as a substructure of cofactors in protein com-
plexes so that there is enough information about receptor-bound conformations for
this particular example. They come from crystal structures of proteins with these
bound cofactors. The distance range of 5.9–9.2 Å between the adenine scaffold and
the phosphorus in the receptor-bound structures covers the same range that was
detected in the enhanced systematic search. It can therefore be assumed that enough
geometries were generated that satisfactorily populate the local minima of the
bound state of adenosine monophosphate. Reflecting back to the initial butane
example (Fig. 16.1), this means that the starting points were well distributed so
that all minima were reached.

16.5 The Difficulty in Finding Local Minima Corresponding to


the Receptor-Bound State

As already described, the local minima in a systematic conformational search are


obtained by subjecting all of the generated geometries to a force-field optimization.
There can be problems with this approach. To explain this, a different molecule,
citric acid 16.2 can be considered, in the binding pocket of citrate synthase. Seven
16.5 The Difficulty in Finding Local Minima 341

40 60

Frequency [%]
Frequency [%]

30
40
20
20
10

0 0
0 30 60 90 120 150 180 210 240 270 300 330 360 0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°] Torsion Angle t [°]

t3 t2 NH2

HO N
N

O P O
N N
O O

t1 t4

HO OH 16.1

60 15
Frequency [%]

Frequency [%]

40 10

20 5

0 0
0 30 60 90 120 150 180 210 240 270 300 330 360 0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°] Torsion Angle t [°]

Fig. 16.4 The frequency distribution of the torsion angles of the open-chain bonds of adenosine
monophosphate as found in the crystal structures of small organic molecules. The torsion-angle
histograms are constructed for fragments that are representative for corresponding portions of the
test molecule. There are clearly preferred values for the angles t1–t3, but a broad distribution of all
possible angles is found for t4. This knowledge is used in the conformational analyses and limits
the search for t1–t3 to the preferred value ranges.

hydrogen bonds are formed by its three carboxylate groups and the hydroxyl group
to three histidine and two arginine residues of the protein (Fig. 16.5). If the free, not
to the protein bound citrate molecule is considered and its geometry is minimized in
an isolated state, it takes on a conformation with internally saturated hydrogen
bonds (▶ Sect. 15.5). Of course, a different geometry can be started from, but in
all cases, conformations with intramolecular hydrogen bonds will result upon
minimization. Such hydrogen bonds rarely occur in the protein-bound state.
Therefore the conformation that was obtained after minimization in the isolated
state has no relevance for the conditions in the protein.
As a general rule, ligands rarely bind to proteins in a conformation exhibiting
intramolecular hydrogen bonds. The H-bond-forming groups are generally
involved in interactions with the protein.
To circumvent the problem of intramolecular H-bond formation, a minimization
of the generated starting structure can be neglected, and all geometries from the
systematic search can be used for further comparison (▶ Chap. 17, “Pharmacophore
Hypotheses and Molecular Comparisons”). Then, however, very many geometries
must be examined. This would severely limit the scope of such comparisons for
342 16 Conformational Analysis

Fig. 16.5 Interactions N


between citric acid 16.2 and Arg 329 His 238
the enzyme citrate synthase. N
HN + NH H
The molecule is bound by H H N
seven hydrogen bonds to three
OH HN
histidine and two arginine
H O O
residues. His 320
HN
Arg 401 + −
O O−
O O−
NH
H
H
N
16.2 His 274
N

computational reasons. Furthermore, such generated results would likely describe


rather distorted geometries. The force field responsible for the formation of intra-
molecular H-bonds could be neglected. But how reliable would such a reduced
force field be? An attempt can be made to summarize the geometries that were
generated in a systematic search so that groups with similar conformations are
described by one representative member.

16.6 An Effective Search for Relevant Conformations by Using


a Knowledge-Based Approach

A knowledge-based approach analyzes first the experimentally determined confor-


mations and generates for new molecules only those conformations that are con-
sistent with the experimental knowledge base. In this way, many geometries are
never generated from the very beginning. The example of adenosine
monophosphate 16.1 is once again invoked. The approach recognizes a flexible
five-membered ring and four open-chain rotatable bonds. Energetically favorable
conformations of the ring are chosen from a database. This database contains many
different ring systems as they are found in, for example, crystal structures of
organic molecules. In the case at hand, the approach suggests the five energetically
most favorable ring conformations from which two are in fact found in the protein-
bound cofactors. For the open-chain part of the molecule the method is guided by
the above-mentioned frequency distribution of the dihedral angle (Fig. 16.4). The
starting geometries are only generated in areas in which these distributions show
significant frequencies. The distribution is still rather crude. In a final step, the
generated geometries are optimized by readjusting the torsion angles. Clashes
between non-covalently bound atoms are avoided. At the same time the adjusted
dihedral angles are kept as close as possible to the preferred values. This approach
gets by with relatively few conformations. They are rather evenly distributed in the
part of the conformational space that is relevant for receptor-bound conformations
(Fig. 16.6).
16.7 What Is the Outcome of a Conformational Search? 343

Fig. 16.6 Eighty-one


conformers (upper part) from
experimentally determined
protein–ligand complexes are
superimposed upon one
another to illustrate the areas
in space that adenosine
monophosphate 16.1 can
adopt in a protein-bound state.
The ribose ring is located in
the center, for which two ring
conformations occur. The
possible orientations of the
adenine ring are shown on the
top, and the conformations of
the flexible phosphate chain
are on the bottom. Similar
coverage of the
conformational space is
achieved with a manageable
number of 14 conformations
(lower part), which were
generated by a knowledge-
based approach.

16.7 What Is the Outcome of a Conformational Search?

Many drug-like molecules are flexible. They can adopt markedly different confor-
mations depending on the surrounding environment. Usually the receptor-bound
geometry is not in the energetically most favorable conformation found for the
isolated state, but will fall in an energetically favorable area. For the conformational
analysis, this means that it is not necessarily the deepest minimum that is sought.
Rather, it should be the “relevant” minimum that corresponds to the bound state.
There is only a chance of finding it when the criteria for the search are known.
There is no difference in the difficulty of finding the energetically most
favorable conformation, or the one that “fits” best the binding site. An important
tool in the search for novel lead structures is the docking of candidate molecules
into the binding pocket of a given protein. Programs that are able to use this
approach must be able to handle the conformation problem. Meanwhile, a large
variety of methods have been developed that allow efficient docking searches on
computer clusters, particularly for molecules of drug-like size.
344 16 Conformational Analysis

16.8 Synopsis

• Drug-like molecules exhibit multiple rotatable bonds. Rotations around these


bonds drive the molecules into different conformations that correspond to local
minima on the energy surface of the molecule.
• The receptor-bound conformation of a drug-like molecule is the starting
point for any drug-design considerations. Therefore, many methods have been
developed to perform conformational analyses. Systematic searches by incre-
mental rotations about each single bond torsion angle will produce a huge
amount of geometries that need to be optimized to the local minima on the
energy surface.
• The conformation of a drug-like molecule frequently changes with the environ-
ment. Usually the conformation in the protein-bound state differs from that in
solution, in the gas phase, or in the small-molecule structure.
• Considering torsional fragments in small molecules and analyzing them across
databases of crystal structures by statistical means reveals clear-cut torsional
preferences for many examples. Such knowledge can be exploited to perform
a conformational search more efficiently. Not all values around a rotatable bond
have to be tested, and the search can be limited to the ranges that are known to be
preferred.
• A further obstacle in the conformational search of the protein-bound conforma-
tion of a drug-like molecule is that the molecule will interact with its environ-
ment. This environment, the protein’s binding pocket, is often polar and will
involve the bound ligand in multiple hydrogen bonds.
• Using a knowledge base on torsional preferences of small organic molecules
can significantly enhance the conformational search, particularly during
docking, in molecular comparisons, or in database searches based on predefined
pharmacophores.

Bibliography

General Literature
Leach A (2001) Molecular modelling: principles and applications, 2nd edn. Prentice Hall,
Englewood Cliffs

Special Literature
Böhm HJ, Klebe G (1996) What can we learn from molecular recognition in protein–ligand
complexes for the design of new drugs? Angew Chem Intl Ed Eng 35:2588–2614
Klebe G, Mietzner T (1994) A fast and efficient method to generate biologically relevant
conformations. J Comput Aided Mol Design 8:583–606
Klebe G (1994) Structure correlation and ligand/receptor interactions. In: Bürgi HB, Dunitz JD
(eds) Structure correlation. VCH, Weinheim, pp 543–603
Bibliography 345

Klebe G (1995) Toward a more efficient handling of conformational flexibility in computer-


assisted modelling of drug molecules. Persp Drug Des Discov 3:85–105
Marshall GR, Naylor CB (1990) Use of molecular graphics for structural analysis of small
molecules. In: Hansch C, Sammes PG, Taylor JB (eds) Comprehensive medicinal chemistry,
4. Pergamon, Oxford, pp 431–458
Stegemann B, Klebe G (2011) Cofactor-binding sites in proteins of deviating sequence: Compar-
ative analysis and clustering in torsion angle, cavity, and fold space. Proteins 80:626–648
Part IV
Structure–Activity Relationships and
Design Approaches
348 IV Structure–Activity Relationships and Design Approaches

Today drug design is supported by numerous computational approaches that, like


the pieces of a puzzle, all provide contributions from the development of a first
design hypothesis to a clinical candidate. (Announcement poster from the research
group of the author on the occasion of a conference in 2005 in Rauischholzhausen,
Marburg, Germany.)
Pharmacophore Hypotheses and Molecular
Comparisons 17

Emil Fischer’s lock-and-key principle (▶ Sect. 4.1) demonstrates the specific


interaction of an active compound with its receptor. With a key, it is the grooves
on the blade that interact with the wards in the keyway to open the lock. With active
substances it is a particular part of the molecule that undergoes an interaction with
the amino acids in the binding pocket. Similar molecules are frequently compared
in drug design to derive ideas for new structures. In this chapter, the criteria that
make such comparisons possible are compiled. Furthermore, these criteria can be
used to search in databases for alternative molecules that can bind in the same way
with the protein.

17.1 The Pharmacophore Anchors a Drug Molecule in the


Binding Pocket

The structure of the binding pocket determines which functional groups are neces-
sary for the ligand to bind. The spatial orientation of these functional groups in
ligands is referred to as the pharmacophore (▶ Sect. 8.7, Fig. 8.9). Because of its
importance for drug design and model hypothesis in medicinal chemistry, an
official IUPAC definition has been established by Camille G. Wermuth
(Table 17.1). The interacting groups that a ligand must possess to be able to
successfully interact with a protein defines the pharmacophore in space and is
independent of the special molecular scaffold to which they are attached. The
hydrogen-bond-forming groups or hydrophobic parts are considered for this.
A more detailed examination differentiates between positively and negatively
charged groups in a molecule. When derived from a set of similarly binding ligands,
this generalized description is referred to as the ligand-based pharmacophore. On
the other hand, the protein structure can be the starting point. For this, an analysis is
made as to which amino acid functional groups are in the binding pocket. They
define the properties with which a ligand can bind to them. In this sense, the protein
structure determines how the pharmacophore of a ligand must be shaped to be
able to successfully bind to the protein. This description is referred to as the

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_17, 349


# Springer-Verlag Berlin Heidelberg 2013
350 17 Pharmacophore Hypotheses and Molecular Comparisons

Table 17.1 Official IUPAC definition of a pharmacophore by Wermuth CG et al (1998) Pure


Appl Chem 70:1129–1143
A pharmacophore is the ensemble of steric and electronic features that is necessary to ensure the
optimal supramolecular interactions with a specific biological target structure and to trigger (or
to block) its biological response.
A pharmacophore does not represent a real molecule or a real association of functional groups,
but a purely abstract concept that accounts for the common molecular interaction capacities of
a group of compounds toward their target structure.
A pharmacophore can be considered as the largest common denominator shared by a set of active
molecules. This definition discards a misuse often found in the medicinal chemistry literature,
which consists of naming as pharmacophores simple chemical functionalities such as guanidines,
sulfonamides, or dihydroimidazoles (formerly imidazolines), or typical structural skeletons such
as flavones, phenothiazines, prostaglandins, or steroids.
A pharmacophore is defined by pharmacophoric descriptors, including H-bonding, hydrophobic,
and electrostatic interaction sites, defined by atoms, ring centers, and virtual points.

protein-based pharmacophore. In contrast to the lock-and-key picture, ligands


and proteins are flexible. In ligands the functional groups of the pharmacophore
must be oriented in the direction of the corresponding counter groups in the protein.
Therefore, detailed knowledge about the conformational properties of the ligand is
essential. Only then, it can be predicted whether a ligand can potentially adopt
a geometry that satisfies the interactions with the protein. On the receptor side, the
geometry of the binding pocket can adapt to the shape of the ligand, similar to how
a glove fits on the hand of its wearer (induced fit, ▶ Sect. 4.1). Binding pockets are
indeed found in the interior or in buried grooves on the surface of proteins, and it is
there that the small but decisive conformational changes of the protein occur. An
example for the adaptability of a protein is presented in ▶ Sect. 15.8. An attempt is
made to describe these induced-fit adaptations by using molecular dynamics
simulations.

17.2 Structural Superposition of Drug Molecules

For the moment, we want to limit ourselves to an example with an unknown


receptor structure. All of the effects that ligand binding induces in the protein are
therefore neglected. An example should clarify this. The fruit of the shrub Anamirta
cocculus, the fishberry, contains the terpene picrotoxinin 17.1, which causes con-
vulsions. This compound affects chloride channels (▶ Sect. 30.5). Because of its
central stimulatory effect, it has been used in the past as an antidote to sleeping pill
overdoses. Due to its high toxicity, it has no therapeutic importance today. The
structure of picrotoxinin was determined by crystallography (Fig. 17.1).
Synthetic modifications of the cyclic core structure have led to active and inactive
derivatives (Fig. 17.2). The spatial structure of the individual derivatives can be
constructed on a computer from the crystal structure of the parent compound and
superimposed upon one another to recognize structural differences. The parts of the
molecule that are seen as equivalent in a ligand-based pharmacophore model are
17.3 Logical Operations with Molecular Volumes 351

O
O OH

O CH3

O O
17.1

Fig. 17.1 Picrotoxinin 17.1 is responsible for the centrally stimulating effect of the extracts of
fishberries. Its structure and spatial architecture were proven by X-ray structure analysis.

placed upon one another for this superimposition. The superposition of all active and
inactive derivatives along with the common volumes of both classes are shown in
Fig. 17.3. The difference between both volumes is computed. It describes those areas
in space that are only occupied by the inactive molecules.

17.3 Logical Operations with Molecular Volumes

What information can be extracted from such comparative volumes? It is assumed


as a working hypothesis that a molecule can only be bound when its size does not
exceed the maximum available space. How large is the maximum available space?
To get an idea of this, the common volumes of all active derivatives are considered
and compared with the volumes of all inactive derivatives. A possible explanation
for the lack of activity of a molecule could then be that the area in the binding
pocket that the molecule would likely occupy is already taken by the protein.
Volume comparisons between active and inactive derivatives deliver informa-
tion about the possible shape of the receptor pocket. Such comparisons can be very
supportive in drug design. If the “forbidden” volume area for a compound class is
found, it can be checked before synthesis whether a compound really leaves the
“forbidden” area unoccupied.
352 17 Pharmacophore Hypotheses and Molecular Comparisons

Fig. 17.2 By starting with


active
picrotoxinin 17.1, active and
inactive derivatives were R1 R1 =
CH2 CH3
synthesized. OH
CH3 CH3
O
O OH 17.1
Picrotoxinin
O CH3
CH3 CH3
H OAc
O O CH3 CH3

R2 = OH
O H
OH OAc O
O OH
O
R2
CH3 N CH3

inactive

O OCOCH3 O OH
O O

O CH3 O
CH3

O O O O

OH
O OH O
O O
O O
CH3 CH3

O O O
O

Because of the rigidity of the molecule, it is simple to superimpose picrotoxinin


analogues on one another. Considering flexible molecules however, it cannot be
expected that the transition from a 2D molecular representation to a 3D structure
(▶ Chaps. 15, “Molecular Modeling” and ▶ 16, “Conformational Analysis”)
17.4 The Pharmacophore Is Modified by Conformational Transitions 353

Fig. 17.3 Superposition of the spatial structure of active (yellow) and inactive (blue) derivatives
of picrotoxinin. The united volumes around the active derivatives are shown by the red mesh. The
total volume around all inactive derivatives is shown in blue. A difference is formed between the
two volumes. The remaining volume (green) shows areas that are only occupied by inactive
derivatives. An explanation for the lack of activity of these derivatives can be that they try to
occupy volume areas that are already occupied by the receptor protein. This spatial clash does not
occur with the active derivatives.

delivers molecules in conformations in which all of the functional groups of the


pharmacophore are already analogously placed in space. Therefore, there are two
problems to solve:
• The groups that correspond to one another in different molecules and define the
pharmacophore must be determined.
• Techniques are needed that bring the molecules into conformations in which the
equivalent groups of the pharmacophores are analogously oriented in space.

17.4 The Pharmacophore Is Modified by Conformational


Transitions

To resolve the first problem, the role that the functional groups of the active
substance that form the contact with the receptor must be considered. They must
form hydrogen-bonding and hydrophobic interactions with the protein. In this
context, similarity of the functional groups means that they can form analogous
interactions with the protein. To define a pharmacophore in space, at least three
interacting groups are needed. This is immediately clear if one considers how many
fingers on a hand are needed to hold a randomly formed object (e.g., a potato) in
space. With only two fingers, the object can still rotate about an axis. In contrast, if
three anchor points are taken, its position is fixed in space. Practical experience
with a compound class is often helpful when assigning pharmacophoric groups.
For example, inhibitors of the angiotensin-converting enzyme (Fig. 17.4 and
▶ Sect. 25.5) need a terminal carboxylate group, a carbonyl group, and a group
that coordinates to the catalytic zinc ion.
354 17 Pharmacophore Hypotheses and Molecular Comparisons

O OH
O
HS S
N HS
N N
N N
N HOOC
H3C
O COOH O COOH
O COOH
O
OH
N
CH3
HS N S
HS N HS N
O COOH
O COOH O COOH
O OH CH3 O
CH3
P HOOC
N N HS
HO N N N
H N
O COOH O COOH
O COOH
O
CH3
HS N N

O HS O H
COOH COOH HS N
O COOH

O CH3 CH3
P N
HO N HOOC N
N N HS O
H H COOH
O COOH O COOH

CH3
S HOOC N
N
HOOC N H
N O
H S COOH
O COOH
S
P N
O
OH O
COOH

O CH3

HOOC N N
N N
H O
H COOH
O COOH SH

Fig. 17.4 Inhibitors of the angiotensin-converting enzyme. A pharmacophore that consists of


a terminal carboxylate group, a carbonyl group, and a group that coordinates to the catalytic zinc is
necessary for binding to the enzyme. The latter function is assumed by a thiol, a phosphoric,
phosphonic, or carboxylic acid. The individual derivatives possess conformational flexibility in
different areas.
17.4 The Pharmacophore Is Modified by Conformational Transitions 355

12 12

9 10 9 10
11 11

8
5
2 6
N 5
3 7 2 6
3

HO 1 4 1
17.2 17.3

12 12
9 8 10
10
6
5
NH
2 7
5 N 3
3 6
1 4
1 4 17.5
17.4

Fig. 17.5 “Virtual” springs are coupled to the atoms that are marked with numbers around the
steroid 17.2 and the three derivatives 17.3–17.5. The structural superposition (bottom) that is
shown is determined by the force of these springs and the simultaneous consideration of molecular
force-fields.

How can it be determined whether a common orientation for the assumed


equivalent groups in different molecules exists? In a computational method
these groups are assigned “virtual” springs that are coupled to one another. The
spatial overlap is reinforced by pulling these springs together. To avoid arriving at
an entirely distorted molecular geometry, a force-field is simultaneously taken into
consideration for each molecule (▶ Chap. 15, “Molecular Modeling”). The steroid
17.2 and three different inhibitors 17.3–17.5 (Fig. 17.5) are considered as an
example. They are ligands of an enzyme in the ergosterol biosynthesis. Spring
forces are applied between the marked atoms with the same numbers. The minimi-
zation of these forces along with the individual force-fields of the four molecules
leads to the superposition that is shown in Fig. 17.5.
356 17 Pharmacophore Hypotheses and Molecular Comparisons

Unfortunately the resulting solution depends on the starting conditions. If the


molecules are differently oriented in space at the beginning of the calculation, or if
they start from different conformations, different superpositions can result. At first
glance, this argument appears perhaps somewhat implausible. It should be kept in
mind that molecules are not only considered under the influence of “virtual” spring
forces but also under their own force fields. The many minima problem of molecular
force-field calculations was already mentioned in ▶ Chap. 16, “Conformational
Analysis”. They play an important role here too. The hiker in the last chapter should
help to explain this problem. He stands on a mountain peak and wants to descend into
the deepest valley possible. At the same time, he feels an “additional force” as he has
severe thirst. He wants to meet his friends in a pub. The friends are coming from
different peaks in the mountains. He sees a pub in all valleys. But which is his choice?
For a common meeting point he would also accept a less deep valley. In the beginning
of his hike he looks for the steepest descent to come down quickly. After a while, the
other valleys fall from view. If he arrives at a different pub in the end, he does not have
the energy anymore to look for another one. If he had started from a different mountain
top, he might have found a comparably deep valley, but had found the pub of his
choice and met his friends at the same time. The problem with the choice of starting
conditions for molecular comparisons with “virtual” spring forces is similar. How
should it be checked whether the best possible solution was found? Here only an
experiment can help. For this it is necessary to synthesize molecules that are
conformationally rigid in particular parts because of the incorporation of rings. They
confer a fixed spatial arrangement to the pharmacophore. If they also possess activity,
their rigidified geometry indicates the correct pharmacophore (see Sect. 17.9).

17.5 Systematic Conformational Search and Pharmacophore


Hypothesis: The “Active-Analogue Approach”

In the last chapter conformational analysis was the central topic. Could the tech-
niques described there, for example, the systematic rotation around particular
bonds, be used in the search for the pharmacophore? Garland Marshall developed
such a technique, called the active-analogue approach, at the end of the 1970s.
First a pharmacophore must be assigned to all molecules in a data set. Then the
equivalency of groups must be defined, that means, which groups are equivalent to
which other groups. Then a systematic conformational search is carried out for the
first compound in the data set. The distance between each functional group in the
pharmacophore for each geometry is determined during the search. These distances
are saved. Because molecules cannot take on any arbitrary geometry, the distances
will occur in particular intervals. An analogous approach is taken for the second
molecule in the set. In principle, only the distance ranges of the first molecule must
be searched. It could be that all of the distances found with the second molecule
were already found with the first. It could also be that particular ranges are
excluded, and the “allowed” distance ranges are therefore limited. All of the
molecules in the data set are analyzed in this way.
17.6 Molecular Recognition Properties and the Similarity of Molecules 357

If the conformational flexibility of the molecule is limited in one part of the


scaffold, there is a chance that the functional groups of the pharmacophore remain
in only one or a few different spatial patterns. The possible binding geometries of
the pharmacophoric groups of the ligand are derived from this. Afterward,
a geometry optimization can be carried out, in which case, the “virtual-springs”
approach is now ideally suited because the latter approach has approximated the
final solution rather closely. It is easy to imagine that the order with which the
molecules are investigated is decisive for the efficiency of the technique. Ideally,
the most rigid molecule from the data set is the first to be studied. With a little luck,
this limits a large part of the possible conformational space. The resulting list of
possible distances will remain small. By consistently using such limitations, in
1987 Garland Marshall and his research group were able to propose a model for the
receptor-bound conformation of the ACE inhibitors shown in Fig. 17.4. What could
be more rewarding than years later to be able to personally validate the model and
that it proved correct within an astonishingly small error margin! The validation
was achieved in the meantime because the crystal structures of the enzyme with
bound inhibitors from this data set were solved (see ▶ Sect. 25.5).

17.6 Molecular Recognition Properties and the Similarity of


Molecules

The question must be allowed as whether the conceptions presented in the previous
sections to represent the properties of molecules were really appropriately consid-
ered in the attempted comparisons? Deciding which functional groups belong to the
individual “teeth” of a pharmacophore is not easy. Analogous functional groups
must be oriented in a similar spatial direction in all molecules. In the case of the ACE
inhibitors (Fig. 17.4) conflict occurs already during the assignment of the functional
groups. Some analogues carry two carboxylate groups, which must be unambigu-
ously assigned to the pharmacophore prior to comparison with other inhibitors.
The binding of low molecular weight ligands to a protein is a mutual, targeted
recognition process. Both partners must fit together so that a strong interaction can
be formed. Parts of the ligand that have complementary recognition properties
determine the binding to the receptor. The term “recognition properties” refers to all
qualities that contribute to the specific interaction between molecules. Until now,
only properties and similarities have been considered that could be directly read
from the molecular scaffold. But is that sufficient? How would the world look if we
recognized ourselves only by our “scaffolds,” that is, only by the skeletons? Male
and female could not even be differentiated straightaway on these grounds! All of
the allure of interpersonal relationships that function over personal appearance and
charisma would be lost. Until now, molecules have been considered on the grounds
of their “skeleton”. Why should ligand–receptor interactions be described at this
level? Even molecules recognize one another by the properties of their shapes and
surfaces exposed to their immediate vicinity to form contacts. The following
example should clarify this point. Methotrexate 17.6 (MTX) and dihydrofolate
358 17 Pharmacophore Hypotheses and Molecular Comparisons

17.7 (DHF) bind to the enzyme dihydrofolate reductase (Fig. 17.6 and ▶ Sect.
27.2). The side chains of both molecules are nearly identical, but the heterocycles
are different. It is known from NMR spectroscopic investigations that the proton-
ated form of MTX binds to the protein. When considering the chemical formulae, it
is tempting to overlay the two heterocycles directly upon one another. Good
scaffold equivalence is achieved, and the heteroatoms in both molecules fall on
top of one another. The receptor, however, does not care about the apparent
equivalence of molecular skeletons. The interaction with the molecular surface is
much more important. Polar molecules such as MTX or DHF are bound to the
protein through hydrogen bonds. The arrows in Fig. 17.6 characterize the H-bond
donor and acceptor groups. The arrows are pointing to the molecule when an
acceptor property is exposed, and away in the case of donor groups. At the start, the
molecules are oriented in space so that they correspond in terms of a direct atom–
atom matching. For the moment, the basic molecular skeleton should be ignored,
and only the distribution of H-bond donor and acceptor groups is considered. The
equivalence achieved is not very convincing. Another variant is taken into consid-
eration in which the heterocycle of DHF is flipped over along the bond between the
heterocycle and the side chain. The spatial overlap of both molecules is no longer
optimal, but the pattern of exposed donor and acceptor groups for both molecules
shows much better agreement (Fig. 17.6). If transformed into another conformation,
the molecule now has entirely different molecular recognition properties. This
difference can hardly be read from chemical formulae, even by a trained eye in
cases such as this one.
Models are nice, but are they also correct? Here only an experiment can provide an
answer. Luckily, in the present case, crystal structures are available for both ligands in
complex with DHFR. The observed binding geometries are shown in Fig. 17.7. One
aspartate and two carbonyl groups in the main chain and two water molecules are
responsible for recognition in the binding pocket. The water molecules mediate the H-
bonds between ligand and protein. The experimentally determined binding geometries
show that the conceptions about the similarity of the hydrogen bond properties led to
the correct conclusions. On first glance, a surprising and seemingly “non-equivalent”
orientation of both ligands in the binding pocket is easily explained. The properties
that are responsible for the mutual recognition process must be compared to one
another. Only these count in the comparison! It is notable that this experimental
confirmation of the above-described ideas came eight years after the working hypoth-
esis was proposed. This is a nice example of the performance of model hypothesis.
Other properties, apart from hydrogen bonds, can serve as additional criteria to
define similarities in the molecular-recognition process. The electrostatic poten-
tial (▶ Chap. 15, “Molecular Modeling”) computed for the heterocyclic ring
systems of DHF and MTX (Fig. 17.7) suggests very similar conclusions. In addition
to the previously mentioned H-bonding properties and electrostatic potential, steric
space filling and the distribution of hydrophobic properties on the surface of both
ligands, play an important role. When molecules are superimposed to predict their
putative geometries in the binding pocket, their conformational flexibility must also
be considered.
17.7 Automated Molecular Comparisons and Superpositioning 359

a H H H H
N N
O O
N R H N
N R H
R N R
N N
N N
H H
H H
N N+ N N
N N
N N
N N N N
H H H
H H H H H
17.6 17.7

b c
H H
N O H H H R
N N
N R H N R
N N N R N
N N
H H
N N+ N N N N H H
N N+ N N N O
H H H H
H H H H

Fig. 17.6 Methotrexate 17.6 and dihydrofolate 17.7 are ligands of dihydrofolate reductase. The
side chain R (see ▶ Sect. 27.2, Fig. 27.9) is identical for both except for a methyl group on the
nitrogen atom. The heterocycles are different. (a) Intuitively, superposition of both heterocycles
directly upon one another when comparing the structures appear reasonable. Heteroatoms match
pair-wise one another. (b) Arrows are distributed around the molecules to compare the hydrogen-
bonding properties. They are pointed to the molecule when an acceptor is present and they point
away for donor groups. If the molecular skeletons are masked out, and the distribution of H-bond
donor and acceptor groups is concentrated upon, the atom–atom overlap obtained via the direct
superposition of the rings shows rather unconvincing equivalence. (c) Instead if the heterocycle in
17.7 is flipped about the bond between the heterocycle and the side chain R, the pattern of donor
and acceptor groups that is obtained exhibits convincing equivalence.

17.7 Automated Molecular Comparisons and Superpositioning


Based on Recognition Properties

Is it possible to consider all of the properties that were mentioned in the last section
in a method to superimpose molecules for a relative comparison? For this,
a measure of similarity for all properties must be calculated. This measure must
be related to a spatial distance function. Subsequently, an optimization of the spatial
superposition can be performed. At the same time, the maximum similarity of the
chosen properties is sought. The program SEAL from Simon Kearsley and Graham
Smith determines the spatial similarity of different properties distributed over
the molecular scaffold. It simultaneously ranks the similarity with respect to the
overlap volume of the molecules that were determined during superposition. In
this way the superposition of MTX and DHF is correctly predicted according to
experiment. The conformational flexibility is also considered in this analysis.
360 17 Pharmacophore Hypotheses and Molecular Comparisons

Fig. 17.7 Experimentally determined binding geometries of methotrexate (green carbon atoms)
and dihydrofolate (gray carbon atoms) in dihydrofolate reductase. The heterocycles of the ligands
are bound through H-bonds to the carboxylate or carbonyl group of an amino acid that is oriented
into the binding pocket. Two water molecules (red spheres) mediate additional H-bonds between
the ligands and the protein. The difference in the binding mode that is discussed in Fig. 17.6, is
clearly recognized. On the right-hand side the electrostatic potentials around methotrexate (top)
and dihydrofolate are shown. The molecules are found in a spatial orientation that was determined
by crystal structure analysis. Considered qualitatively, the electrostatic potentials of both mole-
cules in this orientation have very similar form.

For this, precalculated conformers can be taken and compared successively to


one another. This is realized in the program ROCS from Anthony Nicchols at
OpenEye. Alternatively, a different approach was taken by Christian Lemmen at
GMD in St. Augustin, Germany, in the program FlexS. First a reference ligand
is depicted through a series of property-bound Gaussian functions. The molecule
is described as a density distribution of pharmacophore properties in space.
Then the molecule to be compared by superposition with the reference ligand
is deconvoluted into fragments. A central base fragment is laid upon the
reference in such a way that the description with Gauss functions overlaps
with the reference as optimally as possible. Then the other fragments are
attached to the base fragment until the complete ligand is reconstructed. During
this attachment, care is taken to fit the fragments just as optimally in the Gauss
function. At the same time the conformational flexibility of the ligand is
considered.
One complication occurs during the similarity analysis of the molecules in this
method. Assuming that the relevant properties defining the similarity were found at
all, the question arises as to what is accepted as “sufficient” similarity to induce
17.8 Rigid Analogues Trace the Biologically Active Conformation 361

a comparable effect on the receptor. There is a toy with which children try to push
differently shaped pieces through preformed holes into a box, a so-called “shape
sorter.” For each block form, be it a cube, cuboid, round cylinder, or elliptical
cylinder, there is one performed hole that it fits. In similarity considerations there
is a tendency to group cube and cuboid, or round and elliptical cylinder into related
categories because of their similar form. If an attempt is made to push these parts
through the holes of the shape sorter, it is easily discovered that the cuboid will not
only fit through the square hole but also, with a bit of force, through the hole for the
elliptical cylinder. The cube is only slightly too big to, in addition to the square
hole, also fit through the hole for the circular cylinder. Therefore, are the cuboid and
the elliptical cylinder or the cube and the circular cylinder not more similar to one
another? The measure of similarity that is to be used for a molecule is calibrated
with respect to the receptor to which the molecule should fit. It is therefore always
a relative measure!
Thiorphan and retro-thiorphan (▶ Sect. 5.5, formulae 5.23 and 5.24) differ only
in the spatial sequence of the amide bond. They bind with almost identical affinity
to the zinc protease thermolysin, and NEP 24.11. Therefore, one would classify
them as very similar. The zinc protease ACE binds thiorphan by at least a factor of
100 times more strongly than retro-thiorphan (▶ Sect. 5.5, Fig. 5.10). Relative
to this enzyme, both substances must be called dissimilar. Another extreme is
seen in the oligopeptide-binding protein A (▶ Sect. 4.1). It binds every tri- to
pentapeptide comprising a central Lys—Xxx—Lys moiety with almost equal
affinity. In principle, only information about the shape of the binding site is
needed for a similarity analysis. Only then the requirements can be adequately
defined. However, the structure of the receptor is still not known in many drug-
design projects. Here there is no choice: it is only through hypothesis and its
experimental testing in gradual steps that the structural requirements of the
receptor can be approximated.

17.8 Rigid Analogues Trace the Biologically Active


Conformation

The concepts in ▶ Chap. 16, “Conformational Analysis” showed that an enor-


mously large number of conformers can be easily generated for many drug-like
molecules. If comparison of all conformers is desired, the undertaking quickly
becomes computationally very intensive. When would a chance been given to get
an idea of the bound conformations? Either one compound in the data set is highly
rigid and constrains the putative arrangements of the pharmacophore in space, or
the considered molecules are rigid in different areas of their molecular scaffold. In
Fig. 17.8 the structural superposition of the steroid 17.2 with the above-described
inhibitors 17.3–17.5 is shown. This result was obtained from a similarity analysis
with multiple conformers. The achieved result is very similar to the calculation with
the “virtual” spring forces. It has, however, a decisive advantage: no preconceived
definitions of equivalent centers are necessary, between which the spring forces
362 17 Pharmacophore Hypotheses and Molecular Comparisons

Fig. 17.8 Superposition of the steroid 17.2 and three inhibitors 17.3–17.5 according to a spatial
comparison of their molecular properties. In contrast to methods with “virtual” spring forces, this
method does not require a predefined equivalence of molecular groups. It is automatically
generated by the similarity comparison of many different conformations.

are applied. These equivalences arise automatically through a similarity compari-


son of the properties that are distributed over the molecules.

17.9 If Rigid Analogues are Lacking: Model Compounds


Elucidate the Active Conformation

In the last example a largely rigid reference compound was furnished. How should
one proceed when no such reference compound is known? Only experiment can
help here. Rigidized analogues must be synthesized. These are tested for biological
activity. If they still exhibit affinity to the receptor, it can be assumed that the active
conformation was frozen.
An example should demonstrate how the receptor-bound conformation can be
probed by synthesizing rigid model compounds. The calcium channel blocker
nifedipine 17.8 (▶ Sect. 2.5) contains multiple rotatable bonds (Fig. 17.9). It can
therefore adopt numerous conformations. Which orientation does the phenyl ring,
for instance, take relative to the dihydropyridine ring? This question was very
elegantly clarified by Wolfgang Seidel at Bayer through the synthesis and crystal
structure determination of cyclized derivatives 17.9. An additional lactone ring
changes the biological activity of the derivative depending on the ring size. In
compounds with a six-membered lactone the phenyl and dihydropyridine rings lie
virtually in the same plane. Conversely, the phenyl ring stands perpendicular to the
dihydropyridine ring in the derivative with the twelve-membered ring. The affinity
of this compound is about five orders of magnitude higher than for the derivative
with the six-membered lactone. Therefore it must be assumed that nifedipine exerts
its effect in a conformation in which the phenyl and dihydropyridine rings are
perpendicular to one another.
After this question has been answered, more compounds can be designed.
A relevant superposition that corresponds to the conditions in the protein’s binding
pocket will be possible. Such superpositions have gained a decisive meaning in the
context of 3D structure–activity relationships. An example is shown in ▶ Sect. 29.4
of how the structural fixation of the biologically active conformation of a ligand can
support the design process.
17.9 If Rigid Analogues are Lacking 363

Fig. 17.9 The calcium


channel blocker nifedipine
17.8 contains multiple (CH2)n
NO2
rotatable bonds. The phenyl O
H3CO2C CO2CH3 RO2C
ring can coincide with a plane
of the dihydropyridine ring or O
they orient perpendicular to H3C N CH3 H3C N CH3
one another. To distinguish H H
between these possibilities,
lactones with different ring 17.8 Nifedipine 17.9
size 17.9 were synthesized
and their crystal structures
were determined. The phenyl
ring lies almost parallel to the
dihydropyridine ring (a  0 )
in the compound with the six-
membered-ring lactone
(orange). Upon increasing the
ring size, the angle between
the two rings grows so that
a perpendicular orientation
(a  80 ) is achieved in the
twelve-membered-ring
derivative (green). The
biological activity increases
from virtually inactive, as in
the six-membered ring, to
almost five orders of
magnitude higher for the
twelve-membered-ring 90°-α
derivative. The bioactive 80
conformation of nifedipine 6
(gray) therefore requires
a perpendicular orientation of
the two rings. 60

7
40
11 9
8
20 10

12
0
1 10 100 1,000 10,000 100,000
Ki (nM)
364 17 Pharmacophore Hypotheses and Molecular Comparisons

17.10 The Protein Defines the Pharmacophore: “Hot Spot”


Analysis of the Binding Pocket

It was described in Sect. 17.1 that a pharmacophore can also be derived from the
protein structure. The computer program GRID from Peter Goodford is a tool that
is often used for this purpose. It calculates favorable positions for functional groups
on a putative ligand in the protein’s binding pocket. These could be, for instance,
a carboxylate group, a hydroxyl group, or an aliphatic carbon atom. The potential
function, implemented into GRID, has been calibrated on numerous functional
groups from crystal structures of organic molecules. The result of a GRID calculation
is a set of interaction energies assigned to the intersections of a regularly spaced grid
that is inscribed into the binding pocket. The energies are graphically displayed, for
instance, by contouring the spatial area at which the interaction energy reaches or
exceeds a certain predefined threshold. They indicate hot spots for the placement of
functional groups of a potential ligand. The areas in which the interactions with an
aromatic carbon atom or a hydroxyl oxygen atom are favorable are shown for the
enzyme thermolysin in Fig. 17.10. Such calculations are carried out with a set of
different probes, for instance, a water molecule, an aromatic carbon, a hydrogen-bond
acceptor or donor, or a positively or negatively charged group. The results provide
valuable information about the shape and electrostatic properties of the binding pocket.
Another way of analyzing protein structures is based on the idea that the physical
nature of non-bonding interactions in protein–ligand complexes and in the crystal
packing of small organic molecules is identical. The latter are particularly inter-
esting for this purpose because the crystal structures of small organic molecules are
regularly determined with great precision. There are over 500,000 crystal structures
stored in the Cambridge Database (▶ Sect. 13.9). This collection is ideal to obtain
relevant and reliable data via a statistical analysis for ligand-design purposes
(▶ Sect. 14.7). Let us assume that there is a carboxylate group —COO— on the
protein that protrudes into the binding pocket. Where must a partner group be
positioned to form a favorable interaction? To answer this question, the Cambridge
Database was searched first for compounds with carboxylate groups, and then for
each of the retrieved groups, the position of the counter group that forms an H-bond
to the carboxylate was saved. Finally, the collective of all the found H-bonds was
superimposed in that the carboxylate groups of all examples are superimposed
exactly onto one another. The distribution of H-bond-donor groups (Fig. 17.11)
offers a valuable picture of the allowed area of the H-bond geometry. Subsequently,
such a distribution can be superimposed onto the protein structure by matching with
the carboxylate group of the protein. Areas in which the distribution overlaps with
other atoms of the protein are discarded. In this way the energetically most
favorable areas for a counter group in the binding pocket are found. In Fig. 17.12
these distributions are compared with a protein–ligand complex. As expected, the
hydrogen-bond geometries found in the complex coincide nicely with the range that
was found in the crystal packings of organic molecules. A system of rules for non-
bonding interactions in protein–ligand complexes was obtained from the statisti-
cal evaluations of all groups that are found in proteins. These rules are compiled at
17.10 The Protein Defines the Pharmacophore 365

Phe114

Asn112

Zn2+

Arg203

O
CH3 OH
H2O
HO
N O HO O
HO
Water Acetonitrile Acetone Isopropanol Phenol Benzylsuccinic acid

Fig. 17.10 An analysis of the binding pocket of thermolysin. Areas of favorable interactions were
calculated for an aromatic carbon probe (white) and a hydroxyl oxygen atom (red). There are also
fragments mentioned in Fig. 7.8 that could be determined by allowing the probe molecules to
diffuse into the protein crystals. The calculated hot spot corresponds well with the positions that
were crystallographically determined with molecular probes.

the Cambridge Crystallographic Data Centre in the Isostar database. Once


superimposed with the protein, they can be contoured to map hot spots of binding
with the program SuperStar.
Knowledge-based potentials represent another approach for the display of
a protein-based pharmacorphore. For this, the contact geometries in protein–ligand
complexes are evaluated. A histographical distribution is compiled that shows how
often a particular contact occurs between a group of a ligand and an amino acid in
the protein. If such a statistical frequency distribution is related to a mean reference
366 17 Pharmacophore Hypotheses and Molecular Comparisons

a b

OH

OH O

O
O O−

OH

c d

OH O OH O

Fig. 17.11 Hydrogen-bonding geometries (carbon is green, oxygen is red, and hydrogen is white)
around a carboxylate group (a), ester group (b), carbonyl group (c), and ether group (d). Structures
with these central groups that form hydrogen bonds with OH donor groups were extracted from the
Cambridge database. These examples were superimposed based on the geometry of the central
group. It is obvious that there is considerable variability in the interaction geometry, but also that
preferred orientations are to be found. It is also shown that, for instance, the interaction pattern
around an ester group (b) is not simply a superimposition of the distribution around a carbonyl
group (c) and an ether group (d).

state, an energy function can be calculated from it. In this function it is assumed that
contacts that occur more frequently than the average distribution are energetically
favorable. If they occur rarely, they are assigned to be unfavorable. These statistical
potentials have been integrated into the scoring function DrugScore. They can also
be used for the analysis of binding pockets and help to indicate hot spots in the
ligand binding.
The MCSS method was developed in the group of Martin Karplus. Several
thousand random probe molecules such as acetone, water, methanol, or benzene
were placed in a binding pocket for this. A computer simulation is started with
17.11 The Search for Pharmacophore Patterns in Databases 367

Ala97

Leu4

Asp26
Tyr155

Fig. 17.12 The distribution of H-bond-donor groups (carbon is white, oxygen is red, and nitrogen
is blue) around a carboxylate group or a carbonyl group are superimposed with the 3D structure of
the complex of methotrexate with dihydrofolate reductase (Fig. 17.7). The distributions are
imposed onto the acid group of Asp26 and the carbonyl groups of Leu4 and Ala97. The hydrogen
bonds formed between protein and ligand coincide geometrically with ranges often found in small
organic molecules in the crystal structures.

which the single probe molecules are moved into optimal positions. They are driven
by a calculation according to the underlying force-field. The probe molecules
experience the interaction with the protein, but they do not “see” one another. At
the end of the calculation a frequency distribution for the probe molecules is
obtained. If this distribution is evaluated, a hot spot for an interaction with the
protein is highlighted. If the so-obtained hot spots are compiled into a composite
picture, a protein-based pharmacophore is obtained.

17.11 The Search for Pharmacophore Patterns in Databases


Generate Ideas for Novel Lead Compounds

A pharmacophore can be used to search a database for promising candidates that


are able to be accommodated in a protein’s binding pocket. The reference
pharmacophore can be either derived from a set of superimposed ligands, or
a reference protein can define its properties. How such a database search is carried
out and what is discovered in the process depends on how much information is
stored in the database itself. If only 2D structures are collected, all examples can be
retrieved that possess a particular functional group or substructure. Based on the
topology, different criteria are defined to determine the degree of similarity
between molecules. If the definition of the pharmacophore is very generally
defined, for instance, an aromatic compound with an acid group and a basic
368 17 Pharmacophore Hypotheses and Molecular Comparisons

nitrogen atom, then numerous hits will be found. However, it is important which
relative spatial distances are given between these groups. Such information is not
taken into account in searching a 2D database. Matthias Rarey and Scott Dixon
developed the Feature-Trees method, which can screen large databases according to
topological criteria. However, the connectivities of the chemical formulae are not
compared. Rather, the database entries are initially classified by the topological
sequences of particular characteristics, for instance, the presence of an H-bond-
donor group or a hydrophobic cyclic molecular portion. Such a method can
compare molecules and find candidates that have pharmacophore properties in
a comparable topological sequence extremely quickly.
Databases that contain 3D molecular geometries allow the search for the spatial
pattern of the pharmacophore. For example, the Cambridge Database of crystal
structures of small organic molecules (▶ Sect. 13.9) can be used for such a search.
Molecules are found with experimental geometries that satisfy the pharmacophore.
In the search for ligands for HIV protease (▶ Sect. 24.3) a pharmacophore pattern
was derived from the known crystal structure of the enzyme, and the Cambridge
Database was searched for molecules that match this pattern. The result of this
search is presented in ▶ Sect. 24.4 (Fig. 24.16) in detail. It inspired the researchers
at Dupont–Merck with the first ideas that led to the development of an entirely new
class of non-peptidic HIV-protease inhibitors.
These days databases containing 3D structures of molecules generated from 2D
structural formulae are commonly used along side experimental structural databases.
In other approaches, the molecules spatial structure is generated on the fly during the
search (▶ Sect. 15.2). Here, as with most entries in the Cambridge Database, each
molecule is present in only one conformation. Molecules can, however, adopt many
different conformations (▶ Chap. 16, “Conformational Analysis”). It is therefore
usually the exception that a flexible molecule exists in the “right” conformation
required for the search. Therefore conformational flexibility must be considered
during the search. An elaborate search, for example, the active-analogue approach,
would demand too much computational time. Therefore fast algorithms have been
developed to figure out whether particular pharmacophoric groups on the molecules
could fall within predefined distances. It is enough to estimate the minimum or
maximum achievable distances. This concept has been realized e.g., in the program
UNITY from the company Tripos. One can start from a database holding multiple
precalculated conformers. Here it is critical that the stored conformers are distributed
as representatively as possible throughout the conformational space (▶ Sect. 16.6).
The single conformers are then checked to see whether they fit to the defined
pharmacophore. This concept is followed by the program Catalyst from the company
Accelrys.
It is not to be expected that such database searches directly deliver candidates for
clinical trials. As an idea generator, however, they can guide the drug researcher to
novel lead structures and can drive synthetic plans down entirely different path-
ways. Today database searches are carried out on a large scale during the course of
virtual screenings (▶ Sect. 7.6). For this, proprietary compound libraries are
screened, or collections of commercially available compounds are searched.
17.12 Synopsis 369

John Irwin and Brian Shoichet at UCSF in San Francisco have taken on the initiative
with the database ZINC, which collects current commercially available compounds
and makes the collection available for database searches. Preset filters help to sieve out
the desired subsets for the search at hand from the millions of compounds in the
databases. As a major advantage, the found hits can be purchased and experimentally
tested in an assay. Many candidates for new lead structures have already been
discovered by using this “lead discovery by shopping” strategy (see ▶ Sect. 21.7).

17.12 Synopsis

• The structure of the binding pocket determines which functional groups are
necessary on the ligand side for successful protein binding. Either the ligand
or the protein structure can be used as the starting point from which
a pharmacophore is derived.
• The superposition of active and inactive small molecule ligands from a series of
related compounds upon one another can be used to define the allowed and
forbidden areas in a hypothetical binding pocket. Logical operations of volume
differences are indicative for the design of optimized ligands.
• Flexible molecules that can adopt different conformations present a special
challenge in superpositions. The molecules must be energy-minimized as part
of the superposition procedure or, alternatively, multiple conformations must be
evaluated.
• Alternatively, a set of molecules can be superimposed by assigning
pharmacophoric groups, and through systematic rotations about all open-
chain single bonds a common alignment is found in the active-analogue
approach.
• Care must be taken to not be deceived by molecules that look similar with
respect to their chemical formulae. Instead, the interacting functional groups are
important for the molecular recognition at the binding pocket and not the
scaffold itself. The role of water in the binding must not be underestimated.
• Molecular recognition properties can also be considered to mutually superim-
pose molecules.
• The synthesis of a structurally rigid analogue (or analogues) can help to define
and validate the pharmacophore assignment and the determination of the bio-
logically active conformation.
• Binding “hot spots” can be found by examining the protein by mapping the
binding pocket with small molecules and probes with different properties. These
give some ideas as to what sort of molecule might show successful binding to the
target protein.
• The Cambridge Database of crystal structures provides valuable insights into
preferred interaction geometries and motifs. Such information is of high rele-
vance for protein–ligand complexes because the forces that are responsible for
crystal packing are the same as for non-bonding interactions between active
substances and proteins.
370 17 Pharmacophore Hypotheses and Molecular Comparisons

• A variety of databases are available that can be screened by using a 3D


pharmacophore as a search query. Usually, commercially available compounds
are screened first. If they show activity on a certain protein of interest, they can be
purchased and tested, and will hopefully provide a starting point for lead discovery.

Bibliography

General Literature

Klebe G (1993) Structural alignment of molecules. In: Kubinyi H (ed) 3D-QSAR in drug design,
Theory, methods and application. ESCOM, Leiden, pp 173–199
Langer T, Hoffmann RD (2006) Methods and principles in medicinal chemistry. In: Mannhold R,
Kubinyi H, Folkers G (eds) Pharmacophores and pharmacophore searches, vol 32.
Wiley-VCH, Weinheim
Marshall GR (1989) Computer-aided drug design. In: Richards WG (ed) Computer-aided
molecular design. IBC Technical Services, London, pp 91–104

Special Literature
Bolin JT, Filman DJ, Matthews DA, Hamlin RC, Kraut J (1982) Crystal structure of Eschericha
coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 Å resolution. J Biol Chem
257(13):13650–13662
Kearsley SK, Smith GM (1990) An alternative method for the alignment of molecular structures:
maximizing electrostatic and steric overlap. Tetrahedron Comput Methodol 3:615–633
Klebe G, Mietzner T, Weber F (1995) Different approaches toward an automatic structural
alignment of drug molecules: applications to sterol mimics, thrombin and thermolysin inhib-
itors. J Comput-Aided Mol Des 8:751–778
Klunk WE, Kalman BL, Ferrendelli JA, Covey DF (1983) Computer-assisted modeling of the
picrotoxinin and g-butyrolactone receptor site. Mol Pharmacol 23:511–518
Kuster DJ, Marshall GR (2005) Validated ligand mapping of ACE active site. J Comput-Aided
Mol Des 19:609–615
Mackay MF, Sadek M (1983) The crystal and molecular structure of picrotoxinin. Aust J Chem
36:2111–2117
Marshall GR, Barry CD, Bossard HE, Dammkoehler RA, Dunn DA (1979) The conformational
parameter in drug design: the active analog approach. In: Olson EC, Christoffersen RE (eds)
Computer-assisted drug design, vol 112, ACS symposium series. American Chemical Society,
Washington, DC, pp 205–226
Martin YC (1992) 3D database searching in drug design. J Med Chem 35:2145–2154
Mayer D, Naylor CB, Motoc I, Marshall GR (1987) A unique geometry of the active site of
angiotensin-converting enzyme consistent with structure-activity studies. J Comput-Aided Mol
Des 1:3–16
Seidel W, Meyer H, Born L, Kazda S, Dompert W (1984) Rigid calcium antagonists of the
Nifedipine-type: geometric requirements for the dihydropyridine receptor. In: Seydel JK (ed)
QSAR as strategies in the design of bioactive compounds. VCH, Weinheim, pp 366–369
Quantitative Structure–Activity
Relationships 18

Quantitative structure–activity relationships, QSAR (usually pronounced [0 ky€ u:


sar]), attempt to describe and quantify the correlation between chemical structure
and biological activity. The investigated substances should come from a chemically
uniform series and must interact with the same biological target. They should also
display the same mode of action. For example, structurally analogous inhibitors of
a particular protein can be compared among themselves, but not different blood
pressure lowering drugs that have diverse modes of action on different target proteins.
The correlation of biological activity with the physicochemical properties is always
related to relative potency in a test model, but not to different effect qualities.
The foundation of quantitative correlations between chemical structure and
biological effect is the entirely reasonable assumption that the differences in the
physicochemical properties are responsible for the relative potency of the interac-
tions of the drug with biological macromolecules. It is assumed in the first approx-
imation that these contribute additively to the affinity of an active substance on its
receptor. The concept of describing the biological activity of substances with
mathematical models is derived from this approach.
For the system under investigation, it can be assumed that the simpler it is, the
more likely it will be that a quantitative structure–activity relationship can be
derived. To a certain extent this is valid for in vitro systems, such as the inhibition
of an enzyme or the binding to a receptor, where the assay records only the binding
of a compound to a protein. The more complex the system is, for example, central
nervous system effects on an animal after oral administration, the more different
processes must be considered. In this case the absorption, distribution, blood–brain
barrier penetration, further transport to the target tissue, metabolism, and elimina-
tion overlap with one another and with the actual effect on the receptor. In principle,
an individual structure–activity relationship is required for each of these events.
Establishing valid and relevant models for each of these steps, requires
corresponding test systems that examine the different steps separately. In favorable
cases it might be possible to characterize a complex multistep process by one single
equation. This is only feasible if one step, for instance, the penetration through the
blood–brain barrier, dominates the entire structure–activity relationship.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_18, 371


# Springer-Verlag Berlin Heidelberg 2013
372 18 Quantitative Structure–Activity Relationships

e.g. R1
R1 pH < 9 R1 CH3I
+ +
R2 N H R2 N R2 N CH3
R3 pH > 9 R3 R3

Positively Neutral Alkaloid Quaternary, Perpetually


Charged Form Charged Form

Fig. 18.1 The protonation of a tertiary amine depends on the pH value of the medium (left). On
the other hand, the quaternization of a nitrogen atom leads to a permanently positively charged
compound (right).

18.1 Structure–Activity Relationships of Alkaloids

The South American dart poison tubocurare (▶ Sect. 7.1) was the first therapeutic
principle for which the exact mode of action was elucidated. In 1852, Claude
Bernard recognized that this quaternary alkaloid causes muscle paralysis, but that
the nerve as well as the muscle remain independently excitable. Curare must
therefore act on the coupling between nerve and muscle. Scottish pharmacologists
Alexander Crum-Brown and Thomas Fraser occupied themselves somewhat more
exhaustively with the question of whether the quaternization of the nitrogen atom of
different alkaloids (Fig. 18.1) has an influence on their biological effects. In 1868,
from entirely different effects observed before and after the transformation of
alkaloid, they formulated a general equation to describe structure–activity rela-
tionships (Eq. 18.1).

F ¼ f ðCÞ (18.1)

This equation is ingeniously simple, but it says only that F (Greek letter Phi), the
biological activity, is a function of C, the chemical structure. At that time, the
tetrahedral structure of the carbon atom had not been clarified, and the constitution
of many organic compounds, above all complex natural products, was entirely
unknown.

18.2 From Richet, Meyer, and Overton to Hammett and


Hansch

In 1893, Charles Richet published an investigation on the toxicity of organic


compounds. From the comparison of the water solubility of ethanol, diethyl ether,
urethane, paraldehyde, amyl alcohol, and absinthe extract(!) to the lethal dose in the
dog, he concluded plus ils sont subles, moins ils sont toxiques, that is, the better the
solubility, the less the toxicity. This was the first evidence of a linear inverse
relationship between water solubility and biological activity.
18.2 From Richet, Meyer, and Overton to Hammett and Hansch 373

Around the turn of the twentieth century, pharmacologist Hans Horst Meyer and
botanist Charles Ernest Overton founded the lipid theory of anesthesia indepen-
dently, which unifies three important statements:
• All chemically unreactive substances that are lipophilic and can be distributed in
biological systems have anesthetic effects.
• The biological effect occurs in nerve cells because fat plays an important role in
their function.
• The relative potency of anesthetics depends on their partition coefficient
(▶ Sect. 19.2) in a mixture of fat and water.
The work of Crum-Brown, Fraser, and Richet, or the contribution of Meyer and
Overton can be seen as the origin of quantitative structure–activity relationships. In
fact after the formulation of the anesthesia theory, numerous other linear, and later
non-linear, dependencies on the lipophilicity, the “fat affinity” of active substances,
were found. But all of these activities were relatively unspecific “membrane”
effects.
In the middle of the 1930s Louis P. Hammett formulated a relationship between
the electronic properties of the substituents and the reactivity of aromatic com-
pounds. Accordingly, the relative contribution of electron-withdrawing and elec-
tron-donating substituents on the electron density of the aromatic ring is always
constant. They are determined by the electronic parameter of the substituent, the
Hammett constant, s. Electron-accepting substituents with positive s values are,
among others, the nitro group, the cyano group, and the halogens. Electron-
donating substituents with negative s values are hydroxyl and amino groups, the
methoxy group, and alkyl substituents. Acceptor substituents enhance the acidity of
benzoic acids and phenols, they reduce the basicity of anilines, and they accelerate
the basic hydrolysis of benzoic ethers. Electron-donating substituents exert an
opposite influence.
However an individual reaction constant r must be applied for each reaction
type of aromatic compounds. By using Eq. 18.2, later generally called the Hammett
equation, the equilibrium constant K for an arbitrary reaction can be calculated from
r and s. R–X and R–H represent the relevant aromatic compounds substituted with
the group X, or unsubstituted, respectively.

rs ¼ log KRX  log KRH (18.2)

Acceptor and donor substituents influence the electron density on the heteroatoms
and reduce or increase the ability to form hydrogen bonds. This, among other things,
explains the electronic influence of aromatic substituents on the biological activity of
drug molecules. The Hammett equation was therefore seen as a challenge to phar-
maceutical chemists and biologists to derive quantitative structure–activity relation-
ships from this concept. Many groups have made efforts to find relationships between
biological activity and the Hammet constants s, or between s and/or r-analogous
substituents and to derive test parameters for biological systems. Despite individually
interesting results, no generally valid concept could be established.
374 18 Quantitative Structure–Activity Relationships

It was Corwin Hansch and Toshio Fujita who in 1964 published a work that
established the fundamentals for quantitative structure–activity relationships. In
this, they describe:
• The definition of a lipophilicity parameter p, analogous to the electronic term s
in the Hammett equation.
• The combination of different parameters in a model.
• The formulation of a parabolic model for the description of non-linear
lipophilicity–activity relationships.

18.3 The Determination and Calculation of Lipophilicity

Corwin Hansch had previously investigated the structure–activity relationship of


phenoxyacetic acids, which show growth-stimulatory effects in plants. In addition
to their biological activity, he was particularly interested in their lipophilicity,
which can be measured by the partition coefficient in an octanol/water system
(▶ Sect. 19.1). It occurred to him while analyzing the data that the lipophilicity is
an additive molecular parameter. The logarithm of the octanol/water partition
coefficient P is given by the sum of the group contributions of the individual parts
of the molecule. Hansch defined a lipophilicity parameter p (Eq. 18.3), analo-
gously to the Hammett equation. R–X and R–H have the same meaning here as in
Eq. 18.2. The absence of a reaction-specific r term in Eq. 18.3 is because the p
value is based on a single distribution system: n-octanol and water.

p ¼ log PRX  log PRH (18.3)

n-Octanol was chosen for theoretical and practical reasons. It has a long aliphatic
chain and a hydroxyl group that is an H-bond donor as well as an acceptor. Its
structure therefore resembles the membrane lipids to some extent. It dissolves a
large number of organic compounds, it has a low vapor pressure, but can nonethe-
less be easily removed. Its UV transparence over an extremely wide range is
particularly advantageous.
With the help of the lipophilicity parameter p, the log P values of new com-
pounds, and therefore their lipophilicity, can be calculated. For this the lipophilicity
of the basic scaffold and the p values of the substituents must be known. In this way
the biological activity can be correlated without the tedious experimental measure-
ments of each individual partition coefficient. In addition to the p values of all
important substituents, a very large number of experimentally determined octanol/
water partition coefficients are available in the literature.

18.4 Lipophilicity and Biological Activity

Lipophilicity has an overwhelming role in describing the dependence of biological


effects on chemical structure and therefore accounts for many quantitative
18.5 The Hansch Analysis and the Free–Wilson Model 375

structure–activity relationships. This is easily understood because biological


systems consist of aqueous phases that are separated by lipid membranes. The
transport and the distribution of small molecules in such systems must therefore
depend on the lipophilicity. For polar substances the lipid membrane represents
a barrier that they cannot surmount. Only substances with moderate lipophilicity
have a good chance to “migrate” into the aqueous as well as the lipid phases to
arrive in adequate concentrations in the target tissue (▶ Chap. 19, “From In Vitro to
In Vivo: Optimization of ADME and Toxicology Properties”). Although soluble
proteins carry overwhelmingly polar amino acid residues on their surfaces, the more-
or-less buried binding site for ligands is constructed from polar and non-polar areas.
The hydrophobic parts of the ligand bind to the hydrophobic parts of the pocket.
The size of these hydrophobic surface areas is always limited. The size and form of
the lipophilic portion of the ligand must fit to the hydrophobic surfaces in the binding
pocket. Because the natural ligands that are normally bound in these pockets have
adequate water solubility themselves, the lipophilic areas in the binding pockets are of
limited size. Another reason for the complex, generally non-linear lipophilicity–activity
relationships results from this fact. Many linear and non-linear lipophilicity–activity
relationships describe relatively unspecific biological effects such as anesthetic,
bactericidal, fungicidal, and hemolytic effects. They shall not be further discussed
here. Other relationships describe the transport and distribution in a biological system.
Such structure–activity relationships are discussed in ▶ Chap. 19, “From In Vitro to
In Vivo: Optimization of ADME and Toxicology Properties”.

18.5 The Hansch Analysis and the Free–Wilson Model

In 1964 Corwin Hansch and Toshio Fujita derived a mathematical model more
intuitively than theoretically that can quantitatively describe structure–activity
relationships, the Hansch analysis (Eq. 18.4).

1
log ¼ k1 ðlog PÞ2 þ k2 log P þ k3 s þ K k (18.4)
C

In Eq. 18.4, C is a molar concentration that induces a particular biological effect.


When related to a series of substances, it is the equieffective molar dose. Log P is the
logarithm of the octanol/water partition coefficient P, and s is the Hammett constant.
The square of the log P term allows the quantitative description of non-linear
lipophilicity–activity relationships. This term is omitted when the dependence is
linear. Other terms such as polarizability and steric parameters can additionally occur.
The coefficients k1, k2,. . . and k are determined with the method of regression
analysis. The Hansch analysis therefore establishes a hypothetical model for quan-
titative relationships between biological activity and physicochemical parameters.
Biological data are flawed, and the same is true for physicochemical properties.
Despite this, the reliability of the latter parameters is usually greater than those of
the biological data. The result of a calculation is judged by the squared differences
376 18 Quantitative Structure–Activity Relationships

Table 18.1 The biological activity of meta- and para-substituents of phenethylamines 18.1 (i.v.
application in the rat; C in mol/kg rat)
meta (X) para (Y) log 1/C
Br
X N

Y x HCI

18.1

H H 7.46
H F 8.16
H Cl 8.68
H Br 8.89
H I 9.25
H Me 9.30
F H 7.52
Cl H 8.16
Br H 8.30
I H 8.40
Me H 8.46
Cl F 8.19
Br F 8.57
Me F 8.82
Cl Cl 8.89
Br Cl 8.92
Me Cl 8.96
Cl Br 9.00
Br Br 9.35
Me Br 9.22
Me Me 9.30
Br Me 9.52

between the measured biological data and the values that were calculated from the
model. The sum must show the smallest possible value over all of the investigated
compounds. It represents an important criterion for the judgment of the quality of
a model, or for the comparison of different models with different qualities.
The quantitative structure–activity relationship of the antiadrenergic effect of
N,N-dimethyl-b-bromophenethylamines 18.1 (Table 18.1) is considered as an
example. According to their structure, these compounds more or less reverse the
agonistic effect of an adrenaline dose. The value C is the dose of an antagonist that
blocks the adrenaline effect by 50%. The data can be described with the Hansch
model, which is illustrated in Fig. 18.2.
The description of the entire data set is possible with a mathematical model by
using the derived equations. A carbocation is formed upon cleavage of bromine,
18.5 The Hansch Analysis and the Free–Wilson Model 377

C = Molar concentration Regression 95% Confidence interval


that invokes a particular coefficient for the coefficients and
biological effect values constants

Log 1/C = 1.15 (±0.2) p -1.46 (±0.4) s + + 7.82 (±0.2)

The logarithm of
the reciprocal value Lipophilicity Electronic Constant term
gives the correct parameter parameter
scaling
(n = 22; r = 0.945; s = 0.196; F = 78.6)

The Fischer value F is a


Number of measure of the significance;
compounds it is often not reported
The correlation The standard deviation,
coefficient, r, is a s, is a measure of the
measure for the absoute quality of the
quality of the model model

Fig. 18.2 A QSAR equation delivers individual parameters for a quantitative model for the
prediction of biological activity, in this case from substituted N,N-dimethyl-b-
bromophenethylamines (Table 18.1).

and the substances bind irreversibly to the adrenergic receptor. Accordingly, the sþ
term is found in the Hansch equation (Fig. 18.2), which describes such reaction
types particularly well. Lipophilic substituents increase the biological activity
(positive p term) and electron-withdrawing substituents decrease it (negative sþ
term). Therefore lipophilic electron-donating substituents, for example, large alkyl
substituents, should be optimal for the activity. Second, within certain limits, the
effect of further compounds can be predicted. Interpolations, that is, conclusions
that are drawn based upon very similar substituents, have a better reliability than
extrapolations, which are predictions made outside of the parameter space, for
instance, for considerably more lipophilic, more polar, or larger substituents. As
a first approximation, it can be said of the statistical parameters r, s, and F
(Fig. 18.2) that the correlation coefficient r should have values that are close to
1.00, the standard deviation, s, should be as small as possible, and the F value
should be as large as possible. The better the criteria are fulfilled, the better the
quantitative model will be, in other words, the experimental and calculated values
agree better with one another.
Also in 1964 and independently of Hansch and Fujita, S. R. Free and J. W. Wilson
developed an entirely different model for structure–activity analysis. Because the
original approach is confusingly formulated and awkward to use, here only a variant
shall be discussed that was later proposed by Fujita and T. Ban, the Free–Wilson
analysis. The Free–Wilson analysis assumes that within a set of chemically related
378 18 Quantitative Structure–Activity Relationships

Basic Scafforld (contribution μ)

Active
substance
Free-Wilson Model:
X1 Xn
log 1/C = Σ a i + m
(Contribution a1) X2 (Contribution an)

(Contribution a2)

Fig. 18.3 The Free–Wilson analysis uses the additive nature of the group contributions to
describe the biological activity. Accordingly, the biological activity in the displayed equation is
made up of the activity of the basic scaffold, m, and the constant group contributions ai of the
substituents Xi.

Table 18.2 Free–Wilson group contributions for phenethylamines


Position H F Cl Br I Me
meta 0.00 0.30 0.21 0.43 0.58 0.45
para 0.00 0.34 0.77 1.02 1.43 1.26
m ¼ 7.82
(n ¼ 22; r ¼ 0.97; s ¼ 0.19)a
a
For an explanation of these values see Fig. 18.2

substances, a reference compound, usually the unsubstituted starting compound,


makes per se a specific contribution m to the biological effect. Each substituent on
this scaffold delivers an “additive and constitutive” contribution ai to the biological
activity (Fig. 18.3). Additive, because there is no consideration of structural variation
in other positions in the molecule, and constitutive because it does matter on what
position of the molecule the specific structural change is undertaken. Despite these
relatively simple assumptions, the Free–Wilson analysis delivers good quantitative
models for many structure–activity relationships.
In contrast to the Hansch analysis, which compares properties, the Free–Wilson
analysis is a real “structure–activity analysis,” because the parameter that codes for
the structural information (1 for present, 0 for absent) correlates with biological
effects. It is easily carried out, but the structures and the biological data must be
known. Unfortunately, the Free–Wilson analysis also has disadvantages:
• The structural variation must be present on at least two different substitution
sites, because otherwise there will not be enough degrees of freedom to use
statistical methods.
• The usually large number of variables diminishes the predictive value and
reliability of the analyses.
• Predictions are only possible for combinations of substituents that have already
been considered in the analysis, and not for new substituents.
If the Free–Wilson analysis is applied to the above-mentioned antiadrenergic
phenethylamine example, the values in Table 18.2 are obtained for the scaffold and
the substituent contributions. Even after a quick glance, an increase in the values from
18.7 Structural Alignment as a Prerequisite for the Relative Comparison of Molecules 379

F to Cl and Br to I, that is, the influence of the lipophilicity, is obvious. Despite having
almost the same lipophilicity, the methyl and chloro substituents are different. This is
explained by their different electronic properties. Differences in the meta and para
position on the electronic influence can also be followed. Therefore the Free–Wilson
analysis indeed has advantages for the analysis of substituent effects.

18.6 Structure–Activity Relationships of Molecules in Space

As was shown in the previous section, an attempt is made to correlate structure–


activity relationships with substance-specific parameters. These parameters, for
example, volume, polarizability, or lipophilicity are properties that are calculated
or measured for the entire molecule or for specific groups of substituents. The 3D
structure of the molecules is only conditionally considered by these descriptors.
Therefore in the context of increasing knowledge of the spatial structure of protein–
ligand complexes, the QSAR methods focus on parameters that can be derived from
the 3D structure. As a general rule the goal of these approaches is to calculate
binding affinity. The techniques can also be applied for the description of
other biological properties such as the bioavailability or the metabolic reactivity
(▶ Chap. 19, “From In Vitro to In Vivo: Optimization of ADME and Toxicology
Properties”). To distinguish them from the above-described classical QSAR tech-
niques, these are referred to as 3D-QSAR methods.
Ideally, parameters are desired that can be read directly from the 3D structure of
an active substance and that can be used to draw conclusions about their binding
affinity. The interplay between these parameters and the activity are, however, very
complex and even today are by no means fully understood. Furthermore there are
still many other biological systems on which one would like to apply 3D-QSAR
methods, but the structures of the relevant target proteins are unknown. Many
pharmacologically relevant receptors are membrane bound, and their structure
determination has proven to be extremely difficult. The knowledge of their struc-
tures is, however, a prerequisite for a reasonable estimation of the binding affinity
of a ligand from the geometry of the formed complex (▶ Chap. 4, “Protein–Ligand
Interactions as the Basis for Drug Action”). As a consequence, an attempt is made
to calculate not the absolute values of the binding affinities from these incomplete
data, instead relative affinity differences between active substances in a data set are
concentrated upon. The gradual changes in the substance-specific parameters are
then correlated with the biological data.

18.7 Structural Alignment as a Prerequisite for the Relative


Comparison of Molecules

Assumptions about the spatial structure of molecules are already considered in


classical QSAR techniques. Different positions of substituents, for example, in the
meta or para position of an aromatic ring, are often described by individual
380 18 Quantitative Structure–Activity Relationships

parameters. In this form they are regarded in the Hansch equation as well as the
Free–Wilson analysis (Sect. 18.5). Moreover, indicator variables for different
configurations of substituents, for example, the configuration of stereoisomers,
are defined in classical QSAR models. An analogous orientation of the molecule
in a hypothetical binding pocket is assumed for the use of these parameters. For
example, it is assumed that all ortho substituents are oriented toward the “same
side” in a series of ortho-substituted derivatives. As a prerequisite structure–activity
relationships that correlate the biological activity with properties of the 3D structure
need a spatial superposition of the active substances. This superposition should
approximate the relative orientation in the binding pocket as accurately as possible.
A technique was discussed in ▶ Chap. 17, “Pharmacophore Hypotheses and Molec-
ular Comparisons” that can be used for the calculation of these spatial
superpositions.

18.8 Binding Affinities as Compound Properties

Which substance-specific characteristics can be used to correlate the properties of


the 3D structure with the binding affinity? As was discussed in ▶ Chap. 4, “Protein–
Ligand Interactions as the Basis for Drug Action,” the binding affinity is composed
of enthalpic and entropic components. The first contribution comprises everything
that depends on the direct energetic interaction. These are predominantly of a steric
(van der Waals potentials, ▶ Sect. 15.4) or electrostatic (Coulomb potentials)
nature. The second contribution concentrates on the degree of ordering and the
distribution of the energy over the different degrees of freedom of the studied
system. The ligands as well as the binding pockets of a protein are solvated by water
molecules in the uncomplexed state. Upon complex formation, the enthalpic inter-
actions to these water molecules are lost. They are replaced by direct interactions
between ligand and protein. Because only the relative differences between mole-
cules of a data set are of interest, any effects that are the same for all derivatives are
ignored. Among these effects are practically all influences that affect the protein.
This omission is certainly a rough simplification because the protein changes its
solvation state upon ligand binding. Water molecules are displaced from the
binding site. Ligand-induced adaptations of side chains in the binding pocket or
changes in the rotational degrees of freedom of methyl groups and side chains
(▶ Sect. 4.10) are imaginable. These effects are either not considered or are
accepted as being the same for all molecules in the data set. Presumably this
assumption is valid for many cases. Nonetheless, many new investigations clearly
show that changes affecting the protein or the dynamics of the ligand are often not
constant within a series of compounds. Here the methods will fail.
In the beginning only the steric and electrostatic interactions of an active
substance in the binding pocket should be taken into consideration. How can these
properties be compared for a series of ligands? A first approach to this was the
hypothetical interaction models from Hans-Dieter Höltje and Lemont B. Kier. The
decisive prerequisite of the latter models was the choice of and spatial positioning
18.9 How Is a CoMFA Analysis Performed? 381

for amino acid side chains around the ligands. These assumptions can be dropped
once the molecules are embedded in a lattice and can be explored with an interac-
tion probe. Richard Cramer and M. Milne proposed such a model in 1978. It took
another 10 years until the generally applicable CoMFA method (Comparative
Molecular Field Analysis) was established. Despite many theoretical and practical
deficiencies with their application, the method was quickly accepted. Today it is
applied in many different variations.
Before such an analysis can be practically carried out, a few basic consider-
ations should be made. Do steric and electrostatic interactions consider all
contributions to ligand binding that lead to a correct relative ranking of binding
affinity? As already mentioned, the binding affinity is composed of enthalpic and
entropic contributions. A sampling of the properties via probes to map interac-
tions certainly affords a measure for how well a molecule can undergo energet-
ically favorable interactions. How well are the entropic contributions considered?
A considerable portion is made up of solvation and desolvation processes
(▶ Sect. 4.6). These processes change the local water structure around the ligand
and in the binding pocket. The water structure in the immediate vicinity of the
hydrophobic surfaces of the ligand is more ordered in the solvated state than it is
in bulk water. The transition of such ligands out of the bulk water into the
protein’s binding pocket immediately causes a certain number of water molecules
to adopt a less-ordered state. This increases the entropy of the system and pro-
motes spontaneity in the binding process. The number of water molecules that are
involved in this process depends on the size of the hydrophobic surface of
the ligand. Furthermore the displacement of the water molecules from the
binding pocket upon ligand binding increases the disorder of the examined
system and also increases its entropy. In the above-mentioned approximation it
is assumed that this water-related effect is the same for all molecules in the data
set. Therefore, it is not considered in a relative comparison. Additionally,
a molecule can move “freely” in an aqueous solution and adopt different confor-
mations. In the binding pocket, however, it is fixed predominantly in one partic-
ular conformation. Rotational, translational, and conformational degrees of
freedom are lost, and the system loses entropy. All of these influences are to be
taken into consideration for the correct treatment of affinities.

18.9 How Is a CoMFA Analysis Performed?

The most important and most often used method for 3D structure–activity analysis
is the CoMFA method. The execution of a CoMFA study first requires the choice of
a data set of suitable compounds. This data set should encompass around 50–100
compounds with related overall geometry. It should also be ensured that all sub-
stances bind to the same protein at the same site, and that a binding affinity is
known for all of them. The ligands must possess a given diversity with regard to
their structural variation. Their binding affinities should scatter over at least three
orders of magnitude. Conformations are generated for all of the molecules
382 18 Quantitative Structure–Activity Relationships

(▶ Chap. 16, “Conformational Analysis”) and are superimposed by using one of


the techniques discussed in ▶ Chap. 17, “Pharmacophore Hypotheses and Molec-
ular Comparisons”. As a general rule, the spatial structure of the protein, if known,
is taken, and the ligands are mutually aligned in the binding pocket. Finally the
superimposed molecules are embedded in a lattice (Fig. 18.4) that encloses them
by a broad margin. The intersections of the lattice should show a grid spacing of 1 or
2 Å. A probe, that is, an atom with the properties of hydrogen, carbon, or oxygen, or a
particle with a formal charge, is placed at each of the grid points. The interaction
energies are calculated between this probe and each molecule in the data set. The
collective interaction contributions on the grid are referred to as the interaction field
of the molecule. This also gave rise to the name of the method. Finally the fields of the
molecules in the data set are compared with one another. If the box size is 10–20 Å,
and a grid spacing of 1–2 Å is applied, there are many thousands of field values per
molecule of the data set to be handled. This huge amount of data means that the
evaluation of the fields can be computationally very intensive.

18.10 Molecular Fields as Criteria of a Comparative Analysis

Steric and electrostatic interactions are described by a Lennard-Jones or a Coulomb


potential (Fig. 18.5) in force-fields (▶ Sect. 15.4). If the distance between a probe
and an atom of the molecule approaches zero, the Lennard-Jones and Coulomb
potentials increase toward infinity. With like-charged particles the Coulomb poten-
tial approaches infinity with oppositely charged particles negative infinity. These
values reach extremely high field contributions at the grid points that fall near the
surface or lie inside a molecule. They must be avoided in a CoMFA analysis.
Therefore the field contributions above and below a particular threshold are set
to a predefined cut-off value. According to these procedures, a Lennard-Jones or
a Coulomb potential can be calculated. Aliphatic carbon atoms, for example, can be
used as probes. These probes are given a positive or negative charge to study the
electrostatic properties of the molecules. The program GRID of Peter Goodford was
introduced in ▶ Sect. 17.10. Molecular fields can be calculated with this program
for numerous probes that describe different functional groups. For each predefined
probe there are areas in space at which favorable or unfavorable interactions
between the probe and the examined molecule are to be expected.
Moreover, other fields can also be defined aside from fields that probe the steric
and electrostatic properties of molecules. Further above in ▶ Sect. 18.8, it was
discussed that the hydrophobic surface of a molecule represents a measure for the
entropic contribution, particularly upon transfer from the bulk water phase. Molec-
ular fields were developed in the group of Donald Abraham that allow the hydro-
phobic properties of molecules to be explored (program HINT). These are
calculated by using a very similar distance-dependent function. The resulting
molecular field describes the lipophilicity distribution on the surface of a molecule.
18.10 Molecular Fields as Criteria of a Comparative Analysis 383

Comp. –lg(Ki) S1 S2 S3 .... Sn .... ....


E1 E2 E3 En
4.15
5.74
..
..
3.89
8.83
6.74

−lg(Ki) = y + a S1 + b S2 + c S3 + ... + h Sn + k E1 + m E2 + n E3 ... + z En

Fig. 18.4 A grid is generated for the calculation of molecular fields that broadly encompasses
a molecule. The grid points are color-coded, with increasing distance from the ligand (red <
yellow < green < blue < gray). The contributions from the chosen fields are calculated at points of
the lattice, which have a grid spacing of 1–2 Å. The field contributions at each point in the grid (S1,
S2,. . .Sn, E1, E2, . . . En) are written into a table. The analysis is carried out for all molecules in the
data set. The binding affinities are incorporated into the table as, for instance, –log (Ki). The field
contributions are weighted with appropriate coefficients (a, b, . . .z) and using a special statistical
method, the PLS analysis, they are related to the affinity. A model is obtained in the form of an
equation that indicates at which grid points and with what weight the different field contributions
explain the biological activity.
384 18 Quantitative Structure–Activity Relationships

E(r)
Lennard-Jones Potential

Cut-off value
Gauss Coulomb Potential
Curve

0 r

Coulomb Potential
(opposite charges)

Cut-off value

Fig. 18.5 The Lennard-Jones potential (green) is a model for describing the intermolecular
interactions of two atoms without considering their charge. Negative potential values correspond
to mutual attraction, positive values correspond to a repulsion of the particles. If a reciprocal distance
becomes infinite, the potential approaches zero. Upon approach it goes through a shallow minimum
due to alternating polarization. At even shorter distance it very steeply rises toward positive infinity
because of atom-atom repulsions. The Coulomb potential (blue) considers only electrostatic inter-
actions that formally reside as point charges on the atomic nuclei. It also approaches infinity when
the distance disappears for like-charged particles. For oppositely charged atoms, negatively infinite
values result. The hyperbolic form of the Coulomb potential is considerably less steep, so that the
particles can still “feel” one another at larger distances. Boundary values are set for potentials in
a CoMFA analysis. A Gaussian function, which takes the course of a bell-shaped curve (here only
the right half of the “bell” is shown) describes the distance dependence of the interaction potential
between the particles in the context of the CoMSIA model. As the distance disappears between the
particles, the curve reaches its maximum value, which remains finite.

18.11 3D-QSAR: Correlation of Molecular Fields with Biological


Properties

Let us assume that multiple molecular fields for each molecule in a data set have
been calculated, and a correlation of their differences with the binding affinity is
attempted. How are these differences expressed? For this we want to consider three
hypothetical examples of substituted phenyl derivatives.
• First, all of the substituents on the phenyl ring in a compound series should be
varied so that increasingly large field contributions result in the vicinity of the
substituent when being scanned with a positively charged probe. If the binding
affinities increase in the same way as the field contributions become larger, this will
be reflected in the quantitative analysis. It means that derivatives with increasingly
positively charged groups in this molecular region are more potent substances.
18.12 Graphical Interpretation of the Results of a Comparative Molecular Field Analysis 385

• A second example should be positioned a little bit differently. Now the phenyl
ring substituents are given positive or negative partial charges. Their variation
has no influence on the potency of the substances. The quantitative analysis
shows that the changes in the electrostatic field contributions have no correlation
with the biological activity. A possible explanation might be that this effect and
another property, for example, the size of the substituents, mutually cancel their
influences. It could also be that the biological activity is influenced through other
qualities of the substituents, for instance, their hydrophobic character.
• In the third case, the electrostatic properties of the substituents that are important
for binding to the receptor should be hardly varied at all at the
examined position. There might be different substituents present, however,
they all have comparable partial charges. The model that analyzes the field
contributions in the vicinity of these groups does not recognize differences and
therefore also does not correlate with the binding affinity. It can indeed be that
a class of substituents at a particular position on a molecular scaffold is actually
very important for binding but nonetheless it remains insignificant in the anal-
ysis. This has to do with the fact that a QSAR analysis only performs a relative
comparisons within a data set.
These examples are still easily manageable. The question can be posed whether
a tedious correlation method with the “detour” via molecular fields is really needed.
The situation is more complicated in practice, above all if molecules with different
scaffolds are considered. The substituents do not fall exactly on top of one another
in the molecular superposition. Their contribution must be described as a field in
space and only as such they can be evaluated. At any rate, these examples under-
score the importance for careful planning of the analysis. The structures in the data
set must be chosen so that they have the largest possible variation of substituents
and their properties.

18.12 Graphical Interpretation of the Results of a Comparative


Molecular Field Analysis

If the full complexity of the field contributions is considered in terms of


a multidimensional matrix, a straightforward regression analysis cannot be applied
to extract the interdependence of the variables, for example, the binding affinity.
PLS analysis (partial least squares) is a statistical method that extracts relevant
and explanatory factors, so-called PLS vectors, out of the large quantities of data. In
CoMFA analysis these vectors describe the area of the fields that correlate best with
the experimentally determined affinity. The result is an equation that is analogous to
the results of the classical QSAR methods. It shows to what extent particular grid
points in the individual fields contribute to the binding affinities. Depending on how
many field points there are to be evaluated in the analysis, a strict monitoring of the
statistical significance of the derived results must be undertaken. This significance
is checked by a particular test: the crossvalidation.
386 18 Quantitative Structure–Activity Relationships

One or more compounds are randomly extracted from the data set. A model is
constructed with the remaining derivatives and the affinities of the removed com-
pounds are predicted with this model. The removal of compounds is repeated
several times, in the simplest case, so often until all substances have been removed
one time. The quality of the prediction represents a measure for the reliability and
significance of the model. The achieved result is expressed with the q2 value, which
can be calculated from the square of the deviation from the predicted value. It takes
on values from 1 to +1. A value of +1 indicates that a perfect model was
achieved. All predictions exactly agree with the measured binding affinities.
There is no deviation. A value of q2 ¼ 0 indicates that the predictions of the
model are no better than no model at all; it is just as good as the average of all
affinities. If q2 takes on negative values, the model is worse than the average, that is,
worse than no model. A model is therefore only to be trusted when the q2 value lies
above 0.4–0.5.
Another step must be performed to check the predictive value of a trained model.
For this, a test data set of molecules is needed that are similar to the molecules in the
training data set, but that were not used for the training. The binding affinities are
predicted for these molecules. It is only if the correlation coefficient for this set is of
similar size to that of the training set that the model possesses adequate predictive
power.
The derived model can be used to estimate the affinity of new compounds
that have not yet been synthesized. The conformations of these compounds
are calculated and superimposed on the other structures. They must fall within
the grid that was defined in the training set. Next their field contributions are
calculated. By using the correlation derived by CoMFA for the training set, it is
possible to compute which grid points are predictive with respect to the binding
affinity of new compounds.
CoMFA techniques establish a correlation between activity data and molecular
properties. A model can be derived that encompasses the properties of new mole-
cules, from the relative comparison within a training set. Relevant predictions are
only to be expected when the structural variations in the new molecule remain
within the scope of the model. In other words, the model cannot make predictions
about the influence of substituents that occur in areas in which there were no
structural variations in the training set. CoMFA models interpolate between field
contributions from molecules. An extrapolation to areas that were not covered by
the data set is not possible.
The results of a CoMFA analysis can be graphically evaluated. From the model
it is known at which grid points field contributions are obtained that contribute
significantly to explain the binding affinity. These contributions can be contoured
for the different fields according to their importance. They indicate volume areas
around the molecules in which changes in the field contributions run parallel or
opposite to the affinity changes in the data set. These contour maps significantly
support the design of new active substances (Sect. 18.14). They indicate the
position at which the properties of a lead structure have to be varied so that an
increase in affinity can be achieved.
18.13 Scope, Limitations, and Possible Expansions of the CoMFA Analysis 387

18.13 Scope, Limitations, and Possible Expansions of the


CoMFA Analysis

Usually only steric and electrostatic field contributions are evaluated in CoMFA
analyses. A hydrophobic field can quantify the size of the hydrophobic surfaces and
therefore partially considers the entropic contribution to affinity. Because CoMFA
evaluations yield relevant models without the explicit use of hydrophobic fields,
these field contributions must be at least partially contained in Lennard-Jones and
Coulomb fields. The lipophilicity of a molecule increases upon enlarging an
uncharged, sterically demanding group, for instance, from methyl to butyl. Here
the changes in the steric field contributions can correctly reflect the lipophilic
surface. A correlation with electrostatic properties is also imaginable. Hydrophobic
molecular portions carry, as a general rule, only minor partial charges. Positively or
negatively charged groups represent hydrophilic regions. In this way the lipophilic
and hydrophilic surface regions can be quantified via differences in the charge.
The deviation that is not explained by a CoMFA model comprises, apart from
experimental errors, also all inadequately described binding contributions. These
include structural adaptations of the protein that are not identical for all compounds
in the data set. Entropic contributions that come from the conformational fixation of
the active substance in the binding pocket or the residual mobility of the ligand in
the binding pocket are also not considered in any of the fields.
In addition to these inadequacies, the fields themselves cause a few problems. Due
to their mathematical function behavior, very large and/or very small values are
achieved at the surface or in the interior of the molecule (Fig. 18.5). Because the
Lennard-Jones potential increases faster upon approaching the atoms than the Cou-
lomb potential does, both achieve arbitrarily set cut-off values (Sect. 18.10) at different
distances from the molecule. Within a distance of 2 Å, which is the commonly chosen
grid spacing, the extremely steep Lennard-Jones potential can change from practically
zero to the cut-off value. These discontinuities and the neglected areas near the surface
can cause significant problems for the interpretation. Furthermore, they often cause
fragmented contour maps in the individual fields that are difficult to interpret.
The deficits in these fields have stimulated the search for other solutions. In one
method the similarity of molecules is investigated by use of their steric and
physicochemical properties in space and correlated to the binding affinity
(CoMSIA methods; Comparative Molecular Similarity Indices Analysis). The
molecules are superimposed just as they are in the CoMFA methods. Then their
relative similarity is determined through their relationship to a probe, a carbon atom
for instance, in that the similarity of each molecule is sampled with a probe at the
intersections of a surrounding grid. The measure of similarity between the probe
and the molecule is defined in a distance-dependent way. A Gaussian function
(Fig. 18.5) is chosen for this purpose. In contrast to the hyperbolic form of the
above-described potentials, the Gaussian bell-type curve approaches for decreasing
distances finite values instead of infinity. Cut-off values need not be set. For many
different properties a similarity is determined at all grid points. The prerequisite is
that the properties must be described by atom-based values, for example, partial
388 18 Quantitative Structure–Activity Relationships

charges or atomic volumes. The same distance dependency is used for all proper-
ties. Property-specific similarity fields are obtained. These are correlated with the
binding affinity. The interpretation of the field contributions is achieved analo-
gously to the CoMFA method. The advantage of this method lies, above all, in the
interpretability and the preserved contour maps. If a particular property in an area of
the superimposed molecules correlates significantly with binding affinity, this area
is enhanced. In contrast, the CoMFA method contours areas outside of the mole-
cules, where a property reveals changes in the field contributions that affect the
affinity positively or negatively. The setting of cut-off values, however, masks
entire areas of these field contributions near the surface (Fig. 18.5).
3D-QSAR analyses were first meant to establish structure–activity relationships in
cases when the target protein’s structure was unavailable as a reference. Nowadays,
more and more crystal structures of the target proteins become available, so, the
technique is increasingly used for cases in which this reference is actually known. It
serves as a method of generating a reasonable and relevant superpositions of the
substances to be compared in their biologically active conformations. It seems all the
more paradoxical to use the information about the surrounding protein environment
only to superimpose the molecules and then to relinquish this valuable data in the
comparative field analysis. Methods have been developed that consider this informa-
tion. The group of Rebecca Wade at EMBL in Heidelberg have developed the
COMBINE method. For this, a set of modeled protein–ligand complexes are used
to calculate a data table. It contains the interaction energies between individual ligand
atoms in the test molecules of the data set and the amino acid residues and water
molecules in the surrounding protein. The interpretation of this enormous data table is
achieved by using a technique that is similar to the CoMFA methods. The graphical
interpretation of the correlation model obtained by COMBINE indicates which
regions of the protein account for decisive contributions to explain the affinity
differences in the ligand data set. These are very valuable details, but they only
help a little for the design of better molecules that achieve higher affinity.
Holger Gohlke in Marburg developed the variation AFMoC (Adaptation of
Fields for Molecular Comparison), with which it is possible to transfer information
about the protein environment into the field-based model. The advantage of the
intuitive interpretation of the field contributions with regard to the structural
optimization of the ligands is not lost. For this, values are generated on
a COMFA-like grid by using the empirical scoring function DrugScore (▶ Sect.
17.10) by placing atomic probes at each grid point. The resulting values reflect the
protein environment and the grid has been “prepolarized.” By using a docking and
superposition technique, the ligands of the training set are then placed onto this
grid. It is only when an atom of the ligand falls upon an area of the grid for which
the protein environment has predicted this atom type as advantageous, the field
contribution is enhanced. In other cases the interaction contribution on the grid is
reduced. In this way a data table is generated for the entire training set analogously
to a CoMFA method. This table is accordingly evaluated and affords a QSAR
equation. The individual contributions can be shown on a grid. They indicate where
particular atom types increase or reduce affinity.
18.14 A Glimpse Behind the Scenes 389

A similar field analysis is also used for the correlation and prediction of
selectivity differences between ligands. Many enzymes occur as isoforms. They
therefore have similarities in their binding pockets. As a consequence ligands show
graduated affinities or “selectivity profiles” to these isoforms. If a ligand is to be
optimized to improve selectivity, the positions at which a change in a property
results in an improved profile must be known. A 3D-QSAR model is constructed for
each isoenzyme. Either the difference in the affinity values can be calculated and
used for the model as values to be predicted, or alternatively, two correlation
models can be constructed and at each grid point the field contributions are
subtracted from one another. The models that are obtained with both approaches
can be graphically interpreted. Contour diagrams show where and how the mole-
cules are to be changed to improve their selectivity with regard to the one or other
isoenzyme.

18.14 A Glimpse Behind the Scenes: Comparative Molecular


Field Analysis of Carbonic Anhydrase Inhibitors

Today comparative field analyses belong to the standard repertoire in drug research.
As an example, the binding of inhibitors to carbonic anhydrase I and II shall be
examined. The biological function of this enzyme is described in detail in ▶ Sect.
25.7. The sequence identity of the isoforms is 60%. The ligands in the training data
set are derived from the parent structures shown in Fig. 18.6. First, a superposition
model is generated by docking the ligands into the protein (Fig. 18.7). The
enzyme’s funnel-shaped binding pocket is occupied by ligands in a large variety
of ways. A good correlation model is obtained with the three methods, CoMFA,
CoMSIA, and AFMoC. The models also achieve a convincing predictive power on
a test data set that was independent from the training set.

R1
N

N N N NH2
R1 NH2 NH2
R1 SO2
N S SO2 H3C SO2 S
H SO2 S
Thiadiazolsulfonamide Thienothiopyransulfonamide Benzothiazolsulfonamide

R1 O
H H
NH2 R1 N OH R1 N
N OH
SO2 SO2 H SO2
R2
Phenylsulfonamide Hydroxamate Hydroxysulfonamide

Fig. 18.6 The scaffolds of inhibitors that were used in different field analyses to establish affinity
(pKi[CAII]) and selectivity models (pKi[CAII] – pKi[CAI] ¼ DpKi[CAII – CAI]) to describe the
inhibition of the carboanhydrases CAI and CAII. Different substituents were varied at the positions
that are marked as R1 and R2.
390 18 Quantitative Structure–Activity Relationships

Fig. 18.7 The superposition of inhibitors from the data set in the funnel-shaped binding pocket of
CAII; the zinc ion is shown as the blue-gray sphere, carbon is light-yellow, oxygen is red, nitrogen
is blue, sulfur is orange, and hydrogen is white.

The contours for the acceptor properties with regard to the inhibition of carbonic
anhydrase II are shown in Fig. 18.8. Molecules in the data set that exhibit an
acceptor function in the areas marked in red have lower potency. On the other
hand, an acceptor function in the blue area improves potency. Compound 18.2,
which has both acceptor functions of an SO2 group oriented in the detrimental red
area, is a weak CAII inhibitor. Moreover its NH group is in the blue region, which
should be occupied by an acceptor. Compound 18.3, which is about four orders of
magnitude more potent, leaves the area that was occupied by an oxygen atom in
18.2 empty, and orients its thiadiazole ring in the direction of the desirable acceptor
function. It achieves considerably better inhibition of the target enzyme.
Just as for the acceptor properties, contour maps can be generated for steric,
electrostatic, hydrophobic, and hydrogen-bond-donor properties. Their evaluation
18.14 A Glimpse Behind the Scenes 391

Zn2+

O OO O
S S Zn2+
Cl3C N O
H
H
18.2 CA II pKi = 4.7

O O
H S
N S – Zn2+
N
N N H
O
18.3 CA II pKi = 8.7

Fig. 18.8 Contour map for the description of the binding contributions of H-bond acceptor
properties. Inhibitors that occupy the red contour areas with H-bond acceptor groups do not inhibit
CAII well, the occupancy of the blue areas with acceptor groups, however, leads to increasing
values. Both oxygen atoms of the sulfonamide group of 18.2 occupy the red-contoured area, which
is unfavorable for acceptor properties. On the other hand, 18.3 leaves these areas unoccupied and
places its basic nitrogen in the vicinity of the blue-contoured region, which is favorable for
occupancy by acceptor groups. This explains the markedly better inhibition of CAII by 18.3.

helps to make evident where particular properties improve or lower the binding
affinity. Such correlation analyses help the synthetic chemist to plan the optimiza-
tion of lead structures in a tailored way.
Contour maps for steric properties that cause a selectivity difference between
CAI and CAII are shown in Fig. 18.9. Occupancy of the green areas with an
inhibitor improves the selectivity for CAI. On the other hand, spatially filling the
yellow-colored regions improves the selectivity for CAII. Compound 18.4 binds
unselectively with the same affinity to both isoforms, but 18.5 can clearly discrim-
inate between the two. The shown model is purely derived from the correlation of
ligand binding data. The relative alignment of the molecules in the data set is
accomplished in the binding pocket of the protein. Therefore the protein environ-
ment around this binding pocket should be examined more closely, to see if the
derived contours are reasonable. If the amino acid replacement between the two
isoforms is compared, it is apparent that CAI has two large residues Phe91 and
Leu131 that constrain the lower left portion of the binding pocket. The inhibitors
have less room in CAI than they do in CAII. In fact the comparative field analysis
392 18 Quantitative Structure–Activity Relationships

CAI selective CAII selective

His200 His200
Thr200 Thr200

Tyr204
Phe91 Phe91 Tyr204
Leu204
Ile91 Ile91 Leu204

Leu131 Leu131
Phe131 Phe131

O N N NH2
N N SO2
S
H
His200
Thr200 18.4
CAI: pKi = 8.15
CAII: pKi = 8.10
F
O O O
F
S N N NH2
Phe91 Tyr204
N S SO2
Ile91 Leu204 N
F H H
F
F 18.5
Leu131
Phe131 CAI: pKi = 6.70
CAII: pKi = 9.40

Fig. 18.9 The selectivity can be improved with regard to CAII inhibition by sterically filling
the yellow-contoured area. Filling the green area with sterically demanding group causes an
increase in selectivity with regard to CAI (top left). Compound 18.4 occupies virtually no area
that is particularly selectivity discriminating; the compound is not isoenzyme specific (top left and
top right). On the other hand, 18.5 occupies a yellow-contoured area neighboring position 204,
which causes a selectivity enhancement for CAII. Compound 18.5 inhibits CAII decidedly more
potently than CAI.

in this region generates a yellow contour, (near position 91) the occupancy of which
should be favorable for the inhibition of CAII. CAII also makes a large amount of
space available for inhibitors next to position 204, which is occupied by the less-
crowding Leu204 instead of Tyr204. A yellow contour is seen that indicates
a favorable occupancy of this area. Inhibitor 18.5, which is considerably more
potent on CAII, orients its pentafluorophenyl group exactly in this region (Fig. 18.9,
right). In the vicinity of position 131 (Leu131/Phe131) a yellow and a green area
occur directly next to one another but spatially separated, the occupancy of which is
favorable for either CAI or CAII inhibitors, respectively. Compound 18.4, which
can hardly distinguish between the two isoforms, occupies the upper edge of both
areas equally well. Moreover it leaves virtually all regions unoccupied that should
lead to a better inhibition of either CAI or CAII for steric reasons. Therefore it is
evident why this compound shows no particular selectivity.
18.14 A Glimpse Behind the Scenes 393

CAI CAI selective

H
N

NH2
SO2
H3C S S
O O
18.6
CAI: pKi = 4.30
CAII: pKi = 8.05
CAII CAII selective

Fig. 18.10 Compound 18.6 inhibits CAII significantly more potently than CAI. Its sulfone
oxygen atom lies near one red contoursed area, the filling of which causes an increase in the
selectivity for CAII binding. Interestingly, Gln92 is found in this region in both isoforms.
However, it is only in CAII that this group is available to accept an H-bond from the inhibitor
that will contribute to binding affinity. The comparable residue in CAI is involved in a network of
H-bonds to neighboring amino acids. Therefore it is not available as a binding partner, and
a decrease in the affinity for CAI is the consequence.

Finally, the binding of the well-discriminating compound 18.6 should be con-


sidered (Fig. 18.10). The evaluation of the acceptor properties of the ligands in the
training data set shows that the occupancy of the red regions with H-bond-acceptor
groups shifts the selectivity to the benefit of CAII. Filling the blue contours with
this property achieves an increase in potency regarding CAI. Compound 18.6 places
394 18 Quantitative Structure–Activity Relationships

its oxygen atoms of the endocyclic SO2 group in the vicinity of the red CAII-
selective areas. Furthermore, a glutamine is neighboring position 92 both in CAI as
well as CAII. This amino acid can accept an H-bond from the inhibitor via the NH2
group of its carboxamide group. However, only CAII allows this structural condi-
tions. Gln92 neighbors Asn69 and Glu58 in CAI. The carboxamide group of Glu92
forms a continuous H-bond network with these residues and with His94. Therefore
the NH group is no longer available for interactions with a bound inhibitor. This is
expressed in the poorer binding affinity of inhibitors that place an acceptor function
at this position, as 18.6 does. The situation is entirely different in CAII. The
neighboring groups of Glu69 and Arg58 form an internal salt bridge with each
other. Therefore they are not available as H-bond partners for Gln92. The
carboxamide group of Gln92 involves His94 via its carboxamide CO group in an
H-bond, and its NH2 group is now available as an acceptor functionality to interact
with a bound ligand. This results in a considerably enhanced binding to CAII and is
expressed as a selectivity advantage.
Alexander Hillebrecht at the University of Marburg has performed yet another
evaluation of the data set of carbonic anhydrase inhibitors that underscores the
difference between 3D, 2D, and 1D QSAR analyses. First, 32 so-called one-
dimensional descriptors were calculated with the MOE program for all molecules
in the data set. These are surface-based descriptors that describe the lipophilicity (log
P), the molar refraction (and therefore the polarization), and partial charges distrib-
uted over the molecules. These 32 descriptors are correlated with the binding affinity
to CAII or the selectivity difference between CAI and CAII to establish a QSAR
model. In another model the connectivities in the chemical formulae (so-called
molecular graphs) were used as descriptors. For this a topological connectivity tree
of all bonds in a molecular formula was generated, and by “walking” along the bond
connections it was counted how often a particular connectivity, for instance, an N–S–
C–C–N or C–N–C–C–C sequence occurs (so-called MACCS keys). In all, the
frequency of 166 different connectivity fragments was evaluated.
Such descriptors code indirectly for the molecular composition of the individual
inhibitors in the data set, as was introduced above in the Free–Wilson analysis (Sect.
18.5). These topological 2D descriptors were then related to the binding affinity or
selectivity data as described above. Good correlation models can be derived using 1D
as well as 2D descriptors. The models based on the 1D descriptors proved to be not
predictive. If an attempt was made to predict a molecule that was not in the data set,
the model failed. The topological descriptors obtain better results. They possess
a certain degree of predictive power, but they perform less well than the above-
described 3D descriptors in the comparative field analysis. This comparison makes
evident that the increase in the complexity of the model and the structural validity of
the descriptors increases their predictive power with regard to the binding properties
of new molecules that were not part of the training data set. But it is especially this
predictive power and the straightforward translation of the obtained correlation
model into the design of new or the modification of existing chemical structures
during the optimization that make QSAR models valuable for drug design.
18.15 Synopsis 395

18.15 Synopsis

• The concept of quantitative structure–activity relationships is not new. It was


first described in the nineteenth century qualitatively, and later more quantita-
tively by Hansch and Fujita. It is an attempt to describe structure–activity
relationships with mathematical models.
• Across a series of structurally closely related test compounds, the equieffective
dose that induces a particular biological effect is related in a linear or squared
dependence on the logarithm of the octanol/water partition coefficient and the
Hammett constant, which describes the electronic properties of substituents at a
given scaffold. A mathematical correlation model is computed by regression
analysis.
• 3D QSAR methods have been developed to consider and correlate the spatial
structure of active substances beyond molecular topology.
• The mutually aligned test molecules are embedded in a regularly spaced lattice
and their properties are explored with an interaction probe. This is placed
systematically at all grid points and a molecular interaction field is computed
around the aligned molecules by using a distance-dependent property
potential.
• Usually, Lennard-Jones and Coulomb potentials are evaluated, and the gener-
ated data table for all molecules of the training data set is correlated by a partial
least-squares technique.
• The derived CoFMA correlation model can be used to predict the biological
properties of novel ligands not included in the training data set. Strict
criteria to monitor the statistical significance of the derived correlations must
be defined.
• Other property fields beyond Lennard-Jones and Coulomb potentials with
mathematically different functional forms can be applied. With respect to
the prediction of binding affinity, it has to be regarded that this property
comprises an entropic contribution that is particularly difficult to reflect in
property fields.
• QSAR analysis only performs a relative comparison of molecules with regard to
the considered biological property. Any dependence on a particular descriptor
across a compound series can only be expected if the property related to this
descriptor is varied in the series. QSAR methods only interpolate and never
extrapolate beyond the scope of molecular properties reflected by the training
set.
• Comparative molecular field analyses can be evaluated graphically. Results are
displayed as contours around the molecules and indicate where the change of
a particular property runs either parallel or opposite to the changes in the
biological property in the data set.
• The graphical information can be directly translated into the design of modified
molecules and thus support the medicinal chemist in optimizing a given lead
structure in a systematic fashion.
396 18 Quantitative Structure–Activity Relationships

Bibliography

General Literature

Hansch C, Leo A (1995) Exploring QSAR. Fundamentals and applications in chemistry and
biology, vol 2. American Chemical Society, Washington, DC
Kubinyi H (1993a) QSAR: Hansch analysis and related approaches. VCH, Weinheim
Kubinyi H (ed) (1993b) 3D-QSAR in drug design: theory, methods, and applications. ESCOM, Leiden
Kubinyi H, Folkers G, Martin YC (1998) 3D QSAR in drug design, vol 2 and 3. Kluwer/ESCOM,
Dordrecht/Boston/London
Ramsden CA (1990) Quantitative drug design. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry, vol 4. Pergamon Press, Oxford
van de Waterbeemd H (1995a) Chemometric methods in molecular design. VCH, Weinheim
van de Waterbeemd H (1995b) Advanced computer-assisted techniques in drug discovery. VCH,
Weinheim

Special Literature

Blaney JM, Hansch C, Silipo C, Vittoria A (1984) Structure–activity relationships of dihydrofolate


reductase inhibitors. Chem Rev 84:333–407
Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1.
Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967
DePriest SA, Mayer D, Naylor CB, Marshall GR (1993) 3DQSAR of angiotensin-converting
enzyme and thermolysin inhibitors: a comparison of CoMFA models based on deduced and
experimentally determined active site geometries. J Am Chem Soc 115:5372–5384
Gohlke H, Klebe G (2002) DrugScore meets CoMFA: adaptation of fields for molecular compar-
ison (AFMoC) or how to tailor knowledge-based pair-potentials to a particular protein. J Med
Chem 45:4153–4170
Goodford PJ (1985) A computational procedure of determining energetically favorable binding
sites on biologically important macromolecules. J Med Chem 28:849–857
Hansch C, Klein TE (1991) Quantitative structure–activity relationships and molecular graphics in
evaluation of enzyme–ligand interactions. Methods Enzymol 202:512–543
Hillebrecht A, Klebe G (2008) The use of 3D QSAR models for database screening: a feasibility
study. J Chem Inf Model 48:384–396
Hillebrecht A, Supuran CT, Klebe G (2006) Integrated approach using protein and ligand
information to analyze affinity and selectivity determining features of carbonic anhydrase
isozymes. ChemMedChem 1:839–853
Kellogg GE, Abraham DJ (1992) Key, lock and locksmith: complementary hydrophathic map
predictions of drug structure from a known receptor-receptor structure from known drugs.
J Mol Graph 10:212–217
Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis
(CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem
37:4130–4146
Ortiz AR, Pisabarro MT, Gago F, Wade RC (1995) Prediction of drug binding affinities by
comparative binding energy analysis. J Med Chem 38:2681–2691
Unger SH, Hansch C (1973) On model building in structure–activity relationships.
A reexamination of adrenergic blocking activity of b-Halo-b-arylalkylamines. J Med Chem
16:745–749
Weber A, Böhm M, Supuran CT, Scozzafava A, Sotriffer CA, Klebe G (2006) 3D QSAR
selectivity analyses of carbonic anhydrase inhibitors: insights for the design of isozyme
selective inhibitors. J Chem Inf Model 46:2737–2760
From In Vitro to In Vivo: Optimization of
ADME and Toxicology Properties 19

The interaction between a substance and the binding site of a therapeutically


relevant biological macromolecule is the decisive prerequisite for suitability as
a drug. Another, no less important, prerequisite is the ability of the substance to
manage to get from the site of application, through an often rather tortuous path, to
the target tissue. The substance must penetrate aqueous phases and lipid membranes
for this to occur. According to its water and lipid solubility, it will arrive in different
compartments of the biological system. It is also changed by metabolic enzymes.
After conjugation or degradation it is finally eliminated via the kidney, the bile, and/
or by the intestines (▶ Sect. 9.1).
In contrast to the biological activity of a drug, which is called pharmacodynam-
ics, the sum of all processes that affect the absorption, distribution, metabolism,
and excretion, so-called ADME parameters, is covered by the term pharmaco-
kinetics. Roughly simplified, pharmacodynamics can be thought of as “the effect of
the substance on the organism” and pharmacokinetics as “the effect of the organism
on the substance.” In the last years this clear separation of definitions has begun to
disappear. The term pharmacodynamics has expanded more and more to processes
of pharmacokinetics. Above all, this has to do with increasing knowledge that
transporters or enzyme systems are responsible for properties such as absorption,
distribution, or metabolism. More and more structures are being solved for these
enzymes, and structure–activity relationships have been established (▶ Sects. 27.6
and ▶ 30.7).
The pharmacokinetics of an arbitrary biological system and the dependence of
the absorption, distribution, and excretion processes on time are described with
mathematical models. The pharmacokinetics of every pharmaceutical is scrupu-
lously investigated and a dosing scheme is determined before entry into clinical
trials, especially during the clinical phases I and II, which evaluate the tolerability
and efficacy in humans. The isolation and structural elucidation of metabolic
products in humans help to find the animal model that is most similar to humans
in its metabolic properties. These species are then used for toxicology studies,
which are chosen to investigate possible teratogenic effects, and long-term studies

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_19, 397


# Springer-Verlag Berlin Heidelberg 2013
398 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

to investigate possible carcinogenic effects. In parallel, individual metabolites of


a pharmaceutical are investigated for their toxic side effects.
In the context of the rational design of new active substances, a substantial
problem arises from the pharmacokinetic parameters and the toxicity: these inves-
tigations are only carried out for very few compounds because of the enormous
experimental effort and the high costs, and only for those compounds that are
intended for clinical development. This approach comes with a serious danger:
scant pharmacokinetic properties are only recognized until very late development
stages, and only then after considerable sums have already been invested in the
development of a new pharmaceutical. In the middle of the 1990s a study emerged
that tellingly showed that numerous unsuccessful development campaigns failed
because of unsatisfactory pharmacokinetics and intolerable toxicity. For these
reasons, an intensified search for in vitro models to predict ADME-tox properties
has taken place in the last 15 years. Therefore it is not the pharmacokinetics of
individual substances that are investigated in detail, but rather the dependence of
different pharmacokinetic parameters on the properties of many different sub-
stances. This allows a better comprehension of the interrelationship between chem-
ical structure and pharmacokinetics. At the same time, it leads to the derivation of
general rules and numerous computer models that are today applied early on in the
design of new drugs.

19.1 Rate Constants of Compound Transport

The distribution of a substance in phases of different lipophilicities is measured as


the partition coefficient P (▶ Sect. 18.3). This definition is valid for systems at
equilibrium. The distribution between the water and octanol phases is considered
as a model system. The ratio of the concentration of the non-ionized form of an
investigated compound in the two phases is considered. In addition, the pH value is
adjusted during the measurement so that the investigated compound overwhelm-
ingly occurs in its non-ionized form. As a general rule, log P, the logarithm of this
value is used.

concentration ðdissolved compoundÞoctanol


log Pðoctanol=waterÞ ¼ log
concentrationðdissolved compoundÞnon-ionized in water

Biological systems are open systems that are kinetically controlled. They can be
temporarily found in a dynamic equilibrium. This condition can be compared to
a chromatographic process in which a substance is in a constant exchange between
the solid support and the mobile phase. Locally, equilibria occur that are disrupted
by the continuous progression of the mobile phase. In contrast to the relatively
simple conditions in chromatography, there are a plethora of different phases in
biological systems. A drug is distributed throughout all of these phases. Further-
more, metabolic processes are running in parallel that lead to different metabolites.
19.1 Rate Constants of Compound Transport 399

Fig. 19.1 Three-


compartment system for the
determination of the rate
constants k1 and k2. At the
beginning of the experiment
the substance is dissolved in
aqueous phase A. Next the
substance concentration is Octanol phase B
measured in phases A, B, and
C after different times until an
equilibrium is established
between the individual k1 k2 k1 k2
phases.

Aqueous Aqueous
phase A phase C

To analyze these dynamic equilibria, the kinetic equilibrium constants of the


substance transport from the aqueous phases into the lipid phases and in the
reverse direction must be known. It is astonishing that such fundamental experi-
mental investigations on organic substances were first carried out by Bernard
Lippold in the mid–1970s, and later also by Han van de Waterbeemd. Lippold
used a three-phase system: water/n-octanol/water (Fig. 19.1). After the addition of
the substance in one of the two aqueous phases, the time dependence of the
substance concentration in the different phases was measured. From this the
equilibrium constant k1 for the transport from the water into the octanol phase
and the rate constant k2 in the opposite direction were calculated.
In addition to the partition coefficient P, which is described in Eq. 19.1, a very
simple correlation has been shown for the dependence of k1 and k2 (Eq. 19.2); b and
c are constants that depend on the system and not on the structures of the
substances.

k1
P¼ (19.1)
k2

k2 ¼ bk1 þ c (19.2)

The dependence of the rate constants k1 and k2 on the partition coefficient P


results from the combination of both equations (Eqs. 19.3 and 19.4).

log k1 ¼ log P  logðbP þ 1Þ þ constant (19.3)

log k2 ¼  logðbP þ 1Þ þ constant (19.4)

The experimental k values for 20 different sulfonamides and 15 further sub-


stances that were experimentally determined by Han van de Waterbeemd are shown
400 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

Fig. 19.2 Experimentally −3


determined rate constants k1
and k2 for the transport of 20
sulfonamides and 15 further
chemically different −4
substances with molecular
weights between 100 and
500 Da. The curves and
−5
correlation coefficients r

log k
correspond to the fitting of the
data with Eqs. 19.3 and 19.4.
−6

log k1 log k2
(r = 0.997) (r = 0.998)
−7

−3 −2 −1 0 1 2 3 4
log P

in Figure 19.2. Among the latter are neutral, acidic, basic, and even quaternary
charged compounds with very different molecular weights. The characteristic
course of the curve says that the rate constant k1 for the transfer from the aqueous
phase into the organic phase depends on the partition coefficient P for relatively
polar substances. It is thermodynamically controlled, that is, it increases with
increasing lipophilicity. A point is reached, however, at which the diffusion of
the substance is limited by k1 at the maximally achievable value. More lipophilic
substances cannot simply penetrate the organic phase faster. Analogously, this is
valid for the opposite direction as well, from which the diffusion from the organic
phase into the aqueous phase is described by k2. The chemical structure plays
a role in both cases in that it determines the value of the partition coefficient P.
Because the rate constants are limited by diffusion, there must be an apparent
dependence on the molecular size in this area. According to Fick’s law of
diffusion, the diffusion should be proportional to the radius of the particle, as
a first approximation, parallel to the third root of the volume. Because of the
relatively low variability of the molecular size of organic drugs and their
conformational flexibility, this effect is probably lost by the noise level of
experimental error. Moreover, it must not be forgotten that the discussed
octanol/water system is very simple and it only slightly approximates the
complex structural relationships of real membrane systems. Therefore today
more relevant models to collect experimental distribution data, such as the so-
called PAMPA or Caco-2 models, are increasingly being used (Sect. 19.6).
Here more complex correlations are indicated. Obviously how a compound is
distributed and structurally oriented in the vicinity of membrane structures is
important. These properties simultaneously influence how the penetration, and
therefore the distribution, is to be described.
19.3 The Role of Hydrogen Bonds 401

19.2 Absorption of Organic Molecules: Model and


Experimental Data

The rate constant, k, for the penetration through the lipid membrane from the
aqueous phase is described by another equation, Eq. 19.5. Here, the rate constants
k1 and k2 also describe the entry into the organic phase and the transport in the
opposite direction, respectively.

log k ¼ log k1 þ log k2 þ constant (19.5)

In the first approximation, this equation should also describe transportation


processes in multicompartment systems. Model calculations on arbitrary, complex
systems show that this is indeed the case. They confirm that there is bilinear
dependence of the transport in different phases on the total lipophilicity of
a substance. For multiple groups of drugs, for example, barbiturates, this was
demonstrated experimentally in simple in vitro model systems (Fig. 19.3, bottom).
The log k values increase linearly upon penetration through an organic membrane,
which correlates with the increase of k1 with constant k2. After passing through
a maximum, they decrease with a constant k1 value and decreasing k2 value. This
dependence was quantitatively summarized by Hugo Kubinyi in the so-called
bilinear model (Eq. 19.6); a, b, b, and c are constants, by which the nonlinear
regression analysis is ascertained.

log k ¼ a log P  b logðbP þ 1Þ þ c (19.6)

Entirely analogous dependencies are observed with the absorption of com-


pounds, that is, out of the stomach or intestines (Fig. 19.3, middle). Active
substances that should be orally available should not be either very polar or very
non-polar. Substances with intermediate lipophilicity can cross the blood–placenta
barrier more easily than very polar or very non-polar compounds (Fig. 19.3, top).
A nonlinear dependence on the lipophilicity for substance penetration through the
blood–brain barrier is particularly pronounced (Fig. 19.4). The optimum for this
barrier is in the range of log P ¼ 1.5–2.5. For CNS-active substances, an optimal
lipophilicity around log P ¼ 2 should be aimed for in order to facilitate penetration
across the blood–brain barrier.

19.3 The Role of Hydrogen Bonds

The simple concept about the dependence of absorption on the octanol/


water partition coefficients that was outlined above, has been questioned in the
last few years. Octanol is indeed a relevant model for lipid membranes in many
respects (▶ Sect. 4.2), but it can only incompletely model the influence of
hydrogen bonding. Upon establishing equilibrium in the octanol/water system,
402 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

1
Blood-placenta
penetration
0

Intestinal
−1 resorption

Gastric
log K

−2 resorption

−3 Penetration
through
an organic
−4 membrane

−5
−3 −2 −1 0 1 2 3 4 5
log P

Fig. 19.3 The rate constant k for the transport of drugs depends nonlinearly on lipophilicity. This
is valid for simple in vitro models as well as for biological systems. The bottom curve describes the
log k values of the transport of barbiturates in an in vitro absorption model from an aqueous phase,
through an organic membrane into another aqueous phase. Both curves in the middle (gray points)
describe the dependence of the absorption rate constants k on the lipophilicity for the absorption of
homologous carbamates from the stomach (gastric absorption) or the gut (intestinal absorption) of
rats. The top curve was determined for the entry of different drugs into the placenta from the
circulation. In all cases an increase in log k dependent upon log P is seen, until a more-or-less-
pronounced maximum for substances with moderate lipophilicity. For very non-polar substances,
this curve falls, and in rare cases a plateau is reached. The curves for gastric and intestinal
absorption and for the penetration into the placenta run flatter than the curve for the in vitro
transport of barbiturates (below), because here no lipid barrier is present.

the organic phase contains considerable amounts of water so that the molar ratio of
octanol/water ¼ 4:1. Substances with polar, solvated groups therefore do not need
to fully release their water solvation shell upon entry into the octanol phase.
Entering into a biological membrane is obviously different. Aside from the depen-
dence on lipophilicity, even worse membrane penetration is observed for sub-
stances that can form an increasing number of hydrogen bonds. Similarly,
a ligand must release its water shell before it can be accommodated in the binding
site of a protein.
The system water/cyclohexane is more suitable for the description of such
processes. Because of the non-polar character of this hydrocarbon, upon transition
from water into cyclohexane the molecule cannot take its water shell with it. Many
years ago P. Seiler derived an increment IH (Eq. 19.7) from the differences in the
partition coefficients in cyclohexane/water (loss of water shell) and octanol / water
19.3 The Role of Hydrogen Bonds 403

3
AmOH

log 1/c 2
EtOH DecOH
1 MeOH

0
−2 −1 0 1 2 3 4 5
log P

Fig. 19.4 The neurotoxicity (C ¼ molar dose that induces a particular toxic effect) of homolo-
gous primary alcohols in the rat is a measure of their ability to cross the blood–brain barrier. Polar
substances remain overwhelmingly in the blood circulation. In contrast, substances with moderate
lipophilicity reach the central nervous system easily. Accordingly, neither methanol (MeOH) nor
ethanol (EtOH) shows a pronounced neurotoxicity. The high general toxicity of methanol (blind-
ness) is not because of its own effect but rather the severely toxic metabolic products formalde-
hyde and formic acid (acidosis). Short-chained alcohols such as amyl alcohol (AmOH) are
considerably more neurotoxic. The highly lipophilic decanol (DecOH) shows low toxicity.

(no loss of water shell) for different functional groups. These IH values characterize
the tendency of groups to form hydrogen bonds.
X
log Pcyclohexane þ IH ¼ 1:00 log Poctanol þ 0:16 (19.7)

The concept of Seiler remained largely ignored. In 1988 Robin Ganellin and co-
workers described the CNS bioavailability of different substances, that is, their
ability to cross the blood–brain barrier, as a linear function of a Dlog P value. This
Dlog P value is the difference between the log P values in the systems cyclohexane/
water and octanol/water. The bioavailability of peptides also runs in first approx-
imation parallel to the Dlog P value, or the number of groups that potentially
participate in hydrogen bonds. The methylation of all NH groups of a peptide
scaffold can, in fact, deliver substances with good bioavailability. The prerequisite
for good membrane penetration is similar to those for high affinity at the binding
site (▶ Chap. 4, “Protein–Ligand Interactions as the Basis for Drug Action”). Here
too, the requirement to release relatively strongly bound water molecules can also
have a detrimental influence on binding affinity.
Several other distribution systems, for instance, heptane/ethylene glycol, have
been proposed as alternatives to the octanol/water or cyclohexane/water systems
with regard to the simulation of penetration through a lipid membrane. But even
these systems cannot correctly reflect the architecture of membranes with an
interior lipophilic zone and a polar, negatively charged outer rim. Another option
404 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

is the determination of the membrane/water partition coefficient, which is, how-


ever, experimentally rather laborious. For this, artificial membranes or liposomes
are used as models.

19.4 Distribution Equilibria of Acids and Bases

Many drugs are acids (HA) or bases (B). They exist in two forms through dissoci-
ation (Eq. 19.8) or protonation (Eq. 19.9); one is usually a non-polar neutral form
and the other is a polar ionic form. The values of the partition coefficients of the
ionic species are generally three to five orders of magnitude less than the
corresponding neutral molecule.

HA þ H2 O Ð A þ H3 Oþ (19.8)

B þ H3 Oþ Ð BHþ þ H2 O (19.9)

The distribution equilibrium of an acid and its anion in a two-phase system


depends on the pKa value and the pH value of the aqueous phase, as well as the
partition coefficients Pu and Pi of the substance (Fig. 19.5). All components in each
phase must be in equilibria with one another to establish equilibrium of the total
system. The dependence of the partition coefficient P on the pH value, the pH–
partition profile, usually takes on a sigmoidal (i.e., S-shaped) course. Plateaus are
observed for the uncharged neutral form and in case of pH values at which so little
of the neutral form exists that solely the transfer of the charged species in the
organic phase determines the measured partition coefficient (Fig. 19.6). The
charged species goes into the organic phase with a counterion as an ion pair. Either
the corresponding ion of the salt or the excess of ions in the aqueous buffer come
into play as counterions. The partition coefficient of the ion pair decidedly depends
on the lipophilicity of the counterion. The tetrabutylammonium salt of salicylic acid

Octanol
HA A−

Pu Pi

HA + H2O A− + H3O+
Ka
Aqueous Buffer

Fig. 19.5 Two-phase system with partition and dissociation equilibria for an acid HA (Eq. 19.8).
Ka is the dissociation constant, Pu and Pi are the partition coefficients of the non-dissociated and
ionic forms, that is, neutral and charged species, respectively. Because there is usually a difference
of several orders of magnitude between the Pu and Pi values, in many cases the Pi value can be
neglected. This leads to considerable simplification of the corresponding mathematical models.
19.4 Distribution Equilibria of Acids and Bases 405

log P
Acid AH
Base B

+H NCH(R)COO−
A− × N(R4)+
3
Acid/Ion pair

Amino Acid

Protonated
Anion A−
Base B+

+H NCH(R)COOH
3 H2NCH(R)COO−
Dibasic Acid

pH = 0 pH = 7 pH = 14

Fig. 19.6 The pH dependence of the distribution equilibrium of acids and bases, the so-called pH
distribution profile, follows simple rules. Typically when an acid or a base is present, sigmoidal,
that is, S-shaped, curves are observed. For a two-base acid, for example, oxalic acid, the decrease
in the partition coefficient continues with increasing pH values. In the presence of lipophilic
counterions, for example, the tetrabutylammonium salt of salicylic acid, the ion pair displays
a very high partition coefficient. Amino acids with neutral side chains carry one basic amino group
and an acidic carboxyl group. Accordingly, they go through a maximum in their partition
coefficient at the neutral point. Here the majority of the substance indeed exists as a zwitterion;
aside from that, however, a larger part is in the neutral form than is at lower or higher pH values.

has an only slightly lower partition coefficient than the neutral form of salicylic
acid. In contrast, the sodium salt of salicylic acid has absolutely no tendency to
cross over into the organic phase. Amino acids and other mixed acidic and basic
compounds afford pH–partition profiles with a maximum between the pKa values of
the two ionizable groups (Fig. 19.6), that is, when the zwitterionic form is present.
Knowledge about the log P value of the neutral form and the pKa value allows
the partition coefficient of a substance to be calculated at neutral pH. These
principles allow the estimation of absorption and distribution properties of new
substances. Of course, these considerations are only valid for drugs for which no
transporter exists that facilitates their membrane penetration ( ▶ Sects. 22.7 and
▶ 30.7)
Because of their importance, today pKa values are routinely measured by
potentiometric titration in pharmaceutical research. However, it remains neglected
that the definition of pKa values of acids and bases are only valid for aqueous
solutions. The addition of an organic solvent, which changes the dielectric con-
stant, shifts this value (▶ Sect. 4.4). This is even more valid for the binding site of
a protein or the interior of a membrane. In individual cases, experimental values
have been determined by NMR spectroscopy and isothermal titration calorimetry.
406 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

19.5 Absorption Profiles of Acids and Bases

The absorption of an active substance, for example, out of the intestines into blood,
should be dependent on the pH of the surrounding medium and the pKa of the
substance, just as the distribution between an aqueous buffer system and an organic
phase is. The absorption should follow very simliar profiles as the distribution. In
the 1950s, Brodie, Hogben, and Schanker formulated the pH–partition theory to
this effect. It says that the dependence of absorption profile on the pH value, the
pH–absorption profile is identical to the pH–partition profile (Sect. 19.4). This
theory was confirmed by, among other things, the investigation of the rate constant
of absorption of a few acids and phenols from the colon of the rat at pH 6.8. The
neutral forms of the strong acids 5-nitrosalicylic acid (pKa ¼ 2.3), salicylic acid
(pKa ¼ 3.0), m-nitrobenzoic acid (pKa ¼ 3.4), and benzoic acid (pKa ¼ 4.2) display
comparable lipophilicity with log P values between 1.8 and 2.3. Under experimen-
tal conditions near neutral pH, they are largely dissociated. Less than 0.1% are in
the neutral form. Therefore they are distinctly more slowly absorbed than the
comparably lipophilic, weakly acidic phenols p-hydroxypropiophenone
(pKa ¼ 7.8) and m-nitrophenol (pKa ¼ 8.2), which are more than 90% in their
neutral form at pH 6.8.
Neutral forms can diffuse through membranes; charged forms are well soluble in
water. An equilibrium is quickly established between the two forms in an aqueous
medium and also at the phase boundaries. In the case that the pKa values of the
substances are not more than 2–3 units from the neutral value of pH 7, the neutral
form is present in the aqueous phase at the entirely adequate concentration of about
0.1–1%. The latter penetrates into the membrane. In the aqueous phase it is
immediately regenerated by the dissociation equilibrium. In a biological system
the distribution of such substances is accomplished quickly and effectively
(Fig. 19.7), and indeed even better the closer the pKa value is to the neutral pH 7.
This also explains why so many drugs are organic acids or bases. Because of the
strongly deviating pH values in the stomach and intestines, at some place along the
gastrointestinal tract the conditions are right that a neutral substance, an acid, or
a base can be well absorbed. If the pKa values are too far from the physiological pH
values, for example, amidines or guanidines with extremely high pKa values, the
absorption can become problematic. This is also true for zwitterionic compounds,
for example, amino acids, and for compounds with multiple acidic or basic groups
in the molecule. Because of the large volume available for the distribution the
diffusion occurs overwhelmingly from the gastrointestinal tract into blood or tissue
and only to a negligible extent in the opposite direction (Fig. 19.7).
The absorption of strongly acidic compounds outside the range in which the
compound exists as a neutral molecule, runs in first approximation parallel to the
difference pH  pKa, and for bases the difference is pKa  pH. There are exceptions
to this approximation. Highly lipophilic compounds require a more detailed descrip-
tion of the pH–absorption profile. The neutral forms of these substances enter the
lipid phase as soon as they come near the membranes. The neutral molecule is being
constantly removed from the dissociation equilibrium, which is established in the
19.5 Absorption Profiles of Acids and Bases 407

a Neutral substance b Acid, pKa = 4

N N HA HA A−
Stomach, pH = 1 Stomach, pH = 1
A-

Blood circulation, pH = 7.4 Blood circulation, pH = 7.4

HA HA
N N

A- A−
Intestines, pH = 6–8 Intestines, pH = 6–8

c Weak base, pKa = 5 d Strong base, pKa = 9

B B BH+ B B BH+
Stomach, pH = 1 Stomach, pH = 1
BH+ BH+
Blood circulation, pH = 7.4 Blood circulation, pH = 7.4

B B B B

BH+ BH+ BH+ BH+


Intestines, pH = 6–8 Intestines, pH = 6–8

Fig. 19.7 (a) A moderately polar neutral substance N is absorbed very well from the stomach as
well as from the intestines. It is quickly distributed in the circulation so that the back-transport does
not play a notable role. (b) An organic acid HA (pKa ¼ 4) is absorbed well from the stomach, as
long as it is not too polar, because it exists there overwhelmingly in the neutral form. The
absorption is facilitated by the fact that the free acid is in considerably lower concentration in
the blood than in the stomach. The formation of an anion shifts the concentration gradient in this
direction. The absorption is slower from the gut because there the equilibrium lies overwhelmingly
on side of the ionized form. (c) A weak base (pKa ¼ 5) is absorbed relatively poorly from the
stomach because it exists overwhelmingly in its polar, protonated form. It is well absorbed in the
intestines because it exists as its neutral form there. (d) A strong base with a pKa ¼ 9 cannot be
absorbed from the stomach. The equilibrium indeed lies heavily on the side of protonated form in
the intestines, but the non-polar form is supplied in adequate quantities. Therefore the substance
can be absorbed. When a pKa value of >11 is reached by a substance, the concentration of the
neutral, bioavailable form is too low for good absorption.
408 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

Amount of an acid,
pH–Absorption
AH, distributed or
Diagram (dynamic
absorbed
equilibrium)
Δ pH =
pH shift

pH–Distribution
Diagram (dynamic
equilibrium)

pH Value

Fig. 19.8 The dependence of the absorption of lipophilic acids on the pH value, the absorption
profile (red curve) decidedly deviates from the pH distribution curve (black curve, see Fig. 19.6).
Although the pH-distribution profile is valid for an equilibrium system, a steady-state equilibrium
is established during absorption. Even at relatively high pH values, that is, when small concen-
trations of the neutral species are present, a fast absorption of these few molecules is achieved.
Because of the high anion concentrations and the continuous adjustment of the dissociation
equilibrium, a minimally necessary concentration of the neutral species is maintained. The shift
in the pH-absorption profile is referred to as a pH shift. Analogous shifts are observed in the
opposite direction for lipophilic bases.

aqueous phase. However it is very quickly replenished by this equilibrium. In the


balance, a continuous transport of substance from the aqueous phase into the
membrane is achieved. The small amounts of uncharged neutral form is the door
over which the entire process takes place. The rate of the transition into the lipid
layer does not depend on the (often very low) concentration of the neutral form, but
rather on:
• The total concentration of the compound,
• The rate constants of the dissociation equilibrium,
• The diffusion constant of the compound.
Accordingly, a shift in the pH–absorption profile is observed in biological
systems for lipophilic acids and bases relative to the pH–partition profile, which is
referred to as pH shift. This always occurs in the direction toward the neutral point,
that is, with acids to higher and with bases to lower pH values (Fig. 19.8). The
larger the lipophilicity of an acid or a base, the larger the observed shift in
the absorption profile. To judge the question of how well a substance is absorbed,
the log P value and the pKa values must not be considered separately. Their
cooperation is decisive. For the design of new drugs, this means that a substance
with an unfavorable partition behavior, that is, with a too high or too low a pKa
value, can be beneficially modified in the desired direction by increasing its
lipophilicity. To describe the pH dependency of the distribution equilibrium,
a distribution coefficient D was introduced as a supplement to the partition
19.6 What Is the Lipophilicity Optimum of Drugs? 409

coefficient P. For this, the ratio of the sum of all concentrations of ionized and non-
ionized forms of an investigated compound in the two phases are considered. The pH
value is adjusted for measurement in a buffer solution so that the addition of the
investigated compound does not shift the pH. Usually log D, logarithm of the
distribution coefficient, is used in this place.

19.6 What Is the Lipophilicity Optimum of Drugs?

Lipophilicity plays an important role in the appraisal of the therapeutic suitability of


a pharmaceutical. This is valid for the absorption, distribution, metabolism, as well
as the excretion. With the exception of substances that are taken up via
a transporter, the absorption is usually better when the compounds are more
lipophilic. This advantage is limited by the solubility in aqueous phases, which
decreases severely as the lipophilicity increases. The solvation enthalpy and the rate
by which the solid of an active substance dissolves in the gastrointestinal tract are
also decisive for the bioavailability. These factors depend on the intermolecular
interaction in the crystalline solid and can vary severely from one polymorphic
crystal modification to the other. Therefore correlations to predict bioavailability
regard the melting point as an additional parameter apart from the lipophilicity and
the solubility. In addition to the solubility, the kinetics of dissolution are important
for galenic formulations, that is, the final drug preparation. It determines the amount
of substance that goes into solution during the gastrointestinal passage. This amount
can be increased by different factors such as:
• Increasing the surface area by grinding the crystals into miniscule particles
(micronization),
• Growing a modified crystal with better solubility properties,
• Crystallization under special conditions to afford a more uniform (usually
smaller) size, or crystals with lattice defects,
• Changing the salt form,
• Adding solubility-mediating additives,
• Embedding the drug as amorphic solid solutions of easily dissolvable polymers.
Because of its importance, techniques to measure the solubility on a high-
throughput scale have been established in the last years.
Cell cultures are also increasingly used as in vitro models to record substance
absorption. A thin layer of cells from human colon carcinomas (so-called Caco-2,
HT29, or MFCH cell lines) is grown in a two-chamber system. The transport of
active substance can be followed from both sides, this is either the so-called apical
or basolateral side. Because these cells also express transporters, the involvement of
specific transportation mechanisms of substances can also be studied. These models
are less suitable for the study of the possible consequences of substance metabolism
because the metabolizing enzymes (▶ Sect. 27.6) are only expressed in diminished
quantities by these cells.
Relevant in vitro test models have also been developed to study blood–brain
barrier penetration. These models are experimentally relatively laborious, and the
410 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

results often can only be compared within a series of structurally related substances.
Assay systems with artificial membranes (PAMPA, from parallel artificial
membrane-permeability assay) can be constructed that allow high-throughput
screening. Moreover, the penetration behavior in liposomes can be evaluated by
surface plasmon resonance.
When experimentally determining the absorption of different substances, results
obtained from saturated solutions of the substances should not be compared with
results from solutions with concentrations well below the saturation limit. Due to
the lower solubility of the lipophilic compounds their solutions will exhibit minor
concentrations which pretends worse absorption. In the second case using compa-
rable concentrations for all test compounds improved or good absorption is also
found for the lipophilic substances. A comparison of such different experimental
conditions will lead to incorrect conclusions. Further confusion occurs when the
terms absorption and bioavailability are incorrectly applied (▶ Sect. 9.1). The
absorption of a substance can be excellent, but the bioavailability is nonetheless
poor. Lipophilic compounds and substances with a molecular weight of more than
500–600 Da are often well absorbed, but suffer from very fast biliary (via the bile)
elimination. This usually happens during the first liver passage (first pass effect,
▶ Sect. 9.1) directly after absorption from the intestines. To achieve good bioavail-
ability, the lipophilicity must not be too high. The excretion path also depends on
the lipophilicity. In general, extremely lipophilic substances are more quickly
metabolized, but are also toxicologically worrisome. Hydrophilic substances and
polar metabolites, including those after conjugation with polar groups, are excreted
via the kidneys. The excretion of lipophilic substance is usually accomplished
hepatically, and subsequently over the intestines. Such substances often undergo
oxidative metabolism, with the concomitant possibility of toxic metabolites being
produced.
Substances that interact with membrane-bound receptors or ion channels can
often access their targets more easily if they are enriched in the surrounding
membrane. For this, the substances should be lipophilic, or should carry a large
lipophilic group with which they can be anchored in the membrane (▶ Sect. 4.2,
▶ Fig. 4.2).

19.7 Computer Models and Rules to Predict ADME Parameters

Aside from the set-up of suitable test systems to systematically record parameters
that determine the pharmacokinetic properties, major effort has been spent to
establish rules and computer models to predict favorable ADME properties. In
the first place, the rule of five must be mentioned, which was developed by Chris
Lipinski at Pfizer. Accordingly, an active substance should not violate more than
two of the rule of five in Table 19.1. These simple rules were derived from
experience and are almost exclusively used to preselect compounds for screening.
Tudor Oprea refined these rules further and extended them to cover the occurrence
of particular structural building blocks such as, for instance, the maximum number
19.7 Computer Models and Rules to Predict ADME Parameters 411

Table 19.1 Criteria for the Molecular weight 500 Da


rule of five.
Partition coefficient log P  5
No more than 5 H-bond donor groups
No more than 10 H-bond acceptor groups

of rings of a particular size. Programs such as CLOGP, or ACD/pKa, and Pallas/pKa


have been developed to estimate lipophilicity and pKa values. To predict solubility,
attempts are made to calculate solvation enthalpies. Permeability, absorption, and
bioavailability predictions are based on empirical correlation models. For this,
experimental observations are related to the chemical structure of the investigated
molecules. The applied methods are derived from QSAR models presented
in ▶ Chap. 18, “Quantitative Structure–Activity Relationships.” The properties to
be predicted are described by models based on intuitively selected descriptors.
Usually molecular parameters are consulted that are frequently derived from the
molecular surface and are assumed to be decisive for the considered properties.
In addition to routine regression analyses, more recent mathematical models such as
neural networks, nearest-neighbor classifiers, decision trees, or machine-learning
techniques such as support vector machines are applied. In addition to the
easily evaluated rule of five, the following criteria should also be considered for
rational design: substances that are meant to act in the periphery, for instance,
cardiovascular drugs, should be relatively polar. Of course, a certain amount of
minimal lipophilicity is necessary for their absorption. Due to the risk of central
side effects, or the generation of toxic metabolites, this lipophilicity should not
be too severely exceeded. Here the following motto is valid: better to be a little
less potent than have all the other problems! A good therapeutic window is much
more valuable for therapy than having picomolar affinity to a protein. Substances
that act upon membrane-bound proteins and substances that act in the central
nervous system should have a moderate to high log P value of >1. To avoid the
development of toxic metabolites, the incorporation of the following is
recommendable:
• Easily conjugated groups, for example hydroxyl, amino, or carboxyl groups,
• Preconceived metabolic cleavage points such as ester or amide bonds,
• Oxidizable groups that lead to nontoxic and easily excretable metabolites, for
example, methyl groups.
Of course, this strategy should not be exaggerated, otherwise the substances are
excreted too quickly. The biological half-life is then reduced to a value that makes
a therapeutic administration in humans impossible.
The structural consideration of properties that lead to optimal bioavailability,
adequate biological half-life, and non-toxic metabolites represents a problem in the
search for new active substances. Structure-based design of active substances
initially concentrates on the fitting of a ligand to its binding site. Often, aspects
that have to do with the pharmacokinetics and metabolism are not adequately
considered in this phase. Disappointments at the end of a successful optimization
in the preclinical phase, or at the very latest in the clinic, punish such a one-sided
412 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

approach. Because the spatial structures of transporters, channels, and metabolic


enzymes are increasingly becoming available structure-based design can be used to
test for cross-reactivity of proposed or developed ligands on these target structures.
Binding to the potassium-ion-transporting hERG ion channels leads to their block-
age. A consequence could be life-threatening cardiac arrhythmias (▶ Sect. 30.3).
For this reason, QSAR models were developed that can examine molecules for
a possible hERG channel binding. Methods for the direct docking of ligands in
structural models of the channel have also been developed. Another system that was
recently structurally characterized is the membrane-bound glycoprotein GP170. It is
a transporter that can expel drugs from the cell (▶ Sect. 30.8). It is desirable to avoid
interactions with this protein as much as possible. Another large family of enzymes
worthy of attention are the cytochrome P450 metabolic enzymes (▶ Sect. 27.6). Here
an attempt is made to estimate how drugs interact with these proteins and how they are
metabolized. A wide field is opened here for structure-based design.

19.8 From In Vitro to In Vivo Activity

Active substances are initially investigated in simple in vitro test models, for
instance, with respect to enzyme inhibition, receptor binding, in cell cultures, and
later in organs and animal models. As a general rule, the simplest model is chosen
for which the results are predictive of the effect that can be expected in an animal or
in humans. For this it is necessary to derive quantitative relationships between the
different test models, so-called activity–activity relationship. This describes the
relationship between biological activity, for instance, between in vitro and in vivo
data. In the best case, it even allows the extrapolation of the values of binding
affinity in an inhibition assay to the therapeutic effect in humans.
The confirmation of a correlation between a simple test model and a therapeutic
effect is often more important than the derivation of a structure–activity relation-
ship. After finding the relevant, quantitative relationship, inexpensive and quickly
performed tests can be used instead of laborious animal experiments. The number
of animal tests is reduced in this way considerably. But that is not the only
advantage. The use of automated molecular testing systems allows the profile of
active substances to be reliably characterized.

19.9 Natural Ligands Are Often Unspecific

Prior to the biological testing of an active substance: the following questions must
be clarified. What therapeutic goal should be achieved? How is this goal to be
realized? Therapeutic concepts are derived from the pathophysiology of the disease
mechanism. Regulatory intervention with drugs should restore the original physi-
ological condition as far as possible. Problems can occur in the process: to imitate
natural ligands of enzymes and receptors, the active substance must demonstrate
adequate specificity and must distinctly access the target site.
19.9 Natural Ligands Are Often Unspecific 413

Nature works with two orthogonal principles with respect to endogenous


substances: the specificity of the effect and a usually very pronounced spatial
compartmentalization. Hormones act overwhelmingly systemically, that is, they
are released at one site in the organism and transported through the circulation
to another, entirely different site. There they exert their action. Other substances,
for instance, neurotransmitters, act strictly locally. In the context of the picture
of the lock and key (▶ Sect. 4.1), Nature prefers to have a master key that can act
on different locks. It acts only at the site of its production and is removed as soon
as it has fulfilled its tasks. The neurotransmitters are synthesized in nerve cells,
stored, and released upon stimulation of the cell at the synaptic gap (▶ Sect. 22.5).
There they bind to specific receptors and exert a stimulation of the neighboring
nerve cell. The effect quickly subsides after reuptake in the cell or after decompo-
sition, for instance, by monoamine oxidases (amines), esterases (acetylcholine), or
peptidases.
The efficiency of Nature is documented especially impressively in the variety
with which small molecules, for example, adrenaline and noradrenaline (▶ Sect. 1.4),
can be used as hormones as well as neurotransmitters. A plethora of different
receptors and receptor subtypes are available for these substances, with which
entirely different effects can be induced with the same molecule. The recoding of
the amino acid sequence of a particular receptor, and therefore an alteration in its
binding site, is relatively easy to do on the gene level. The evolution of complex
biosynthetic pathways of non-peptide ligands, which often occur over multiple
enzyme-catalyzed steps, is considerably more intricate. Accordingly, almost all
neurotransmitters and many hormones are derived in a simple way from the central
intermediates of metabolism of, for instance, amino acids. On the other hand, the
steroid hormones (▶ Sect. 28.3) prove that Nature can achieve very different effects
with a set of chemically similar structures and evolutionarily and structurally related
receptors, for instance, as with the estrogens, gestagens, androgens, glucocorticoid
steroids, and mineralocorticoid steroids.
Frequently, the spatial distribution of biosynthesis or the release of a receptor
ligand or the distribution from membrane-bound receptors or enzymes plays
a decisive role for the specificity of an effect. Different effects are achieved by
the same ligand through locally restricted substance release or through the presence
of different receptors. In doing so, there is not only a differentiation between
particular organs or areas, but also between individual cells and cell compartments.
This is how, for example, the dopamine concentration in different regions of the rat
brain was determined. Whereas in some regions, for example, the caudate nucleus
(Lat.: Nucleus caudatus), an important synaptic site for the motor system and the
olfactory system, concentrations of up to 100 ng dopamine per mg protein are
reached, most of the other areas of the brain contain only between 0.2 and 10 ng/mg.
Even in the Substantia nigra of the mesencephalon, the dopamine level is only
5–6 ng/mg. The degeneration of dopaminergic neurons in this area leads to
Parkinson’s disease in humans. It is known from labeling experiments that the
distribution and population density of receptor subtypes in diverse areas of the brain
and other tissues can be very different.
414 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

19.10 Specificity and Selectivity of Drug Interactions

How specifically should a drug act? There is no absolute answer to this question.
Because active substances are almost always administered orally or intravenously,
they act systemically, that is, on the entire organism. The lack of limitation to
a particular organ or a particular compartment must be compensated for with
a higher specificity. At any rate the drug must act as specifically as necessary to
achieve a successful therapy with tolerable side effects.
In the case of enzyme inhibitors substances are preferred that act so specifically
that only one particular enzyme is inhibited. Unspecific inhibitors that simulta-
neously inhibit multiple serine or metalloproteases would wreak havoc in an organ-
ism. A thrombin inhibitor, which should reduce an increased thrombosis risk, must
not act simultaneously as an inhibitor of the closely related plasmin, which causes
fibrinolysis, leading to dissolvation of blood clots that have already formed. The
situation with kinase inhibitors (▶ Sect. 26.3) is a bit different. Because of the
similarity among kinases one member of the family can take over the task of another
related kinase, which has been blocked. In doing so it reduces the therapeutic effect
to nothing. Here, a broad-spectrum kinase inhibitor might be desirable that can
simultaneously shut off an entire protein family. A broad-spectrum action that
inhibits multiple isoenzymes of a parasite equally well can also be beneficial for
antibacterial or antiparasitic compounds (e.g., plasmapepsins, ▶ Sect. 24.7).
Receptor agonists and antagonists should also display a high selectivity.
b-Agonists that are used to treat asthma (▶ Sect. 29.3) must be b2-specific so that
they do not induce an undesirable increase in the heart rate or blood pressure. Often
the necessary effect of a drug cannot be achieved with only one drug. The simul-
taneous use of multiple drugs is often indicated for the treatment of arterial
hypertension (▶ Sect. 22.10). More complex, multifactor-induced disease pro-
cesses must be treated by addressing multiple mechanisms. Because of the low
dosing of the different components, the unspecific side effects of the individual
different components fade into the background.
The specificity is critical for the effect of CNS-acting drugs. Progress in gene
technology has provided us with an explosion of knowledge about receptors, but
also a dilemma. We know the exact receptor profile of established substances. We
know what specificity must be achieved to imitate a particular type of action. But in
many cases, we do not know which profile should be present to achieve a better
therapeutic effect. An example should clarify this point. Neuroleptics and many
antidepressants (▶ Sect. 1.6) act on neuroreceptors. The classic neuroleptics chlor-
promazine 19.1 and haloperidol 19.2 (Sect. 19.9), which are used in the treatment
of schizophrenia, are relatively unspecific dopamine receptor antagonists
(Table 19.2). The mixed-type neuroleptic/antidepressant sulpiride 19.3 acts on the
D2 and D3 receptors simultaneously. All of these substances cause side effects on
the muscular–skeletal system, as is observed in Parkinson’s disease (Sect. 19.4),
which is caused by a dopamine deficiency. Because of its mode of action, it was
assumed that the side effects of the neuroleptics were inevitable consequences of
antagonism of the dopamine receptors. Then an atypical neuroleptic, clozapine 19.4,
19.10 Specificity and Selectivity of Drug Interactions 415

Table 19.2 The natural neurotransmitter dopamine binds with higher affinity to dopamine
receptors of the D1-type. The classic neuroleptics chlorpromazine 19.1, haloperidol 19.2, and
(S)-sulpiride 19.3 are different from clozapine 19.4 (Fig. 19.9) in one point: they have no
comparable selectivity for the D4 receptor.
Binding to the dopamine receptors, Ki in nM
D1-Type D2-Type
Substance D1 D5 D2 D3 D4
Dopamine 0.9 <0.9 7 4 30
Chlorpromazine 19.1 30 130 3 4 35
Haloperidol 19.2 80 100 1.2 7 2.3
(S)-Sulpiride 19.3 45,000 77,000 25 13 1,000
Clozapine 19.4 170 30 230 170 21

Cl
S

O
Cl N OH
N
N
F

19.1 Chlorpromazine 19.2 Haloperidol

N
H
O N
N
N
MeO
N
Cl
SO2NH2
N
H

19.3 Sulpiride 19.4 Clozapine

Fig. 19.9 Chlorpromazine 19.1, haloperidol 19.2, and sulpiride 19.3 are neuroleptics with typical
side effects that are associated with dopamine antagonists. Clozapine 19.4 is different from these
substances in its binding profile on the dopamine receptors (Table 19.2) as well as in its side
effects.

came along (Fig. 19.9). It does not have the described side effects. Today we know
that clozapine, in contrast to the other neuroleptics, acts much more potently on the
D4 receptor than on the D2 and D3 receptors (Table 19.2). At the concentration at
which clozapine acts on the D4 receptor, and which was also measured in the
cerebral spinal fluid of the treated patients, is sufficient so that clozapine also
binds to particular serotonin and muscarine receptors, partly with even higher
affinity. Because of this it could also be that the antagonistic effects of clozapine
on these receptors are responsible for the atypical effects.
416 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

Fig. 19.10 The agonist


dopamine preferably binds to 1
the D1-type of dopamine
receptors (Table 19.2). It was

Log (Ki receptor binding, in mM)


clear very early, however, 0
from binding studies on
membrane homogenates that
the potency of clinically used −1
Dopamine
neuroleptics correlated with (r = 0.27)
the displacement of
haloperidol (r ¼ 0.87) rather −2
than with dopamine binding
(r ¼ 0.27).
−3

Haloperidol
−4 (r = 0.87)

−5
−2 −1 0 1 2
Log (average clinical dose, in mmol/kg)

Many drugs are classified as “dirty drugs” because of their multifaceted action
on many, totally different receptors. From the pharmacologists’ point of view, such
a characterization is appropriate. A general statement about the therapeutic value
cannot be derived from that. It may well be that many dirty drugs are optimal for
therapy because of their balanced action on multiple receptors. Recently, these
compounds have been termed “rich in pharmacology” and they define a “polyphar-
macology.” The suitability or unsuitability of a drug is only decided in the clinical
testing and later by the experience from broad application in patients.
The differences between enzymes and receptors in different species also offers
a chance to therapeutically achieve desired selectivity. Species differences play
a role if an undesired organism should be killed, that is, with antibiotics,
antimycotics, antivirals, and antiparasitic drugs. To avoid side effects in humans,
the metabolic pathways of the bacteria, fungus, viruses, or parasites are purpose-
fully attacked either by adequate selectivity or by selecting a point of action that is
not present in higher organisms (see ▶ Sects. 23.7, ▶ 24.3, ▶ 27.2, or ▶ 30.8).

19.11 Of Mice and Men: The Value of Animal Models

Quantitative activity–activity relationships serve to draw conclusions about humans


from animals, but also valuable to compare different biological models to one
another. From the huge plethora of examples that are described in the literature,
only a few typical relationships will be mentioned here.
Even before the characterization of the different dopamine receptors,
(Sect. 19.10, Table 19.2) 25 clinically used neuroleptics were investigated to
19.11 Of Mice and Men: The Value of Animal Models 417

Table 19.3 Correlation of the clinical efficacy (Fig. 19.10) of 25 different neuroleptics and their
potency in different animal models that are typically used for the evaluation of neuroleptic effects
with the displacement of dopamine or haloperidol 19.2. The clinical data and the results of the
animal models correlate conspicuously better with the displacement of the D2-type ligand halo-
peridol than with the displacement of the D1-type ligand dopamine (r ¼ correlation coefficient).
Correlation with dopamine Correlation with haloperidol
Model displacement (r) displacement (r)
Mean clinical dose in humans 0.27 0.87
Inhibition of the stereotypical 0.46 0.94
behavior after application of
apomorphine (rat)
Inhibition of the stereotypical 0.41 0.92
behavior after application of
amphetamine (rat)
Protection from apomorphine- 0.22 0.93
induced emesis (dog)

unravel correlations between the results of in vitro models, animal experiments, and
the potency of these substances in humans. Two radioactively labeled ligands,
dopamine and haloperidol 19.2 (Sect. 19.10, Fig. 19.9), one of which prefers the
D1-type and the other prefers the D2-type dopamine receptor, were used to charac-
terize binding. It was demonstrated that the average clinical dose significantly
correlated with the displacement of the D2-type ligand haloperidol 19.2. Signifi-
cantly higher concentrations were needed to displace the D1-type ligand dopamine.
A correlation with these data is virtually non-existent. Not only the clinical efficacy,
but also the data from animal models that are used to test for neuroleptic effects
correlate better with the displacement of haloperidol than with dopamine
(Table 19.3). In hindsight, the results suffer from a lack of ligand specificity for a
single receptor, and the preparations are affected by receptor heterogeneity because
the presence of the different receptor subtypes was not standardized in the calf brain
homogenates that were used. All substances were investigated with dirty ligands in
dirty test models. The profile of active substances can only now be unambiguously
assigned by using uniform receptor subtypes, which are produced by using gene
technology (see Table 19.2).
There are many cases in which the relationship between different test models
depends strongly on the species used. Investigations on isolated arteries and veins
from the lungs of rabbits, sheep, pigs, and humans show that the vascular
preparations from rabbits and humans react to noradrenaline in a comparable
way. Sheep and pig arteries are significantly less sensitive. Isolated pig veins
cannot be stimulated at all at comparable doses of noradrenaline. The experimental
results are even more inhomogeneous and difficult to interpret upon stimulation
with acetylcholine. It must not be forgotten that the metabolism in humans and in
animal species is also different and exerts an influence on the test results.
Tachykinins are short peptides that trigger a wealth of physiological and patho-
logical processes. Their central role in pain and asthma is certain. They act over the
NK1, NK2, and NK3 receptor subtypes, which also bind specifically to the three
418 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

Table 19.4 Binding of substance P and displacement by the antagonist CP 96 345 19.5 (tested as
a racemate) on cells of different origins.

OMe
NH

19.5 CP 96 345

Binding of substance P; IC50 Displacement of substance


System in nM P by 19.5; IC50 in nM
Human cell line U373 0.13 0.40
Human cell line IM9 0.22 0.35
Guinea pig brain 0.07 0.32
Guinea pig lung 0.04 0.34
Rabbit brain 0.16 0.54
Mouse brain 0.19 32
Rat brain 0.20 35
Chicken brain 0.26 156

peptide agonists substance P, neurokinin A, and neurokinin B (▶ Sect. 10.7). CP 96


345, 19.5 a non-peptide NK1 antagonist, displaces substance P with high affinity in
two human cell culture models and in guinea pig and rabbit membrane preparations.
In membrane preparations from mouse, rat, and chicken brains, with which sub-
stance P binds with entirely comparable affinities, 19.5 demonstrates IC50 values
that are 60–500 times higher (Table 19.4). It is known from sequence-specific point
mutations that the agonist substance P and the antagonist CP 96 345 bind to
different regions of the receptor (see ▶ Sect. 29.7).
The differences between humans and individual animal species are not surpris-
ing considering that the amino acid sequence of the receptor proteins is usually
different in multiple positions. The use of human proteins in molecular test systems
is just as critical for the relevance of the achieved results as it is for the determi-
nation of the 3D structures (▶ Chaps. 13, “Experimental Methods of Structure
Determination” and ▶ 14, “Three-Dimensional Structure of Biomolecules”). This
can be seen very cleary in the results of the aspartic protease renin (▶ Sect. 24.2).
The inhibitors remikiren 19.6 and aliskirien 19.7 were tested on renins from
different species. The renins of two primate species and humans were inhibited at
very low concentrations. On the other hand, the renins from the rat and the dog,
which are two species that are most commonly used in cardiovascular pharmacol-
ogy, were inhibited at conspicuously higher concentrations (Table 19.5). Remikiren
would have indeed been found in a classical test for blood-pressure-lowering
19.11 Of Mice and Men: The Value of Animal Models 419

Table 19.5 Inhibition of the renins of humans and other animal species by remikiren 19.6 and
aliskiren 19.7.

O O O OH
H
S N
N
H
O OH

N NH

19.6 Remikiren

O O
O
H2N N NH2
O H
OH

19.7 Aliskiren

Renin from: IC50 in nM, Remikerin IC50 in nM, Aliskiren


Human 0.8 0.6
Monkey 1.0 1.72
Dog 107 7
Rat 3,600 80

effects, but it would have been judged to be much too weak. A comparison of the
X-ray structure analysis of the renins from the mouse and human also shows
a conserved binding mode in the main chain of the peptide inhibitors that is
common to other aspartic proteases. Subtle differences are found at the rim of the
binding pocket arising from sequence differences between the species.
The amino acid sequences of 5-HT1B and 5-HT1Db subtypes of the serotonin
receptors of humans and rats show more than 90% identity. If the relationships
between the individual amino acids are considered, a homology of 95% is obtained.
Despite these similarities, a series of active substances bind with very different
affinities to these two receptors. The difference is traceable to a single amino acid:
the exchange of threonine 355 for an asparagine (Fig. 19.11). The human receptor
is, from the point of view of the affinity, converted to the rat receptor by this
mutation! After the exchange of this amino acid, the b-blockers propranolol and
pindolol bind with approximately three orders of magnitude higher affinity.
The affinities of many other ligands, on the other hand, are significantly reduced.
420 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

5-Carboxamido-tryptamine (5-CT)
8
5-Hydroxytryptamine

(−)-Propranolol
log 1/Ki (Rat)

7
Pindolol Metergoline

Sumatriptan
6
Methysergide
5-OMe-diMe-tryptamine
Rauwolscine
5 N,N⬘-Dipropyl-5-CT

4
4 5 6 7 8 9
log 1/Ki (Human)

Fig. 19.11 Different serotonin receptor ligands and the b-blockers propranolol and pindolol show
very different binding affinities on very similar 5-HT receptors from rats and humans. The open
circles refer to the wild-type human receptor. They are irregularly distributed over the diagram
(correlation coefficient r ¼ 0.27). If one amino acid in the human receptor is exchanged for the
corresponding amino acid in the rat receptor, the binding profile changes. Relative to the affinity of
the ligands, the human receptor becomes a rat receptor. The black-filled circles refer to this
Asn355 mutant (correlation coefficient r ¼ 0.98).

This may indeed only be a weak indication, but it can be speculated that the two
b-blockers bind to the mutated 5-HT receptor as they do to the b-receptor.

19.12 Toxicity and Adverse Effects

One of the most difficult chapters in preclinical research is the estimation of the
toxicity of a substance, above all the human toxicity, from data that were obtained
from other species. Such considerations must be made to be able to estimate the
potential danger of the substance before it is introduced to the clinic. Are there any
drugs without toxicity and without side effects? Paracelsus recognized in the
sixteenth century:
Everything is poison and nothing is without poison, it is the dose alone that makes a thing
non-poisonous.

Friedrich Schiller had his Fiesko say:


A desperate evil needs a bold medicine.
19.12 Toxicity and Adverse Effects 421

Table 19.6 Acute toxicity of Species Toxicity; LD50 (in mg/kg)


lysergic acid diethyl amide
(LSD, 2.21, ▶ Sect. 2.5, Mouse 50–60
▶ Fig. 2.8) in different Rat 16.5
species and in humans Rabbit 0.3
(LD50 ¼ dose that was lethal Elephant <<0.06
for 50% of the animals). Human >>0.003

And the pharmacologist Gustav Kuschinski formulated:


Whenever it is proclaimed that a substance has no side effects, the urgent suspicion ensues
that there is also no main effect.

The determination of the acute toxicity in multiple animal species, and the
determination of the chronic toxicity in at least two animal species is routine before
entry into clinical trials, phase I, which is tolerability testing on healthy volunteers.
It is considered to be standard that the species for the chronic toxicity investigations
should be selected according to which animal species displays the most similarity to
humans in their pharmacokinetics and metabolism.
Cats and guinea pigs react extremely sensitively to cardiac glycosides. Therefore
they were previously used as models for the effect on humans. Rats react consid-
erably less sensitively. The hallucinogen lysergic acid diethylamide (LSD 2.21,
▶ Sect. 2.5) shows decidedly different toxicity in multiple animal species. An
experiment to test the hallucinogenic effects of LSD on an elephant led to
a disaster. A hallucinogenic, but non-toxic dose was desired. Despite carefully
estimating this dose, the elephant died within minutes after 0.3 g of LSD
(corresponding to 0.06 mg/kg) was administered. Relative to the mouse, which is
relatively insensitive (Table 19.6), the elephant reacted at least 1,000 times more
sensitively. This experiment was not repeated! The discoverer of LSD, Albert
Hofmann, took 0.25 mg of LSD in his first controlled self-experiment. With
about 0.0035 mg/kg he was significantly below the dose that cost the elephant its
life. Despite this, it can be assumed that LSD is less toxic for humans than it is for
elephants. Direct fatality through LSD is not known, only mortality that occurs as
a result of accidents or from suicide while in the psychotic state.
The toxicities of poisons that end up in our environment are very exactingly
investigated. Chlorinated dibenzodioxines and furans form during the uncontrolled
chemical decomposition of the corresponding substituted chlorophenols. The
Seveso accident is attributed to such an incident. Toxic chlorinated dioxins and
furans also occur during many burning processes. Tetrachlorodibenzodioxine 19.8
(TCCD, “Seveso dioxine”) belongs to one of the best investigated substances
regarding its toxicity. Even here, different species react differently (Table 19.7).
Three orders of magnitude difference is found between the two relatively closely
related species of hamster and guinea pig. Accordingly, it is difficult to draw
conclusions about the toxicity in humans. If an extrapolation is made between
primates and humans, TCCS would be classified as relatively non-toxic. In con-
nection with humans, the definition of an acute LD50 is absolutely inappropriate.
422 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

Table 19.7 Acute toxicity of tetrachlorodibenzodioxine 19.8 in different animal species.

Cl O Cl

Cl O Cl

19.8 2,3,7,8-Tetrachlordibenzodioxin

Species Toxicity (LD50 in mg/kg)


Mouse 114–280
Rat 22–320
Hamster 1,150–5,000
Guinea pig 0.5–2.5
Mink 4
Rabbit 115–275
Dog >100 <300
Monkey <70
Human ?

To be able to exclude one fatality per one million people, an “LD0.00001” must be
determined or calculated. Because of its pronounced mutagenic effects, the
long-term damage stands in the foreground with TCDD. It is questionable in this
case whether an absolute no-effect level, that is, the lowest ineffective dose, can
be defined. The estimation of the potential danger of environmentally
relevant chemicals looks entirely different if considered relative to toxic natural
products, natural radioactivity, cosmic radiation, etc., or even when compared
to socially tolerated substances of abuse such as alcohol and nicotine. This
puts some things into perspective that are very contentiously discussed in public
forums.
A difficult problem must be mentioned when discussing structure–activity
relationships from in vitro investigations in order to estimate the mutagenic and
carcinogenic potential. Such tests indeed afford valuable information that must be
carefully checked. In individual cases they are neither in the positive nor the
negative sense proofs.
To develop theoretical models for toxicity and carcinogenic estimation that
have adequate reliability and predictive power has proven to be extremely difficult.
The mechanisms that are responsible for the activity are too diverse and multifac-
eted, and the chemical structures and structure–activity relationships, which are
only valid for one substance class, are too different. Today, testing for toxic,
carcinogenic, and teratogenic adverse effects has reached a high standard. The
pharmaceutical catastrophes of earlier decades such as the following would be
almost impossible with today’s standards:
• Early childhood brain damage and death of many premature and mature new-
borns by the sulfonamides in the late 1930s,
19.13 Animal Protection and Alternative Test Models 423

• Over 100 fatalities in the USA because of the use of diethylene glycol as
a solvent for sulfanilamide (this incident led to the foundation of the Food and
Drug Administration, FDA.),
• The SMON (subacute myelo-optic-neuropathy) illness of thousands of Japanese,
caused by the prolonged and too-frequent use of an antidiarrheal medicine,
• The severe birth defects of approximately 10,000 children worldwide that were
caused by thalidomide (Contergan ®) in the late 1950s.
Nonetheless, criminal intrigue and the uncontrolled distribution of faked drugs
from internet-based providers or the unscrupulous pursuit of economic advantages
can still cause such catastrophes today. The melamine-contaminated baby formula
(melamine makes the protein content of inferior or diluted milk seem higher) in
September 2008 in China, from which many thousand toddlers and babies were
sickened and several even died, serves as an example.
Moreover, in addition to the markedly stricter testing guidelines for medicines
that exist today in most countries, there is a reporting system that registers and
investigates adverse drug effect incidents. The slightest suspicion of a causal
relationship results in anything from public announcement or warning all the way
to the withdrawal of the marketing license.
A complication for the estimation of the toxicity is the formation of toxic, and
particularly reactive metabolites, even in small amounts. As was already discussed
in ▶ Sects. 9.1 and 19.6, an ideal drug should contain predetermined cleavage and/
or conjugation sites in addition to finely tuned pharmacodynamics and pharmaco-
kinetics. The more these requirements are fulfilled, the lower the risk that the
substance will exert toxic effects.
Some toxicity studies suffer from the fact that the extrapolated results to humans
reflect a higher toxicity than is actually the case because of the unphysiologically
high doses that are used in the studies. On the other hand, even the most compre-
hensive investigation cannot eliminate the risk of serious adverse effects occurring
in extraordinarily rare cases once the drug is used broadly. An adverse effect ratio
of 1:10,000 or less can remain undiscovered in even the most careful preclinical and
clinical trial.
Toxic side effects in humans are particularly seen after chronic pharmaceutical
misuse. The life-long consumption of large amounts of pain medication sums up to
kilogram amounts. In the case of phenacetin (▶ Sect. 2.1), this led to the conse-
quence that an effective and principally well-tolerated drug had to be withdrawn
from the market because of the kidney damage that resulted from inappropriate
(abusive) use.

19.13 Animal Protection and Alternative Test Models

Back in 1780 the philosopher Jeremy Bentham discussed the rights of animals. The
first mass protests against animal experiments were over 100 years ago. In 1875 the
dedicated animal protectionist Frances Power Cobbe founded the first Society
Against Vivisection in England, and the demand that anesthesia be administered
424 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

during animal experiments led to the first animal protection law a year later. In
Germany in 1879 the Internationale Gesellschaft zur Bek€ ampfung der
Wissenschaftlichen Thierfolter (International Society for the Abatement of Scien-
tific Animal Torture) was founded, and in 1883 the American Antivivisection
Society followed. A new militant form of protest against animal experiments,
complete with the violent freeing of experimental animals and attacks on scientists,
came in the 1970s. The book Animal Liberation by Peter Singer was published in
1975 and became the bible of all animal rights activists. The often-cited story of
animal trappers that hawk their prey to the pharmaceutical industry, belonged to the
realm of fantasy, even in the early days of drug research. Any pharmacologist
knows that any results obtained from such diverse animals, without any knowledge
of their health history, would be entirely useless.
More in parallel to the development of the animal protection movement than
inspired by it, in the 1960s alternative methods for pharmaceutical research were
implemented that consisted mostly of binding studies on membrane homogenates
and cell culture investigations. The number of animal experiments has been signif-
icantly reduced in the last decades because of the economic motivation arising from
the enormously high costs of breeding and keeping experimental animals, but also
because of the rapid progress of gene technology. As already explained in
▶ Sect. 7.5, models with lower animals such as pinworms, fruit flies, or zebra fish
have been increasingly used in screening. Here the ethical acceptance limit for
animal experiments is certainly lower.
Over 50% of the experimental animals are used for the testing of drugs, 12–15%
in basic research, for investigating medical methods, and for the recognition of
environmental danger. About half of all experimental animals are mice. The rest are
rats and other rodents, and a small part are fish and birds. Only about 1.5% of the
total number are cats, dogs, pigs, and other animals. A large part of the investiga-
tions on the latter species are chronic toxicity studies that are required by law.
The reduction in these numbers is notable because pharmaceutical companies
are investigating the biological activity of more substances than ever before. Each
year, tens or even hundreds of thousands of substances are meticulously character-
ized, usually in automated in vitro tests. Only a few of these substances reach
animal experiments. The number of experimental animals is also to be seen in the
context of the legal requirements of proof of efficacy and safety of new drugs,
which are increasing rather than diminishing. Such experiments must still be
overwhelmingly carried out on animals.

19.14 Synopsis

• Apart from potent and selective binding to a target protein, a successful drug
candidate must exhibit favorable pharmacokinetics. This comprises all processes
that affect the absorption, distribution, metabolism, and excretion along with
minor toxic side effects.
19.14 Synopsis 425

• Due to high costs and enormous experimental effort, full pharmacokinetics and
toxicity studies can only be carried out on a few drug development candidates.
A plethora of test methods have been developed to relate chemical structure with
ADME and toxicology properties in the lead-optimization phase to reduce the
chances of failure due to insufficient pharmacokinetics at a late stage.
• An active substance has to penetrate multiple lipid membrane barriers and aque-
ous compartments on its way from the site of application to the locus of the target
protein. To achieve sufficient distribution, adequate lipophilicity must be present.
This is described by the partition coefficient between the lipid and aqueous phases.
In the simplest model, the distribution between octanol and water is measured.
• Rather sophisticated models have been established to relate chemical structure
with penetration properties. Considerations about the release of the water solva-
tion shell around a drug molecule and its potential to form hydrogen bonds upon
crossing lipid membranes are particularly important.
• Many drugs are either weak acids or bases. Depending on the applied pH, they
exist through dissociation equilibria in either a more lipophilic neutral or more
polar ionized form. Membrane penetration of such species will therefore depend
very much on the local pH conditions.
• Because of the progressively changing pH conditions in the stomach and intes-
tines, at some place along the gastrointestinal tract appropriate pH conditions
exist that allow sufficient penetration of the neutral form of weakly acidic or
basic drug molecules.
• Because of established equilibria, small amounts of the neutral form of an acidic
or basic drug molecule are the intermediate over which the membrane penetra-
tion occurs. Constant removal of the neutral species from the water phase into
the membrane is quickly replenished by the dissociation equilibrium.
• The adjustment of the lipophilicity of a drug is crucial for pharmacokinetics.
Usually the more lipophilic a compound is, the better it will be absorbed;
however, limited solubility in the aqueous phase restricts lipophilicity. Relevant
test models have been developed by using thin layers of human colon cells.
These also allow the absorption by transporters to be studied.
• Active substances are initially tested in simple in vitro test models. Testing is
gradually moved into animal models via cellular assays. Relevant activity–
activity relationships must be established to correlate response in animal models.
At best, results from appropriate in vivo testing allow the therapeutic effects in
humans to be predicted. They standardize test data and reduce the amount of
animal experiments that are required.
• Nature works with two orthogonal principles upon release of its native sub-
stances: the specificity of the biological effect and a pronounced spatial com-
partmentalization. Some compounds are highly specific and travel long ways
through the organism to exert their action. Others are locally synthesized and
stimulate their target protein in the immediate vicinity. Here, high specificity and
selectivity are not required.
• Drugs administered orally or intravenously act systemically on the entire organ-
ism; no organ or cell-specific compartmentalization can be achieved. This has to
426 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties

be compensated for by sufficient specificity and selectivity. Whether high


isoform selectivity or protein-family-wide promiscuity is required depends
very much on the mode of action and biological function of the target protein.
• Prior to administration in humans, clinical candidates are tested in animals. To
draw conclusions about humans from animals, it must be considered that test
models strongly depend on the species used. Even metabolism can be very
different in humans and various animal species.
• Deviating therapeutic responses in animals and humans are also related to small
differences in the amino acid composition of the target proteins in various
species.
• Estimations of human toxicity have to be made from data that were obtained
from other animal species. Chronic toxicity is routinely determined in two
animal species and must be evaluated in animals that display the most similarity
to humans in their pharmacokinetics and metabolism.
• Different species, however, react differently to active substances and frequently
show several orders of magnitude differences in toxicity.
• Today, testing for toxic, carcinogenic, and teratogenic adverse effects has
reached a high standard. The pharmaceutical catastrophes of earlier decades,
which were mostly caused by these adverse effects, will hopefully be almost
impossible nowadays.
• Even the most comprehensive toxicity studies cannot eliminate the risk of severe
adverse effects occurring in extremely rare cases once a new drug is adminis-
tered broadly.
• The animal experiments of the past have been replaced by more conclusive
binding studies on membrane homogenates and cell cultures. Whole-animal
testing has been shifted toward lower animals such as pinworms, fruit flies, or
zebra fish. The toxicity studies required by law make up a large part of the
animal experiments that are done today.

Bibliography

General Literature
Dearden JC (1990) Molecular structure and drug transport. In: Ramsden CA (ed) Quantitative drug
design, Band 4 von: comprehensive medicinal chemistry, Hansch C, Sammes PG, Taylor JB
(eds) Pergamon Press, Oxford, pp 375–411
Hansch C, Leo A (1995) Exploring QSAR. Fundamentals and applications in chemistry and
biology, 1st edn. American Chemical Society, Washington, DC
Kubinyi H (1979) Lipophilicity and drug activity. Prog Drug Res 23:97–198
Kubinyi H (1993) QSAR: Hansch analysis and related approaches. Wiley VCH, Weinheim
Kubinyi H (1995) Lock and key in the real world: concluding remarks. Pharmacol Acta Helv
69:259–269
Lipnick RL (1990) Selectivity. In: Kennewell PD (eds) General principles, vol 1 from: Hansch C,
Sammes PG, Taylor JB (eds) Comprehensive medicinal chemistry, Pergamon Press, Oxford, S.
239–247
Bibliography 427

Mannhold R (ed) (2008) Molecular drug properties. Wiley-VCH, Weinheim


Mayer JM, van de Waterbeemd H (1985) Development of quantitative structure-pharmacokinetic
relationships. Environ Health Perspect 61:295–306
Reinhardt CA (ed) (1994) Alternatives to animal testing. Wiley VCH, Weinheim
Seydel JK, Schaper K-J (1982) Quantitative structure-pharmacokinetic relationships and drug
design. Pharmacol Ther 15:131–182
Smith DA, van de Waterbeemd H, Walker DK (2006) Pharmacokinetics and metabolism in drug
design. Wiley-VCH, Weinheim
Testa B, van de Waterbeemd H (2007) ADME-Tox approaches, vol 5 of comprehensive medicinal
chemistry II. Elsevier, Oxford

Special Literature
Clozel J-P, Fischli W (1993) Discovery of Remikiren as the first orally active renin inhibitor.
Arzneim-Forsch 43:260–262
Dhanaraj V et al (1992) X-Ray analyses of peptide-inhibitor complexes define the structural basis
of specificity for human and mouse renins. Nature 357:466–472
Gitter BD et al (1991) Species differences in affinities of non-peptide antagonists for substance
P receptors. Eur J Pharmacol 197:237–238
Hansch C, Björkroth JP, Leo A (1987) Hydrophobicity and central nervous system agents: on the
principle of minimal hydrophobicity in drug design. J Pharm Sci 76:663–687
Hanson DJ (1991) Dioxin toxicity: new studies prompt debate, regulatory action. Chem Eng News
69:7–14
Kubinyi H (1978) Drug partitioning: relationships between forward and reverse rate constants and
partition coefficient. J Pharm Sci 67:262–263
Lippold BC, Schneider GF (1974) Zur Optimierung der Verf€ ugbarkeit homologer quart€arer
Ammoniumverbindungen, 2. Mitteilung: In-vitro-Versuche zur Verteilung von
Benzils€aureestern homologer Dimethyl-(2-hydroxy€athyl)-alkylammoniumbromide. Arzneim-
Forsch 25:843–852
Matfield MJ (1991) Animal liberation or animal research? Trends Pharmacol Sci 12:411–415
Parker EM, Grisel DA, Iben LG, Shapiro RS (1993) A single amino acid difference accounts for
the pharmacological distinctions between the rat and human 5-hydroxytryptamine-1B recep-
tors. J Neurochem 60:380–383
Seeman P, Van Tol HHM (1994) Dopamine receptor pharmacology. Trends Pharmacol Sci
15:264–270
Tsuji A, Miyamoto E, Hashimoto N, Yamana T (1978) GI absorption of b-Lactam antibiotics II:
deviation from pH-partition hypothesis in penicillin absorption through In Situ and In Vitro
lipoidal barriers. J Pharm Sci 67:1705–1711
van de Waterbeemd H, Kansy M (1992) Hydrogen-bonding capacity and brain penetration.
Chimia 46:299–303
van de Waterbeemd H, van Bakel P, Jansen A (1981) Transport in quantitative structure-activity
relationships VI: relationship between transport rate constants and partition coefficients.
J Pharm Sci 70:1081–1082
Protein Modeling and Structure-Based
Drug Design 20

Structure-based drug design focuses on the search, design, and optimization of


a small molecule that fits well into the binding pocket of a target protein to form
energetically favorable interactions. Initially, a detailed analysis of the protein is
performed. All information about its structure and that of related proteins are
evaluated. Next, the properties of the binding pocket are thoroughly explored,
and areas are sought where an optimal binding is to be expected. Experimental
techniques as well as computer methods are employed to discover a lead structure
from a screening library (▶ Chap. 7, “Screening Technologies for Lead Structure
Discovery”). Alternatively, approaches are also applied that begin with a small
molecular “seed” in the binding pocket, which is then allowed to “grow” to a potent
ligand by using a stepwise iterative design. This approach uses fast docking
techniques that propose relevant binding geometries. The geometries are evaluated
with a scoring function, which estimates whether they are energetically favorable.
A decisive prerequisite for the use of structure-based drug design is, however,
the knowledge about the spatial structure of the target protein. Impressive progress
in the area of structure determinations (▶ Chaps. 13, “Experimental Methods of
Structure Determination” and ▶ 14, “Three-Dimensional Structure of Biomole-
cules”) have led to the situation that the 3D structures are known for many
therapeutically relevant proteins, or they can be determined early at the beginning
of a project. Nonetheless, it must not be overlooked that experimentally determined
spatial structures are still unavailable for many interesting target proteins.
Thanks to the sequencing of the human genome, the blueprints for all of the
proteins of our species are now known at the sequence level. The genomes of many
pathogens have also been determined, and new ones are reported weekly. How can
this enormous progress in information delivery be exploited for the design of new
drugs? Unfortunately, the way from the primary structure, that is, the amino acid
sequence, to the 3D structure is still very difficult, and even today is only reliably
possible by using experimental structure-determination methods (▶ Chap. 13,
“Experimental Methods of Structure Determination”). Methods to predict the
spatial structure ab initio are the subject of intensive basic research. They are still
a long way away from reliable, routine use, and do not afford the structural accuracy

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_20, 429


# Springer-Verlag Berlin Heidelberg 2013
430 20 Protein Modeling and Structure-Based Drug Design

that is needed for structure-based design. The “folding problem”, that is, the
prediction of 3D protein structure from the amino acid sequence alone, is still
unsolved. The situation occurs increasingly often, however, that the structure of the
protein of interest is unknown, but the structure of another related protein has been
solved. In such a situation, a model of the unknown protein can be constructed on
the basis of the spatial coordinates of the already characterized biopolymer.
Because of this, at least from the point of view of basic research, it could come to
pass that the extremely exciting question of finding Nature’s rules to the folding
problem will increasingly lose its significance.

20.1 Pioneering Studies in Structure-Based Drug Design

If one considers that most of the structures of therapeutically relevant proteins were
determined in the last 15 years, it is even more impressive that the first work in
structure-based drug design was carried out in the 1970s. The pioneers in this area
were Chris Beddell and Peter Goodford, who began to develop methods for ligand
design in 1973 at the Wellcome Research Laboratories. Hemoglobin was chosen as
a protein because it was the only known protein structure at the time that had some
relevance for pathophysiology. The goal of this work was to find a ligand that would
exert an allosteric modulating effect, analogous to diphosphoglyceric acid 20.1
(DPG; Fig. 20.1). The hope was to find a therapeutic approach that could help
homozygotic patients with lethal sickle cell anemia (▶ Sect. 12.13). DPG is syn-
thesized in red blood cells. It binds to hemoglobin and reduces its affinity to oxygen.
In this way, oxygen that is absorbed in the lungs can be released in other tissues.
The part of hemoglobin that binds to DPG contains a large number of positively
charged amino acids (Fig. 20.1). An optimal ligand should therefore contain
a negatively charged group to form multiple salt bridges to hemoglobin just
as DPG does. However, such compounds cannot penetrate the membrane of
a red blood cell. Therefore structures that interact with hemoglobin in other ways
were considered in the Wellcome research group. Compounds were chosen
containing reactive groups that could react with the amino groups of the lysines

b2-N-Term O−
O b2-His143

P O
b2-His2 O O P −
O b1-His2
O −
O O O b1-N-Term
b1-His143

20.1 b1-Lys82

Fig. 20.1 Schematic binding mode of diphosphoglyceric acid 20.1 (DPG) to the allosteric
binding site of hemoglobin. The ligand is bound through multiple charge-assisted hydrogen
bonds (N-terminal amino groups, His2, Lys82, and His143) from the b1 and b2 subunits.
20.1 Pioneering Studies in Structure-Based Drug Design 431

CHO

20.2 R=H

OHC R 20.3 R = OCH2COOH

SO3H

OH

20.4 R=H
HO
R
SO3H 20.5 R = OCH2COOH

Fig. 20.2 Structures of the diphosphoglyceric acid competitive hemoglobin ligands 20.2–20.5
that were developed by Beddell and Goodford.

b2-N-Term N

N b1-N-Term

Schiff base of 20.2

b2-His143
H
b2-N-Term N
SO3− b1-His2
−O
b2-His2 3S
O N b1-N-Term
H
b1-His143 COO−

b1-Lys82
Addition product of 20.5

Fig. 20.3 Postulated binding mode of the hemoglobin ligands 20.2 and 20.5 after chemical
reaction to the Schiff base or the bisulfite addition product. It is assumed for both compounds
that they bind covalently to the b1 and b2 subunits of hemoglobin through their N-terminal amino
acids. Compound 20.5 should also be able to form a hydrogen bond with its charged groups to the
side chains of amino acids His2 and His143 of the b1 and b2 subunits, as well as Lys82 of the b1
subunit.

in the binding pocket, or with the terminal valine. The idea was to design
a compound containing two correctly spaced reactive groups that could form Schiff
bases with two of these amino groups. Dibenzyl-4,40 -dialdehyde 20.2 (Fig. 20.2)
was chosen as a parent structure. The assumed binding mode of this compound is
shown in Fig. 20.3. Compound 20.2 was synthesized but proved to be too insoluble
for testing. Adequate solubility was achieved by introducing an additional carboxyl
432 20 Protein Modeling and Structure-Based Drug Design

group in 20.3. Moreover, this compound with its carboxyl group should form an
additional favorable interaction with the lysine side chain of the protein. Compounds
20.4 and 20.5 are the bisulfite adducts of the corresponding aldehydes. These
compounds were tested and indeed showed the desired allosteric effect. However,
they bind to the oxy form of the protein and increase its oxygen affinity. They proved
to be potent inhibitors of the erythrocyte deformation that occurs in sickle cell disease
because they stabilize the oxyform. The deformation begins with the aggregation of
the desoxy form. The targeted design of these dibenzyldialdehydes is the first
example of a rational, structure-guided protein-ligand design.

20.2 Strategies in Structure-Based Drug Design

The first step in the design of ligands for a protein with a known 3D structure is the
precise analysis of this structure. What does the protein’s binding pocket look like?
Where are the hot spots, that is, where can the ligand functional groups bind
particularly well? Today, computer programs are available for just such an analysis.
They search the surface of a protein for suitable binding sites for different func-
tional groups. These methods have been introduced in ▶ Sect. 17.10.
Experimental techniques can also help in the search for hot spots. X-ray structure
analysis and NMR spectroscopy are particularly suitable methods (▶ Sects. 7.8 and
▶ 7.9). Alexander Klibanov and Dagmar Ringe first described this strategy. They
initially grew crystals of the enzyme elastase in water and determined its X-ray crystal
structure. Then the crystals were soaked in the organic solvent acetonitrile, and the 3D
structure was determined again. It was shown that the protein on the whole maintains its
structure. On the other hand, significant changes were found in the solvent structure.
Water molecules from the previously determined structure were displaced by acetoni-
trile. Other water molecules remained in their original positions, just as before. This
experiment allowed the differentiation of displaceable versus non-displaceable water
molecules, which presumably relates to their stronger or weaker binding. Furthermore,
additional preferred binding sites of the organic solvent molecule were identified. They
helped to identify and experimentally map out the energetically favorable binding sites
in the protein pocket. At Abbott, NMR spectroscopy was used analogously to elucidate
binding sites by using small molecular probes.
Often, binding affinities for some initial ligands are already available at the time
of the structure elucidation of the target protein. Based on the protein structure, first
attempts are made to derive a crude structure–activity relationship for these first
ligands. Conclusions can be drawn about the essential interactions between the
protein and ligands. If further compounds are discovered by high-throughput
screening (▶ Sect. 7.3), they are docked into the binding pocket to generate ideas
for their subsequent structural optimization. This way, additional areas in the
binding pocket can be identified that are not yet occupied by known ligands. The
indicated additional interactions can be exploited with appropriately modified
compounds to arrive at more potent and selective binders. Also ideas how to simplify
a ligand can be extracted from the information provided by the 3D structure:
20.3 Search Tools for Databases of Experimentally Determined Protein Complexes 433

a substituent of the inhibitor that is not involved in a favorable contact with the
protein can often be removed. On the other hand, such a substituent can be
purposefully changed to improve the solubility, lipophilicity, or transport and
distribution (ADME) properties of the drug candidate (see ▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”).
In principle, there are two approaches for the design of new active substances.
Either an attempt can be made to find an entirely new structure (Sect. 20.10), or
a known lead structure that was discovered with the techniques discussed in
▶ Chap. 7, “Screening Technologies for Lead Structure Discovery” can be modi-
fied. The modification of a known structure has the advantage that potent and
selective protein ligands can be arrived at relatively quickly. Furthermore, proteins
with known 3D structure, in general, afford more meaningful structure–activity
relationships. The proposed structures, however, will remain very close to the initial
lead structure. Often the 3D structure of an enzyme in complex with a peptide
inhibitor is determined at the beginning. The initially modified lead structures are
usually also peptides. The way to an orally available drug is then rather protracted
under these conditions (▶ Chap. 10, “Peptidomimetics”). The second approach is
represented by de novo design. The corresponding techniques are introduced at the
end of this chapter. De novo design can lead to entirely new, non-peptide structures.
The problem is, however, that this method often leads to an enormous variety of
possible structures that are difficult to prioritize and rank reasonably.
The most important prerequisite for successful structure-based drug design is an
iterative approach. The 3D protein structure is the starting point for the initial
design of a ligand, which is then to be synthesized and tested. In the case of good
binding, an attempt is made to determine the 3D structure of the protein–ligand
complex with the new compound. This structure is the starting point of the next
design cycle. The approach is summarized schematically in Fig. 20.4. The great
advantage of this technique is that all steps of the design hypotheses can be
validated with each cycle. Surprising binding modes that do not correspond to the
original design and that might mask a proper structure–activity relationship also
become immediately apparent. At this point it is also worthwhile to determine the
3D structure of a protein–ligand complex with a ligand that binds poorly with the
protein. This 3D structure usually provides an explanation for the poor binding, and
information is gained that can be translated into proposals for new structures.

20.3 Search Tools for Databases of Experimentally Determined


Protein Complexes

The number of experimentally determined protein structures has grown exponen-


tially in the last years. In 1988, 200 3D structures were found in the protein database
(PDB), in the meantime there are more than 89,000 entries, mainly from proteins
and protein–ligand complexes. This rapid growth of known spatial protein struc-
tures stimulates the development of methods to use this structural information for
the design of new active compounds. Most of the available examples are still
434 20 Protein Modeling and Structure-Based Drug Design

3D-Structure 3D protein
determination structure Ligand design

Biologically Proposed
active ligand ligand

Testing Synthesized Synthesis


compound

Fig. 20.4 The starting point of a design cycle is the determination of the 3D structure of the target
protein. This information is used for the design of new protein ligands that are then synthesized
and tested. If they show activity, their 3D structures in complex with the protein are determined.
On the basis of these structures, ligands with improved binding properties are designed in the next
design cycle.

overwhelmingly globular, water-soluble enzymes. The number of novel mem-


brane-bound proteins, however, is increasing steadily. To really exploit this pleth-
ora of structures, database tools are needed that can retrieve, correlate, and analyze
structures and structural motifs. Numerous programs are known that can compare
the sequence and folded structure of proteins. The database Relibase was developed
for the analysis of protein–ligand complexes. Sequence patterns in proteins can be
searched, and the connectivities of bound ligands can be compared with this
program. The database automatically superimposes proteins, and an optimal over-
lap of the binding pockets is sought. The aligned structures can be systematically
evaluated. Which amino acids are involved in the interactions with ligands? Which
functional groups do the ligands use to interact with protein amino acids? Which
side chains in the binding pocket are recurrently in an identical geometry, or
which ones show a high degree of flexibility? The water structure at the interface
between the protein and ligand can be studied in detail. A statistical evaluation
surprisingly showed that at least one water molecule is involved in ligand binding in
two thirds of all protein–ligand complexes. This underscores the importance of
considering water molecules in the modeling. Unfortunately, our concepts to
adequately treat water molecules are still rudimentary, even today.

20.4 Comparison of Protein Binding Pockets

Another important question addresses the shape and composition of binding


pockets. Are there other proteins with similar amino acid compositions in which
the pocket exhibits an analogous shape? Here the actual amino acids are less
20.5 High Sequence Identity Facilitates Model Generation 435

important. Rather, the analogous physicochemical properties of exposed groups,


such as hydrogen-bond donors or acceptors that are oriented toward the binding
pocket are crucial. Programs that make these comparisons possible describe the
shape and surface of protein pockets along with the exposed properties. The
function of proteins is frequently coupled to the recognition and binding of small-
molecule ligands or segments of peptide sequences (e.g., proteases). Once bound,
these molecules are chemically modified in the case of enzymes. With receptors,
the ligands are able to induce an effect within the receptor that, for example,
stabilizes an active or inactive conformation of the protein. In this way a signal is
transduced. The discovery of similarities in binding pockets can lead to the detec-
tion of functional similarities between the proteins. This is independent of whether
a sequence or folding homology exists between these proteins. There is also a chance
to find an unexpected cross-reactivity through similarities in the shape and properties
of binding pockets. Frequently such unexpected binding is the reason for adverse
effects. By evaluating similarities and differences in such pockets, it can also be
recognized how ligands are to be changed to achieve the desired selectivity for the
target protein. Valuable ideas for the design of new or altered protein ligands can be
generated if bound ligands or ligand building blocks in similar pockets are examined
and compared. This stimulates valuable ideas for isosteric replacements in the struc-
ture-based optimization of the first lead structures. The search engine Cavbase,
implemented into the database Relibase, makes such pocket comparisons possible.

20.5 High Sequence Identity Facilitates Model Generation

An indispensible prerequisite for the use of the method arsenal of structure-based


drug design is the existence of a spatial structure. A crystal structure cannot always
be obtained. Under what conditions can the model of an unknown protein be
constructed from a given sequence?
Proteins with similar function originating from different species differ in their
amino acid sequences. With increasing distance in the phylogenetic tree, these
differences increase. Take the example of cytochrome C (Fig. 20.5). This broadly
distributed protein in mitochondria plays a central role in the respiratory chain. It
consists of a polypeptide chain with about 100  20 amino acids. Three cyto-
chromes are shown in Fig. 20.6, which, despite their different peptide-chain lengths
and compositions, possess a very similar folding pattern. The proteins from the
phylogenetically related species human and chimpanzee have 100% sequence
identity. In contrast, the enzyme from yeast has only 45% identity with that of
these mammals. If the homology is very high, and only a few mutations are present,
the model construction is relatively simply carried out. If the sequence identity is
more than 90%, models can be constructed with uncertainties that approach
the error margins of experimental structure determinations (▶ Sect. 13.5). If the
sequence identity falls, model construction becomes less accurate. At 50%, the
mean error of the coordinates can account for a few Angstroms. Below an identity
of 25–30% the recognition of structural relationships becomes very problematic.
436 20 Protein Modeling and Structure-Based Drug Design

(a) NEGDAAKGEKEF-NKCKACHMIQAPDGTDIKGGKTGPNLY
(b) -EGDAAAGEKVS-KKCLACHTFDQGGAN-----KVGPNLF
(c) --GDVAKGKKTFVQKCAQCHTVENGGKH-----KVGPNLW

(a) GVVGRKIASEEGFKYGEGILEVAEKNPDLTWTEANLIEYV
(b) GVFENTAAHKDNYAYSESYTEMKAK--GLTWTEANLAAYV
(c) GLFGRKTGQAEGYSYTDA-----NKSKGIVWNNDTIMEYI

(a) TDPKPLYKKMTDDKGAKTKMTFKMGKNQADVVAFLAQBBP
(b) KDPKAFVLEKSGDPKAKSKMTFKLTKDD--------EIEN
(c) ENPKKYI--------PGTKMIFAGIKKKGER-------QD

(a) BAGZGZAAGAGSBSZ
(b) VIAYLK------TLK
(c) LVAYLKSATS

Fig. 20.5 The primary sequences of three cytochrome C proteins arranged using the typical one-
letter code are shown from (a) the denitrifying bacterium Paracoccus denitrificans (134 amino
acids), (b) the proteobacterium Rhodospirillum rubrum (112 amino acids), and (c) the mitochon-
dria of a tuna fish (103 amino acids). The proteins vary in their length and composition. The
sequence comparison shows the alignment with the best agreement. Invariable or conserved
positions in the sequence are marked in bold. The abbreviations stand for A Ala, C Cys, D Asp,
E Glu, F Phe, G Gly, H His, I Ile, K Lys, L Leu, M Met, N Asn, P Pro, Q Gln, R Arg, S Ser, T Thr, V
Val, W Trp, Y Tyr. Dashes stand for areas in which other proteins carry additional amino acids
(insertions). The red bars underscoring the sequences show preferred helical areas.

The overwhelming portion of sequence differences between homologous proteins is


found at the protein surface in loop regions, which are not critical for protein folding
(▶ Sect. 14.4). Exchanges in the interior of the protein have a considerably larger effect
on its architecture. They are usually limited to amino acids with similar volumes and
with very similar physicochemical properties, for instance, the exchange of a leucine
for an isoleucine. Frequently the exchange of an amino acid is coupled with the
complementary exchange of one or more other amino acids in the direct vicinity.
This is particularly true when polar amino acids are exchanged in the protein interior
that are internally saturated by salt bridges. In the new mutated protein variants, these
amino acids establish a stable alignment. Because the spatial vicinity of the amino acid
residues in the fold need not be accompanied by sequential vicinity in the protein chain,
the recognition of such structural relationships is considerably complicated. Mutations
in the protein core can lead to an expansion, spatial shifts, or twisting of structural
building blocks of the protein.
If the identity is very high, only a few amino acid side chains must be exchanged.
The conformations of the involved side chains can be derived from a comparison
with the structurally resolved proteins showing these amino acids in a similar
environment. With decreasing identity, insertions and deletions in loop areas, that
is, an expansion or contraction in the polypeptide chain, must be considered.
Libraries of known protein structures have been compiled for the prediction of
the conformations of these loops during model construction. Based on the length
20.6 Secondary Structure Prediction and Amino Acid Replacement Propensities 437

Fig. 20.6 Superposition of the folded structures of the three cytochrome C proteins from Fig. 20.5
based on a ribbon model: Paracoccus denitrificans in blue, Rhodospirillum rubrum in red, and
tuna fish in yellow (left). The cytochromes bind via a histidine and a methionine to an iron–heme
center. The structures were determined by X-ray crystallography. Structural deviations occur
particularly in the loop regions. On the right side the same superposition is shown, only here the
individual amino acids are color-coded. The same colors in all three ribbon models show identical
amino acids at different positions (color coding: Ala: light gray, Val: chartreuse, Gly: white: Ile:
bright green, Leu: olive green, Pro: pink, Phe: violet, Tyr: dark purple, Trp: light violet, Asp: dark
red, Glu: wine red, Asn: turquoise, Gln: cyan, Lys: blue, His: light blue, Arg: medium blue, Ser:
light orange, Thr: dark orange, Cys: light yellow, Met: dark yellow.

and sequence, these loops are classified into conformational families. They can be
retrieved with the computer and support the construction of the spatial arrangement
of a modified loop. The validation of the relevance of these protein models follows
empirical rules. They check whether their constructed geometry is in agreement
with experimental evidence. For example, it must be ensured that hydrophobic
groups are oriented in the interior, and hydrophilic groups are oriented largely on
the exterior. The contact between amino acids groups is checked, and the chosen
torsion angles are compared with the typically observed values.

20.6 Secondary Structure Prediction and Amino Acid


Replacement Propensities Support Model Building at
Low Sequence Identity

If the sequence identity between the known and modeled protein falls below 30%,
the determination of structural homology becomes difficult. All additional infor-
mation must be employed as a resource. An attempt is made to estimate in which
sections of the polymer chain of the modeled protein particular secondary structure
elements are to be expected (▶ Sect. 14.2). If the frequency with which individual
438 20 Protein Modeling and Structure-Based Drug Design

amino acids occur in helices, pleated sheets, or loops is evaluated, significant


differences are found. For example, proline is considered to be a “helix breaker.” It
occurs in the first turn of a helix at the most, in other positions it disrupts the geometry
and induces a kink. To determine whether a particular sequence segment folds as
a helix, pleated sheet, or loop, the information about positional preferences is
evaluated for multiple neighboring amino acids in an overlapping fashion.
Once analyzed in this way, the primary sequence is then compared with a reference
protein of known geometry. Because the 3D structure is given, the assignment of the
sequence to the secondary structural elements is straightforward. If not only one but
multiple 3D structures from homologous protein families are known, an attempt is
made to construct a representative profile of the expected secondary structure through
multiple sequence alignments. This profile serves as a reference for the mutual
comparison of sequences from structurally known or unknown proteins.
The reliability of this comparison can be improved. In the 1980s, the group of Tom
Blundell in London and Cambridge compiled a set of rules for the probability of the
mutual exchange of individual amino acids. In addition to the physicochemical prop-
erties of the amino acids, the local conformational properties, the main and side-chain
orientations, the accessibility to solvent molecules, and the involvement of hydrogen
bonds are analyzed. At the same time, the probability with which such a mutation could
occur at the DNA sequence level is considered for a particular amino acid exchange.
These properties can easily be determined for proteins with known spatial structures.
By comparing the structures within a set of homologous proteins, the probability of
the mutual replacement of amino acids is determined. For example, in contrast to other
amino acids, glycine has no side chain (Figure in the Appendix). It can therefore form
conformations in the polymer chains that are inaccessible for other amino acids for
steric reasons. The polymer chain may adopt such conformations in areas near the
protein surface, where the course of the chain changes direction. There the conforma-
tional flexibility of glycine plays an important role. It is just these solvent-exposed
glycines with unusual torsion angles that have proven to be largely conserved between
homologously folded proteins. Such conserved glycines can be searched for in
a sequence comparison for protein modeling. They represent anchor points to match
the sequence. Many other similar rules can be established. They serve as criteria for the
recognition of the structure-determining sequence regions. Then they are transferred to
the sequence to be modeled. Even with little sequence identity, structural homologies
between the primary sequence and a protein with known 3D structure can be recog-
nized. They are incorporated as criteria into the homology modeling. When modeling
G protein-coupled receptors, the most important group of membrane-bound receptors,
additional criteria are considered. The modeling must ensure that the hydrophobic
amino acids in the helical areas that are embedded in the membrane are oriented toward
the membrane environment. Meanwhile, the homology modeling programs have
achieved a high degree of automation. A server was established at the Biozentrum in
Basel that translates submitted sequences into 3D structures automatically. The pro-
gram Modeller from the group of Andrej Šali in San Francisco is able to assemble
protein models from the sequences of entire genomes in silico. Despite the certainly
rough structures, among which many may be incorrect, such an approach allows
20.8 Docking Ligands into Binding Pockets 439

a search for similarities on the recognition determinants of proteins. In this way possible
interactions between proteins can be discovered, or commonalities in metabolic
pathways become transparent.
The modeling of proteins gives good results, above all when they show high
homology. This is given in areas that determine the folding scaffold. The binding
pockets fall on areas of the loop regions (▶ Sect. 14.4). It is particularly there that
even homologous proteins differ severely. Therefore the model constructions do not
achieve the desired accuracy in these regions. An improvement can be achieved
here if a ligand is already placed in the assumed binding region during model
construction. Model and placement must be finally optimized in an iterative process
by using appropriate energy functions. In this way, models for G protein-coupled
receptors have been constructed (▶ Sect. 29.2) that are sufficiently accurate for
successful virtual screening.

20.7 Ligand Design: Seeding, Expanding, and Linking

The next step after the analysis of the binding pocket of the either experimentally
determined or modeled protein is the actual ligand design. Here different
approaches to computer-aided design are available to suggest new protein ligands.
A docking program can be consulted with which successively preselected ligands
from a database are placed into the binding pocket (Fig. 20.7). Typically the
database is assembled with molecular candidates that largely resemble customary
drug-like molecules (▶ Sect. 7.6). Another approach starts with a “seed” in the
binding pocket. By starting from this point, the ligand grows stepwise in the binding
pocket. This principle is followed by most de novo design programs. The placement
of the first “seed” is critical. Such approaches are especially successful when there
is a particular hot spot in the binding pocket from which the further optimization
starts. Salt bridges to charged amino acids or the coordination of metal ion centers
are especially well suited for this approach. This concept has been successfully used
on, for example, the serine proteases trypsin and thrombin (▶ Sect. 23.4) and the
zinc-containing carbonic anhydrases (▶ Sect. 25.7). Another approach starts with
multiple small fragments that are placed in the binding pocket. Next an attempt is
made to link the fitted molecular fragments to one another with appropriate spacers.
This strategy could be successfully applied several times by using the SAR-by-NMR
methods (▶ Sect. 7.8).

20.8 Docking Ligands into Binding Pockets

Docking tries to fit potential protein ligands into a binding pocket using the
computer. For this, a docking program takes one candidate after the other succes-
sively from a precompiled library of molecules. A 3D structure is generated for
each entry. If a flexible molecule is encountered, either multiple conformations are
saved or they are generated on the fly during docking. In the next step, each
440 20 Protein Modeling and Structure-Based Drug Design

N
H

Linking
Placement

Construction

N N N
H H H
O ?
O O O
H ? H ? H
?
O O O

N O
N
H H
S O
O O
H H

O O

N
H
O S
H
N O
H

Fig. 20.7 Possible strategies for ligand design. The complete 3D structures of possible ligands are
fitted into the binding pocket during docking (left picture, insert). The construction of new
molecules is sketched in the middle and right part of the picture. In principle there are two
possibilities. A fragment can be placed as a seed, and step-by-step other groups can be attached
(middle). Alternatively, multiple small molecular fragments can be placed in the binding pocket
independently of one another and later linked with one another (right).
20.8 Docking Ligands into Binding Pockets 441

molecule is fitted into the binding pocket. First, the structures that cannot bind to the
protein are discarded. In addition, other structures are eliminated that cause obvious
problems, for example, due to electrostatic repulsions with the protein in the
assumed docking mode. Typically a docking program generates multiple solutions.
These are scored on the basis of the generated binding geometries, and their affinity
is estimated.
Irwin Kuntz is a pioneer in the field of docking programs; the program DOCK
was developed in his group at UCSF in San Francisco. In the original version in
1982, only the steric complementarity of ligands and proteins were evaluated. For
this the shape of the binding pocket was approximated by a set of different spheres
so that the pocket was completely filled. Next a mathematical method was used to
place the test ligands on this distribution of spheres. The complementarity,
a measure of direct protein–ligand contacts, served as a scoring function. Since
the first version, DOCK has developed much further. The program now uses a force
field for scoring and calculates the contributions for desolvation. Even the place-
ment of the ligands is conducted flexibly by considering rotatable bonds.
A different docking prototype was developed at GMD in Bonn by Matthias
Rarey. The program FlexX represented the first program that could quickly handle
ligand flexibility during docking. It disassembles the test ligands into individual
fragments and subsequently uses an algorithm that works very similarly to the
positioning in the program LUDI (Sect. 20.10). After placement of the first building
block, the ligand is successively reconstructed in the binding pocket. Different
conformers along the rotatable bonds are considered for this purpose. The program
maintains stored tables of preferred torsion angles, similar to those described in
▶ Sect. 16.6. The energetic evaluation of the placement is carried out at this step.
The program AutoDock from the groups of Art Olson at Scripps in La Jolla, San
Diego, uses a lattice-based algorithm for the placement. By using a force-field
function similar to that in the program GRID (▶ Sect. 17.10), potential values are
placed on a lattice that is embedded into the binding pocket. By starting from
a randomly chosen starting orientation, the ligand is shifted across the lattice until
an optimum is found. In doing so it “feels” the interaction potential with the protein.
Because the potential was already precalculated on the lattice, this evaluation runs
particularly fast. At the same time, twisting around rotatable bonds is performed.
The program GOLD, which was developed by Gerrith Jones in the group of Peter
Willett in Sheffield, England, also uses a lattice for placement. Interaction poten-
tials are, however, parameterized on crystal data. GOLD uses a genetic algorithm to
optimize the geometry. In the meantime a plethora of docking programs has been
developed. All follow a slightly different strategy but are based on the concepts
described here. Some follow the idea that it is better to generate a well-distributed
number of rigid-ligand conformers and then to dock these quickly as rigid bodies.
Today there are three main problems that impose limits on docking. One is the
energetic evaluation of the generated geometries. This will be specially addressed
in the next section. Another is that water plays a decisive role in ligand binding
(Sect. 20.3). Even today no really convincing solution to the handling of water
during docking has been found. The third problem is the flexible adaptation of the
442 20 Protein Modeling and Structure-Based Drug Design

protein (▶ Sect. 15.8). Usually there are small adaptations on the side of the protein
that slightly change the shape of the binding pocket. Indeed, they are large enough
to send the docking programs after proverbial red herrings.

20.9 Scoring Functions: Ranking of Constructed Binding


Geometries

A relevant scoring of the generated binding geometries is essential for all


de novo design approaches in structure-based drug design. From the numerous
geometrically plausible placements, only those that approximate the experimentally
found situations reasonably well must be highlighted. The enthalpic and entropic
contributions were described in ▶ Chap. 4, “Protein–Ligand Interactions as the
Basis for Drug Action” that, according to today’s state of knowledge, determine the
affinity of a ligand to its target protein. The goal of a scoring function is to quickly
read the expected binding affinity from a given interaction geometry. The theory
states already that one single geometry is not enough to resolve this issue. Molar
energies are determined by a finite set of conformations of a molecule. They are
partitioned over a so-called “ensemble” of multiple states. These states are differ-
ently populated. One group of methods tries to consider this fact in the calculations.
This is the theoretically most appropriate approach. The energy contributions of the
ensemble (usually taken from the trajectories of a molecular dynamics simulation
▶ Sect. 15.7) are summed. The free enthalpy DG (▶ Sect. 4.3) can be estimated
from the resulting partition function. The necessary calculations for such an
evaluation are, however, very time consuming, which practically excludes them
from present structure-based design.
Instead, regression-based scoring functions are used as an alternative
approach. Assuming that a particular state is populated to an overwhelming extent,
it may be justified to consider only one state in the scoring function. The enthalpy
and entropy contributions that most likely determine the binding affinity are
considered. The approach is reminiscent of the setting up of a QSAR equation
(▶ Sect. 18.2). Terms are composed in an energy function. Molecular descriptors
are sought that correctly reflect the contributions to these terms. In doing so, the
false assumption is certainly made that the individual contributions to the descrip-
tion of the free enthalpy are composed additively (▶ Sect. 4.10). The individual
terms of the equation are each furnished with an adjustable weighting factor. As
with QSAR equations, by using a mathematical technique, an optimal fitting of
these weighting factors is conducted on a training data set. This set is composed of
crystallographically determined protein–ligand complexes for which experimental
binding affinities are available.
A third approach follows a so-called knowledge-based concept. As already
discussed in ▶ Sect. 17.10, the frequency of the individual contact geometries in
the crystal structures of protein–ligand complexes are evaluated. A kind of
“normal distribution” is defined as a reference state. Then all contacts that
occur more frequently than average are classified as energetically favorable.
20.10 De Novo Design: From LUDI to the Automated Assembly of Novel Ligands 443

All contacts that rarely occur are classified as unfavorable. Next, a relative
energetic ranking of a set of ligands to the same reference protein can be
performed with a thus-derived function. If trained by using a data set of known
geometries and binding affinities, analogously to the regression-based scoring
functions, an affinity prediction can be achieved. The evaluation with the regres-
sion-based or knowledge-based function is very fast. In the meantime, a vast
number of scoring functions have been developed. None are ideal. In each case it
must be checked which function affords the best performance for the protein
under investigation.

20.10 De Novo Design: From LUDI to the Automated Assembly


of Novel Ligands

The first program for stepwise de novo design was GROW from Jeffrey Howe and
Joseph Moon at the company Upjohn. It concentrates on peptides as lead structures.
An amide group is positioned in a favorable orientation in the binding pocket.
Next amino acids are added stepwise to the starting amide group. At each step,
a large variety of different conformations of all 20 proteinogenic amino acids are
attached to the seed on the fly. For each, the “best” solutions are followed further.
In this way, GROW constructs a peptide ligand in the binding pocket with increas-
ing length.
In the beginning of the 1990s, Hans-Joachim Böhm developed the program
LUDI at BASF. The underlying idea was to read small molecules or molecular
fragments from a database with precalculated spatial geometries, and to position
them in the binding pocket so that hydrogen bonds are formed with the protein, and
hydrophobic pockets are filled with non-polar groups. The program needs the
coordinates of the protein as well as a library of 3D structures of fragments or
drug-like molecules as input.
The precalculation of so-called interaction sites is decisive. These are placed in
the binding pocket around the amino acid groups in terms of fitting points or
directional vectors (Fig. 20.8). The program uses rules that are derived from the
non-bonded interactions found in the crystal packing of small organic molecules
(▶ Sects. 14.7 and ▶ 17.10). Then LUDI extracts small molecules or molecular
fragments from a 3D library. For each entry an attempt is made to position them
into the binding pocket of the protein so that as many of these interaction sites are
satisfied as possible (Fig. 20.8 and ▶ Fig. 17.12). Next, all of the successfully
placed fragments are ranked. The scoring function that is used for this considers
the number and quality of the formed H-bonds and ionic interactions, hydropho-
bic contact surfaces shared between the protein and ligand, as well as unfavorable
contributions arising from the number of rotatable bonds in the ligands.
An example for the successful application of this program is described in
▶ Sect. 21.5.
As the first prototype, LUDI was the gold standard for many de novo design
programs that were developed later. In these approaches improved scoring
444 20 Protein Modeling and Structure-Based Drug Design

Fig. 20.8 Concept of the


program LUDI for de novo
design of protein ligands. In
the first step, the interaction
sites are determined (top).
Donor sites are represented by
blue lines, acceptor sites by
red lines. The green points
symbolize lipophilic sites.
Subsequently, small
molecules from a database are
fitted into the binding pocket
in that they are matched with
the interaction sites (middle).
Finally LUDI can chemically
link groups of molecules to
larger structures to match the
remaining interaction sites
and to fill the entire binding
pocket (below).

functions were implemented, and the fragment libraries were improved. The
programs were also taught synthesis rules, so that the chemical accessibility of
the generated molecules was not completely neglected. The search space for the
programs was also enlarged in that multiple conformations and configurations
could be screened.
A de novo design program represents an idea generator. Its value is, of course,
determined by concepts that went into its development. On the other hand, the
values also strongly depend on the user and how the suggestions of such a program
are interpreted and used for further design.
20.12 Synopsis 445

20.11 The Feasibility of Designing Ligands In Silico

Certainly many examples proving the scope of de novo design, virtual screening,
and docking have been provided. The example described in ▶ Chap. 21, “A Case
Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase” was
successful because of the massive deployment of such methods. It is decisive
however, that the computer methods are tightly embedded into an iterative process
with synthesis and experimental structure determination. It must be kept in mind
that not all hits from the computer screening are found based on correct assump-
tions and picked up for the right reasons.
The predictive power of the available methods is still limited. The synthetic
accessibility of a suggested molecule is not sufficiently considered, the flexibility of
the protein is neglected, and the methods to estimate the binding affinity are still too
inaccurate. This is because the process and factors responsible for molecular
recognition and ligand binding are still too poorly understood. The correct descrip-
tion of solvation effects and the incorporation of water molecules in the binding
process represent large problems. The contribution of a hydrogen bond to binding
affinity, despite all efforts to the contrary, is still only an estimation. As for
lipophilic interactions, it can at least be assumed that the filling of an unoccupied
lipophilic pocket with additional non-polar substituents is in most cases accompa-
nied by an increase in binding affinity.
How the changes in the binding entropy contributing to ligand binding free
energy can be considered is largely unclear. At least evidence is collected that the
oversimplified assumptions that entropic contributions within a set of congeneric
ligands should be constant is not given.
There are still further fundamental limitations of this approach. The most
important is certainly that the technique is limited to the optimization of direct
interactions with the protein. Successful binding to a target protein is of central
importance for any active substance. However, to be suitable as a drug, additional
prerequisites must be fulfilled. Among these are good selectivity, metabolic stabil-
ity, adequate duration of action, low addictive potential, and negligible toxicity.
Today, at the very least the selectivity of a compound to members of a structurally
related protein family can be estimated with some certainty.
Fully automated molecular design on a computer is indeed not possible, even in
the long term. The methods of structure-based design are of value as idea genera-
tors. The obtained proposals must be checked and, when necessary, modified. Time
will tell whether the methods will gradually approximate the “holy grail” of drug
design: the design of drug molecules from scratch.

20.12 Synopsis

• In structure-based drug design, attempts are made to design small-molecule


ligands by docking them directly into the binding pocket of a target protein.
446 20 Protein Modeling and Structure-Based Drug Design

This requires a 3D structure of the reference protein. The goal is to optimally fill
the binding pocket by satisfying non-bonded interactions with the functional
groups of the binding-site residues.
• Structure-based design starts with a detailed analysis of the binding pocket to
elucidate hot spots for putative interactions with the protein. Either experimental
methods or computational tools can be used to perform an active site mapping
with molecular probes or small solvent-like molecules.
• In an iterative process of structure determination, modeling of modified ligands,
docking and screening, synthesis, and biological testing, the properties of small-
molecule ligands are improved to optimize binding to the target protein.
• Databases have been developed to retrieve and compare structural information
about the exponentially growing body of structural data on protein–ligand
complexes. They allow comparison of binding poses, active-site interaction
geometries, protein–ligand binding motifs, and the original solvation structures
in the protein’s binding pocket.
• Proteins can be compared in terms of their exposed binding pockets. The shape
and the exposure of groups which have particular physicochemical properties in
binding pockets are compared and help to design small-molecule ligands with
the desired selectivity. Ideas for isosteric replacements on the ligand scaffold can
also be generated in this way.
• If an experimentally determined structure of the target protein is unavailable,
a homology model can be constructed by using a related protein of known archi-
tecture as a template. The accuracy and the success of such homology modeling
depend strongly on the sequence homology with the template structure.
• Tools for secondary-structure prediction and amino acid replacement propensi-
ties have been developed to improve the reliability of the sequence assignment
of proteins to the 3D structure of the reference template.
• An alternative approach starts with a small molecule seed fragment and grows
putative ligands from this starting point into the binding pocket. Two non-
overlapping fragments can also be linked and turned into a larger ligand with
improved binding properties.
• The geometry of a constructed protein–ligand complex must be evaluated in
terms of the expected binding affinity in all structure-based design strategies.
A large variety of scoring functions are used to predict the binding affinity based
on the geometry of the formed complex.

Bibliography

General Literature

Beddell CR (ed) (1992) The design of drugs to macromolecular targets. Wiley, Chichester
Böhm HJ, Schneider G (2006) Molecular recognition in protein–ligand interactions. In: Mannhold
R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry, vol 19.
Wiley-VCH, Weinheim
Bibliography 447

Böhm HJ (1993) Ligand design. In: Kubinyi H (ed) 3D QSAR in drug design. Leiden, Escom,
pp 386–405
Borman S (1992) New 3-D search and de novo design techniques aid drug development. Chem
Eng News 10:18–26
Branden C, Tooze J (1999) Introduction to protein structure, 2nd edn. Garland, New York
Goodford P (1984) Drug design by the method of receptor fit. J Med Chem 27:557–564
Greer J, Erickson JW, Baldwin JJ, Varney MD (1994) Application of the three-dimensional
structures of protein target molecules in structure-based drug design. J Med Chem
37:1035–1054
Hubbard TJP, Lesk AM (1995) Modelling protein structures. In: Goodfellow JM (ed) Computer
modelling in molecular biology. Weinheim, VCH
Hutchins C, Greer J (1991) Comparative modeling of proteins in the design of novel renin
inhibitors. Crit Rev Biochem Molec Biol 26:77–127
Kuntz ID, Meng EC, Shoichet BK (1994) Structure-based molecular design. Acc Chem Res
27:117–123
Kuntz ID (1992) Structure-based strategies for drug design and discovery. Science 257:1078–1082
Martin YC (1992) 3D database searching in drug design. J Med Chem 35:2145–2154
Müller K (1995) De novo design. In: Anderson PS, Kenyon GL, Marshall GR (eds) Persp drug
discovery and de-sign, vol 3. Escom, Leiden
Schneider G, Baringhaus KH (2008) Molecular design. Wiley-VCH, Weinheim

Special Literature
Böhm HJ (1992a) LUDI: rule-based automatic design of new substituents for enzyme inhibitor
leads. J Comp-Aided Molec Des 6:593–606
Böhm HJ (1992b) The computer program Ludi: a new method for the de novo design of enzyme
inhibitors. J Comp-Aided Molec Des 6:61–78
Henderson R, Baldwin JM, Ceska TA, Zemlin F, Beckmann E, Downing KH (1990) Model of the
structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. J Mol Biol
213:899–929
Hibert M, Trumpp-Kallmeyer S, Hoflack J, Bruinvels A (1993) This is not a G-protein-coupled
receptor. Trends Pharm Sci 14:7–12
Hoflack J, Trumpp-Kallmeyer S, Hibert M (1994) Re-evaluation of bacteriorhodopsin as a model
for G-protein-coupled receptors. Trends Pharm Sci 15:7–9
Overington J, Johnson MS, Sali A, Blundell TL (1990) Tertiary structural constraints on protein
evolutionary diversity: templates, key residues and structure prediction. Proc Roy Soc Lond B
241:132–145
Ring CS et al (1993) Structure-based inhibitor design by using protein models for the development
of antiparasitic agents. Proc Natl Acad Sci 90:3583–3587
Sali A, Blundell TL (1990) Definition of general topological equivalence in protein structures.
J Mol Biol 212:403–428
Schertler GFX, Villa C, Henderson R (1993) Projection structure of Rhodopsin. Nature
362:770–772
Travis J (1993) Proteins and organic solvents make an eye-opening mix. Science 262:1374
A Case Study: Structure-Based Inhibitor
Design for tRNA-Guanine Transglycosylase 21

Before presenting numerous examples of applications (▶ Chaps. 22, “How Drugs


Act: Concepts for Therapy”; ▶ 23, “Inhibitors of Hydrolases with an Acyl–Enzyme
Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors of Hydro-
lyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidoreductase
Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists
and Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for Channels,
Pores, and Transporters”; ▶ 31, “Ligands for Surface Receptors”; ▶ 32, “Biologicals:
Peptides, Proteins, Nucleotides, and Macrolides as Drugs”) in the last part of this
book, a case study should be considered. The opportunities for inhibitor development
that have been introduced in the last chapter (▶ Chap. 20, “Protein Modeling and
Structure-Based Drug Design”) will be applied to the example of the tRNA-
modifying enzyme, tRNA-guanine transglycosylase. In particular, the advantage of
iteratively using multiple cycles of the structure-based design techniques is
highlighted. The example references work from the research group of François
Diederich at the ETH in Zurich, as well as work that was performed by the research
group of the author at the University of Marburg. Because the work was carried out
in an academic environment, it was possible to use different tools of structure-
based design and to pursue some of the more fundamental problems in the context
of the project.

21.1 Shigellosis: Disease and Therapeutic Options

Shigella dysentery is a severe diarrheal illness that is caused by Shigella bacteria.


The bacteria are ingested with contaminated water or food and adhere to epithelial
cells in the intestinal mucosa. It is extremely contagious: 10–100 bacteria are
enough to cause an infection. Worldwide, shigellosis represents a serious problem.
Almost 170 million cases are reported annually, of which over a million are fatal.
The disease is widespread in developing countries, but over 1.5 million cases are
also annually reported in industrialized countries. Above all, the disease flourishes
under conditions of inadequate hygiene and poor water quality as is found in war,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_21, 449


# Springer-Verlag Berlin Heidelberg 2013
450 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

natural catastrophes, famine, and in refugee camps. Dysentery is a particular


problem in Africa where it can occur concomitantly with AIDS.
As with any bacterial infectious disease, shigellosis can be treated with
antibiotics. The infections that occur in industrialized countries are cured in this
way. Unfortunately Shigella, which is very similar to the Escherichia coli that
naturally occurs in the intestinal flora, has a tendency to become resistant to
antibiotics very quickly. Moreover, antibiotic therapy also kills the naturally
occurring bacteria of the intestinal flora, and this also produces diarrheal
symptoms and severe dehydration in the patients. This can lead to a life-threatening
disruption of the electrolyte homeostasis, particularly in small children. Therefore,
specific therapeutic approaches are sought that suppress the pathogenesis of
the shigellosis.

21.2 Blocking Pathogenesis on the Molecular Level

Shigella attack the epithelial cells in the intestines. To gain entrance to these cells, the
bacteria produce their own virulence factors, so-called invasins. These are proteins
that form a sophisticated apparatus with the proteins on the epithelial cells, which
allows the penetration and proliferation of the bacteria in the infected cells. The gene
for the virulence factors are on a plasmid. Their expression in cases of infection is
regulated by different transcription factors. The factor VirF is particularly respon-
sible for the pathogenesis of the bacteria, and altered tRNA molecules are needed so
that it can be efficiently synthesized in the ribosome. tRNA is a ribonucleic acid that
is made of about 80 nucleotides (Fig. 32.15, ▶ Sect. 32.6). It is loaded with an amino
acid at the end that corresponds to the base-pair triplet in the middle loop, the so-
called anticodon loop. The genetic information encoded in the base-triplet of the
mRNA is transferred when the mRNA binds to the corresponding tRNA in
the ribosome during translation. This tRNA carries the right amino acid so that the
growing peptide chain of the nascent protein is correctly constructed. The changes in
the required tRNA affect the base in position 34 of the so-called wobble region.
A modified base must be incorporated at this site. If these changes do not occur,
the translation remains inefficient. Shigella could then barely produce enough of the
needed invasins to infect the epithelial cells. Their pathogenic potential is therefore
severely reduced.
The bacteria have enzymes that can carry out these changes in the tRNA.
In the first step, a guanine 21.1 is cut out of the tRNA molecule at position 34
and replaced with an altered base, preQ1 21.2 (Fig. 21.1). This step is catalyzed by
tRNA-guanine transglycosylase (TGT). The exchanged base in the tRNA is
further modified in the next step of the enzyme cascade so that the base queuine
is obtained as the final product. TGT inhibitors therefore represent a specific
therapeutic principle to selectively attack the pathogenicity of Shigella.
In contrast to a therapy with broad-spectrum antibiotics, the bacteria are not
killed but rather the disease-causing infection of the epithelial cells is prevented.
21.3 The Crystal Structure of tRNA-Guanine Transglycosylase as a Starting Point 451

a b
O NH2 O NH2 tRNA
TGT
HN HN

H2 N N N H2N N
H tRNAG34 N
G tRNA
preQ1
O tRNA-preQ1
21.2
HN N

H2N N N QueA
H
SAM
21.1
OH O OH U
G N
OH OH U
HN HN
O O

HN ? HN TGT
N Vit B12 N
H 2N N H2 N N
tRNA tRNA
U
tRNA-Queuine tRNA-oQueuine
Q N
U

Fig. 21.1 The enzyme tRNA-guanine transglycosylase (TGT) catalyzes the exchange of guanine
21.1 for preQ1 21.2 in tRNA (a). Next, the further modification of this base to queuine, which is
incorporated in the tRNA is achieved by other enzymes. The exchange of the base takes place in
the wobble position of the anticodon loop of the tRNA (b).

Higher-developed eukaryotic organisms also have such an enzyme. In contrast to


the bacteria, which use a homodimer, the eukaryotic enzyme is a heterodimer.
Moreover the higher-developed organisms do not transform preQ1 to the end-
product queuine but incorporate the latter queuine directly into the tRNA.

21.3 The Crystal Structure of tRNA-Guanine Transglycosylase


as a Starting Point

First, the crystal structure determination of TGT in complex with preQ1 was
determined in a related species. It shows an exchange of a Phe for a Tyr in the
active site, which is immaterial for substrate or ligand binding. Later, the structure
in complex with a part of the tRNA was elucidated (Fig. 21.2). According to these
structures, the base exchange occurs along to the following reaction pathway
(Fig. 21.3). Initially the tRNA binds to the covalently attached guanine. The base with
its ribose moiety is pulled out of the tRNA molecule and is specifically recognized by
Asp102, Asp156, Gln203, Gly230, and Leu231. The reaction starts with a nucleophilic
attack at carbon C1 of the ribose ring. The C1–N bond is cleaved, and guanine is
released. The base leaves the binding pocket with a water molecule, and preQ1 is taken
up into the same binding site. For this, the peptide bond between Leu231 and Ala232
452 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

Fig. 21.2 The crystal a


structure of TGT with
a portion of the tRNA. The
protein adopts a TIM-barrel preQ1
fold. The tRNA binds to the tRNA
Leu231
protein near the catalytic
center with the bases U33, Asp156
G34, and U35, and the base
(gray) to be exchanged at
position 34 is completely
rotated out from the tRNA
molecule (a). A view in the
binding site is below (b). The
already-incorporated,
modified base preQ1 is held in
place in the guanine-
recognition pocket (orange)
by Asp156, Asp102, Gly230, b Gly230
and Leu231. The ribose G34/preQ1
Leu231
moiety is arranged in a small
hydrophobic pocket (blue). U33
Uracil33, which comes earlier
in the sequence, lies in the
green-marked part of the
binding pocket, and the
Asp156
uracil35 residue, which comes
later in the sequence, lies in
the red-colored binding areas. Asp102

U35
Ribose 34

must flip over. The basic nitrogen atom of preQ1 then releases a proton and carries out
a nucleophilic attack on the ribose, which is covalently bound to Asp280. Once the new
bond to the tRNA is formed, the altered tRNA leaves the enzyme. Asp102 is critically
involved in the recognition process of the bound base. Furthermore this amino acid
probably provides the proton that is required for the mechanism, or picks up a proton in
the other step again.

21.4 A Functional Assay to Determine Binding Constants

The base-exchange reaction is accomplished in two steps. In principle, both steps


can be blocked by inhibitors. This must be considered in a functional assay. In the
first step the unmodified tRNA is bound (Fig. 21.4). Sufficiently large inhibitors
could competitively prevent this step. After the tRNA is covalently attached to the
enzyme, the guanine base is released and leaves the protein. Next preQ1 binds.
21.4 A Functional Assay to Determine Binding Constants 453

a b
Glu235 Leu231 Glu235 Leu231 Gln203
Gly230 Gln203 Gly230
OH OH
O N
O O N O H
H NH
Ala232 N H H2N O
Ala232 H2N O
W1 W1
O O
Asp280 Guanine H
N O O
O NH Asp280 O N7 NH
− −O O

O N N O
N NH2 N NH2
O −
Asp156 Asp156
RO O O O HO O
OH RO OH
tRNA - Asp102 W2 Asp102
guanine34 OR tRNA- OR
ribose34

c d
Glu235 Glu235
− Leu231
O Gln203 − Leu231
Gln203
O Gly230 O
Gly230
H O H
N N N
O H N
Ala232 H2N O O H
+ Ala232 H2N O
H3 N O +
Asp280 H3N O
Asp280 O
O NH O
O NH
O N N −O − −
NH2 O N O
H N NH2
Asp156
O HO O O − Asp156
RO O O
RO OH OH
W2 Asp102
tRNA- tRNA- Asp102
OR preQ134 OR
ribose34

Fig. 21.3 Mechanism of the base-exchange reaction in glycosylase. The tRNA with guanine 34
binds, and a water molecule makes contact with the nitrogen atom in the 3-position. Asp280
nucleophilically attacks the C1 carbon atom of the ribose ring (a). The C1–N bond is broken and
guanine is released (b). It leaves the binding pocket together with a water molecule. PreQ1 is taken
into the same binding site where the peptide bond between Leu231 and Ala232 is flipped over (c).
After deprotonation, the basic nitrogen atom of preQ1 carries out a nucleophilic attack on the
ribose, which is covalently bound to Asp280, and a new bond to tRNA is formed. The altered
tRNA leaves the enzyme (d).

A potential inhibitor can also compete with this uptake into the binding site, but
must not be much larger than guanine or preQ1. In this way small inhibitors display
a different inhibition profile than structurally larger inhibitors.
Radioactively labeled guanine is used to measure the inhibition. If this guanine is
added to the tRNA, the TGT catalyzes its incorporation, and the tRNA molecule
becomes radioactively labeled. If the tRNA is separated at fixed intervals, and the
incorporated radioactivity is measured, the reaction kinetics of the incorporation
process and therefore the catalytic rate of the enzyme can be followed. If potential
inhibitors are added, fewer TGT molecules are available for the transformation, and
the incorporation rate is reduced. This can be seen in the observed enzyme kinetics.
The inhibition constants can be determined by detailed analysis of the kinetics.
454 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

TGT

tRNA
Guanine
preQ1
tRNA-Competitive Inhibitor
tRNA
Base-Competitive Inhibitor

Fig. 21.4 The base-exchange reaction takes place in two steps. Inhibitors can compete with the
binding of the complete tRNA (left, dark gray) as well as the exchange of the small nucleobase
(middle, light gray).

Whether the inhibitors interfere competitively with the binding of the entire tRNA, or
whether they compete with the exchange of the small base can also be differentiated.

21.5 LUDI Discovers the First Leads

In the beginning of the project, only the structure of the binary complex of TGT
with preQ1 was known. The two-step inhibition mechanism explained in the last
section was also unknown at the time. During the course of the project Bernhard
Stengl managed to clarify the details of this process. Ulrich Gr€adler used the binary
TGT•preQ1 structure as a reference and initiated a search for potential inhibitors
with LUDI (▶ Sect. 20.10). He was able to find hits in a chemical catalog. The
compounds listed in Fig. 21.5 were proposed. Among them, 21.3 proved to be
a micromolar inhibitor. A crystal structure could be determined with this hit
(Fig. 21.6). There was great delight when 4-aminophthalic acid hydrazide 21.3
was shown to bind to the enzyme exactly as LUDI had predicted.
Next, LUDI was consulted to predict further groups for the inhibitor that would
fill in the as-yet unoccupied areas in the binding pocket. On the one hand, an
expansion of the ring system by an additional aromatic ring was proposed. On the
other hand, the placement of a nitrogen-containing heterocycle at the unoccupied
interaction site near Asp102 and Asp280 was considered. Hans-Dieter Gerber
synthesized derivatives 21.4–21.6 (Fig. 21.7). Compounds 21.4 and 21.5 achieved
10-times-better inhibition of the enzyme in the assay than 21.3. The results were
quite different with the heterocyclic derivative 21.6. It was significantly worse than
the initial lead structure. Ulrich Gr€adler was able to solve the crystal structures with
these inhibitors, which exhibited the expected binding mode. It was shown in the
structure with 21.6 that the heterocycle falls very near the terminal amide group of
Asn79. It was then obvious that 21.6 needed an additional amino group to build an
21.5 LUDI Discovers the First Leads 455

Fig. 21.5 Proposals for the O


O
first lead structures by LUDI. H
H2N N
Among them, 21.3 proved to NH
NH O
be a two-digit micromolar NH
N
inhibitor. 21.3 O NH

H H3C O
CH3O N HN SO2

NH2
NH2
COOH
COOH
H2N O OH

HO OH
O OH
CH3

Leu231

Asp156

Asp102

Fig. 21.6 Crystal structure of TGT with 21.3, the first hit from LUDI. The agreement between the
predicted (above right) and the final experiment is almost perfect. LUDI indicated additional
interaction centers in the lower part of the binding pocket that had not yet been used.

additional contact with the protein. This synthesis was accomplished, and the
crystal structure with 21.7 in fact did show the expected binding mode with the
additional H-bond. However, even this derivative was less potent than the original
lead structure 21.3. A more detailed analysis of the structural data showed that the
appended heterocycle in 21.6 and 21.7 is disordered, and a hydrogen bond between
the exocyclic amino group and the carbonyl group of Leu231 is very long. The
heterocycle was incorporated based on the idea that it would be beneficial to have
a charged group that can also form hydrogen bonds to the two neighboring aspartate
456 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

O NH2 O
H2N
NH NH
NH NH
NH2 O O
21.4 21.5
O O
H2N H2N
NH NH
NH NH

N S O N S O
N N Asp280
NH NH
21.6 21.7
H2N
Asp102

Fig. 21.7 By starting from 21.3, inhibitors 21.4 and 21.5 were developed, which showed an
improved inhibition by a factor of 10 and could better fill the unoccupied area in the binding pocket
that was indicated in Fig. 21.6. Heterocycles 21.6 and 21.7 were introduced to the scaffold to
exploit the additional unoccupied interaction sites (right). Both derivatives showed diminishing
binding affinities, probably because of repulsive interactions with the two neighboring Asp102 and
Asp280 residues. The heterocycles do not display the desired partial-positive charge.

groups. These two groups were assumed to adopt a deprotonated state. Then a positive
charge on the triazole group would be ideal for an interaction. But in what protonation
state are these groups? A pKa measurement was carried out on a related model
compound. A small-molecule crystal structure determination was undertaken on
a crystal grown under the same buffer conditions as the protein complex was crystal-
lized. Both experiments showed that the heterocycle exists without a charge, that is,
both of the neighboring nitrogen atoms are deprotonated. Although it is not obligatory
that the same protonation state is found in the protein’s binding pocket, this model
appears to be plausible to explain the decreasing binding affinity of 21.6 and 21.7: An
uncharged triazole ring between the two negatively charged aspartate groups must
experience a repulsive interaction with at least one of the two acidic groups. This could
reconcile the decreasing binding affinity, the observed disorder, and the elongated
H-bond to the carbonyl group of Leu231.

21.6 Surprise: A Flipped Amide Bond and a Water Molecule

Novo Nordisk kindly provided an additional compound, 21.8 that emulates


the original interaction pattern of the initial lead structure (Fig. 21.8). Upon docking
this derivative, however, it was shown that the distance between the polar nitrogen
atom in the central pyridazinone ring and the carbonyl group in Leu231 was too large.
Nevertheless, the compound was a micromolar hit. The crystal structure that was
determined with derivative 21.9 delivered an explanation. The peptide bond, which,
for mechanistic reasons, acts as a switch between two conformations, takes on
21.7 Hot Spot Analysis and Virtual Screening Open the Floodgate to New Ideas 457

a Leu231 b
Gly230 Leu231
Gly230

Gln203 Gln203

Asp156 Asp156

NH2 O O O
H H
N N
NH NH NH
NH NH NH
S N
O O O 21.9 O
21.8
21.5

Fig. 21.8 Analogue 21.8 should also emulate the interaction pattern of the original lead structure
(a). If this derivative is placed in the binding pocket (purple), the distance between the polar
nitrogen atoms in the central pyridazinone ring and the carbonyl group on Leu231 seems to be too
large for an H-bond. Nonetheless, 21.8 binds to the protein with micromolar affinity. (b) The
crystal structure that was determined with the very similar inhibitor 21.9 (orange) shows two
surprises: The peptide bond rotates its orientation and now directs its NH group toward the binding
pocket, and a water molecule (red sphere) mediates the interaction with the ligand!

a different orientation! When flipped, the NH functional group is found in the


binding pocket. The contact between this NH group and the polar nitrogen atom in
the ligand is mediated by an interstitial water molecule. Because the details of the
above-described enzymatic mechanism were not known at that time, the flipping of
the peptide switch could not have been predicted. Furthermore, the incorporation of
a water molecule was a big surprise. It underscores the importance of repeatedly
determining crystal structures with newly found lead structures.

21.7 Hot Spot Analysis and Virtual Screening Open the Floodgate
to New Ideas for Synthesis

How can multiple binding modes be made a virtue out of necessity? Ruth
Brenk used the protein conformers in the structure with 21.3 as well as the geometry
in the complex with 21.9 to carry out a hot-spot analysis (▶ Sect. 17.10). The result
458 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

a Gly230 b Gly230
c Gly230
Leu231 Leu231 Leu231

Asp156 Asp156 Asp156

Donor Acceptor Hydrophobic

Fig. 21.9 Hot-spot analysis shows preferred binding areas for a hydrogen-bond donor
(a), acceptor (b), and a hydrophobic group (c). In addition, it was shown that the polar groups
of 21.9 (cf. Fig. 21.8) fall into the preferred binding area. In the bottom left corner of the binding
pocket (near the binding site of the ribose moiety Fig. 21.2, blue) other binding areas are indicated
that were addressed in subsequent design steps.

of this analysis is shown in Fig. 21.9. A virtual screening was performed with the
generated pharmacophore and this produced a plethora of alternative molecular
scaffolds (Fig. 21.10) to occupy the guanine-binding site (Fig. 21.2). Many of the
hits that were discovered in this way proved to be micromolar inhibitors. They
afforded many new ideas for synthetic entry points to develop new inhibitors. Three
scaffolds were chosen for the following work. They are derived from a
pyridazinone (trione, 21.10), pteridine (21.11), and 6-aminoquinazolinone (21.12;
Fig. 21.11). It is conceivable that the last scaffold can be put together by combining
the right half of the natural substrate guanine 21.1 and the left half of the first hit
from LUDI, 21.3.
Let us turn to the distribution of the hot spots in the binding pocket. The new lead
structures all interact at sites in the ‘upper part’ of the binding pocket. However, an
additional favorable binding area, competent to interact with donor properties as well
as hydrophobic moieties, is indicated in the ‘lower left part’ of the binding site next to
the two aspartic acid residues Asp102 and Asp280. These binding sites had not been
exploited in the previous design. Considering the binding mode of the bound tRNA
(Fig. 21.2), the ribose sugar moiety in position 34 is accommodated in this region.
The hot-spot analysis suggests occupancy with a hydrophobic molecular fragment.
A favorable place for an H-bond donor lies a little bit further above. This region
corresponds to the binding site between the two aspartic acids, in which placement
of the two heterocyclic derivatives 21.6 and 21.7 was already attempted. Another
favorable area for an acceptor group is indicated at the rim of this pocket. The 20 - and
30 -hydroxyl groups of the tRNA ribose moiety are placed in this area.

21.8 The Filling of Hydrophobic Pockets and Interference with


a Water Network

A golden rule in drug design is that the occupancy of an empty hydrophobic pocket
with a lipophilic group leads to an increase in affinity (▶ Sects. 4.9 and ▶ 20.11).
21.8 The Filling of Hydrophobic Pockets and Interference with a Water Network 459

O O O
O O
N N N NH N NH
N NH S S O S
Cl Cl N NH2 N N N
N H N NH2 N NH2
N N NH2 H H H
H O O
O O
N NH N N
N NH N N NH
O N N
N NH2 N NH2
N N NH2 HO H N NH2
N
O
H2N O O
H2N O H
N
NH NH HO O
NH O NH2 NH2 N
S NH2 O NH
N
N N O NH2

O
O O O O
N H O
N N N
NH NH HO NH NH N
N NH2 NH2 O NH
N N NH2 N N NH2
OH N N NH2
N
F

O
O N N N N N N N
N NH NH NH NH
NH
N N NH2 N
N NH2 NH2 NH2 NH2
N

H2N O O
H N HO N N H2N OH
N NH NH NH
NH N
NH N N
N NH2 HO N OH
O OH NH2
O
O
H2N O H O
NH N
N NH2 N
NH NH
N N N
NH2 Cl
N OH I
N

Cl

Fig. 21.10 Proposals from a virtual screening of which a few examples were experimentally
tested and proved to be micromolar inhibitors.

O
H O O
N
NH N H2N
NH NH
NH
RO
N N NH2 N NH2
O O SR R
21.10 21.11 21.12

Fig. 21.11 The pyridazinone (trione 21.10), pteridine (21.11), and 6-aminoquinazolinone (21.12)
scaffolds were sought as possible lead structures for further synthesis and optimization. By adding
appropriate R groups, the synthesis of numerous derivatives was achieved.
460 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

O
H2N
NH
H
N NH2 Ki = 5.6 mM
R 21.12

S S S S S O S S

HN N H3C N
CH3 CH3

O
H
N NH
H
N N NH2
R Ki = 4.1 mM
21.13 OMe

Fig. 21.12 The displayed derivatives were synthesized by adding different R substituents to the
6-aminoquinazolinone scaffold 12.12, and subsequently tested. Surprisingly, even the best com-
pounds from this series remained in the single-digit micromolar region. lin-Benzoguanine 21.13
served as an alternative inhibitor scaffold and hydrophobic groups were attached at the 4-position.
Despite the good inhibition by the basic scaffold, the substituted derivatives could not achieve
a significant improvement in binding affinity.

Accordingly, inhibitors with such side chains were designed and led to the
derivatives displayed in Fig. 21.12. Disappointingly, these showed only a modest
improvement. In addition to the pteridines and aminoquinolinones developed in
Marburg, the lin-benzoguanines 21.13 were advanced by the synthesis program in
Zurich. Thanks to the contributions from Emanuel Meyer and Simone H€orner at the
ETH in Zurich, an entire series of inhibitors was readily prepared for detailed
crystallography studies, which led to the establishment of a structure–activity rela-
tionship. Interestingly, none of the derivatives shown in Fig. 21.12 led to a break-
through improvement in the affinity. As planned, they occupy the small hydrophobic
pocket between Val45, Leu68, and Asn70. A comparison of the individual crystal
structures of these inhibitors and also of the natural substrate tRNA showed that the
amino acid groups of the protein underwent massive induced-fit adaptations upon
binding (Fig. 21.13). These adaptations are very similar to those induced upon binding
to tRNA. Therefore it seemed unlikely that they were energetically very demanding.
The enzyme would otherwise have trouble to adequately bind its substrate. The
straightforward explanation of a possibly too-high energetic cost for this adaptation
was thus excluded as a reason for the lack of improvement in affinity. Bernhard Stengl
and Tina Ritschel re-examined the individual derivatives precisely. It was surprising
21.8 The Filling of Hydrophobic Pockets and Interference with a Water Network 461

O
H
N NH

N N NH2
Asp156

21.14

Asp102

Val45

Asn 70

Fig. 21.13 As a comparison of the crystal structures of the uncomplexed (gray) and with 21.14
bound (brown) showed, the amino acid residues undergo a massive ligand-induced adaptation
upon inhibitor binding to the protein similar to the natural tRNA substrate. This leads to the
opening of a small hydrophobic pocket that is enclosed by Val45, Leu68, and Asn70.

that the small parent scaffold already achieved single-digit micromolar binding.
The addition of small substituents that orient in the direction of the hydrophobic
pocket led only to a loss in the binding affinity. Just the occupancy of the small
hydrophobic pocket with an attached aromatic substituent could compensate for this
initial loss of affinity and recovered the one-digit micromolar binding. A comparison
of the arrangement of the water molecules in different inhibitor structures was very
informative. In the unsubstituted parent structure, multiple water molecules form
a network between the two putatively charged aspartate residues, Asp102 and
Asp280. This network represents a decisive contribution to the solvation of the two
polar acid groups. All of the derivatives listed in Fig. 21.12 create a hydrophobic linker
to cross the area of the water network and to place their hydrophobic substituents in the
small hydrophobic pocket. In doing so, they necessarily destroy the water network.
This has its price!
462 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

O O

NH NH
H3C
N N NH2 N N NH2
CH3 CH3

21.15 Ki = 31 ± 10µM 21.16 Ki = 7.6 ± 3.7µM

Fig. 21.14 The quinazolinone derivative with the 7-dimethylamino group, 21.15 loses a factor of
10 in its binding affinity compared to the unsubstituted derivative. If one methyl groups is replaced
with a benzyl group (i.e., 21.16) an increase in potency is obtained. The crystal structure with this
derivative shows that the benzyl group is not oriented in the direction of the small hydrophobic
pocket, but rather is in the uracil33 pocket.

An affinity comparison between the compounds 21.15 and 21.13 (Fig. 21.14)
was eye catching. The derivative with a 7-dimethylamino group on the
quinazolinone scaffold 21.15 lost binding affinity compared to the unsubstituted
derivatives by a factor of more than 10. If one of the methyl groups is replaced with
a benzyl group (21.16) the lost affinity is partially recovered. The crystal structure
of this derivative shows that the benzyl group is not oriented in the direction of the
small hydrophobic pocket, but is rather placed in the direction of a pocket that is
occupied by uracil33 in the natural substrate (Fig. 21.2). With this result, a new
concept for further design was obvious. Under no circumstances should the water
network between Asp102 and Asp280 be traversed by a hydrophobic bridge. On the
other hand, a hydrophobic group that orients in the direction of the uracil33 pocket
should be added to the scaffold of the ligand.

21.9 With a Salt Bridge: Finally Nanomolar!

Synthetically the desired modifications of the substituent were easier to achieve on


the lin-benzoguanine template. Unsubstituted lin-benzoguanine 21.13 displays
a water network containing five distinct water molecules in the crystal structure
with the enzyme (Fig. 21.15). If a methyl group is added to the 2-position, the
affinity improves by a factor of 2.7 (Fig. 21.16). If the methyl group is then
exchanged for an amino group (i.e., 21.18), the binding constant improves dramat-
ically by a factor of 20. In other words, the introduction of an amino group in the
2-position of the lin-benzoguanine scaffold improves the affinity by a factor of 50!
How can this surprise be explained? The hydrogen bond to the carbonyl group of
the main chain in Leu231 was discussed in Sect. 21.5. This functional group is part
of the peptide bond that was later shown to be a molecular switch for the protein
function. During the course of the derivative synthesis, it became apparent that this
hydrogen bond between protein and ligand, which was initially assumed to be very
important, did not make a decisive contribution (cf. 21.21/21.22 and 21.23/21.24,
Fig. 21.16). The lin-benzoguanine scaffold also forms a hydrogen bond to the
21.9 With a Salt Bridge: Finally Nanomolar! 463

Fig. 21.15 The basic


scaffold of lin-benzoguanine
binds to the protein with
4.1 mM and leaves the water
network (red spheres)
between the two putatively
negatively charged aspartates,
Asp102 and Asp280, intact.
Asp102

Asp280

carbonyl group of Leu231. The introduction of the amino group in the 2-position,
however, changes the imidazole system into a guanidinium-like group. Such a change
significantly increases the basicity of the scaffold. pKa measurements confirmed
this jump of more than one pKa unit. Calculations to simulate pKa shifts upon
complex formation (▶ Sect. 15.4) indicated an additional shift into the basic area.
The compounds should therefore bind to the protein in its protonated form. As
a consequence, they carry a positive charge on the substituted imidazole moiety. As
a result, the hydrogen bond to Leu231, which is further polarized by the adjacent
Glu235 is converted into a salt bridge. It therefore contributes an important part to the
binding affinity.
It has already been demonstrated that filling the uracil33-binding pocket is
associated with an improvement in the affinity. Therefore groups were introduced
onto the 2-amino group. However, a methylene group was used as a bridge to keep
the amino group unconjugated to the added aromatic substituents. Of the synthe-
sized derivatives, morpholine derivative 21.20 proved to be the strongest binder. It
also has the best water solubility. Interestingly the added side chains in this area are
not clearly visible in the electron density. They are probably in a disordered state in
the binding pocket (Fig. 21.17). This speaks against a good enthalpic interaction for
the groups in this area, but this effect should be compensated for entropic reasons so
that a good contribution to the free energy is achieved in the sum, and the binding
affinity is improved. This situation is explained in an example in ▶ Sect. 4.10.
The question was already addressed as to whether an appropriate side chain on
the ligand can span the region of the water network between Asp120 and Asp280
once attached to the lin-benzoguanine molecular scaffold. We first synthesized the
derivatives with an ethylene hydroxyl 21.25 and an ethylene amino substituent
21.26 at the 4-position (Fig. 21.17). The crystal structures of both derivatives are
identical, and the terminal hydroxyl or amino groups actively participate in the
464 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

O
H 21.13 Ki = 4100 nM
H
N NH H3C 21.17 Ki = 1500 nM
R H2N 21.18 Ki = 77 nM
N N NH2
H
N 21.19 Ki = 58 nM
H3C

H H H H H
N N N N N

S O N

O O
Ki = 55 nM Ki = 35 nM Ki = 70 nM Ki = 35 nM 21.20 Ki = 6 nM
O O
H2N
NH NH

N NH2 N NH2

21.21 Ki = 1.5 µM 21.22 Ki = 2.1 µM

O
O
H2N
NH
NH
N NH2
N NH2
Ph
S Ph
S
21.23 Ki = 3.8 µM 21.24 Ki = 4.0 µM

Fig. 21.16 Substitution of the lin-benzoguanine scaffold 21.13 in the 2-position leads
to a significant improvement in the binding affinity. Above all, the introduction of a 2-amino
group (i.e., 21.18) alters the basicity of the derivatives so much that they bind in a positively
charged form. Because of this, a charge-assisted hydrogen bond is formed with the carbonyl
group of Leu231, which contributes strongly to the binding affinity. A comparison of
21.21 with 21.22, or 21.23 with 21.24 underscores the fact that this hydrogen bond only
increases the affinity if this part of the molecule is charged. If the charge is missing, the
H-bond-forming amino group can be left out without a concomitant loss in affinity. The
morpholine derivative 21.20 proved to be a nanomolar inhibitor and arranges its side chain in
the uracil33 pocket.
21.9 With a Salt Bridge: Finally Nanomolar! 465

a b
Gly230
Leu231 Gly230

Asp156
Asp156 Leu231

Asp280
Asp102

Asp102
Asp280 Val45

O O O
H H
H N NH H N NH H N NH
N N N
N N NH2 N N NH2 N N NH2
CH3
N 21.20 Ki = 6 nM O 21.27 Ki = 4 nM
R HN
O Asp280 O
21.25 R = OH Ki = 96 nM
21.26 = NH2 Ki = 55 nM

Fig. 21.17 (a) The crystal structure with the morpholine derivative 21.20 does not show a well-
defined difference electron density (green contour net) around the morpholine side chain in the
uracil33 pocket. This observation is an indication of severe disorder over multiple spatial orien-
tations. Computer simulations confirmed this hypothesis and suggest two possible placements for
the side chain. (b) Introducing a hydroxyl function 21.25 or a basic nitrogen atom 21.26 into the
side chain of the 4-position of the lin-benzoguanine scaffold leads to its participation in the
hydrogen bond network between Asp120 and Asp280. Introduction of a hydroxyethylene linker
leads to a loss of affinity by a factor of two compared to the unsubstituted parent structure, the
amino ethylene linker derivative reveals the same binding as the parent structure. The substituents
prevent a collapse in the binding affinity from the destruction of the water network. Compound
21.27 is clearly recognizable in the electron density, forms an H-bond to Asp280, and fills the
small hydrophobic pocket. It binds to the enzyme with Ki ¼ 4 nM.

water network. Interestingly, the hydroxyl derivative is less potent than the
unsubstituted parent structure 21.19 by a factor of 2. In contrast the amino deriv-
ative 21.26 gains about the same potency as the unsubstituted reference. Obviously in
the latter case, attachment of the ethylene amino substituent and the concomitant
perturbance of the water network is just cost-neutral. Most likely the terminal amino
group of the ethylene amino derivative is charged and present as an ammonium group.
Apparently this charge provides an advantage to the sole placement of a hydroxyl
group between the two neighboring aspartic acids. Derivative 21.27, which extends
the ethylene ammonium substituent by a hydrophobic group, experiences a binding
466 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase

affinity of Ki ¼ 4 nM. The crystal structure confirms the assumed binding mode. It
underscores the relevance of the originally proposed design hypothesis that the water
network should be crossed only by a polar linker and that the filling of the small
hydrophobic pocket achieves a strong increase in binding affinity. In the next step,
appropriate groups were added to the 2- as well as 4-positions. As a result, substances
were obtained that now inhibit the enzyme with subnanomolar potency.
The development of nanomolar TGT inhibitors had to take a few detours.
The many crystal structures that were determined with the system were decisive
for the breakthrough. The basic knowledge for the optimization process can be
summarized in three points. The destruction of the water network can be very
detrimental for the affinity. The filling of a hydrophobic pocket certainly improves
the affinity, but it must be checked whether a linker to place this group can afford an
optimal interaction geometry with the environment. The exchange of a neutral for
a charged-assisted hydrogen bond was critical for the optimization process.
This can be achieved by introducing groups in a molecular building block that
cause a distinct change in the pKa properties of the ligand. Unfortunately, the
inhibitors are not yet suitable for in vivo use. The three potentially positively
charged groups make them very polar. Therefore an attempt must be made avoiding
these charges without losing a large portion of the achieved binding affinity.

21.10 Synopsis

• Shigella dysentery is a severe bacterial diarrheal illness. Shigella bacteria that


are ingested with contaminated water or food adhere to epithelial cells in the
intestinal mucosa. To gain entrance to these cells, the bacteria produce their own
virulence factors, so-called invasins.
• The invasins are only efficiently translated if the tRNA-modifying enzyme
tRNA guanine transglycosylase is effectively catalyzing the incorporation of
the modified base preQ1 in the wobble position of tRNA.
• A functional assay recording the exchange of guanine by radioactively labeled
guanine can determine the potency of ligands inhibiting the function of the target
enzyme.
• The first hits were detected by using the de novo design program LUDI, and
the predicted binding mode of a micromolar hit was confirmed by
crystallography.
• The active site shows adaptations by flipping a peptide bond and mediating
important interactions to the substrates through a water molecule.
• Virtual screening suggests a broad variety of basic scaffolds for
inhibitor design. A lin-benzoguanine scaffold served as the most promising
lead structure.
• Substitutions at the 2- and 4-position of the lin-benzoguanine scaffold caused
very different increases in potency. Attachment of a 2-amino group generates
Bibliography 467

a permanent charge on the scaffold and converts a normal to a charge-assisted


hydrogen bond. This caused a major improvement in affinity.
• Substitutions at the 4-position have to interfere with and partially replace
a contiguous water network between two facing aspartic acids. They can poten-
tially link the parent scaffold with substituents by filling a small hydrophobic
pocket. A significant potency enhancement can be achieved only if the spacer
linking the two portions contains polar atoms to cross the water network; these
atoms can actively participate in the network.
• Multiple iterative cycles of design, crystal structure analyses, and inhibitor
syntheses were necessary to develop the initial two-digit-micromolar hits into
sub-nanomolar inhibitors.

Bibliography
Brenk R, Naerum L, Gr€adler U, Gerber H-D, Garcia GA, Reuter K, Stubbs MT, Klebe G (2003)
Virtual screening for submicromolar leads of TGT based on a new unexpected binding mode
detected by crystal structure analysis. J Med Chem 46:1133–1143
Gr€adler U, Gerber H-D, Goodenough-Lashua DAM, Garcia GA, Ficner R, Reuter K, Stubbs MT,
Klebe G (2001) A new target for shigellosis: rational design and crystallographic studies of
inhibitors of tRNA-guanine transglycosylase. J Mol Biol 306:455–467
H€ortner S, Ritschel T, Stengl B, Kramer Ch, Klebe G, Diederich F (2007) Design, synthesis, and
biological evaluation of inhibitors of tRNA-guanine transglycosylase, an enzyme linked to the
pathogenicity of the Shigella bacterium. Angew Chem Int Ed 46:8266–8269
Meyer EA, Brenk R, Castellano RK, Furler M, Klebe G, Diederich F (2002) De Novo design,
synthesis, and in vitro evaluation of inhibitors for prokaryotic tRNA-guanine transglycosylase
(TGT): a dramatic sulfur effect on binding affinity. Chembiochem 2:250–253
Ritschel T, Hoertner S, Heine A, Diederich F, Klebe G (2009a) Crystal structure analysis and
in-silico pKa calculations suggest strong pKa shifts of ligands as driving force for high affinity
binding to TGT. Chembiochem 10:716–727
Ritschel T, Kohler PC, Neudert G, Heine A, Diederich F, Klebe G (2009b) How to replace the
residual solvation shell of polar active-site residues to achieve nanomolar inhibition of tRNA-
guanine transglycosylase. ChemMedChem 4:2012–2023
Stengl B, Reuter K, Klebe G (2005) Mechanism and substrate specificity of tRNA – guanine
transglycosylases (TGTs): tRNA modifying enzymes from thee three different kingdoms of
life seem to share a common mechanism. Chembiochem 6:1–15
Stengl B, Meyer EA, Heine A, Brenk R, Diederich F, Klebe G (2007) Crystal structures of tRNA-
guanine transglycosylase (TGT) in complex with novel and potent inhibitors unravel pro-
nounced induced-fit adaptations and suggest dimer formation upon substrate binding. J Mol
Biol 370:492–511
Part V
Drugs and Drug Action: Successes of
Structure-Based Design
470 V Drugs and Drug Action: Successes of Structure-Based Design

Design and development of a drug candidate starting from a small-molecule lead


for a given macromolecular target taken from the universe of all possible proteins
requires seeking through the chemical space of all putative pharmacologically
relevant small molecules to find the most appropriate one. This goal is comparable
to a matching of both spaces. This figure tries to illustrate the merging of both
spaces by use of the two spiral nebula composed by protein and ligand structures.
(announcement poster from the research group of the author on the occasion of a
conference in 2007 in Rauischholzhausen, Marburg.)
How Drugs Act: Concepts for Therapy
22

How many modes of action are there for drug therapy? There are estimations that
the currently commercially available drugs exert their action on approximately
500 target structures. Optimistic prognoses claim that this number can be
increased by perhaps a factor of 10. But this number is still small compared to
the diversity of proteins that play a role in our organism. Our genome has been
sequenced. We know that the number of our genes (about 25,000) is much smaller
than it was originally assumed (▶ Sect. 12.3). The number of relevant proteins for
which these genes code is, however, significantly larger, because, among other
reasons, versatile posttranslational modification and alternative splicing cause
the genetic information to be diversified over multiple protein variants. Accord-
ingly, our genome is mapped, but do we know what function is behind each
individual gene? How can predictions about proteins and their functions and
possible roles in pathophysiology be extracted from this flood of sequence infor-
mation? Many of the proteins that have been discovered in the genome can be
assigned to protein families based on sequence comparisons. Nonetheless,
a significant portion of our genetic information still awaits annotation. The first
step has been taken, but how do the spatial structures of these proteins look, for
which only sequences are know? Which ligands will be recognized by these pro-
teins, and what biochemical role do they assume in our organism? The biochemical
function, that is, the assignment of whether a protein represents, for example,
a protease, an ion channel, or a transporter, still affords no information at all
about what systemic roles the protein takes in the functional processes in a cell
or in a whole organism. The spatial structure of a protein is responsible for this
function. Therefore, the structures of the proteins in our genome are being inten-
sively investigated. The goal is to map the structural space of all proteins as well as
possible. Then it could be possible to find a spatially elucidated and adequately
homologous reference structure for each discovered sequence. Today, the structures
of all members of a few gene families have already been determined. Therefore, it is
only a question of time until we have the spatial structure of all relevant proteins at
our disposal. The way there may be long and hard, but it is clearly sketched out.
Will this revolutionize the market of potential pharmaceuticals and make entirely

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_22, 471


# Springer-Verlag Berlin Heidelberg 2013
472 22 How Drugs Act: Concepts for Therapy

new therapeutic approaches possible? The chemical space of all imaginable active
substances and the biological space of all possible pathology-relevant proteins is
discussed in ▶ Sects. 11.4 and ▶ 12.4. Drug design attempts to merge both of these
spaces with one another. There are molecules to be found as candidates for potential
active substances in the cross section of both spaces.

22.1 The Druggable Genome

In 2002, Andrew Hopkins and Colin Groom published a summary that accurately
illuminated the drug market at that time (Fig. 22.1). Back then, approximately
20 drugs per year were being launched into the market, and at the moment we are
seeing only a very small change in these circumstances. Approximately half of
today’s drugs inhibit enzymes. Another 30% modulate the behavior of G protein-
coupled receptors (GPCRs). About 7% exert their therapeutic effect on ion channels.
Transporters, nuclear hormone receptors or other receptors for growth factors, inter-
leukins, or peptides such as insulin are influenced by about 4% of the available drugs
each. Then a small portion remains that influences cell-surface integrins or DNA.
These market segments in no way cover the frequency of these target structures in our
genome. For example, the GPCRs are only 2.3% of our genome if sensory GPCRs are
excluded. Approximately 15% of the “druggable” genome, that is, the portion for
which the function can be favorably influenced by a pharmaceutical therapy, is
assigned to the GPCRs. The kinases make up more than 22% of the genome, but
only nine small-molecule inhibitors are commercially available as drugs. However, it
is estimated that about 100 substances are in extensive testing. Therefore, it is to be
expected that the drug market will change in the next years.
In the next chapters, examples of individual target structures will be introduced
that represent potential targets for drug therapy. They are discussed on the basis of
their most important structural characteristics because the structure of the target
generally defines what is needed to qualify a molecule as an inhibitor, agonist,
antagonist, or allosteric modulator. These principles serve as a general concept for
the design of new active substances. In modern drug research, the target structure
for which a new active substance is sought is usually known. In many historical
examples of drug development, this was not initially the case. In the meantime,
however, many modes of action are known. Peter Imming and his research group
have compiled a summary of the modes of action for a broad collection of drugs that
are used today. Furthermore, the database WOMBAT from Tudor Oprea at the
University of New Mexico in Albuquerque offers fast access to functionally
annotated drugs together with their characteristic properties.

22.2 Enzymes as Catalysts in Cellular Metabolism

All metabolic processes, biosynthetic pathways, and the regulation of important


physiological processes are mediated by enzymes. Enzymes are macromolecular
22.2 Enzymes as Catalysts in Cellular Metabolism 473

Fig. 22.1 Distribution of the Nuclear Hormone


target proteins for drugs that Receptors 4%
are on the market today.
Transporters 4% Enzymes 47%
Ion Channels 7%

GPCRs 30%
Other DNA 1%
Receptors 4%
Integrins 1%

Miscellaneous 2%

biocatalysts that allow complex chemical reactions to take place in an aqueous


medium, usually at 37 C, and under normal pressure. During the course of evolu-
tion, families of enzymes have developed with analogous architecture and identical
catalytic sites. Small differences in the structure of the binding sites lead to entirely
different substrate specificity, which, according to the required function, render
these enzymes either highly specific or profoundly promiscuous.
Enzymes do not bind particularly strongly to their substrates and reaction
products. The bound conformation of the ligand is often different from the ener-
getically most favorable conformation in aqueous solution. An enzyme binds the
substrate in a geometry that prepares it for the transition state of the reaction.
Moreover, polar groups can induce the required shifts in charges. The enzyme
stabilizes the transition state of a chemical reaction through the spatial arrangement
and orientation of its reactive groups. It simultaneously lowers the activation energy
of the reaction and makes sometimes very dramatic rate accelerations of chemical
reactions possible. After dissociation of the product, the enzyme is available for the
transformation of the next substrate molecule.
Enzymes are classified according to the reactions that they catalyze. An inter-
national commission has divided enzymes into six classes, each of which is
assigned a four-number code (Table 22.1). The main class indicates what type of
reaction is catalyzed (redox reactions, transfer reactions, transfer of functional
groups to water, cleavage and elimination reactions, isomerization of groups within
the substrate, or condensation or linkage of molecular groups). The remaining
numbers classify, for example, which group is transferred or whether the protein
is regulated by cofactors. The MEROPS database, which is maintained by the
Sanger Institute in Cambridge, England, affords fast access and a broad
overview of proteases, their substrates, reaction mechanisms, and selectivities.
In ▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”;
▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing
474 22 How Drugs Act: Concepts for Therapy

Table 22.1 Enzyme classification based on the four-digit number code.


Biochemical
Class Name function Examples Coenzymes
EC 1.x.x.x Oxidoreductases Catalyze redox Dehydrogenases, NAD+, NADP+,
reactions; transfer of Oxidases, FAD, FMD, and
H- and O-atoms or Oxygenases, Liponic acid
electrons between Hydroxylases
molecules
EC 2.x.x.x Transferases Transfer functional Phosphotransferases S-Adenosyl
groups such as (including kinases) methionine, Biotin,
methyl, acyl, amino, Aminotransferases cAMP, ATP,
or phosphate groups Thiamine
from one molecule to pyrophosphate
another (TPP),
Tetrahydrofolic acid
EC 3.x.x.x Hydrolases Hydrolytic cleavage Esterases, Lipases, Not needed
of molecules Phosphatases, and
Peptidases
EC 4.x.x.x Lyases Non-hydrolytic Decarboxylases, TTP, Pyridoxal
addition or cleavage Aldolases, Synthases phosphate
of groups on
molecules,
particularly double
bonds, cleavage of
C–C, C–N, C–O, and
C–S bonds
EC 5.x.x.x Isomerases Intramolecular Racemases, Glucose-1,6-
rearrangement and Mutases bisphosphate,
isomerization within Vitamin B12
a molecule
EC 6.x.x.x Ligases Coupling of two Synthestases, ATP, NAD+
molecules by the Carboxylases
formation of C–C,
C–N, C–O, or C–S
bonds by using ATP

Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; and ▶ 27, “Oxidoreductase


Inhibitors,” the important enzyme classes for which drugs have been successfully
developed will be presented.

22.3 How Do Enzymes Push Substrates Toward the Transition


State?

To explain how an enzyme prepares its substrate for the transition state we should
consider an example. The crystal structure of creatinase with its natural
substrate creatine 22.1 and a very similar inhibitor, carbamoylsarcosine 22.2, was
determined in the research group of Robert Huber at the Max Planck Institute in
Martinsried, Germany. The enzyme catalyzes the cleavage of creatine to urea and
22.3 How Do Enzymes Push Substrates Toward the Transition State? 475

NH2
NH2
+ O- + O-
H2N N + H2O H2N
H2N +
O
CH3 O CH3 O
22.1

O
O-
H2N N
CH3 O
22.2

Fig. 22.2 The enzyme creatinase cleaves creatine 22.1 with water into urea and sarcosine. The
structurally very similar molecule, carbamoylsarcosine 22.2, is an inhibitor of this enzyme.

sarcosine (Fig. 22.2). For this, the central carbon in the C–N bond in the
guanidinium part of creatine is nucleophilically attacked by a water molecule.
All three C–N bonds in the guanidinium portion exhibit double-bond character and
a planar geometry because of electron delocalization. How does the enzyme
manage to distort creatine in the direction of the transition state of the reaction to
prepare it for the nucleophilic attack as well as for bond breaking? The zwitterionic
creatine is bound through its guanidinium function by two glutamate residues
forming two salt-bridge-like hydrogen bonds (Figs. 22.3 and 22.4). The opposite
acid function finds strongly polarizing bonding partners in two arginine residues.
Furthermore, a water molecule is found near the central imine-like carbon atom in
the crystal structure. A histidine is next to it in the binding pocket. This histidine
orients the water molecule in exactly the right position and also supports the
abstraction of a proton from this water molecule. This increases the nucleophilicity
of the water to generate an OH group. The vice-like fixation of the guanidine
group by the two glutamate residues causes a twisting of this building block, which
is planar in the unbound state. Because of this, the conjugation is disrupted and as a
consequence, the C–N bond that is to be cleaved is significantly weakened.
A nucleophilic attack occurs, and a tetrahedral transition state is formed. At the
same time, the now-protonated histidine is able to polarize the methyl-substituted
nitrogen atom and involve it in a hydrogen bond. This prepares our substrate for the
bond-breaking transition state. After transferring the proton from histidine to
the substrate, a positive charge is formed on the nitrogen atom of the bond that is
to be cleaved. Histidine accepts a proton from the oxygen atom of the tetrahedral
transition state as a C¼O double bond is formed, and the central C–N bond is cleaved.
The products then leave the binding pocket. In this way, the enzyme creates
a stereoelectronically complementary environment for the cleavage reaction. Its
polar groups place the water molecule correctly for the nucleophilic attack, and
histidine induces a pyramidalization of the nitrogen atom in the bond to be broken.
At the same time, it serves as a proton donor as well as acceptor during the reaction.
476 22 How Drugs Act: Concepts for Therapy

a b
Glu O O Glu Glu O O Glu
− − O − − O
O O

H2N + NH2 H H2N NH2


His H His
Phe O Phe O
N| H N| +
H3C H3C
N NH HN NH

O − O O − O
H2N + + NH2 H2N + + NH2
H2N H2N
NH2 NH2
NH NH
HN HN

Arg Arg Arg


Arg

c d
Glu O O Glu Glu O O Glu
− − O − − O
O O
H2N NH2
H2N NH2
H His His
Phe O Phe O
+
NH +
H3C NH2
N NH H3C HN NH

O − O
H2N + + NH2 H2N +
O − O + NH2
NH2 H2N H2N
NH2
NH NH
HN HN

Arg Arg Arg


Arg

Fig. 22.3 (a) In the first step, a water molecule is polarized by a neighboring histidine so
that a nucleophilic attack on the imine-like carbon is facilitated. (b) Then, the histidine transfers
a proton to the central nitrogen atom. (c) The substrate reacts further in that a C¼O double bond
is formed and the C–N bond is cleaved. (d) The products urea and sarcosine leave the binding
pocket.

The crystal structure in Fig. 22.4 was determined together with carbamoyl-
sarcosine. This molecule is different from the substrate creatine because of an
exchange of an oxygen for a nitrogen atom. However, because of this, this part of
the molecule does not carry a positive charge as creatine does. The addition of the
nucleophilic OH leads to decomposition and compensation of the charge in the
guanidinium part in creatine. A comparable attack upon carbamoylsarcosine
would lead to the formation of a negative charge next to the two negatively
charged glutamates. This is energetically unfavorable. As a consequence, the
cleavage reaction does not take place on this molecule, instead it blocks the
transformation. The example shows how precisely substrate and enzyme must be
22.3 How Do Enzymes Push Substrates Toward the Transition State? 477

a
Glu H2O

H2N NH2
HN + NH
H3C N Glu

His
O
O−

Arg Arg

b
H2O
Glu

H2N + Glu His


NH2
N
H3C N

O
O−
Arg

Arg

c H2O
Glu

H H
O

H2N His
NH2 Glu

H3C N HN + NH

O
O− Arg
Arg

Fig. 22.4 (a) The vice-like fixation of the guanidinium group by both glutamate residues causes
a twisting in this portion, which is planar in the unbound state. Because of this, the conjugation is
disrupted and the C–N bond to be cleaved is weakened. The twisting is indicated by the red and
yellow planes that pass through the atoms of the guanidinium group. (b) The neighboring
protonated histidine further polarizes the methyl-substituted nitrogen atom, and involves it in
a hydrogen bond. In doing so the nitrogen atom takes on a pyramidal configuration, by which it
deviates out of the plane (yellow) of its next three neighbors. (c) In the structure with the substrate-
like inhibitor carbamoylsarcosine, a water molecule can be found in the position from which the
nucleophilic attack on the substrate creatine is initiated. This occurs from above and diagonally
behind the C¼N bond.
478 22 How Drugs Act: Concepts for Therapy

in harmony with one another. Small changes can drastically change this system and
convert a substrate molecule into an inhibitor of the targeted transformation
reaction.

22.4 Enzymes and Their Inhibitors

Enzymes can be organized into multienzyme complexes that carry out multiple
reactions on one substrate sequentially. They can also form cascades in which one
enzyme activates the inactive precursor of the next enzyme. This activation con-
tinues to the next enzyme, and the next, and so forth. The coagulation cascade
(▶ Sect. 23.3) is activated by two independent pathways, each along multiple steps,
which merge into a common pathway in the end. Because of this, a minor initiating
event is amplified by multiple orders of magnitude. This is good for normal
coagulation after an injury, but in the context of a coagulopathy (i.e., a tendency
to form clots too easily) it can have disastrous consequences!
Quite a number of inhibitors prevent the catalytic effect of an enzyme by occupying
the position at which the substrate binds. Such inhibitors are termed competitive
inhibitors. In addition, there are also allosteric inhibitors that bind at another position
on the enzyme and cause a change in its three-dimensional structure or dynamic
properties. This can prevent the enzyme from adopting the necessary conformation
for catalysis and can lead to a weakening of the catalytic activity. Detailed investiga-
tions of the enzyme kinetics allow for competitive inhibition to be distinguished from
noncompetitive inhibition. According to the type of interactions with the enzyme,
reversible and irreversible inhibitors can be differentiated. In the case of reversible
inhibitors, the binding to the enzyme must be strong so that the transformation of the
substrate can be reliably prevented. Some reversible inhibitors form a covalent bond to
the catalytic center that is chemically labile, and therefore fully reversible, for instance,
a hemiacetal bond. Irreversible inhibitors react with the enzyme by forming
a chemically stable bond. The inhibitors or the reacting groups cannot be detached,
and for the rest of the lifespan of the enzyme until protein degradation in the organism,
the enzyme remains inhibited. Moreover, there are naturally occurring protease inhib-
itors that indeed reversibly bind, but adhere so strongly that the complex is degraded
before the inhibitor is released.
The rational design of an enzyme inhibitor usually starts with the structure of the
substrate. One approach that is particularly successful is to imitate the transition
state with a chemically analogous group that is not attacked by the enzyme. In
the ▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl -Enzyme Intermediate”;
▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing
Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; and ▶ 27, “Oxidoreductase
Inhibitors”, many examples for the design of such inhibitors are presented. Overall,
irreversible enzyme inhibitors play a smaller role than reversible inhibitors, but
important drugs such as acetylsalicylic acid (ASA, ▶ Sect. 3.1), omeprazole
(▶ Sect. 3.5), clopidogrel (a thrombocyte aggregation inhibitor), penicillins and
22.5 Receptors as Target Structures for Drugs 479

cephalosporins (▶ Sect. 23.7), and a few monoamine oxidase inhibitors


(▶ Sect. 27.8) belong to this group.

22.5 Receptors as Target Structures for Drugs

Receptors are proteins or protein complexes that


• Mediate the information exchange between cells (membrane-bound
receptors);
• Regulate hormone-controlled gene expression (soluble receptors or transcription
factors);
• Are coupled to ion channels and control the flow of ions into or out of a cell
along a concentration gradient.
Important membrane-bound receptors are the receptors for adrenaline, sero-
tonin, dopamine, histamine, acetylcholine, adenosine, and thromboxane, and for
peptides such as the enkephalins (opiate receptor), neurokinins, and endothelins for
glycoproteins, as well as the group of sensory receptors. Neurotransmitters are the
endogenous agonists of many membrane-bound receptors (▶ Sect. 1.4). Nerve cells
are connected to each other through synapses; these are zones in which chemical
information transfer is accomplished by neurotransmitters. The so-called synaptic
gap is found between the transmitting cell (presynaptic neuron) and the receiving
cell (postsynaptic neuron). Neurotransmitters are synthesized in the presynaptic
neuron and stored in vesicles. Upon nerve stimulation, they are released into the
synaptic gap. There, by binding to a specific receptor on the postsynaptic neuron,
they effect a change in the membrane potential and consequently stimulate this cell.
After reuptake in the cell, containment in vesicles, or after degradation by, for
example, the enzyme monoamine oxidase (amines), esterases (acetylcholine), and
peptidases, or in glial cells through the effect of catechol-O-methyltransferase, the
effect subsides again quickly (see Fig. 22.7).
Within the cell, these receptors act upon G proteins (Fig. 22.5), the name of
which is derived from guanosine di- and triphosphate. All G protein-coupled
receptors (GPCRs) have an identical construction and function principle.
They consist of a protein chain with seven hydrophobic segments that penetrate
the cell membrane and anchor the receptor. These individual sections are connected
to one another by loops. To date around 1,000 different GPCR sequences
are known, and new ones are constantly being discovered and characterized
(▶ Sect. 29.1).
After an agonist docks, the active conformation of the receptor is stabilized.
Antagonists prevent the docking of agonists, and inverse agonists stabilize the
inactive conformation of the receptor.
The provoked receptor response is carried out over identical pathways, despite
the different types of receptors, and then it branches off again. This economic
natural principle is also used in other cases, for example, the regulation of cell
proliferation. The more-or-less-pronounced effect specificity is achieved by:
480 22 How Drugs Act: Concepts for Therapy

Ligand (e.g., Neurotransmitter) K+

Adenylate Membrane
Receptor Cyclase Exterior Ion Channel

Gs,
Membrane
γ Gq/11 Interior
β
Na+
α Gi,
Go ATP c-AMP
"second
G-Protein messenger"
Complex Inactive
Metabolite
Protein-
kinase A

Phospho-
diesterase
Inactive Activated
Enzyme Enzyme

Fig. 22.5 Schematic representation of the structure and function of a G protein-coupled receptor
(GPCR). The seven cylinders symbolize the seven transmembrane helices. The extra- and intra-
cellular loops that bind the helices are not shown. After binding an agonist, the a-subunit
dissociates from the so-called G protein complex. If a Gs or Gq/11 protein is present, then an
enzyme is activated that generates an internal hormone, a “second messenger.” For example, the
membrane-bound enzyme adenylate cyclase generates cyclic adenosine monophosphate (cAMP)
from adenosine triphosphate (ATP). This second messenger can further affect target proteins via
protein kinase A, or open an ion channel. To avoid an overreaction, cAMP is constantly being
degraded by the enzyme phosphodiesterase. Gi/0 proteins inhibit enzymes that form second
messengers.

• The different structures of the agonists and receptors and the resulting activation
of different G proteins and effector proteins;
• The different receptor occupancy and density of different cells;
• The location of the cells that produce and release the hormone or neurotrans-
mitter. This is accomplished in very specific cells; neighboring cells or organs
are not involved.
The picture of such receptors can be very complex. For example, in the case of
the acetylcholine receptors, two different groups are distinguished that preferably
bind either muscarine, a toxin of the toadstool Amanita muscaria, or nicotine, the
active ingredient of the tobacco plant, Nicotiana tabacum. In contrast to the
22.5 Receptors as Target Structures for Drugs 481

a b
Acetyl- K+ Cytosolic
cholin Receptor
Hormone
Membrane
Exterior LBD
Homo- or
Heterodimer

DBD
Membrane LBD LBD
nACh Receptor Interior
(Ion Channel) Na+
DBD DBD
DNA

c
Ligand Homodimerized
Growth Ligand Receptor
Factor Membrane
Receptor Exterior

Membrane
Interior
Tyrosine Kinase Activated
Domain Tyrosine
Kinase

Fig. 22.6 (a) The nicotinic acetylcholine receptor (nAChR) is a ligand-gated ion channel
(▶ Sect. 30.4). Here the cylinders do not stand for segments but rather for five separate proteins,
each of which has four transmembrane domains. After binding acetylcholine, the channel is
quickly opened. (b) Soluble receptors dimerize after agonist docking to their ligand-binding
domains (LBD). Here homodimers composed of two identical receptors as well as heterodimers
of two different receptors can be formed. The so-called zinc fingers of the DNA-binding domains
(DBD) recognize very specific sequences of DNA. A particular DNA segment is addressed by
dimerizing two receptor units. (c) Membrane-bound receptors for growth factors and insulin also
dimerize. Two receptors form a complex in the membrane and in doing so activate the intracellular
domain of the receptor, in this case, a tyrosine kinase.

muscarinic acetylcholine receptor, the nicotinic acetylcholine receptor (nAChR) is


a ligand-gated ion channel (▶ Sect. 30.4). It has a complex architecture of five
protein chains that are positioned in the cell membrane (Fig. 22.6a). Electron
microscopy pictures of the closed and open structure (after activation by acetyl-
choline) of the 290 kD nAChR protein complex (▶ Sect. 30.4) are available for the
nAChR from the electrical organ of the Torpedo electric ray, a fish.
Many hormone receptors, for example, for thyroid hormone, sexual hormones,
the corticosteroids, and retinoic acid, are soluble receptors that can move freely in
482 22 How Drugs Act: Concepts for Therapy

the cytosol, that is, the cell fluid. After binding the agonist, the complex migrates to
the nucleus. There, it binds as a dimer to the signal sequences of the DNA, the
operator and repressor genes, and induces or suppresses the new synthesis of
specific proteins (Fig. 22.6b).
All cytosolic hormone receptors or nuclear receptors are built from common
structural principles (▶ Sect. 28.2). They exhibit domains with a DNA-binding site
and a ligand-binding site. The DNA-binding site is highly conserved, that is, its
amino acid sequence varies very little between the different receptors. It contains
two “zinc fingers” comprising two Zn2+-binding sites that are highly conserved
motifs binding to very specific DNA segments, the so-called recognition sequences.
The ligand-binding site is much more variable. Dimers, either of two identical
receptors (homodimers) or from two different receptors (heterodimers) are formed
for the interaction with DNA. Four zinc fingers in the dimer recognize 12 base pairs
of DNA in total.
Dimerization is also found in other classes of membrane-bound receptors that
do not belong to the GPCR type. Among these are the receptors for growth
factors, for example, for human growth hormone (hGH), epidermal growth factor
(EGF), and insulin (▶ Sect. 29.8). Upon binding to the factor, these receptors
dimerize in the membrane with the extracellular domains. As a consequence, intra-
cellular kinases are activated that are part of the receptor protein (Fig. 22.6c). In
addition, there are receptors that must form complexes of more than two units to
provoke a receptor response. Among these are a series of immunologically important
receptors as well as receptors for the nerve growth factor (NGF) and tumor necrosis
factor (TNF).
Multiple examples of proteins are presented in this section that exert their function
as oligomers. Indeed, oligomer formation is also common in enzymes. There are
many reasons why oligomerization is advantageous. On the one hand, there are
functional requirements that demand, as described above, multiple neighboring
domains. On the other hand, there can be mechanistical advantages, especially with
enzymes. Individual domains of an oligomer are not necessarily independent of one
another. Their catalytic efficiency can depend upon what conditions the other
domains of the oligomer are currently in. This affords an additional possibility to
regulate the protein function. Oligomerization can also have another meaning. The
interior of a cell is crowded with proteins, ligands, substrates, and ions. It must be
compared with a ticker-tape parade given for a winning football team: hectic pushing
and shoving! One way to reduce this number without limiting the catalytic produc-
tivity by sacrificing catalytic centers is the formation of oligomers.

22.6 Drugs Regulate Ion Channels: Our Extremely Fast


Switches

Ion channels, which are embedded in the cell membrane, allow ions to enter or leave
the cell along the corresponding electrochemical concentration gradient when they
22.7 Blocking Transporters and Water Channels 483

are open. The opening or closing of the channel can be either voltage-, ligand- or
receptor-gated. All of these processes occur extraordinarily fast (▶ Sect. 30.1).
The intracellular Ca2+ ion concentration in all cells is a few factors of 10 below
that of the surrounding medium. At the moment of cellular stimulation, all of the
voltage-gated calcium channels are momentarily opened by the arrival of an
electrical signal. An influx of Ca2+ ions into the cell occurs. The intracellular
concentration rises swiftly without ever reaching the extracellular concentration.
In smooth, skeletal, and heart muscle cells, this process induces a contraction. Then
the excess Ca2+ ions are pumped out of the cell, and a resting phase follows. This
process is repeated very quickly in heart cells in a rhythm of less than a second,
corresponding to the length of a heart beat.
Verapamil and nifedipine (▶ Sect. 2.6) affect such voltage-gated calcium chan-
nels and inhibit the influx of calcium ions. They are called “calcium channel
blockers,” which describes the mode of action of these substances. By inhibiting
the influx of Ca2+ ions, the excitability of the cells, for instance, of heart cells, is
decreased, less energy is used, and the muscle work becomes more
economic. Furthermore, calcium channel blockers offer protection from the high
calcium concentrations that are caused by cell demise in poorly perfused areas, for
instance, during a heart attack. A particularly favorable therapeutic effect is their
blood pressure-lowering properties.
The nicotinic acetylcholine receptor (nAChR, Fig. 22.6a) and the family of
glutamate receptors belong to the class of ligand- or receptor-gated ion channels.
Here the opening and closing of the channel is not accomplished by an electrical
impulse but rather by the binding of a ligand.
Many drugs affect ion channels (▶ Chap. 30, “Ligands for Channels, Pores, and
Transporters”). Local anesthetics and antiarrhythmic drugs, which are derived from
the former, are sodium channel blockers; they reduce the excitability of nerve cells.
The venom of the fugu fish, tetrodotoxin (▶ Sect. 6.2), also blocks this channel.
Other antiarrhythmic agents block potassium channels. Substances that stabilize
the K+ channel in an open state, so-called K+ channel openers, act as vasodilators
and decrease the blood pressure. The antidiabetic sulfonylureas are K+ channel
blockers that act on the insulin-producing cells in the pancreas (▶ Sect. 30.2).
Tranquilizers of the benzodiazepine type (▶ Sect. 30.6) increase the binding of
the neurotransmitter g-aminobutyric acid (GABA) to chloride channels. Prolonged
opening of this channel causes an increased influx of chloride ions and with it
a change in the response behavior of the nerve cells. Barbiturates and inhaled
anesthetics also act on the GABA receptors, but on different domains.

22.7 Blocking Transporters and Water Channels

Transporters are proteins that affect the active uptake of molecules or ions into
cells. They play a very decisive role in the digestive process. Because amino acids
and sugar cannot cross membranes on their own, they can only be absorbed with the
help of transporters in the digestive tract.
484 22 How Drugs Act: Concepts for Therapy

L-DOPA

DOPA
Decarboxylase

Ca2+ Dopamine Presynaptic


Storage Vesicle Membrane

Ion Channel

Presynaptic MAO
Nerve Cell
Ca2+ Inactive
Metabolite

Dopamine
Transporter

Synaptic Gap
Postsynaptic
Nerve Cell
Postsynaptic
Membrane
Dopamine
Receptor
with G Protein Complex

Fig. 22.7 Nerve signal transmission through neurotransmitters is based on a complex interplay of
enzymes, receptors, ion channels, and transporters. Dopamine is produced by enzymatic decar-
boxylation of the amino acid L-DOPA. As with other neurotransmitters, it is stored in special
vesicles. Upon electrical stimulation, Ca2+ ions flow into the cell. This causes the neurotransmitter
to be released into the synaptic gap. The nerve impulse is conducted further by the interaction with
the postsynaptic receptor. Finally, the uptake in the presynaptic cell is accomplished by
a transporter and the neurotransmitter is stored in a vesicle again, or degraded by the enzyme
monoamine oxidase (MAO).

Transporters are also exceedingly important for signal transmission of nerve


cells. A neurotransmitter must be rapidly removed from the synaptic gap after its
release to prevent a prolonged stimulation of the nerve cell. This is accomplished in
part by metabolic degradation, but that is very wasteful for the releasing cell. An
uptake (incorrectly called reuptake) with the help of a specific transporter is more
economical. The neurotransmitter is stored in vesicles and held at the ready for the
next release.
Transporters work against concentration gradients. The transport process is
relatively slow, much slower than an ion channel, and it costs energy. The amino
acid sequence of the specific transporter is known for many neurotransmitters,
amino acids, sugars, and nucleosides. As with the G protein-coupled receptors,
22.8 Modes of Action: A Never-Ending Story 485

the transporters are differentiated into many families. Most have an even more
complex structure with 12 transmembrane domains (▶ Sect. 30.8).
A few active substances directly target the transporters and displace the natural
ligands. The euphoric effects of cocaine are due to its binding to the dopamine
transporter, which is responsible for the active transport and uptake of dopamine in
the nerve cells. A fast flood of cocaine causes a delayed uptake of dopamine from
the synaptic gap, and this is responsible for the typical physical and psychiatric
effects. A few antidepressants are ligands for the noradrenaline and serotonin
transporters (▶ Sect. 1.4). They are bound, but not transported into the cell. In
contrast, some analogues of amino acids are brought into nerve cells by transporters
and act there as neurotoxins. An overview of the complex interplay of neurotrans-
mitters, enzymes, receptors, and transporters is presented in Fig. 22.7. Some
anti-gout drugs bind to the uric acid transporter. They displace uric acid, inhibit
its absorption from primary urine, and accelerate the excretion of uric acid with the
urine. There are even specific transporters for bile acids.
In addition to the previously described transporters, other representatives of this
protein class are also important for the uptake or excretion of foreign substances into or
out of cells. Tumor cells often react to therapeutic measures by developing multiple
resistance to many structurally diverse substances (▶ Sect. 30.8). Glycoprotein GP
170, also a transporter with 12 transmembrane domains, is responsible for this process.
In contrast to ion channels, ion transporters work against the concentration
gradients. This is an active process that occurs at the expense of energy. Drugs can
influence this too. An example is agents that increase urine production: diuretics.
They inhibit different ion transporters. Na+/K+ ATPase, a pump that exchanges
sodium for potassium ions, is inhibited by the cardiac glycosides, which are
prescribed to treat congestive heart failure. Substances of the omeprazole type
(▶ Sects. 3.6 and ▶ 9.5) inhibit the H+/K+ ATPase, the so-called proton pump.
Nature uses special water channels to regulate water homeostasis, and also to
quickly and selectively transport small, non-charged molecules such as glycerol
or urea across the cell membrane. In contrast to the transporters, and analogously
to the ion channels, these allow water to flow along the osmotic gradient
(▶ Sect. 30.9). Ten isoforms that display different permeabilities have been dis-
covered in mammals. They are tetramers that are composed of six transmembrane
helices. Each monomer unit forms a channel. The channels are partially made
available for water homeostasis by the release of cytosolic vesicles or activation
can be achieved by phosphorylation. Regulation of the water channels by drugs
represents a diuretic therapy concept, but the treatment of parasitic infection has
also been discussed as an additional indication.

22.8 Modes of Action: A Never-Ending Story

The therapy of viral, bacterial, and parasitic diseases attempts to target a pathogen
very specifically. For this, various mechanisms are exploited, for example,
biosynthetic pathways that are either not present in humans in an identical form
486 22 How Drugs Act: Concepts for Therapy

or that do not play an important role in humans. In this way, the danger of adverse
effects can be minimized from the beginning.
Antimetabolites are substances that are incorporated as a false substrate instead
of the natural biological reagents, for example, as enzyme cofactors or in DNA. An
example is the sulfonamide sulfonamidochrysoidine. Its cleavage product sulfanil-
amide (▶ Sect. 2.3) is similar to p-aminobenzoic acid, which is the starting material
in the biosynthesis of an important bacterial cofactor, dihydrofolic acid. Only
bacteria are affected by this. Humans are not dependent on this biosynthetic
pathway. As with other mammals, humans must obtain dihydrofolic acid from
food. A few virostatics and tumor-inhibiting substances are nucleoside ana-
logues. Depending on their structure type, they use a modified base, a modified
sugar, or both. All influence the DNA or RNA synthesis. Aciclovir and a few other
analogues are taken into the cells as Trojan horses in the inactive form, and
“armed” once inside the cell. Their activation is carried out by viral enzymes,
and this process only occurs inside cells that have been infected by the virus
(▶ Sects. 9.5 and ▶ 32.5). Another mechanistic principle tries to interfere with
the translation process so that particular proteins are never manufactured in the first
place by protein biosynthesis. For this, the translation of the mRNA is blocked by
complexation to so-called antisense oligonucleotides (▶ Sect. 32.4). The formed
double-stranded mRNA cannot be read in the ribosome. Such a therapy can find
application for the treatment of exaggerated immune reactions, septic shock,
arterial hypertension, pulmonary emphysema, or pancreatitis.
Many antibiotics, for example, the penicillins and cephalosporins
(▶ Sect. 23.7) inhibit the bacterial cell-wall biosynthesis. In the latter process,
they block the catalytic center of a transpeptidase that shows a similar mode of
action to a serine hydrolase (▶ Sect. 23.7). The antibiotic D-cycloserine, also an
inhibitor of the cell-wall construction, penetrates the interior of the bacteria by
using a D-alanine transporter. Other antibiotics are protein biosynthesis inhibitors
(▶ Sect. 32.6). Tetracycline (▶ Sect. 6.3), streptomycin (▶ Sect. 6.3), and chlor-
amphenicol (▶ Sect. 9.2) also inhibit the protein synthesis machinery. They undergo
an interaction with the 30S or 50S subunit of the ribosome and block ribosomal
peptide synthesis. The elucidation of the spatial structure of the ribosome established
fundamentals that allowed the mode of action of a large number of macrolide
antibiotics to be understood and afforded a perspective on how the mechanisms of
resistance is developed (▶ Sect. 32.6). Antibacterial quinolone carboxylic acids
inhibit gyrase. The latter enzyme causes a twisting, and as a consequence enables
a dense packing of the DNA in the bacterial cells. Without this twisting, there is
simply not enough space in the cell for the genetic material. The so-called polyene
antibiotics are used to treat fungal infections. They form channels in the fungal cell
membrane that causes a loss in intracellular ions and, consequently, cell death.
Azoles inhibit the biosynthesis of ergosterol, which is absolutely required for the
construction of the intact cell membrane.
Alkylating agents play an important role in tumor therapy. Reading and writing
errors occur because of the alkylation of DNA bases, and these errors have a much
22.8 Modes of Action: A Never-Ending Story 487

stronger effect on quickly dividing tumor cells than in normal cells, but they also
have considerable side effects. Intercalating tumor therapeutics are planar mol-
ecules that slip between two base pairs of DNA (▶ Sect. 14.9). The disruption that
occurs as a consequence also leads to errors in cell division. Other DNA ligands
bind in the minor or the major groove on the exterior of the double helix. Taxol
(▶ Sect. 6.1) and the epothilones are important active substances for cancer ther-
apy. They bind to tubulin, a protein that forms tube-like structures: so-called
microtubuli. Because the formation of such structures is an important prerequisite
for cell division, Taxol or the epothilones inhibit this process in a very specific way.
The immunosuppressive ciclosporin (▶ Chap. 10, “Peptidomimetics,” Fig. 10.2)
blocks the activation of the immune system, the so-called helper cells. Two
enzymes are involved in this process. One of them, cyclophilin, is a prolyl cis–
trans isomerase. The other, calcineurin, is a Ca2+/calmodulin-dependent phosphatase.
Ciclosporin acts as “putty” between these two proteins. The complex formation
prevents the activation of helper cells and therefore stops the stimulation of an
immune response. Modern transplant surgery would not be possible without the
immunosuppressive ciclosporin and substances with an analogous mode of action.
The so-called RAS proteins play an important role in tumorigenesis. They are
a family of enzymes with a relatively low molecular weight. RAS proteins with
mutated active centers lose their ability to control cell division, and the cells divide
unstoppably. Therefore they are oncogenic, that is, they cause tumors. Around 50%
of all lung and colorectal tumors have mutated ras genes, and about 95% of the ras
genes in pancreatic tumors are mutated. There are other approaches for therapy.
RAS proteins must migrate from the cytosol, the cell fluid, into the cell membrane
to signal the cell division. For this, they are enzymatically equipped with a farnesyl
group, which anchors the protein in the cell membrane. The prevention of the
membrane embedding by inhibiting farnesyltransferase represents an attractive
approach for targeted cancer treatment (▶ Sect. 26.10). In the meantime, it has been
demonstrated that this principle of blocking the farnesylation of proteins can also be
used to treat parasitic infections. For this, the farnesyl transferases of these parasites
are the target structures for drug development.
Tumor-suppressor genes produce proteins such as the p53 protein that prevent
cell division in the case of DNA damage. Any genetic defect in a cell leading to
a reduced concentration of one or more of these proteins has the consequence that
cells with defective DNA can proliferate. Cell division runs out of control, and
a tumor with additional genetic defects and uncontrolled growth forms.
A vascular occlusion is caused by the aggregation of blood platelets. Proteins on the
cell surface play an important role, for example, the adhesion glycoprotein aIIbb3. Two
of these molecules form a complex with fibrinogen that “glues” the cells together. The
targeted development of low-molecular-weight peptidomimetics (▶ Sect. 10.6)
starting from an RGD motif (RGD stands for Arg-Gly-Asp) represents a great success
in rational drug design (▶ Sect. 31.2). Another system that plays an important role in
the cell–cell recognition between leukocytes and endothelial cells are the selectins. In
cases of inflammation, the E- and P-selectins are upregulated and presented on the
488 22 How Drugs Act: Concepts for Therapy

endothelium, and these prevent leukocytes from rolling along the surfaces of the blood
vessels (▶ Sect. 31.3). After adhesion, the leukocytes penetrate the vessel and migrate
to the site of the inflammation to fight the infection. In some diseases, an excessive
leukocyte infiltration leads to tissue damage. To prevent this, an attempt is made to
interfere with the inflammatory cascade with compounds that block the surface
exposition of selectins. These receptors recognize sugar-like molecular groups on
the leukocyte surface, therefore the development of appropriate antagonists based on
carbohydrates displays a suitable therapeutic concept.
A surface contact must also be formed between the flu virus and the host cell for
infection to take place. The virus docks with its capsule protein, hemagglutinin, to
the host cell to initiate endocytosis. After gaining entry into the cell, it uses the
protein biosynthesis machinery of the infected cell to make copies of itself. After
maturation, the new virus must be expelled from the cell again. For this, the new
virus buds on the cell surface and the bud is finally cinched off. In the last step, the
viral neuraminidase cleaves sialic acid. It is through this acid that the viral hemag-
glutinin is bound to the host cell. This last step can be blocked by neuraminidase
inhibitors (▶ Sect. 31.4). The inhibitors zanamivir and oseltamivir have been very
successfully introduced to the market. The CCR5 receptor antagonist maraviroc has
been launched for the therapy of HIV; the CCR5 receptor acts as an entry gate for
the HI virus, and its inhibition blocks host cell invasion.
The endogenous immune system has developed very efficient defensive
mechanisms. Antibodies represent one such defensive weapon. These proteins are
able to bind to foreign substances very selectively and with high affinity, and to
expose them to phagocytotic cells (i.e., dendritic cells and macrophages) for
degradation. This sophisticated, highly specific recognition system for molecules,
which ranges from very small low-molecular-weight antigens to complex macro-
molecular systems, has been tapped for pharmaceutical therapy (▶ Sect. 32.3).
Today, numerous artificially manufactured antibodies directed against very
different target molecules are found in the therapy of many different diseases.
There is no end in sight because currently about 200 newly developed antibodies
are in clinical trials.
There are only very few really “unspecifically” acting drugs. Antacids, which
neutralize gastric acid purely chemically, belong to this class, as do purely
surface-active substances, for instance, amphiphilic bactericides, fungicides, and
hemolytics. Specific mechanisms of action have been recognized even for the
barbiturates, local anesthetics, inhalation anesthetics, and alcohol, which was
long considered to be an unspecific agent. Frequently the evidence of a specific
effect was provided over the different effects of pure enantiomers of a racemate.
The b-antagonistic effect of an optically active b-blocker is associated with one
enantiomer (▶ Sect. 5.5). The unspecific adverse effects with membranes, however,
are attributed to both enantiomers equally.
Is there anything new to still be discovered? An absolute surprise was the finding
that nitrogen monoxide, NO, a miniscule molecule, is also a neurotransmitter. Sub-
stances that release NO or that interfere with the NO biosynthesis lower or raise the
blood pressure (▶ Sect. 25.8). New subtypes are constantly being discovered for
22.9 Resistance and Its Origin 489

already-established receptors. The question of to what extent it is reasonable to


optimize an active substance for absolute receptor specificity remains an unsolved
problem. It can certainly be the case that some active substances with targeted attacks
on multiple receptors or their subtypes are better suited for therapy than highly
specific analogues. This is particularly valid for compounds that bind to GPCRs.
Here, the activity profile against an entire palette of receptor subtypes is critical
for the efficacy of a compound. Numerous GPCRs are even involved in our sense
of smell, which follows this principle of multiple graduated receptor responses
(▶ Sect. 29.7). It is only in this way that the finely tuned and nuance-rich perception
diversity can be achieved. This is a broad field of research. To date, particular in
CNS-active substances only clinical research can deliver the results needed to make
a decision about the therapeutic usefulness of a compound.

22.9 Resistance and Its Origin

Pathogenic viruses, bacteria, and parasites defend themselves against drug therapy.
In the past the inappropriate and too-broad use of antibiotics led to selection
pressure for resistant strains. Unfortunately, it is the hospitals above all that are
the main location for the emergence and spread of resistant strains. The spatial
proximity and concentration of the most diverse pathogens is virtually unavoidable.
In some cases there are only a few effective weapons left, for example, the
glycopeptide antibiotics. They should be used prudently and purposefully, even if
that goes against the commercial interests of the manufacturer.
Bacterial pathogens overwhelmingly defend themselves against penicillins and
cephalosporins by producing b-lactamases (▶ Sect. 23.7). These are enzymes
that open the four-membered lactam ring of these antibiotics into inactive
cleavage products. During the long time that this substance class was optimized,
metabolically stable analogues as well as specific b-lactamase inhibitors were
developed.
The causative agent of the immune deficiency disease AIDS (▶ Sects. 1.3 and
▶ 24.3), the HI virus, a retrovirus, transfers its genetic information from the RNA
back into DNA. This process is afflicted with an exceedingly high error rate of
about one base mutation per generation. The high mutation rate leads to the fast
emergence and selection of resistant strains. In the last 10 years many active
substances with entirely different modes of action against the HI virus have been
introduced to the market, but resistances to many inhibitors were very quickly
observed, for example, against the HIV protease (▶ Sect. 24.3) or reverse tran-
scriptase inhibitors (▶ Sect. 32.5), and even multiple resistances. The mutated
viruses are even resistant to multiple, structurally different inhibitors! The combi-
nation of different active substances against one and the same target does not help
here much further. Only a combination of active substances that hit the virus at
completely different instances of its lifecycle offers a reprieve.
Tuberculosis is also reemerging. Resistant pathogens require the development of
new therapeutics. After the convincing success of the mosquito extermination
490 22 How Drugs Act: Concepts for Therapy

campaign with DDT and therapy with synthetic antimalarials, malaria is again
progressing in developing countries.
The largest problem in the therapy of tumors is the development of multidrug
resistance (MDR) during the treatment. The resistance is not only against the
causative agent but rather it occurs simultaneously against entirely different
tumor therapeutics. This multidrug resistance is due to the overexpression of
a transporter (▶ Sects. 22.7 and ▶ 30.8), glycoprotein 170, which can largely
eliminate structurally deviating xenobiotics from the cell. Although GP170 prefers
cationic substances, another transporter, the multidrug resistance-associated protein
(MRP) eliminates amphiphilic anionic substances, compounds with polar and
nonpolar character. But amphiphilic substances are also able to break the resistance
of tumor cells. Quantitative structure–activity relationships show that tumor cell
resistance to particular drugs is mainly associated with similarities in their molec-
ular weights, that is, the size of the inducing agent, and its lipophilicity.

22.10 Combined Administration of Drugs

Combination drugs are very popular with pharmaceutical manufacturers, doctors,


and patients alike. The manufacturers value them because they expand the
indication field of a successful substance and bring new life into their sales
figures. Some physicians are pleased that the therapy is simplified in many
cases, but others reject such combination preparations. An advantage for older
patients is that they do not need to take so many different medications at different
times of the day and in different doses, rather only one or a few combination
drugs. This improves the reliability of the dosing, that is, the compliance. One of
the most common reasons for therapy failure is, in fact, the behavior of the
patient. Either the regular dose is forgotten, or the patient gives himself or herself
a break from the regime over the weekend or while on vacation. These behaviors
are particularly pronounced in older patients, with medications that show no
obvious immediate success, or with drugs that have side effects that the patient
subjectively experiences as unpleasant.
Clinical pharmacologists, academics, and many critically oriented physicians
have considerable reservations regarding combination preparations. This is
understandable if one considers that the attitude of a patient to a particular medi-
cation requires the observation of a dose–effect relationship over a long period of
time, and finally, an individual therapy. In a combination medication, there is
always a fixed relationship between the individual components. Many combina-
tions, for example, analgesics, contain components with different modes of action.
These are often misused without a strict medical indication, and are therefore to be
judged critically.
There are reasonable combinations that even opponents to the general concept
of combination therapy would accept without reservations. Among these are
• L-DOPA preparation with which the side effects can be reduced by selective
combination (▶ Sects. 9.4, ▶ 26.9, and ▶ 27.8);
22.11 Synopsis 491

• Antihypertensives and diuretics, the different mechanistic principles of which


complement one another;
• Antibacterial preparations in which a dihydrofolate reductase inhibitor
(▶ Sect. 27.2) is combined with an appropriate sulfonamide;
• Hormonal contraceptives (▶ Sect. 28.5);
• Polyvalent vaccines, with which a single application offers protection against
multiple diseases.
In the case of L-DOPA therapy, only combinations of multiple active substances
reduce the side effects to a tolerable level. A single principle is often not enough to
accomplish the same effect that is achieved with combinations in the case of
antihypertensives and diuretics. In the case of the sulfonamide combinations and
with antituberculosis compounds, the action over diverse modes of action can prevent
or delay the development of resistance. An inhibitor for the P450 family of metabolic
enzymes can be justifiable as an adjuvant to expensive medication or with drugs that
are used in very high doses. In this way, the concentration of the other drug can be
held at a higher level and for a longer time (▶ Sect. 27.7). An important prerequisite
for all combination medications are an adequate therapeutic window and adapted
pharmacokinetics of the components, at least those that support the actual mode of
action.

22.11 Synopsis

• A relatively small portion of the druggable genome has been pharmaceutically


addressed, and GPCRs overrepresent the targets for which active substances are
available. Protein kinases represent a particularly promising emerging family of
targets.
• Enzymes are very popular drug targets, and the natural substrates often provide
the starting point for a rational drug-design approach. There are three types of
enzyme inhibitors, competitive inhibitors, non-competitive inhibitors, and
allosteric inhibitors. Enzyme inhibition can also be classified as reversible
and irreversible. Nowadays reversible inhibition is desired, but some very
important drugs are irreversible inhibitors, and some reversible inhibitors have
such high affinity that they are de facto irreversible inhibitors.
• Receptors are also important drug targets; they can be subdivided into GPCRs, ion
channels, hormone receptors, and growth factor receptors. An agonist activates
the receptor, an antagonist prevents the agonist from docking at its binding site,
and an inverse agonist stabilizes an inactive conformation of the receptor.
• Ion channels are extremely fast gateways for ions and can be either voltage- or
ligand-gated. Ions can flow only passively with the concentration gradient through
an ion channel.
• Transporters are special proteins in the membrane that can pump molecules
and ions against the concentration gradient at the expense of ATP hydrolysis.
Many transporters are attractive drug targets, and others are responsible for the
development of drug resistance.
492 22 How Drugs Act: Concepts for Therapy

• There are a large variety of known modes of action for drugs. Some of the most
diverse modes of action are found in anti-infective drugs. Furthermore, tumor
therapeutics exploit diverse, toxic modes of action. The goal in addressing these
modes of action in terms of a therapy is to find a pathophysiological process that
is unique, or is as unique as possible to the disease to spare healthy tissue from
damage.
• Drug resistance is an increasingly serious problem and is both an inevitable
occurrence associated with using a pharmaceutical therapy, and a consequence
of the misuse of anti-infectives. There are several mechanisms of resistance
development in bacteria (i.e., enzyme production), viruses (i.e., fast genetic
mutations), and in cancer therapy (i.e., aberrant transporter expression). These
mechanisms are not mutually exclusive.
• The issue of combination drugs is a controversial topic. Some physicians are
against them, and others are in favor of them, and both sides of the argument
have good reasons. Nonetheless, some drug combinations are justifiable and
help with compliance, clinical efficacy and safety.

Bibliography

General Literature
Folkers G (1995) Lock and key – a hundred years after, Emil Fischer commemorate symposium.
Pharm Acta Helv 69:175–269
Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1:727–730
Imming P, Sinning C, Meyer A (2006) Drugs, their targets and the nature and number of drug
targets. Nat Rev Drug Discov 5:821–834
Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are there? Nat Rev
Drug Discov 5:993–996
The journals: Trends in Pharmacological Sciences, Chemistry & Biology, Nature Reviews Drug
Discovery or Pharmazie in unserer Zeit contain in each edition a highly topical article about the
mode of action of a biologically active substance.

Special Literature

Austin DJ, Crabtree R, Schreiber SL (1994) Proximity versus allostery: the role of regulated
protein dimerization in biology. Chem Biol 1:131–136
Hayes JD, Wolf CR (1990) Molecular mechanisms of drug resistance. Biochem J 272:281–295
Rawlings ND, Morton FR, Barrett AJ (2006) MEROPS: the peptidase database. Nucleic Acids Res
34:D270–D272, http://merops.sanger.ac.uk/
Saudou F, Hen R (1994) 5-HT receptor subtypes: molecular and functional diversity. Med Chem
Res 4:16–84
Westkaemper RB (1993) Serotonin receptors: molecular genetics and molecular modeling. Med
Chem Res 3:269–272
Inhibitors of Hydrolases with an
Acyl–Enzyme Intermediate 23

Peptidases and esterases are hydrolytic enzymes; 2–3% of all gene products are
assigned to this group alone. They are therefore, an important group of target
proteins for the design of new medicines and have a special importance
for structure-based drug design. This is reflected in the fact that about 14% of
all known human peptidases are presently being investigated as possible target
structures for drug therapy.
The function of these enzymes is the cleavage of peptide or ester bonds for
which a nucleophile is needed for the attack on the carbonyl group of the amide or
ester bond to be cleaved. A large number of proteins use the OH or SH groups of
a serine, threonine, or cysteine for this purpose. In the following chapters, we will
see other cleaving enzymes that use a different mechanism. During the cleavage
reaction of the hydrolases discussed in this chapter, a temporary covalent bond
between substrate and enzyme is formed. This intermediate, the so-called
acyl–enzyme form, occurs with serine, threonine, and cysteine proteases, but
lipases, esterases, transpeptidases, and b-lactamases also use this reaction
mechanism. The design of inhibitors for these enzymes that act via an
acyl–enzyme intermediate shall be discussed. In the following two chapters,
peptidases that use a water molecule for the primary attack on the peptide bond
to be hydrolyzed shall be discussed: the aspartic and metallopeptidases.
Depending on whether they cleave the amino acid chain at the N or C terminus
or in the center, the peptidases are classified as amino-, carboxy-, or endopep-
tidases. Some of these proteases are relatively unspecific, whereas others are
highly specific and only cleave very particular substrates. These latter enzymes
have the best chances that a selective therapeutic inhibitor can be found causing
only few side effects. Bacteria and viruses have also produced their own pepti-
dases, the inhibition of which can be exploited for chemotherapeutic treatment.
Because these proteins are not endogenous in humans, and therefore also have no
function in us, their inhibition should lead to therapeutic success without risking
severe side effects.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_23, 493


# Springer-Verlag Berlin Heidelberg 2013
494 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

23.1 Serine-Dependent Hydrolases

Serine proteases are the most extensive and best-studied class of peptidases. They
are closely related to the esterases and lipases (hydrolases) that hydrolyze ester
bonds. This enzyme class serves the human body in diverse ways. Some
serine proteases, such as, the digestive enzymes trypsin and chymotrypsin, cleave
a broad spectrum of peptides and proteins. Others such as the coagulation enzymes
thrombin and factor Xa are highly selective and only cleave very particular
substrates. Frequently, proteases are expressed in a non-active precursor form, the
so-called zymogens. To transform these into their active form, in many
cases sequence segments of the zymogen polypeptide chain are cleaved that
otherwise serve as endogenous inhibitors of the activated enzyme. The release of
the active form can either occur by autocatalysis (e.g., trypsin) or by other activat-
ing proteases (e.g., the coagulation cascade). An active site serine side chain plays
a decisive role in the catalytic mechanism of serine proteases, esterases, and lipases.
It is characterized by an extraordinarily high chemical reactivity. In chymotrypsin,
only this serine reacts with diisopropylfluorophosphate (DFP), whereas 27 other
serine residues in the enzyme remain unmodified. Upon chemical transformation
with DFP, the enzyme completely loses its catalytic activity.

23.2 Structure and Function of Serine Proteases

The digestive enzyme chymotrypsin was the first serine protease for which the
3D structure was determined, by David Blow in Cambridge, England. The num-
bering of the amino acids in serine proteases of the chymotrypsin type is based on
the sequence of chymotrypsin. The spatial structures of a large variety of serine
proteases are now available, of which a few are listed in Table 23.1. The structures
show an extraordinarily pronounced similarity in the active site even for proteases
that have entirely different folding patterns (▶ Sect. 14.7, compare trypsin with
subtilisin). This so-called catalytic triad of Ser–His–Asp is characteristic of serine
proteases. In some of these enzymes, the aspartate can be replaced by a glutamate
whereas some transpeptidases and b-lactamases display a lysine in place of the
histidine in the active site.
As these three amino acids are very far apart from one another in the sequence,
the protein must fold appropriately to bring the three side chains into spatial
proximity to one another. The catalytic serine, found at position 195 in the
trypsin-like proteases, carries out the actual attack on the amide bond that is being
cleaved (Fig. 23.1). The oxygen atom of an unactivated hydroxyl group would not
be reactive enough for this step. Its nucleophilicity which describes its tendency to
attack an electron-poor carbonyl carbon atom, is enhanced by the neighboring
histidine side chain. The imidazole side chain of this histidine can accept
a proton from the serine hydroxyl group, enabling a nucleophilic attack of the now
negatively charged oxygen atom on the partially positively charged carbon atom of
23.2 Structure and Function of Serine Proteases 495

Table 23.1 Serine proteases with physiological importance (X ¼ arbitrary amino acid). The
3D structures of all listed enzymes are known.
Enzyme Cleavage site Function or therapeutic approach
Trypsin Arg–X, Lys–X Digestive enzyme
Chymotrypsin Tyr–X, Phe–X, Trp–X Digestive enzyme
Elastase Val–X Tissue degradation
Thrombin Arg–Gly Blood coagulation
Factor Xa Arg–Ile, Arg–Gly Blood coagulation
Factor VIIa Arg–Ile Blood coagulation
Tryptase Arg–X Asthma
Matriptase Arg–X Oncology
Urokinase Arg–X Oncology
DPP IV Ala–X, Pro–X Diabetes
Furin Arg–X Viral infection

Fig. 23.1 Catalytic mechanism of serine proteases. (a) The peptide substrate binds to the enzyme
in specific pockets on either side of the cleavage site. (b) The oxygen atom of the serine side chain
carries out a nucleophilic attack. This is fostered by the neighboring histidine side chain, which,
supported by an aspartate residue, accepts a proton from the hydroxyl group. (c) The transition
state collapses with formation of an acyl–enzyme intermediate. (d) This is hydrolyzed by the
attack of a water molecule to release the N-terminal cleavage product.
496 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

the amide carbonyl group. The neighboring aspartate can accept a proton from the
histidine imidazole ring, and release it again. In this way, it compensates the
positive charge that is formed on the histidine residue. To stabilize the transition
state formed upon attack on the carbonyl group, serine proteases have another
characteristic structural motif, the so-called oxyanion hole. This is a small pocket
next to the side chain of Ser195 composed of two main-chain NH groups (Fig. 23.1).
In a few cases, the terminal amide groups of asparagine or glutamine can
accomplish this task. The function of the oxyanion hole is to stabilize the negative
charge formed on the tetrahedral transition state and to distort the geometry of
the attacked carbonyl carbon atom from a trigonal-planar to a tetrahedral config-
uration. The formed transition state collapses with release of the C-terminal cleavage
product which carries a free amino group at its end. The N-terminal cleavage
product remains covalently bound to the protease to give an acyl–enzyme
intermediate. In a subsequent step, a nucleophilic attack by a water molecule
again leads to a tetrahedral transition state. This finally collapses with release
of the N-terminal cleavage product. The catalytic enzyme is then ready for the next
transformation.
What happens if the amino acids serine, histidine, and aspartic acid of the
catalytic triad of a serine protease are individually or collectively exchanged for
amino acids without similar functional groups? In 1988, Paul Carter and James
Wells prepared various mutants of the bacterial serine protease subtilisin
(▶ Sect. 14.7) at Genentech. Exchange of the catalytic serine or histidine for alanine
leads to a reduction in the catalytic activity by more than six orders of magnitude.
Surprisingly, exchange of the aspartic acid, the only function of which is to
exchange a proton with histidine, reduced the catalytic activity by more than four
orders of magnitude. The combined exchange of multiple amino acids of the
catalytic triad led to no further reduction in the catalytic activity. The threefold-
alanine mutant, in which the catalytic triad is completely removed, still cleaves the
peptide substrate more than 1,000 times faster than the pure buffer solution! The
substrate remaining binding sites and the oxyanion hole, the structure and proper-
ties of which stabilize the tetrahedral transition state, are responsible for this
acceleration.
Now it is certainly not difficult to destroy the binding site of an enzyme or its
catalytic activity. It is, however, more difficult to purposefully alter its specificity
or function. The subtilisin mutants in which the histidine was exchanged for an
alanine, cleave substrates with the sequence -Phe–Ala–X–Phe- (X ¼ Ala or Gln,
for example) six orders of magnitude more slowly than the unaltered subtilisin
with one exception: A substrate with the sequence -Phe–Ala–His–Phe- is cleaved
only four orders of magnitude more slowly. The histidine of the substrate
takes over the role of the histidine in the catalytic site to a certain extent!
This process is called substrate-supported catalysis. The transformation is indeed
still rather slow, but the specificity of this mutant is distinctly enhanced:
The -Phe–Ala–His–Phe- sequence is cleaved 200 times faster than any of the
other -Phe–Ala–X–Phe- sequences.
23.3 The S1 Pocket of Serine Proteases Determines Specificity 497

23.3 The S1 Pocket of Serine Proteases Determines Specificity

Proteases recognize polypeptide chains as substrates. For this task they use a series
of more-or-less-pronounced binding pockets on their surface, as described in
▶ Chap. 14, “Three-Dimensional Structure of Biomolecules”. These are structur-
ally and electronically complementary to the side chains of the substrate. As
a consequence, the polypeptide chain of the substrate will be immobilized on the
surface in the vicinity of the catalytic site. The crevices on the surface look very
different depending on the protease. Surface portions of four different serine pro-
teases from the trypsin family are shown in Fig. 23.2. A comparison of the different

Fig. 23.2 The surface of the trypsin-like serine proteases trypsin, thrombin, factor VIIa, and
factor Xa display deep pockets in the area of the catalytic site. To emphasize this surface
structure better, the color of the surface changes from blue to green to red with increasing
depth. The exposed physicochemical properties in the crevices determine the substrate selectivity
of the protease. The preferred cleavage sequences are indicated in the structures, in which XXX
represents an arbitrary amino acid at this position.
498 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Gly226 Gly226
Gly216 Gly216 NH2 Val216 Thr226

Ser189 Asp189 Ser189

Chymotrypsin Trypsin Elastase

Fig. 23.3 Comparison of the S1 pockets of chymotrypsin, trypsin, and elastase. The binding
pocket of chymotrypsin is tailored for large, lipophilic side chains. The S1 pocket of trypsin binds
amino acids with positively charged side chains through its negatively charged Asp189 residue.
Because of the spatial filling of the side chains of Thr216 and Val226, elastase has a relatively
small S1 pocket and therefore binds small hydrophobic amino acids such as alanine and valine.

serine proteases with different substrate specificities (Fig. 23.3) shows that, above
all, the structures of the S1 pockets of these enzymes are different. The S1 pocket is
largely formed of the sequence segments from 189–195 and 214–220. Significant
differences are unique to the side chains of the amino acid at the positions 189, 216,
and 226. In chymotrypsin, these are Ser189, Gly216, and Gly226. They tailor the
depth and form of this pocket to accommodate the aromatic side chains of the
amino acids phenylalanine, tyrosine, and tryptophan. Correspondingly, chymotryp-
sin preferentially cleaves peptide chains after one of these three amino acids.
Trypsin also has a deep, spacious S1 pocket that is flanked by Gly216 and
Gly226. The negatively charged carboxylate group of Asp189 on the floor of the
pocket is decisive for the recognition of long, positively charged side chains in the
amino acids lysine and arginine in the substrate. In elastase, the S1 pocket is shaped
by the amino acids Val216 and Thr226. Because of this, the pocket is significantly
smaller. It can only accommodate amino acids with short hydrophobic side chains
such as alanine and valine. Amino acids with large groups are no longer accom-
modated. The amino acid 189, a serine, is buried. The substrate specificity of the
described serine proteases is primarily achieved by the recognition of the amino
acid in the P1 position. The neighboring pockets, however, are also important for
substrate binding and selectivity. It is remarkable that the substrate-binding pockets
of serine proteases recognizing the N-terminal part of the substrate (unprimed side,
S1–S4 pockets; ▶ Sect. 14.5) are more prominently established. The pockets on
the unprimed side that anchors the C-terminal part of the substrate are much less
well developed. Because the N-terminal cleavage product remains temporarily
covalently bound to the protease as an acyl–enzyme complex, this part of the
substrate is bound particularly selectively.
These structural characteristics establish how a conceivable competitive inhib-
itor of a serine protease should look: It is decisive that the S1 pocket is filled as well
as possible. The chemical constitution of the parts of the inhibitor that bind in this
region must be complementary to the S1 pocket. In some cases, the occupancy of
the S1 pocket alone is sufficient to generate a selective serine protease inhibitor with
23.3 The S1 Pocket of Serine Proteases Determines Specificity 499

Fig. 23.4 The molecules


23.1–23.4 that bind in the S1
pocket of trypsin are NH
micromolar inhibitors. All of NH2
these molecules contain HN NH2 HN NH2 NH2
a strongly basic group that is
protonated under 23.1 23.2 23.3 23.4
physiological conditions;
therefore a positive charge is Trypsin: Ki = 18 μM 72 μM 380 μM 1500 μM
available to form a salt bridge Thrombin: Ki = 220 μM
to the negatively charged side
chain of Asp189. The COOH
thrombin inhibitors 23.5 and SO2F
O
23.6 contain an additional
functional group that can
form a covalent bond to the
catalytically active serine.

HN NH2 HN NH2

23.5 23.6
Thrombin Ki = 6.5 μM

Table 23.2 Reactive groups that can covalently react with the catalytically active serine.
Inhibitor type Functional group
Irreversible Chloromethylketone –COCH2Cl
Sulfonylfluoride –SO2F
Estera –COOR
Boronic acida –B(OR)2
Reversible Aldehyde –CHO
Ketone –COR (R ¼ Alkyl, –Aryl)
Trifluoromethylketone –COCF3
a-Ketocarboxylic acid –COCOOH
a-Ketoamide –COCONHR
a-Ketoester –COCOOR
a
Reversible as well as irreversible examples are known.

a respectable binding affinity. Accordingly, in 1967 Marcos Mares-Guia and Elliott


Shaw described small-molecule trypsin inhibitors with micromolar binding affinity
that only occupy the S1 pocket. It is not difficult to see that all the molecules
23.1–23.4 in Fig. 23.4 imitate the basic amino acids arginine or lysine in the P1
position of the substrate.
A first approach to the design of serine protease inhibitors could be based on
the search for an appropriate group for the occupation of the S1 pocket, to then
be coupled with a chemically reactive group that binds to the catalytic serine.
The different groups that have been described in the literature for this purpose
are summarized in Table 23.2. Even natural products follow this principle.
500 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

HO
Trp60D

O O
N
His57 Ser195 H N OH
N H
N
Tyr60A H O H
O––Loch O N
HN
O
HO
O
HN Ser195
H2N
+ NH
2
O − Cyclotheonamide A
Tyr228 O
Asp189

Asp189

Fig. 23.5 Crystal structure of the inhibitor cyclotheonamide with thrombin. The inhibitor forms
a covalent bond to the catalytic serine with its a-keto group to form a hemiketal structure. The now
negatively charged oxygen is stabilized by two hydrogen bonds in an oxyanion hole.

The macrocyclic pentapeptide thrombin inhibitor cyclotheonamide A from the


marine sponge Theonella sp. contains an a-keto function next to an amide bond.
As the X-ray structure shows, this ketone group forms a tetrahedral hemiacetal
structure with the OH group of the catalytic serine (Fig. 23.5).
If the sequence of the peptide substrate of the serine protease is known, the
N-terminal amino acid prior to the cleavage site can be coupled with one of the
groups from Table 23.2 to produce a compound that will most likely be an inhibitor.
An example of this is the elastase inhibitor N-(methylsuccinoyl)-Ala–Ala–Pro–Val-
CF3 (23.21 in Fig. 23.14), which is derived from the substrate sequence Pro–Val. In
favorable cases, the P1 equivalent alone is sufficient, for example, in the trypsin and
thrombin inhibitors 23.5 and 23.6, respectively (Fig. 23.4). However, the usually
high chemical reactivity of the functional groups in covalently binding serine
protease inhibitors, which is necessary to interact with the catalytically active
serine, can be problematic. Because of their reactivity, such groups can also
undergo undesirable reactions with serine residues from other enzymes and there-
fore cause side effects. The design of highly potent and selective inhibitors requires
at least the occupancy of the S2, S3, and S4 pockets. An additional structural
characteristic that all serine proteases share in common should also be mentioned:
Their substrates are bound to the peptide backbone via two antiparallel-oriented
23.4 Seeking Small-Molecule Thrombin Inhibitors 501

His57
N HN
O
N
H
O
H H H HN
N N N
N HO
H O O Ser195
O O

O
H
N
His57 HN
Gly216
P2 NH2
H2N +
Ser195 − O
O
Asp189
P3

-
O –hole
Gly216 P1

Asp189

Fig. 23.6 General binding mode of a peptide chain that is to be cleaved (gray carbon atoms) in
the catalytic site of a serine protease. The amide bond to be cleaved is shown in yellow.
The substrate’s P1 (light-blue) and P2 groups (green) are shown with a surface; they bind in the
S1 and S2 pockets of the protein. Two antiparallel-oriented hydrogen bonds (green) are formed to
the main chain. The H-bonds to the oxyanion hole are in purple, and the direction of the
nucleophilic attack of the Ser195 oxygen on the carbonyl carbon is indicated in blue.

hydrogen bonds. This orientation of the two hydrogen-bonding partners leads to


a pleated-sheet-like geometry. In most inhibitors, an attempt is made to imitate this
hydrogen-bonding pattern (Fig. 23.6).

23.4 Seeking Small-Molecule Thrombin Inhibitors

The serine protease thrombin plays a central role in the control of blood coagula-
tion. Thrombin is at the end of a complex, highly regulated cascade of serine
proteases. An injury to the arterial vascular system leads to the situation that
membrane-bound tissue factor that is found outside the vessel comes into contact
with the precursor of the serine protease, factor VII, in blood. The precursor is
activated to factor VIIa, and induces the coagulation cascade. Different factors are
released along the cascade, which are activated by proteases from the previous
step from their zymogen form. Finally, the cascade leads to the release of
“von Willebrand factor,” which binds to thrombocytes, and in doing so initiates
the formation of a blood clot. In addition to extrinsic activation, there is also an
intrinsic coagulation pathway. It is initiated by reduced blood flow or pathologically
altered vasculature. In this case, the coagulation cascade is started to form a platelet
aggregate, which is then stabilized by a fibrin network. Factor X is found in one of
502 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Table 23.3 Relative binding affinity of tripeptide aldehydes on thrombin. Arg–H is for the
aldehyde that was obtained by reducing the carboxylic acid of arginine. The larger the value of
the relative inhibition, the stronger the inhibitor binds to thrombin.
Peptide Relative inhibition
Gly–Val–Ar–H 1
Gly–Pro–Arg–H 9
Phe–Pro–Arg–H 57
D–Ala–Pro–Arg–H 469
D–Val–Pro–Arg–H 1273
D–Phe–Pro–Arg–H 7370

the last steps in which the two pathways merge. All of the different steps involve
proteases that represent conceivable target structures for a drug therapy. Until now,
in particular developments for the enzymes thrombin, factor Xa, and factor VIIa
have been tackled. This has already led to development candidates and marketed
products for the first two.
Thrombin transforms the inactive fibrinogen into reactive fibrin. It forms
a polymer together with aggregated platelets in which different blood cells are
trapped. A thrombus is formed, which is further cross-linked and stabilized
by transglutaminase factor XIII (Sect. 23.8). This is an essential protective
mechanism of the body to ensure wound closure. In particular diseases or
situations, for example, after surgery, after a heart attack, or to prevent stroke
in patients with atrial fibrillation, it is necessary to reduce the coagulation
capacity of the blood. For this reason, there is great interest in the development
of selective, and above all, orally available coagulation cascade inhibitors.
Thrombin cleaves fibrinogen between the amino acids arginine and glycine.
This sequence served as a starting point for the development of the first
synthetic thrombin inhibitors that therefore possessed either an Arg or an
Arg-analogous building block.
In this section, three different approaches for the development of thrombin
inhibitors shall be presented: substrate analogues, benzamidine, and structurally
significantly modified analogues.
One approach for the design of thrombin inhibitors is provided by the P3. . .P30
substrate sequence Gly–Val–Arg–Gly–Pro–Arg of fibrinogen. In the early 1970s,
the Japanese group of Hamao Umezawa established that peptide aldehydes with
C-terminal arginine residues that are isolated from bacteria are potent inhibitors of
some trypsin-like serine proteases. The tripeptide aldehydes that were investigated
by Sándor Bajusz were derived from the amino acids P3–P1 or P30 –P10 , that are the
three amino acids “before” and “after” the cleavage site. The relative binding
affinity of a few peptide aldehydes are summarized in Table 23.3. Interestingly,
the direct comparison of Gly–Val–Arg-H and Gly–Pro–Arg-H shows that a proline
in the P2 position inhibits thrombin about ninefold more strongly. The introduction
of phenylalanine instead of glycine in the P3 position leads to an additional
significant increase in the binding. Then, D-amino acids were investigated in
position P3. Surprisingly, these led to a dramatic improvement in binding affinity.
23.4 Seeking Small-Molecule Thrombin Inhibitors 503

Fig. 23.7 Comparison of the


binding mode of the
irreversibly binding thrombin P8
inhibitors D-Phe–Pro–Arg-
Trp60D
CH2Cl (dark-red carbon
atoms) with that of the P9
S2
fibrinopeptide derivative (gray
carbon atoms). Both inhibitors
bind with an arginine side
chain in the S1 pocket. The S2
pocket is occupied by a valine Glu192
S3
side chain of the fibrinopeptide.
Its additional peptide chain is
folded back so that the Leu and Glu215
Phe side chains in positions P8 S1
and P9 are oriented into the
lipophilic S3 binding pocket. In
the case of D-Phe–Pro–Arg-
CH2Cl the phenyl ring of the D-
Asp189
Phe is found in this pocket too.

This result was not expected if one considers that the substrate sequence from P5
to P3 Gly–Gly–Gly–Val–Arg contains only achiral glycine residues without
lipophilic side chains that can hardly form any interactions that would correspond
to the D-Phe side chain.
When the above-described work was carried out, the spatial structure of
thrombin had not yet been determined. Wolfram Bode and Milton Stubbs
managed to elucidate the structure of a thrombin complex with a chemically
activated fibrinopeptide, Gly–Asp–Phe–Leu–Ala–Glu–Gly–Gly–Val–Arg-CH2Cl.
This peptide corresponds to the N-terminal portion from P11 to P1 that
thrombin cleaves from fibrinogen. The comparison of this structure with that of
D-Phe–Pro–Arg-chloromethylketone (Fig. 23.7) provided an explanation for the
structure–reactivity relationship found by Sándor Bajusz. The S3 pocket is filled by
both ligands; in the case of the fibrinopeptide, it is achieved by the side chains of
leucine and phenylalanine in the positions P8 and P9. The peptide forms a b-turn
that enables the amino acids in this sequence to be positioned in the S3 pocket. The
same pocket is accessed by the tripeptide through the side chain of the D-amino acid
at position P3.
The compound D-Phe–Pro–Arg-H, synthesized by Bajusz is a high-affinity
thrombin inhibitor (Ki ¼ 75 nM). However, the compound proved to be chemically
unstable. This problem could be addressed by N-methylation of the free NH2
group. N-Methyl-D-Phe–Pro–Arg-H 23.7 (Gyki 14766/Efegatran, Fig. 23.8) is
chemically stable.
Jörg St€
urzebecher and Fritz Marquardt took a different route. They pursued the
goal of managing inhibition without a covalent attachment. Their approach was
based on the finding that aside from trypsin (Ki ¼ 18 mM), benzamidine 23.1
(Fig. 23.4, Sect. 23.3) also inhibits thrombin (Ki ¼ 220 mM). The combination
504 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

O
H
H3C N N O
N H
H
O O

NH

HN NH2
HN NH2 HN NH2

23.7 Gyki 14766, Efegatran Ki = 1,8 mM 23.8 23.9

Fig. 23.8 The inhibitor 23.7 (Gyki 14766, efegatran) contains an aldehyde group that binds
reversibly to Ser195. Compounds 23.8 and 23.9 are simple derivatives of the benzamidine that
non-covalently inhibit the enzyme.

of the benzamidine group with a reactive group from Table 23.2 gave potent
thrombin inhibitors. The first low-molecular-weight thrombin inhibitor that
was clinically tested in the 1970s was p-amidinophenylpyruvic acid 23.5
(Fig. 23.4, Sect. 23.3). The compound proved to be efficacious, but its selectivity
was unsatisfactory. The simple benzamidine derivatives 23.8 and 23.9 (Fig. 23.8)
are further typical representatives with micromolar affinity for thrombin, but
without selectivity compared to trypsin.
The coupling of the benzamidine groups with a peptide structure brought
significant improvement. Na-(b-naphthylsulfonylglycyl)-D,L-p-amidinophenylala-
nylpiperidide, 23.10 (NAPAP, Fig. 23.9) was the result of a more than 10-year-
long systematic search for potent and selective thrombin inhibitors. NAPAP was
the most potent representative of the class of low-molecular-weight thrombin
inhibitors (Ki ¼ 6 nM) for a long time, but it has only modest selectivity over trypsin.
In 1989, Wolfram Bode elucidated the crystal structure of thrombin with a bound
inhibitor at the Max Planck Institute for Biochemistry in Martinsried, Germany.
Initially the structure determination was accomplished with the irreversible inhibitor
D-Phe–Pro–Arg-CH2Cl and with NAPAP shortly thereafter. The 3D-structure of the
thrombin–NAPAP complex is shown in Fig. 23.10. The racemic form was used for
the co-crystallization. The result that the p-amidinophenylalanine binds to thrombin
as the D-amino acid was rather surprising. The substrate is composed of L-amino acids
only, therefore it was expected that p-amidinophenylalanine would also bind in the L
configuration.
The groups of the ligand that form polar interactions with the protein can be
directly deduced from the crystal structure. For NAPAP these are the glycine unit in
the center of the molecule (double hydrogen bond to the peptide backbone) and the
23.4 Seeking Small-Molecule Thrombin Inhibitors 505

23.10 Thrombin Trypsin


O O N rac-NAPAP Ki = 0.006 μM 0.69 μM
H
S N L-Napap Ki = 1.4 μM 25.5 μM
N O
H D-Napap Ki = 0.0021 μM 0.21 μM
O
Thrombin : Trypsin
NH2 1:100

NH

O CH3 O N
H3C O
H 23.11 CRC220
N
H3C S N Behringwerke
H Ki = 6 nM
CH3 O O
CO2H Thrombin : Trypsin
1:200

HN NH2

O O N
H 23.12 (racemic)
S N
N O IC50 = 15 nM
H Thrombin:Trypsin
O
MeO 1:600
NH2

NH

Fig. 23.9 The thrombin inhibitors NAPAP 23.10, CRC 220 23.11, the latter was developed at the
former Behringwerke, and 23.12 which was derived from 23.10. The two latter compounds have
distinctly better affinity to thrombin and improved selectivity relative to trypsin. The IC50 values
for 23.10 and 23.12 are given for the racemates. Inhibitor 23.11 was measured as an enantiopure
compound.

amidinium group in the S1 pocket for NAPAP. Omitting the positively charged
amidine group will result in a loss of binding affinity because the salt bridge to
Asp189 can no longer be formed. More recent work has shown however that chloro-
substituted aromatic rings can also bind in the S1 pocket and form a hydrophobic
interaction to Tyr228. Today, an arsenal of building blocks is available that can be
used as arginine side chain mimics to fill the S1 pocket of thrombin (Fig. 23.11).
With its naphthyl and piperidyl side chains, NAPAP largely fills the lipophilic S3
pocket and the spatially rather limited S2 pocket (Fig. 23.10). However, it seems as
if even larger substituents could fit in the S3 pocket. A weakness of NAPAP was its
inadequate selectivity compared to the digestive enzyme trypsin. Luckily, the
structures of NAPAP in complex with thrombin and also with trypsin are known
(Fig. 23.12). A comparison of the 3D structures shows that there is a significant
difference in the binding mode between the two enzymes in the S3 pocket that leads
506 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Lipophilic Binding Trp60D


Pocket
Leu99

S2

N
H Ser195
O S N S3
N O
O H
O
+
NH2 Trp215
O
H
N NH2 O– S1
O
Gly 216 O

Gly219 Asp189
Gly219

Asp 189

Fig. 23.10 Structure of the thrombin–NAPAP complex. The most important interactions are
outlined on the left side. The positively charged benzamidine group occupies the S1 pocket and
forms a salt bridge to the negatively charged side chain of Asp189. Two hydrogen bonds are
formed to the amino acid Gly216. The piperidyl and naphthyl groups together occupy the two large
lipophilic pockets S2 and S3.

NH NH NH NH NH N
H
X N
X S N N
N NH
NH
NH NH NH H2N H2N
H2N H2N NH H2N NH H2N
H2N
X=CH,N X=CH,N

NH NH NH NH N N NH
H H

HN N S
N X NH2
NH R NH2 NH2
Y N N N N
H H
Y=NH2,OH R=H,Me
X=NH,S

NH NH HN NH HN NH
HN
R
X
X N
S
N NH2 Cl
H N
N N H2N R = H,Cl, O-Alkyl
H2N
X=CH,N OH H2N X=O,S 5-Membered-Ring Heterocycle

Fig. 23.11 Numerous building blocks have been developed that bind as a mimetic for the
arginine in the thrombin’s S1 pocket.
23.4 Seeking Small-Molecule Thrombin Inhibitors 507

Fig. 23.12 Comparison of the 3D structures of trypsin (left) and thrombin (right), each in
complex with NAPAP. The active site in thrombin is further narrowed by an additional loop
from above. The depth of the pocket is, once again, color-coded (see Fig. 23.2).

to a 180 -flipped orientation of the naphthyl group about the bond to sulfur. In
thrombin, the S3 pocket is more pronounced and is surrounded by multiple lipo-
philic amino acid side chains. In trypsin the top end of this pocket is open, and is
spatially hardly restricted at all. Obviously its structuring is not necessary in the
largely unspecific digestive enzyme. Therefore, the selectivity can be increased by
occupying the S3 pocket of thrombin as optimally as possible. If the thrombin–
NAPAP complex is examined in more detail, it is apparent that an additional
methoxy substituent on the naphthyl ring should be suitable to enhance selectivity.
In fact, inhibitor 23.12 binds 600-fold more strongly to thrombin than to trypsin.
Compound CRC220 (23.11, Fig. 23.9), which fills the hydrophobic S3 pocket
much better than NAPAP was developed at the former Behringwerke in Marburg,
Germany. Because of this improved filling, CRC220 inhibits thrombin almost
200-fold more effectively than trypsin.
Another approach to searching for thrombin inhibitors was taken by the
researchers at Hoffmann-La Roche. Initially they concentrated on optimally filling
the S1 pocket. Benzamidine was known to be a weak thrombin inhibitor that
occupies the S1 pocket. It has, however, the disadvantage that it binds more strongly
to trypsin (Fig. 23.4). Accordingly, the researchers in Basel initially sought a small
molecule that binds more strongly to thrombin than trypsin. More than 200 small
molecules were tested in this narrowly focused search. Structures were chosen only
if their functional groups were able to interact with the negatively charged side
chain of Asp189. Guanidines, amidines, and amines were investigated.
N-Amidinopiperidine (23.13, Fig. 23.13) was identified as an interesting lead
structure. In contrast to benzamidine, amidinopiperidine binds more strongly to
thrombin (Ki ¼ 150 mM) than to trypsin (Ki ¼ 300 mM). A systematic derivatization
led to 23.14, a moderately active thrombin inhibitor (Ki ¼ 0.48 mM). Based on the
structural model with the protease, it appeared obvious that the replacement of the
glycine unit with a D-amino acid, for example, D-Phe, should fill a lipophilic pocket
508 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

23.14 R = H
H O
Ki = 0.48 mM
N N
S N
H 23.15 R = CH2Ph
HN NH2 O O R
N Ki = 0.047 mM
23.13
Ki = 150 mM HN NH2
23.16 R = CH2-(m-NO2-Ph)
Ki = 0.024 mM
O N COOH
O O
S 23.17 Napsagatran
N
H Roche
O N Ki = 0.27 nM
H
N

HN NH2

O O O
H
N 23.18 Ximelagatran (Exanta® )
EtO N N
HO H AstraZeneca
NOH Prodrug of Melagatran
NH2
NH2

O O

EtO N 23.19 Dabigatran (Rendix® )


N Boehringer Ingelheim
H
N N
N
NH2

N OHex
O O
O
H

N N 23.20

O H

NH2
HN

Fig. 23.13 One approach to the structure-based design of thrombin inhibitors began with 23.13 in
the S1 pocket. Compound 23.14 was derived from this lead structure. Its docking into the active
site of thrombin generated the idea for the synthesis of 23.15. The systematic variation of the side
chain R gave compounds with better binding affinity such as 23.16 and 23.17. The compound was
tested in depth in the clinic under the name napsagatran. The compound melagatran from
AstraZeneca was introduced as the double prodrug ximelagatran 23.18 as the first orally available
thrombin inhibitor on the market. It is derived from the tripeptide sequence D-Phe–Pro–Arg.
Another orally available inhibitor, dabigatran 23.19, was launched to market by Boehringer
Ingelheim. The tricyclic inhibitor 23.30, which was developed at the ETH in Zurich, foregoes
peptide character entirely.
23.4 Seeking Small-Molecule Thrombin Inhibitors 509

and lead to a distinct increase in the affinity. The compound was quickly prepared
and tested. In fact, 23.15 bound tenfold more strongly to thrombin. Other D-amino
acids were then investigated and additional affinity could be achieved. High
selectivity against trypsin was also encouraging; 23.16 binds 840-fold more
strongly to thrombin than to trypsin. The surprise was great when the 3D structure
of 23.14 in complex with thrombin was determined: The compound binds differ-
ently than predicted in the binding pocket! In contrast to the original assumption,
the naphthylsulfonyl group exchanged positions with the benzyl side chain.
The incorporation of a non-proteinogenic amino acid proved to be unfavorable
from a synthetic point of view. Therefore, other central building blocks were sought
that were synthetically more easily accessible. This work finally led to napsagatran
23.17, a highly potent and exceedingly selective substance. Because it is only
intravenously applicable, however, it never found its way to a marketed product,
particularly because argatroban, a marketed product for intravenous use, was
discovered much earlier and was already available.
The search for low-molecular-weight, orally available thrombin inhibitors
intensively occupied numerous large pharmaceutical companies for many years.
It took a long time until AstraZeneca introduced Ximelagatran (23.18, Fig. 23.13)
to the market as the first orally available thrombin inhibitor. The compound is
a double prodrug of the actual active substance melagatran. Its relation to the initial
parent structures (e.g., the tripeptide sequence D-Phe–Pro–Arg) is still quite appar-
ent. The head group of the arginine residue was replaced with a benzamidine, the
five-membered ring of the proline was narrowed to a four-membered ring, and the
terminal benzyl group was shortened to a cyclohexyl ring. The N terminus was
substituted with a methylenecarboxylic acid group. It proved to be extremely difficult
to make the thrombin inhibitors adequately bioavailable and to maintain the neces-
sary plasma level over an acceptable length of time. With regard to the bioavailabil-
ity, AstraZeneca, in collaboration with the group of Bernd Clement at the University
of Kiel, pursued a double prodrug strategy: The terminal acid function was masked as
an ester, and the benzamidine group was transformed into an N-hydroxyamidine. The
release of the active substance, melagatran, in the body is made possible by ubiqui-
tously present esterases and a set of three reductases. AstraZenecca withdrew
Ximelagatran (Exanta®) after 2 years because some issues with liver toxicity were
observed in a small number of cases after weeks of use.
Many years of thrombin research finally also led to success at Boehringer
Ingelheim. The compound dabigatran (23.19, Fig. 23.13) was introduced to the
market in the spring of 2008 for the prevention of stroke in patients with atrial
fibrillations. It also has a benzamidine anchor, and it has a pyridine group for the
hydrophobic S3 pocket. A benzimidazole building block with an attached amide
bond was chosen as a linker between these groups. As with ximelagatran, it uses
a carboxylic acid on the N terminus. It shows distinctly less peptide character than
the lead structures. A double prodrug strategy was also used for this substance to
make it adequately bioavailable. In addition to the esterification of the acid group,
the amidine group was masked as a carbamoyl moiety. The prodrug carries the
name dabigatran (Pradaxa® in the USA and Europe and Pradax® in Canada).
510 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

The group of François Diederich at the ETH in Zurich managed to develop


a thrombin inhibitor (23.20, Fig. 23.13) that foregoes the peptidomimetic character
entirely. Precise design in the binding pocket led to an inhibitor with a central
tricyclic moiety, which was readily prepared by a 1,3-dipolar-addition reaction.
With a benzamidine anchor for the S1 pocket and piperonyl moiety for the S3
pocket, this very rigid derivative advanced into the realm of nanomolar inhibitors.

23.5 Design of Orally Available Low-Molecular-Weight


Elastase Inhibitors

Human leukocyte elastase is a serine protease that is released in the lung to


destroy dead tissue and invading bacteria. The destructive potential of this
enzyme is normally controlled by a series of endogenous inhibitors, such as a1
protease inhibitor or leukocyte protease inhibitor. If the equilibrium between the
protease and inhibitor is shifted, for instance, because of a genetically caused
underexpression of an inhibitor or by toxic substances that are taken in with the
air, elastase also attacks healthy lung tissue. Cigarette smoke contains compounds
that oxidize an essential methionine side chain on the endogenous a1 protease
inhibitor and cause its deactivation. The chronic destruction of cells in the alveoli
leads to a life-threatening disease: emphysema.
A possible approach to the pharmaceutical treatment of this disease is therefore
the use of elastase inhibitors. In contrast to thrombin, elastase does not have a deep,
pronounced S1 pocket with an acidic amino acid, through which a polar contact to
a ligand can be made. It accepts only substrates with small hydrophobic amino acids
such as valine (Fig. 23.3). If a large binding contribution cannot be expected by
occupying the S1 pocket, as is in the case with thrombin, the catalytic serine itself
can be involved in the protein–ligand interaction by forming a reversible covalent
bond with the inhibitor. Such a concept was pursued at the former ICI (today part of
AstraZeneca) by starting with a trifluoromethylketone R–COCF3 as a reversible,
covalent-binding serine protease inhibitor. By starting from the substrate sequence,
potent elastase inhibitors were found such as 23.21 and 23.22 (Fig. 23.14).
ICI 200880 (23.22) proved to be an efficacious elastase inhibitor in clinical
trials, but it lacked oral bioavailability and had a short biological half-life. The
spatial structure of the related inhibitor Ac-Ala–Pro–Val-CF3 had been determined
in complex with elastase. The most important interactions between elastase and the
inhibitor are shown in Fig. 23.15. The inhibitor binds to elastase in a b-pleated-
sheet conformation in which two H-bonds to Val216 and one to Ser214 are formed.
The valine side chain fills the S1 pocket, and the carbonyl group binds covalently as
a hemiketal to the side chain of Ser195. The research concentrated on non-peptidic
structures with functional groups that are able to form the same interactions as the
peptidic inhibitors.
By starting with the 3D structure of the protein–ligand complex, pyridones were
chosen as the most promising peptidomimetic replacement. The postulated binding
mode of the pyridone compared to the binding mode of the peptidic inhibitors is
23.5 Design of Orally Available Low-Molecular-Weight Elastase Inhibitors 511

O O CH3
H H
N N N COCF3 23.21
MeO N
H
O CH3 O O

O
H
Cl N N COCF3
N 23.22
H H
N O O ICI 200880
S
O O O

Fig. 23.14 Elastase inhibitors 23.21 and 23.22 (ICI 200880) are substrate analogues. Compound
23.22 is a highly active compound, but it is not orally available.

Ser214

O C H3 F 3C
H O Ser195
N N
{ N
O H–N
H
O O
H
N
O H
N

Val216 Ser214

O F3C
H O Ser195
N N
Fig. 23.15 Comparison of { N
the binding mode of the H O H–N
elastase inhibitor Ac-Ala– O O
Pro–Val-CF3 with the H
postulated binding mode of N
the pyridone moiety (e.g., O
23.23, Fig. 23.16). Both H
N
compounds should be able to
form a double H-bond to
Val216. Val216
512 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

shown in Fig. 23.15. Compounds of this chemotype were synthesized at


Zeneca (now AstraZeneca) and in fact proved to be very effective elastase
inhibitors. Compound 23.23 (Fig. 23.16) binds to the protein with Ki ¼ 5.6 nM.
This compound, however, has multiple unfavorable properties. It is not orally
available and inhibits chymotrypsin (Ki ¼ 60 nM) in addition to elastase. Poor oral
bioavailability was attributed to excessive lipophilicity (log P > 4), which led to
low water solubility.
The pyrimidone class in which a carbon atom of the heterocycle was exchanged
for a nitrogen atom seemed to be synthetically simpler and therefore more broadly
variable. Compound 23.24 is less lipophilic (log P ¼ 2.1) than 23.23, ten times
more water soluble, and orally available. Its binding to elastase proved to be
practically unchanged (Ki ¼ 6.6 nM), whereas the chymotrypsin inhibition was
much less pronounced (Ki ¼ 1,000 nM). Numerous representatives of the new
substance class were synthesized, and their inhibitory effects and bioavailability
were tested. In doing so it was shown that the strength of the inhibitory effect and
the in vivo activity did not run parallel to one another. For example, 23.25 is
a highly potent elastase inhibitor in the enzyme test but is not orally available.
With an oral bioavailability of 60–90%, compound 23.26 (Ki ¼ 100 nM) proved to
be optimal in the animal model. The crystal structure with an analogous derivative
23.27, which carries only an additional sulfonamide group, confirmed the expected
binding mode (Fig. 23.17).
The Japanese company ONO Pharmaceuticals Co. developed compound 23.28,
which was derived from 23.26 and carries a 1,3,4-oxadiazole ring in place of the
trifluoromethyl group on the ketone and an unsubstituted phenyl ring on the
pyrimidone. However, the development of ONO-6818 was discontinued in clinical
phase II because of abnormally elevated liver enzyme levels. Nevertheless, ONO
had success with ONO-5046, 23.29, which was developed under the name sivelestat
(Elaspol ®; Fig. 23.16). This inhibitor reacts with elastase specifically and reversibly
acylates the catalytic serine.

23.6 Serine Protease Inhibitors, Thrombin Was Just


the Starting Point

Factor Xa and VIIa occur along the coagulation cascade prior to thrombin and are
investigated as targets for antithrombotics. They both have an aspartic acid on the
bottom of their deep S1 pockets, similarly to thrombin. Moreover, a narrow
and deep S3 pocket that is flanked by aromatic amino acids (Tyr99, Trp215, and
Phe174) is specific to factor Xa. Therefore, this pocket is ideally suited for
aromatic groups on inhibitors. As already mentioned, in the mid-1990s a dogma
prevailed that the S1 pocket of trypsin-like serine proteases could only accept
groups with basic character. The binding of chloro-substituted aromatic portions,
however, could be demonstrated for thrombin at Merck & Co. in the USA. These
groups made a breakthrough for factor Xa inhibitors. Highly potent inhibitors could
be developed with chlorophenyl, chloronaphthyl, or chlorothiophene groups
23.6 Serine Protease Inhibitors, Thrombin Was Just the Starting Point 513

R
O O
N CF3 23.23
O N N R = Phenly
H H
O O Ki = 5.6 nM

N R
O O
N CF3 23.24 R = Phenyl
O N N
H H
O O Ki = 6.6 nM

N R
O O O
S N CF3 23.25 R = p -F-Phenyl
N N N
H H H Ki= 1.6 nM
O O

N R
O
23.26 R = p -F-Phenyl
N CF3
H2N N
H Ki= 100 nM
O O

N R
O O O
S N CF3 23.27 R = p - NH2-Phenyl
N N
H H
O O Ki= 15 nM

N Ph
O N N
23.28 ONO-6818
N
H2N N O
H
O O

O O
S 23.29 Sivelestat
O N
H ONO -5046
OH
O O N
H
O

Fig. 23.16 Design of orally available elastase inhibitors at Zeneca. The original idea to replace
the Ala–Pro unit with a pyridone afforded 23.23. Later, pyrimidinones were overwhelmingly inves-
tigated. An additional nitrogen atom has been added to the heterocycle. Very potent compounds (e.g.,
23.25) are found in this class. Compound 23.26 has the best in vivo properties. The p-fluorophenyl
group (in 23.26) or the p-aminophenyl group (in 23.27) increases the lipophilic contact with the
enzyme. The compound ONO-6818 23.28 was developed in Japan all the way to clinical trials, where
it was discontinued due to abnormally elevated liver values in the treated patients. Another compound,
23.29, was clinically tested under the name sivelestat (ONO-4056). These compounds specifically
transfer an acyl group to the catalytic serine and reversibly block the enzyme.
514 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Fig. 23.17 The crystal


structure of 23.27 (Fig. 23.16) Ser214
in complex with elastase. The Ph215
inhibitor forms two H-bonds His57
to Val216, and one H-bond to
Ser214. Furthermore, the
oxyanion hole is occupied by Ser195
an oxygen atom.
Arg217A

Val216

O– hole

protruding into the S1 pocket. So much affinity was gained by the accommodation
of additional groups in the deep, aromatic S3 pocket that these compounds bind to
the protease with single-digit-nanomolar values without the benzamidine anchor.
Such a derivative was introduced by AstraZeneca (23.30, Fig. 23.19). In addition to
the development of compounds with chloro-substituted aromatic rings for the S1
pocket, other inhibitors with benzamidine groups were also synthesized as factor
Xa inhibitors. However, it is much more difficult to achieve adequate selectivity
compared to other trypsin-like serine proteases with these derivatives. Furthermore,
they showed similar problems with regard to bioavailability as the thrombin
inhibitors. Bayer HealthCare introduced a new factor Xa inhibitor to the market
in September 2008, rivaroxaban (Xarelto ®; 23.31, Fig. 23.18 and 23.19), that places
a chlorothiophene group in the S1 pocket.
Other companies are working on inhibitors with comparable chloroaromatic
groups to fill the S1 pocket. The subnanomolar-binding inhibitor apixaban 23.34
from Bristol-Myers Squibb has recently been approved for market. It foregoes the
halogen group for the interaction in the S1 pocket completely. As a crystal structure
shows, the p-methoxy group displaces the water molecules that are almost always
in the S1 pocket. The previously mentioned compounds that lack a benzamidine
group show better bioavailability and good selectivity for factor Xa. Attempts were
also undertaken to develop dual inhibitors for thrombin and factor Xa. By inhibiting
both proteins together, not additive, but rather synergistic antithrombotic effects are
achieved that hopefully expand the therapeutic scope.
Factor VIIa is at the beginning of the extrinsic path of the coagulation cascade.
This enzyme also belongs to the family of trypsin-like serine proteases, and specific
inhibitors have been sought for this enzyme for years. In this case, the activation of
the protease is interesting. In cases of injury, blood comes into contact with tissue.
When this happens, factor VIIa and the membrane-bound tissue factor can form
a complex that causes a change in the conformation of the protease’s catalytic
domain. A peptide segment next to the catalytic center goes from being in an
unfolded conformation into a helical structure. This leads to a change in the
geometry of the catalytic site. Only in the complexed state does the protease have
23.6 Serine Protease Inhibitors, Thrombin Was Just the Starting Point 515

Fig. 23.18 Crystal structure


of rivaroxaban 23.31
(Fig. 23.19) in factor Xa. The
inhibitor’s chlorothiophene
group binds in the deep S1
pocket, at the end of which
Tyr228 and Asp189 are
found. The chlorine atom
forms interactions with the
aromatic rings. The phenyl
ring and the terminal lactam
ring of the inhibitor are found
in the S3 pocket, which is
enclosed by the three
aromatic groups of Tyr99,
Phe174, and Trp215.

a structure that allows the coagulation cascade to be initiated. Although numerous


nanomolar inhibitors are available by now, none of them has been able to forego the
basic group on the P1 aromatic ring.
In addition to the serine proteases of the coagulation cascade, other proteases
of this family have been chosen for drug development. The drug design for these
target enzymes has benefitted strongly from the experience gained with the
thrombin inhibitors. The concepts learned there are well transferred to the special
conditions of these proteins. Tryptase, urokinase, and matriptase belong to this
family. Tryptase inhibitors are being investigated for the treatment of asthma, and
the two others are target structures for possible cancer therapeutics. Tryptase is
a tetramer with four trypsin-like catalytic sites. These sites are separated from one
another by several angstroms. To develop selective inhibitors, compounds were
designed that carry two benzamidine-like anchor groups and a long enough bridge
to link them together. In this way, two of the four sites in the tetrameric tryptase
molecule are simultaneously blocked. The disadvantage of this design concept is
that the developed inhibitors are very large. They are well over the molecular
weight limit of 600 Da, which should not be exceeded for good bioavailability.
The protease furin also belongs to the family of serine proteases; however, it
adopts the folding of the subtilisin family (▶ Sect. 14.7). It is involved in the
maturation of proproteins. In this way, the envelope proteins of viruses are cleaved to
transform them into their active form. Its involvement in the “arming” of viruses was
even reported upon in the tabloid press: In the BILD-Zeitung in Germany on August
28, 2003, furin was referred to as the “the most brutal protein in the world” that
“makes epidemics a deadly danger for humans and acts like a detonator on a bomb.”
Furin and other closely related subtilases cleave particular basic tetrapeptide
sequences C-terminally: Arg–X–(Arg/Lys)–Arg-. Many glycoproteins of lipid-
enveloped viruses are cleaved at this recognition sequence and consequently acti-
vated. An example of this is the highly pathogenic avian influenza viruses that contain
such cleavage sequences in hemagglutinin, one of the surface glycoproteins.
516 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

O
N N
N

N Cl
23.30
S
O
O

O
O N N H 23.31 Rivaroxaban
N Cl (Xarelto®)
S
O
O
O

N N NH2 23.32 Apixaban

N
O O N

O
CH3

Fig. 23.19 Three potent factor Xa inhibitors. The chloroaromatic rings of the two first examples
bind in the S1 pocket of the enzyme. Compound 23.30 was developed by AstraZeneca.
Rivaroxaban 23.31 was introduced to the market in 2008 by Bayer as the first orally available
factor Xa antithrombotic. Apixaban 23.32 from BMS binds at the subnanomolar level with its
methoxysubstituted aromatic rings in the S1 pocket of factor Xa.

Whether the viruses can be activated depends on the availability of the ubiquitously
occurring furin, and this is a prerequisite for the high pathogenic potential of the avian
influenza viruses. Other genetic combinations or prerequisites must be fulfilled to
convert these viruses to dangerous pathogens for animals and humans. Inhibitors of
furin could then contribute to the suppression of “arming” of these viruses. However,
the translation of such highly charged substrates into inhibitors that meet the common
rules for good bioavailability is a tremendous challenge.
In the early 1990s an interesting observation was made that the incretin
hormones GIP and GLP-1, which stimulate the pancreas to release insulin after
eating, are substrates for dipeptidylaminopeptidase IV (DPP IV). They are quickly
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 517

Fig. 23.20 Sitagliptin 23.33, F NH2 O


vildagliptin 23.34, and F 23.33 Sitagliptin
Sitagliptin(MK-0431)
N N
saxagliptin 23.35 are Merck
N
inhibitors of the serine amino N
peptidase DPP IV for the
F CF3
treatment of type-II diabetes.
O
H
N
N 23.34 Vildagliptin (LAF 237)
Novartis

OH N

H 2N H
H
N 23.35 Saxagliptin (BMS-47718)

O
OH N

degraded by this serine aminopeptidase. Because incretins were already interesting


candidates for a diabetes therapy, the idea immediately occurred that the inhibition
of DPP IV could be used as a principle for treating type-II diabetes (non-insulin
dependent). The membrane-bound protease cleaves dipeptides from its substrate
when a prolyl or alanyl group is in the second position from the N terminus.
Sitagliptin (Januvia®) 23.33 (Fig. 23.20) was approved in 2006 for the treatment of
type-II diabetes. It blocks the protease without invoking a covalent coupling.
Recently two additional compounds, vildagliptin (Galvus®) 23.34 and saxagliptin
(Onglyza®) 23.35, became available for clinical use. Both use a proline-derived
cyanopyrrolidine that binds reversibly and covalently to the catalytic serine.
We can only wait and see whether other active substances come to the market in
the next few years for the many currently worked-on serine proteases. The field has
increasingly benefitted from the experience that has been collected on the individ-
ual members of this protein family, so that lead structures can be quickly discovered
for new target structures.

23.7 Serine, a Favored Nucleophile in Degrading Enzymes

Serine peptidases use the OH group of an endogenous serine as an attacking


nucleophile. The neighboring histidine residue mediates the temporary proton
transfer, and the aspartate compensates for the intermediately occurring charge on
the imidazole ring of the histidine. A special feature is, however, the temporary
covalent bond between the N-terminal part of the substrate and the enzyme. Many
other hydrolytically cleaving enzymes use an analogous principle. Esterases and
518 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

lipases also have a catalytic triad. Occasionally an aspartate is exchanged for


a glutamate in these enzymes. The neurotransmitter acetylcholine acts on many
synapses in the vegetative nervous system and is involved in the transmission of
nerve impulses. It binds to the nicotinic acetylcholine receptor, among others and
activates this ion channel (▶ Sect. 30.4). Acetylcholine must be removed to limit
the duration of the transmission process and to reset the receptor to the starting
point. An imbalance in this nerve impulse transmission system leads to acute and
chronic movement disorders. Acetylcholinesterase is responsible for the degrada-
tion of acetylcholine. Inhibitors of this enzyme are used to treat shaking palsy
(Parkinson’s disease). More recent research has also explored their potential for the
treatment of Alzheimer’s disease.
Acetylcholinesterase has a catalytic triad made of serine, histidine, and gluta-
mate. Acetylcholine (3.46, ▶ Fig. 2.11) is cleaved by the enzyme in that its acetyl
group is transferred to the catalytic serine; hydrolysis releases acetic acid slowly
from the esterase. The drug (S)-rivastigmine is also attacked by the catalytic serine,
and its carbamoyl group is transferred. Because of the enhanced stability of the
carbamoyl–enzyme complex, the esterase is then only very slowly deacylated and
regenerated for the next transformation. This causes an inhibition of the target
enzyme for several hours. Cholinesterase inhibitors are also used as insecticides.
Active substances such as paraoxon 23.37 (Fig. 23.21), parathione (E605) 23.38,
propoxur 23.39, or malathione 23.40 contain phosphoric acid esters or thioesters
that are virtually irreversibly transferred to the catalytic serine. Because of this
inhibition, acetylcholine increases to lethal concentration in insects.
Analogously to the esterases, lipases also hydrolyze ester bonds. Serine,
histidine, and aspartate or glutamate serves as the catalytic triad. Pancreas lipase
cleaves triglycerides during the digestion of fats. Inhibitors of this enzyme, which is
present in the intestines, are used to treat obesity. A significantly reduced
absorption of fats and their cleavage products is the result. Orlistat (Xenecal ®,
23.41; Fig. 23.22), a synthetic hydrogenation product of the natural product
lipstatin, carries a reactive b-lactone ring at its core in addition to a very long
aliphatic side chain. Serine in the catalytic site of the lipase attacks the carbonyl
group of the lactone ring and opens the strained ring by transforming to a stabilized
acyl–enzyme complex. Once blocked, the enzyme is no longer able to cleave
triglycerides, which translates to a reduced ability to extract calories from food.
Lipases are often used for the kinetic resolution of racemates. This is usually
achieved by simply transforming a racemic mixture of esters in which one of the
two forms reacts faster than the other. An example was described in ▶ Sect. 5.4 in
which the lipase was used not only to hydrolyze but also to form a new amide bond.
For this, the intermediate acyl–enzyme complex cannot be exposed to a water
molecule as a nucleophile, but rather a compound with a free amino group must
be available. This transformation produces a new amide bond. Bacteria use such
a transpeptidase reaction for the construction of their cell wall. This has
a completely different composition than that of humans. Therefore the enzymes
used for cell wall synthesis are bacteria-specific and particularly well suited as
a target for a side-effect-poor drug therapy.
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 519

O
H3C
O N CH3 + HO Ser 23.36 S-Rivastigmine
N CH3
H3C CH3

H3C N O Ser
H3C
OH
CH3
N
H3C CH3

X
O OEt 23.37 X=O Paraoxon
P
23.38 X=S Parathione, E605

O2N OEt

CH3 23.39 Proxopur

H3C O
H
O N
CH3
O
O

S OEt 23.40 Malathione


P OEt
MeO S
OMe
O

Fig. 23.21 (S)-Rivastigmine 23.36 transfers a carbamoyl group to the catalytic serine in the
binding pocket of acetylcholine esterase and blocks its function because the carbamoyl–esterase
complex decomposes very slowly. The acetylcholine esterase inhibitors paraoxon 23.37,
parathione 23.38, propoxur 23.39, or malathione 23.40 are phosphoric acids, thiophosphoric
acids, or carbamine esters and are used as insecticides. They also react with the catalytic serine
and form a stable covalent bond.

Cross-linking of the peptidoglycan strands is accomplished in the last step of the


cell wall biosynthesis. For this, the terminal amino group of a pentaglycine chain
attacks between two D-alanine residues of another peptide unit. The bond between
D-Ala–D-Ala is cleaved, and a new peptide bond between D-Ala and glycine is
formed. The cross-linking is mediated by a glycopeptide transpeptidase. It has
a catalytic machinery that is very similar to the serine proteases. In addition to
a catalytic serine, a lysine and a glutamate are also found in the reaction center, and
520 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Fig. 23.22 Orlistat 23.41 is NHCHO


a synthetic hydrogenation
product of the natural product O
lipstatin, which has two O O O
additional double bonds. It
has a reactive b-lactone ring
that reacts with the catalytic 23.41 Orlistat
serine in the catalytic site of NHCHO
pancreatic lipase to form an
acyl–enzyme complex with O O OH O
ring opening. The enzyme’s
function is then blocked. Ser152
O

H
S Enzyme-Ser RNH H S
RHN
N
O COOH O N
O H COOH
23.42 R=H, Aminopenicillic acid Ser
23.43 R=PhCH2(C=O),Penicillin G
23.44 R=PhOCH (C=O) Penicillin V
2
H
S RNH H
RHN S
Enzyme-Ser
N O HN
O CH2R' O CH2R'
COOH COOH
Ser
23.45

Fig. 23.23 In the last step of the bacterial cell wall synthesis, a glycopeptide transpeptidase
cleaves the bond between two D-Ala–D-Ala groups and forms a new bond between D-Ala
and a glycine in a peptidoglycan strand. Lactam antibiotics of the penicillin (23.42–23.44) or
cephalosporin type (23.45) can block this step. The penicillin scaffold (green) is reminiscent of the
D-Ala–D-Ala group (orange) and is bound analogously by the enzyme. An irreversible inhibition of
the transpeptidase is achieved by a nucleophilic opening of the lactam ring with the help of the
catalytic serine.

an oxyanion hole is also present. Penicillins 23.42–23.44 and cephalosporins 23.45


(Fig. 23.23) inhibit these transpeptidases. They have a spatial structure that is
analogous to the D-Ala–D-Ala dipeptide, and are therefore recognized as a “false”
substrate (Fig. 23.23). The b-lactam ring is opened by the attacking catalytic serine,
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 521

and an irreversible covalent coupling to the enzyme results. The cross-linking of the
glycan strands is prevented, and the newly synthesized cell wall does not achieve
adequate stability. It cannot withstand the osmotic pressure of the cell contents, and
the bacterial cell is killed as a consequence.
Of the first penicillins that were discovered by Alexander Fleming (▶ Sect. 2.4),
only benzyl 23.43 and phenoxymethylpenicillin 23.44 still have clinical importance
(Fig. 23.23). The residues on the 6-amino group of penicillic acid were exchanged
to improve the pharmacokinetics, activity spectrum, and acid stability. Electroneg-
ative atoms on the a-carbon atom of the acyl function increase the stability with
respect to acid-catalyzed decomposition and contribute to an improvement in the
oral bioavailability.
Bacteria quickly develop resistance to penicillins. They use lactamases, which
are enzymes that are structurally related to the transpeptidases. Four classes of
lactamases are known, of which three have a catalytic serine in their active sites.
A further class belongs to the zinc-dependent metalloenzymes (▶ Chap. 25, “Inhib-
itors of Hydrolyzing Metalloenzymes”). The catalytic serine of even the
b-lactamases is acylated by penicillins and related cephalosporins (Fig. 23.23). Up
until this step, the mechanism in the transpeptidases and the b-lactamases is iden-
tical. However, transpeptidases form very stable acyl–enzymes, whereas the cova-
lent intermediate of the b-lactamases is quickly hydrolyzed. The antibiotic to
deactivate the transpeptidase is therefore rendered inactive. b-Lactamases are prob-
ably descendants of the transpeptidases. They are widespread in nature and have
evolved out of the competition between bacteria and molds. The resistance gene for
b-lactamases is easily transferred between bacteria because the information is stored
on an extrachromosomal plasmid. Such plasmids are transferred very quickly.
How are the b-lactamases different from the transpeptidases so that they are able
to quickly dispose of the covalently bound ring-opened penicillin? The release
requires a hydrolytic cleavage from the protein. For this a well-placed water
molecule in the active site is needed that can initiate the nucleophilic attack on
the acyl–enzyme species. Although the structural architecture of transpeptidases
and b-lactamases is very similar, the sequence identity is small. Nonetheless,
a transpeptidase has been equipped with the hydrolytic properties of a lactamase
by selective mutagenesis! Only a few amino acid exchanges were needed for this.
Above all, the hydrophobic amino acids such as phenylalanine and tryptophan are
the ones that protect the acyl–enzyme complex from hydrolysis in the
transpeptidase. In contrast, polar amino acids such as glutamic acid (Fig. 23.24,
Glu166) are found in the same positions in the lactamases. In contrast to the
transpeptidase’s hydrophobic amino acids, these anchor and activate a water mol-
ecule in the correct orientation for nucleophilic attack on the acyl–enzyme complex
in the lactamases. As a result, the covalent complex with the penicillin cleavage
product that was formed by ring opening in the lactamases is hydrolyzed, but it
remains stable in the transpeptidases.
How can this lactamase-caused resistance be broken, and the degradation
of penicillins stopped? Unsubstituted penicillic acid 23.46 is quickly cleaved by
TEM-1b-lactamase (Fig. 23.24). Based on structural considerations it was proposed
522 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

Fig. 23.24 Unsubstituted O O


O
penicillic acid 23.46 is quickly a COOH
N –
O
cleaved by TEM-1b-lactamase H H OSer
S
(a). By adding H N
OH2
a hydroxymethyl group to the 23.46 Penicillic acid
6-position of 23.47 S
a compound is obtained that
forms a hydrolytically stable Arg244
acyl–enzyme complex with the
enzyme (b). A new crystal
structure was determined with
this compound (b, lower
picture). The hydroxyl group is
found at the position where the
Ser70
water molecule (orange
sphere) starts its nucleophilic
attack on the acyl–enzyme Asn170
intermediate (a, lower part,
modulated structure with the Glu104
coordinates from the crystal Glu166
structure of the complex with
23.47). The hydrophobic
amino acids such as b COOH O O
O
phenylalanine and tryptophan N
–O
are found at positions 166 and H
S H OSer
170 of the transpeptidases, N
OH OH
which are structurally related 23.47
S
to the b-lactamases.

Arg244

Ser70

Asn170

Glu104
Glu166

that a hydroxymethyl group should be added to the 6-position. This group should be
located in exactly the position from where the water molecule would start its
nucleophilic attack on the acyl–enzyme form. In fact, derivative 23.47 inactivates
TEM-1b-lactamase. A water molecule was detected in the vicinity of the CH2OH
group in the subsequently determined crystal structure, but it is too far away to
successfully hydrolyze the acyl–enzyme. The hydroxyl group therefore blocks the
attack of a water molecule on the ester carbonyl group of the acyl–enzyme.
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 523

Fig. 23.25 b-Lactamase- OH OH


H H R1
resistant antibiotics of the
S
penem and carbapenem type.
R R2
Imipenem 23.48 and N N
meropenem 23.49 are derived O O
COOH COOH
from the carbapenem type.
The natural product Basic Penem structure Basic Carbapenem structure
clavulanic acid 23.50 opens
its lactam ring and forms an 23.48 Imipenem R1=H, R2=S-CH2-CH2-N=CH-NH2
acyl–enzyme complex with O
the serine residue. 23.49 Meropenem R1=CH3, R2= S
A hydrolysis-resistant vinyl NMe2
N
urethane analogue is formed H H
by a rearrangement. O OH
O OH
N
O HN
COOH O
OSer
23.50 Clavulanic acid COOH

The incorporation of such a hydroxymethyl group has been undertaken in


important b-lactamase-resistant b-lactams such as imipenem 23.48 or meropenem
23.49 (Fig. 23.25). b-Lactamases can also be irreversibly inhibited. If such an
inhibitor is administered with a penicillin, the degradation of the penicillin by the
lactamase is blocked, and it is available to inhibit the transpeptidase. The natural
product clavulanic acid 23.50 forms an acyl–enzyme complex upon the opening of
its lactam ring. By rearrangement a vinylogic urethane is formed that is resistant to
hydrolysis.
With these examples, the spectrum of enzymes that use a serine as a nucleophile
is nowhere near exhausted. Viruses need cleavage enzymes. They must cleave the
polypeptide chains that were synthesized by the infected cell according to their own
specifications into functional viral proteins. Viruses either use proteases of the
infected host cell (cf. furin) or they employ their own viral proteases. Because
error-free functioning of the latter enzymes is essential for the maturation of the new
viruses and is also virus-specific, these proteases are privileged targets for drug
development. Peptidases with a catalytic serine as well as cysteine (Sect. 23.8) are
recognized. As we shall see in ▶ Sect. 24.3, an aspartic protease serves other viruses.
The assemblins, another group of serine peptidases, have been found in herpes
viruses. The enzyme from the cytomegalovirus belongs to this group as similarly
those from the varicella zoster virus and the herpes simplex virus. These proteases
also use a serine and a histidine. An additional histidine forms the third amino acid
in the triad. Despite a different folding pattern, this triad spatially fits very well to
the trypsin triad. Even the oxyanion hole is present in this viral protease.
The hepatitis C virus belongs to the group of enveloped RNA viruses. In addition
to other viral proteins, its genome contains the sequence for a serine protease
(HCV-NS3/4A). It is needed for the cleavage of the initially produced polypeptide
chain into functional viral proteins. Therefore the inhibition of this protease
represents a therapeutic approach for fighting hepatitis infections. The infections
524 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

can easily become chronic and can lead to severe liver damage as well as liver
cirrhosis and hepatocellular carcinoma. Orally bioavailable inhibitors of these
serine proteases such as telaprevir from the company Vertex have recently been
launched to market.
The carboxy serine peptidases represent another group that is folded analogously
to subtilisin (▶ Sect. 14.7). They have a triad of serine, glutamate, and aspartate.
A member of this family was recently discovered on the human cnl2 gene. Mutations
in this gene lead to severe neurodegenerative diseases. An oxyanion hole is also
found in this enzyme, to which, interestingly, an aspartate contributes. It is only in the
protonated state, however, that it can fulfill its task as a hydrogen-bond donor and
negative-charge stabilizer in the transition state. Because the enzymes of this family
are active in a pH range of 3–5, the requirements for being protonated are fulfilled.
There could be many more cleavage enzymes that use a catalytic serine to
discover. We can only wait and see which of the discovered peptidases will be
singled out for pharmaceutical development. Their catalytic machinery adopts the
same spatial architecture in all examples. Therefore, the general principles can be
transferred between the individual members of the family.

23.8 Triads in All Variations: Threonine as a Nucleophile

Aside from serine, another amino acid also carries an aliphatic-OH group: threo-
nine. This amino acid can also be catalytically active in a protease. The proteasome
represents the central protein-shredding machine for the cell and cleaves proteins
that have been marked with ubiquitin into small oligopeptides containing between
3 and 20 amino acids. The ubiquitin label is itself a highly conserved protein with
76 amino acids. As a cellular shredding machine, the proteasome plays a central
role in the metabolism of proteins, cell growth, and cell demise. Therefore it is an
important target structure for the treatment of cancer. It is a multiprotease complex
composed of more than 30 proteins and is found in the cytoplasm as well as the
nucleus (Fig. 23.26). The proteasome is constructed like a large barrel with two lid
regions that have regulatory function; these regions control the entry of substrates
into the shredder. A threonine is found in the catalytic sites of the proteases, which
have chymotrypsin-like, trypsin-like, and peptidyl-glutamyl-peptide-like substrate
specificity. The OH group of this threonine adopts the role of the nucleophile.
A neighboring, positively charged lysine and a balancing aspartate reinforce its
nucleophilic strength. Because the threonine is the first amino acid at the
N terminus, it carries a free amino group. This serves as a proton acceptor in the
mechanism. Two serines and one aspartate group contribute to the stabilization of
the transition state and complement the nucleophilic center.
The Millenium Pharmaceuticals company, which was founded as an academic
research institute, introduced bortezomib 23.51 (Velcade ®) to the market in 2006;
this was the first active substance that blocks the threonine protease function of the
proteasome. Chemically, bortezomib is a boronic acid derivative (Fig. 23.26). The
inhibitor reacts with the threonine of the catalytic triad to create a covalent
23.8 Triads in All Variations: Threonine as a Nucleophile 525

O OH
H
N N B
N OH
H
O
N

23.51 Bortezomib

Gly47

Thr1
B

Thr22
Lys33

Fig. 23.26 The proteasome, a cellular shredding machine, proteolytically cleaves ubiquitinylated
proteins selectively into small oligopeptides that have between 3 and 20 amino acids. The crystal
structure of the 20S proteasome from yeast (subunits are shown in different colors) is shown at the
left. Six of these units are inhibited by bortezomib (yellow). The boronic acid derivative
bortezomib 23.51 (right, gray) reacts with the N-terminal Thr1 and forms a covalent boronic
acid ester complex.

connection. In addition to this reactive group, a distinct peptide-like character can


be seen in the molecule. The molecule is able to interact with the substrate-binding
site in the proteasome with this moiety.
Another peptide analogue, carfilzomib, is in clinical testing. It carries a terminal
a0 b0 -epoxy ketone. Upon inhibition, the threonine OH group nucleophilically
attacks the keto function of the inhibitor. Next, the neighboring N-terminal amino
group opens the epoxide ring. This leads to an irreversible covalent bond. Even
though the epoxy ketone is very reactive, carfilzomib is a highly selective
proteasome inhibitor. The vicinity of the nucleophilic Thr—OH of the first residue
in the sequence and the N-terminal amino group is an unusual and exceptional
combination. It is, however, necessary for the activation of the inhibitor.
At present, the proteasome is an important target structure being tested for tumor
therapy. More than 20 different inhibitors are currently in development.
Bortezomib is used for the treatment of multiple myeloma, a malignant disease
of the bone marrow. This tumor disease is based on the malignant transformation
of plasma cells, the physiological function of which is to produce antibodies for
immune defense. Even though bortezomib cannot heal multiple myeloma, its use
can extend life of patients for whom other therapies have failed. In multiple
526 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

myeloma, the plasma cells produce massive amounts of misfolded proteins that
must be digested by the proteasome. Therefore these cells need a proteasome
that functions optimally; otherwise apoptosis would be induced. Blocking the
proteasome function is therefore desirable for such cells. Moreover these cells
are significantly more sensitive to bortezomib therapy than normal cells. Some
tumor cells also activate a transcription factor, NF-kB, which controls the prolif-
eration and survival of the tumor cells. The proteasome is critical for the activation
of NF-kB because it degrades an inhibitor of this transcription factor that acts as
a kind of emergency brake on NF-kB. Therefore the inhibition of the proteasome
serves to keep NF-kB in its benign form, because its inhibiting binding partner is no
longer being degraded. It is possible that bortezomib induces apoptosis of tumor
cells in that it stabilizes cyclin-dependent kinase inhibitors (▶ Sect. 26.2) as well as
the tumor-suppressor protein p53.
Interestingly, a protease was discovered in bacteria that exists as a 14mer and the
spatial structure of which is reminiscent of the proteasome. The ClpP-protein is
a serine protease that is involved in the degradation of cellular proteins in
bacteria. Treatment with a macrolide antibiotic can cause its function to run out
of control and degrade proteins in an unregulated way. This leads to cell death in the
bacteria. This principle was recognized by the company Bayer and exploited for an
antibiotic therapy. The goal was not to block the protease function of the ClpP-
protein, but rather to promote its uncontrolled effects through synthetic antibiotics.

23.9 Cysteine Proteases: Sulfur, the Big Brother of Oxygen as


a Nucleophile in the Triad

In addition to the OH group of serine and threonine, the thiol group of cysteine is
able to carry out a nucleophilic hydrolytic attack on amide bonds. These enzymes
possess a catalytic triad, analogously to the serine proteases, and are termed
cysteine proteases. The first protease of this family to have been structurally
investigated in detail is papain, which is isolated from the latex of the papaya, the
fruit of the papaya tree (Carica papaya). Its triad is composed of a nucleophilic
cysteine as well as a histidine and an asparagine. The asparagine adopts the role of
the aspartate in the serine protease. The catalytic mechanism is comparable to that
of serine proteases. Even the oxyanion hole (Cys25 and Gln119) is found in pro-
teases from the papain family. There are indications that the transition state is
structurally similar to the acyl–enzyme intermediate. An attempt has been made to
exchange serine for cysteine in trypsin. The binding properties for the substrate (Km
value) remained virtually the same, but the catalytic rate of the reaction decreased
by five orders of magnitude. Even though the structures are geometrically nearly
unchanged, the experiment shows that the difference between serine and cysteine
proteases is more complicated than a simple exchange of sulfur for oxygen. The
fine-tuning of the structural and electronic properties is pivotal. In contrast to the
trypsin-like serine proteases, the nucleophilic cysteine exists as a preformed ion
pair with its neighbor, histidine.
23.9 Cysteine Proteases: Sulfur, the Big Brother of Oxygen as a Nucleophile in the Triad 527

Table 23.4 Cysteine proteases with physiological importance (X ¼ arbitrary amino acid). The
3D structures of all of the listed enzymes are known.
Enzyme Cleavage site Function or therapeutic use
Papain –Val–X–X– Model botanical enzyme from papaya
Cathepsins B, L, K, M –Arg–X– Inflammation
–Gly–X– Tumor metastasis
–Ser–X– Muscular dystrophy
–Tyr–X– Myocardial infarct
Calpains –Lys–Ser– Stroke
–Arg–Thr– Neuroprotection
–Tyr–Ala– Cataract
Falcipain –Arg–Lys– Malaria
–Lys–X–
Cruzipain –Lys/Arg– Sleeping sickness
–Phe/Ala–
Caspases –Asp–X– Rheumatoid arthritis, apoptosis, sepsis
Picornavirus 3C-proteinase –Gln–X– Viral infection
SARS-main proteinase –Gln– Viral infection
Ser/Ala

Three families of cysteine proteases of importance as targets for drug therapy


have been characterized (Table 23.4). The first group is derived from papain, and
the cathepsins belong to this group. They are proteases that are involved in the
degradation of the extracellular matrix proteins and the basal membrane. Inhibiting
their function opens very different therapeutic possibilities, for example, for inflam-
mation, tumor metastasis, bone resorption, and muscle atrophy, or myocardial
infarct. Another group is the calcium-dependent calpains, their hydrolytic domain
has a very similar folding to papain. They occur in many cells and carry out
different functions. Calpains occur in higher concentrations at sites of cell damage
such as after traumatic brain injury, stroke, or during the formation of cataracts in
the eye. Calpains seem to be regulatory enzymes. For example, they reduce the
blood flow through vessels in cases of injury to limit blood loss. During a stroke,
this natural protective function unfortunately leads to the contrary situation: Acti-
vation of calpains reduces the blood flow, and parts of the brain become
ischemic. Destruction of the affected brain cells is the result. Specific inhibitors
could counteract the over-functioning of the calpains. Cysteine proteases of the
papain family have also been discovered in parasites. The inhibition of cruzipain
could be a concept to treat sleeping sickness. Falcipain, which is used to digest
hemoglobin by the parasite that causes malaria, represents a promising target
enzyme for malaria therapy.
The second large family of cysteine proteases includes the caspases. These are
involved in the control of apoptosis, or programmed cell death. If a cell is irrepa-
rably damaged so that it can no longer be remedied by natural repair mechanisms,
caspases are activated that initiate apoptosis. Misregulation of apoptosis leads to
528 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

different pathological conditions that are associated with tumor disease, disruption
of the immune system, or neurodegenerative damage. Inhibitors of different
caspases have potential as neuroprotective agents, as active substances for tumor
therapy, or for the treatment of rheumatoid arthritis.
The third family includes the viral 3C proteases, which occur in picornaviruses
(human rhinovirus, poliomyelitis, or hepatitis viruses) or corona viruses (SARS).
These viral proteases process the primary polypeptide chain and generate the
specific viral proteins during maturation. Inhibitors of these proteases represent
a concept for antiviral chemotherapy.
A special feature associated with the papain-type proteases is the stereochemis-
try of the nucleophilic attack. In contrast to other serine and cysteine proteases, the
attack occurs from the opposite side, the so-called Si face. The S1 pocket in papain
is not prominent, and the P1 group of the substrate is oriented away from the protein.
In contrast, all of the neighboring pockets are much more prominent. Interestingly,
some of the pockets on the C-terminal side (the primed side, S10 –S40 ) of cysteine
proteases are strongly structured. This can be exploited for the design of potential
inhibitors. Papain prefers substrates with hydrophobic P2 and P3 groups. An aspar-
tate is recognized as a P1 group by caspases of the second folding family. For these
reasons, many inhibitors that have been developed for caspases carry a functional
group with a carboxylic acid group or a corresponding mimetic at this position. The
interaction with the thiol group of the catalytic cysteine is decisive for the binding
of cysteine protease inhibitors to their target enzyme. It is interesting that many of
the developed inhibitors try to involve the sulfur atom in a covalent bond. Revers-
ible and irreversible head groups have been developed for this purpose. The
complex of the inhibitor leupeptin 23.52, a natural product with an aldehyde
function, with calpain II is shown in Fig. 23.27. This group reacts with the thiol
group of cysteine and forms a hemithioacetal. Leupeptin binds with high affinity to
many members of the papain family. In addition to the aldehyde head group, many
other functionalities (so-called warheads) that can be used to inhibit cysteine pro-
teases are known (Fig. 23.28). Such irreversible inhibitors have been developed in
cases of viral proteases and have a Michael-acceptor group at their disposal (i.e.,
23.53). This reactive group forms an irreversible bond with cysteine and switches
the enzyme off permanently. An attempt has been made to develop inhibitors for
cathepsins, calpains, and caspases that can form a reversible connection to the thiol
group. Most of these structures are derived from aldehydes or ketones (23.53–
23.57). From a chemical point of view, the caspase inhibitor 23.56 from Vertex is
interesting. In a cyclic structure, it combines an aspartate-like side chain for the S1
pocket of the enzyme and a capped aldehyde function in the form of a cyclic acetal.
The aldehyde is released as active compound from this prodrug.
Another group of enzymes that actually belongs to the family of transferases, but
follows a cysteine-protease-like mechanism, are the transglutaminases. Nine
isoenzymes have been discovered in our genome. They are constructed from four
domains and contain a catalytic domain composed of a Cys–His–Asp triad. Their task
is the posttranslational modification of proteins (▶ Sect. 26.2), that is, they modify
proteins after they have been synthesized in the ribosome. As one aspect, they can
23.9 Cysteine Proteases: Sulfur, the Big Brother of Oxygen as a Nucleophile in the Triad 529

Fig. 23.27 The binding H2N NH


mode of leupeptin 23.52 in
the crystal structure with NH
calpain II. The natural product
binds covalently through O O
a terminal aldehyde function H
N H
to the thiol group of Cys115 H3C N N
to form a thiohemiacetal. The H H
O O
oxyanion hole is formed by
the NH group of Cys115 and
the carboxamide group of
23.52 Leupeptin Ki = 0.021mΜ
Gln109.

Gly208

Glu349
O- hole
Cys115

Glu261

Leu260

carry out the deamination of glutamine residues to glutamate. Furthermore, they


catalyze the cross-linking of chain strands on proteins by the transaminase reaction.
For this the terminal amino group of a lysine is coupled with a glutamate group with
the formation of an isopeptide bond. A proteolytically stable cross-linking results so
that transglutaminases can be compared to “biological glue.” Their reaction is
analogous to the cysteine proteases. A nucleophilic cysteine initially forms an
acyl–enzyme with the substrate’s glutamine with loss of ammonia, which is cleaved
in the next step by the reactive lysine. A protein cross-link is the result. Trans-
aminases take on many tasks in our bodies and above all they stabilize tissue proteins.
In the blood coagulation cascade, the transglutaminase factor XIII stabilizes the
initially formed clot by cross-linking (Sect. 23.4). Therefore inhibitors for factor
XIII could be potent anticoagulants. Other transglutaminases are also being investi-
gated as possible targets for drug development. Transglutaminase-2 (TGT2) plays an
important role in celiac disease, which is a type of gluten intolerance. Patients that
have this disease are sensitive to gluten, which occurs in many grain products as an
adhesive protein. They develop inflammation in the mucous membranes of the small
intestines, which leads to the destruction of intestinal epithelial cells and severely
limits their ability to extract nutrients from food. Inhibitors of TG2 could represent
a therapeutic approach. Inhibitors for transglutaminases can be developed by follow-
ing analogous principles as in the case of the cysteine protease inhibitors.
530 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

H C CH3
O R1 O 3 O
H
H N
R2 N O N H
H H
O O
O R1
CH3 23.54 MDL-28170
R2 N
H
O H 3C CH3
O R1 O
O O H
CF3 S N
R2 N N H
H H
O O
O R1 F

R2 N 23.55 SJA-6017
H N
O
O R1
CH2Cl N
O
R2 N N
H
O O
O R1 O
N OO N
CHN2 H H
R2 N N O
H
O
CH3
O R1 O

R2 N COOH
H 23.56 VX-740 Pralnacasan
O
O R1
O O O
R2 N COOH H
N
H O N CH2F
H
O OH
H
O N
23.57 MX1013 O
O
N
H3C N
H
O N O N
O H COOEt
23.53

Fig. 23.28 In addition to the aldehyde head group, many other functionalities have been
developed that reversibly or irreversibly couple (reactive site is marked in red) to the catalytic
cysteine and in doing so block cysteine proteases. Irreversible inhibitors such as 23.52 that have
a Michael-acceptor group are available for viral proteases. The two aldehydes 23.54 and 23.55 are
development substances for the inhibition of calpains; 23.56 and 23.57 are caspase inhibitors.
Compound 23.56 is a prodrug that releases an aspartate-like P1 side chain with ring opening and
forms a thiohemiacetal with the protein with its newly generated aldehyde function.
23.10 Synopsis 531

To date, no cysteine protease inhibitor has managed to achieve market approval,


although inhibitors for very many target structures are being developed and are in
clinical trials. We can only wait and see whether the first successes will be achieved
in the near future.

23.10 Synopsis

• Serine proteases belong to the class of hydrolyzing enzymes that cleave amide or
ester bonds. Depending on where they cleave a peptide chain, they are classified
as amino-, carboxy-, or endopeptidases.
• Three amino acids, a serine, a histidine, and an aspartic acid that reside at quite
distant positions in the sequence, are folded in characteristic proximity to one
another. The hydroxyl oxygen atom of the serine performs a nucleophilic attack
onto the carbonyl carbon atom of the scissile peptide bond. Its nucleophilicity is
enhanced by an H-bond to an adjacent imidazole moiety of a histidine.
• The histidine accepts a proton from the nucleophilic serine OH group and is
thereby transposed into a positively charged state. The neighboring aspartate
residue compensates for the positive charge. The simultaneously created nega-
tive charge on the former carbonyl oxygen is stabilized by NH functions in the
H-bond-donating oxyanion hole. Simultaneously, the carbon atom of the cleav-
ing amide bond rearranges to a tetrahedral geometry.
• Upon release of the N-terminal part of the peptide substrate the C-terminal part
remains covalently bound as acyl–enzyme complex. This is finally degraded via
a similar mechanism by using a water molecule as a nucleophile.
• The residues involved can be different; in particular, the nucleophilic serine can
be replaced by a threonine or cysteine. The corresponding enzymes are named
threonine and cysteine proteases.
• The peptide chain to be cleaved is primarily recognized in small binding pockets
on the protease surface that accommodate the amino acid side chains on the
C-terminal end adjacent to the cleavage site. Their composition determines the
chemical building blocks required for inhibitor design to develop highly potent
ligands for the protease.
• A number of warhead groups are known to either reversibly or irreversibly block
the catalytic serine, threonine, or cysteine residue.
• The major contribution to binding affinity and ligand specificity is achieved
through binding to the S1 pocket next to the cleavage site.
• Blood coagulation is a highly regulated cascade of serine proteases. Potent
inhibitors for antithrombotic therapy have been developed for thrombin and
factor Xa, which are located in the last steps of the cascade.
• Whereas thrombin and factor Xa exhibit deep and well-structured S1
pockets, elastase exhibits a flat S1 pocket. Binding to this pocket contributes
much less to the overall affinity of an inhibitor for this protease and the
developed compounds all involve the catalytic serine in a reversible cova-
lent attachment.
532 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate

• Irreversible inhibition through covalent bond formation with the catalytic


serine is followed to block lipases or transpeptidases. The covalent bond
formation is achieved by the ring opening of a reactive highly strained
lactone or lactam ring. The latter principle is used by the penicillins and
cephalosporins.
• The transglutaminases follow a very similar enzyme mechanism as the cysteine
proteases. However, instead of cleaving a peptide bond in the main chain, they
form an isopeptide bond between the terminal amino group of a lysine and the
carboxylate group of a glutamate. Because these bonds cause a cross-linking
between different segments of the polypeptide chain, they can be compared to
biological glue making proteins more stable.

Bibliography

General Literature

Abbenante G, Fairlie DP (2005) Protease inhibitors in the clinic. Med Chem 1:71–104
Babine RE, Bender SL (1997) Molecular recognition of protein-ligand complexes: applications to
drug design. Chem Rev 97:1359–1472
Berliner LJ (ed) (1992) Thrombin: structure and function. Plenum, New York
Branden C, Tooze J (1991) Introduction to protein structure. Garland, New York
Kimball SD (1995) Challenges in the development of orally bioavailable thrombin active site
inhibitors. Blood Coagul Fibrinolysis 6:511–519
Shafer JA, Gould RJ (eds) (1994) Design of antithrombotic agents, vol 1, Perspectives in drug
discovery and design. ESCOM Science Publishers, Leiden, pp 419–550
Steinmetzer T, Stürzebecher J (2004) Progress in the development of synthetic thrombin inhibitors
as new orally active anticoagulants. Curr Med Chem 11:2297–2321
Türk B (2006) Targeting proteases: sucesses, failures and future prospects. Nat Rev Drug Discov
5:785–799

Special Literature

Gustafsson D, Bylund R et al (2004) A new oral anticoagulant: the 50-year challenge. Nat Rev
Drug Discov 3:649–659
Hilpert K, Ackermann J, Banner DW, Gast A, Gubernator K, Hadvary P, Labler L, Müller K,
Schmid G, Tschopp TB, van de Waterbeemd H (1994) Design and synthesis of potent and
highly selective thrombin inhibitors. J Med Chem 37:3889–3901
Veale CA, Bernstein PR, Bryant C et al (1995) Nonpeptidic inhibitors of human leukocyte elastase
5 design, synthesis, and x-ray crystallography of a series of orally active 5- aminopyrimidin-6-
one-containing trifluorormethyl ketones. J Med Chem 38:98–108
Aspartic Protease Inhibitors
24

The task of aspartic proteases is the cleavage of peptide bonds. Their name comes
from two aspartic acid residues that determine the catalytic mechanism. A water
molecule, which is appropriately polarized by these two residues, is used as
a nucleophile for the attack on the peptide bond. At the same time, these groups
stabilize the transition state, balance the charges, and transfer protons. The diges-
tive enzyme pepsin was intensively investigated as the first member of this enzyme
class. It is active at strongly acidic pH conditions between values of 1 and 5.
The first 3D structure of this aspartic protease was determined in the early 1970s
in the group of Alexander Fedorov. The aspartic protease family is relatively small
in the human genome; it contains 15 members. Table 24.1 lists a few important
aspartic proteases.

24.1 Structure and Function of Aspartic Proteases

Pepsin preferably cleaves peptides that contain hydrophobic residues to the right
and left of the cleavage site. Its spatial structure shows that two catalytically active
aspartic acid residues are found next to one another. One of these residues has an
unusually low pKa value of 1.5. The pKa of the other aspartic acid is higher: 4.7.
Therefore under the pH conditions in the stomach, apparently one of the side chains
in the catalytic site is protonated, but the other one is not. This difference is decisive
for the catalytic mechanism. In other aspartic proteases that exert their function at
higher pH values, a comparable difference is observed between the two groups. It is
the local environment that determines the pKa values (▶ Sect. 4.4). On the other
hand, both aspartic acid residues are spatially so close to one another that they can
no longer be considered as independent of one another. The two aspartates behave
like a coupled system, similar to a dicarboxylic acid; they are practically a diprotic
acid (Table 24.2).
The mechanism of the peptide cleavage by aspartic proteases is shown in
Fig. 24.1. The cleavage of the amide bond is accomplished by the nucleophilic
attack of a water on the carbonyl carbon atom. The deprotonated aspartate polarizes

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_24, 533


# Springer-Verlag Berlin Heidelberg 2013
534 24 Aspartic Protease Inhibitors

Table 24.1 A few aspartic proteases and the preferred site for enzymatic cleavage.
Enzyme Cleavage site Function
Pepsin Phe–Phe, Leu–Phe, etc. Digestion
Renin Leu–Val, Leu–Leu Increasing blood pressure
Cathepsin D Phe–Phe, Leu–Leu, etc. Tissue degradation
b-Secretase Met–Asp, Leu–Asp Proteolytic degradation of membrane
proteins
Chymosin Phe–Met Milk curdling
HIV-Protease Phe–Pro, Tyr–Pro, Phe–Tyr, Leu–Phe, Virus replication
Phe–Leu, Met–Met, Leu–Ala
Plasmepsin Phe–Leu Hemoglobin digestion

Table 24.2 pKa Values of a few dicarboxylic acids.


Dicarboxylic AcidHOOC–(CH2)n–COOH pKa 1 pKa 2 Distance: HOOC–COOH (Å)
n¼0 1.46 4.40 1.40
n¼1 2.83 5.85 2.60
n¼2 4.17 5.64 3.82
n¼3 4.33 5.52 4.95
n¼8 4.55 5.52 10.00
Z-HOOC-CH¼CH-COOH 1.90 6.50 3.14
E-HOOC-CH¼CH-COOH 3.00 4.50 3.80
1,2-C6H4(COOH)2 2.96 5.40 3.14
1,3-C6H4(COOH)2 3.62 4.60 4.93
1,4-C6H4(COOH)2 3.54 4.46 5.71
HCOOH pKa ¼ 3.77
CH3COOH pKa ¼ 4.76
C6H5COOH pKa ¼ 4.22

this water molecule and the protonated residue simultaneously forms an H-bond to
the carbonyl group of the amide bond being cleaved. In this way the C¼O bond is
polarized and the nucleophilic attack on the carbon atom is facilitated. The reaction
proceeds via a tetrahedral transition state in that the oxygen atom of water attacks
nucleophilically, and a proton is transferred to the deprotonated aspartate. One
approach to the development of aspartic-protease inhibitors is to imitate the tem-
porarily occurring, unstable geminal-diol transition state with a stable molecule.
Hydroxyl compounds (Fig. 24.2) but also a-ketoamides and phosphinates can be
used for this purpose.
A view into the binding pocket of five different aspartic proteases is shown in
Fig. 24.3. Substrate molecules are bound by these proteases in an elongated channel
that reaches from one side of the enzyme to the other and surrounds the substrate
like a tunnel. For this, the protease must allow the substrate access to the reaction
pathway by opening a flexible flap. The upper part of this tunnel is cut away in the
figure. The areas in which the protein orients its hydrogen-bond-forming groups are
shown in blue. The two catalytic aspartic acid residues are found in the center,
24.1 Structure and Function of Aspartic Proteases 535

Fig. 24.1 Catalytic


mechanism of aspartic
proteases. A water molecule
is polarized by one of the two
catalytically active aspartic
acid residues and
nucleophilically attacks the
amide group to be cleaved (a).
The second aspartate forms an
H-bond to the carbonyl group
of the amide bond. This
causes the electrophilicity of
the carbonyl carbon to
increase. The tetrahedral
transition state (b) collapses
upon formation of the
cleavage product (c).

hidden underneath the blue area. Neighboring this are regions in which hydrogen
bonds are formed to the peptide backbone of the substrate. This binding motif is
common to all aspartic proteases. Binding pockets to the left and right of the
catalytic site are responsible for the selective recognition of the substrate. They
accommodate the side-chain groups of the substrate molecule. It is noteworthy that
536 24 Aspartic Protease Inhibitors

OH R
H OH R2 OH R2 OH O H H HO OH 2
H H
N N N N N
F
R1 O R1 OH R1 R1 O R1 F O

Hydroxyethylene 1,2-Dihydroxyethylene Statin Norstatin Difluorketal

OH O OH O O OH
H H H H HO H
N N N N N P N
R1 R2 R1 O R1 O R2 R1 R2 O
Hydroxyethylamine α−Hydroxylamide α−Ketoamide Phosphinate

Fig. 24.2 Possible transition-state isosteres for the design of aspartic protease inhibitors.
Hydroxyl groups are particularly well suited. Statin, a non-proteinogenic amino acid, is found in
many inhibitor structures.

Fig. 24.3 A view into the binding pockets of the five aspartic proteases: HIV protease (a),
endothiapepsin (b), cathepsin D (c), plasmepsin (d), and renin (e). The catalytic site passes like
a tunnel through the protease (above left, inset). In the figures, the proteins are displayed in a way
that the cut runs through the middle of the tunnel, and the front side of the tunnel is clipped off. The
backside of the tunnel is visible through a view from the side (arrow); the protein surfaces are cut
above and below the active site. The blue areas on the backside of the tunnel surface indicate donor
or acceptor groups in the proteins to which the substrate is bound along the peptide backbone.
24.2 Design of Renin Inhibitors 537

Fig. 24.4 Pepstatin 24.1


(Iva ¼ isovaleric acid, O O O
H H
Sta ¼ statin) is an inhibitor for N N OH
a large number of different N N N
aspartic proteases. H O H OH O H OH O

Iva Val Val Sta Ala Sta OH

24.1 Pepstatin

in contrast to the serine proteases, the pockets on either side of the cleavage site are
well defined. This observation can be understood based on the reaction mechanism.
Also in contrast to the serine proteases, a covalent bond is never formed between
a reaction intermediate and the enzyme. Aspartic proteases often cleave between
hydrophobic amino acids. Because such residues cannot form strong interactions, it
is important to recognize and immobilize the substrate molecule by multiple
contacts on either sides of the cleavage site. For the design of inhibitors, groups
must therefore be found that can mimic the interaction in the binding pockets S3, S2,
S1 and S10 , S20 , and S30 . A group from Fig. 24.2 is then placed in the cleavage site
that represents a transition-state analogue.
Hamao Umezawa isolated one of the first potent and specific aspartic-protease
inhibitors, pepstatin, from a culture of Streptomyces sp. This peptide, Iva–Val–Val–
Sta–Ala–Sta-OH, 24.1 (Fig. 24.4) is a good to highly potent inhibitor of many
members of the aspartic-protease family. It contains the non-proteinogenic amino
acid statin with a hydroxyethyl group. The 3D structure of the pepsin–pepstatin
complex shows that statin indeed binds as a transition-state mimic to the catalytic
aspartic acids.

24.2 Design of Renin Inhibitors

Renin is an aspartic protease that is composed of 340 amino acids. It plays a pivotal
role in endogenous blood pressure regulation and in electrolyte and water homeo-
stasis. The enzyme cleaves the peptide angiotensinogen to form the decapeptide
angiotensin I (Fig. 24.5). This is subsequently cleaved by angiotensin-converting
enzyme (ACE, ▶ Sect. 25.4), a metalloprotease, to give the octapeptide angioten-
sin II, which increases blood pressure. Inhibition of the enzyme renin leads to
a decrease in the concentration of angiotensin I and, as a consequence, of angio-
tensin II. Renin inhibition therefore has a hypotensive effect. As a consequence of
the great therapeutic success of the ACE inhibitors, many pharmaceutical compa-
nies began to search for selective renin inhibitors. Renin has an unusually high
specificity. Angiotensinogen is the only known natural substrate of this enzyme.
Therefore, it should be possible to find a highly specific renin inhibitor that blocks
no other enzymes and causes no side effects, which is not the case with many other
antihypertensive compounds.
538 24 Aspartic Protease Inhibitors

P3 P2 P1 P1¢ P2¢

Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu-Val-Ile-His-Protein Angiotensinogen

Renin

Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu Angiotensin I (AI)


Angiotensin-
Converting
Enzyme (ACE)

Asp-Arg-Val-Tyr-Ile-His-Pro-Phe Angiotensin II (AII)


Angio-
Asp-Amino- tensinases
peptidase
(Angiotensinase A)
Inactive Fragments

Arg-Val-Tyr-Ile-His-Pro-Phe Angiotensin III (AIII)


Angio-
tensinases

Inactive Fragments

Fig. 24.5 The renin–angiotensin system. The conversion of angiotensinogen to angiotensin II


(ATII), which increases blood pressure, is accomplished in two steps. Degradation by an Asp-
aminopeptidase, angiotensinase A, leads to angiotensin III (ATIII), which is still biologically
active. Different angiotensinases (aminopeptidases, carboxypeptidases) degrade these two pep-
tides into inactive fragments.

Table 24.3 The replacement of the cleavable amide bond in Leu–Val by a stable isostere leads to
potent renin inhibitors. The Leu–Val group is replaced by a group in the inhibitors that the enzyme
cannot cleave.
Substrate/Inhibitor IC50 (nM)
His Pro Phe His Leu Val Ile His 300,000a
His Pro Phe His Leu[COCH2]Val Ile His 500
His Pro Phe His Leu[CH2NH]Val Ile His 200
His Pro Phe His Statin Ile His 20
His Pro Phe His Leu[CHOHCH2]Val Ile His 3
Substrate, KM value
a

The starting point for the work was the peptide sequence of the substrate
angiotensinogen. Renin cleaves angiotensinogen between Leu and Val. Initially
an appropriate surrogate for the Leu–Val unit was sought that would allow the
retention of the amino acids in the positions P5 to P30 (Table 24.3). The octapeptide
His–Pro–Phe–His–Leu–Val–Ile–His is cleaved as a renin substrate. A replacement
for the Leu–Val amide bond that is cleaved by the enzyme with the stable, isosteric
groups CH2NH or COCH2 led to modestly effective inhibitors. The isostere with
the hydroxyethylene group, CH(OH)CH2 was better suited as a transition-state
analogue, and afforded a strong inhibitor (IC50 ¼ 3 nM). The incorporation of the
24.2 Design of Renin Inhibitors 539

Table 24.4 Optimization of the P1 side chain. The binding pocket is lipophilic and obviously has
just the right size for a cyclohexylmethylene group.

OH
Boc-Phe-His-NH S CH3

R CH3 24.2

R IC50 (nM)
Isobutyl 81
Cyclohexylmethylene 4
Cyclohexyl 150
Adamantylmethylene 2,500
Benzyl 15

Table 24.5 The introduction of a second hydroxyl group in the R2 position leads to a significant
increase in the binding affinity.

H OH
N CH3
Boc-Phe-His
R1 R2 CH3 24.3

R1 R2 IC50 (nM)
Isobutyl H 1,500
OH 11
Cyclohexylmethyl H 10
OH 1.5

non-natural amino acid statin (see Fig. 24.4) produced a very well-binding inhib-
itor. As a dipeptide isostere, statin replaces the P1–P10 unit in the Leu–Val segment
of the substrate.
The next step was the optimization of the P1 moiety. Different groups were
investigated as a replacement for the leucine side chain. The results of such
a structural variation of 24.2 are listed in Table 24.4. The replacement of the
isobutyl group by a larger cyclohexylmethylene group increased the affinity by
a factor of 20. An adamantylmethylene group is obviously too large for the pocket
because the corresponding derivative only weakly inhibits the enzyme. Next, the P2
moiety was investigated. Here, however, the replacement of the histidine by another
group did not lead to a significant improvement in the binding affinity. Nonetheless,
the replacement of the basic histidine in the P2 position brought significant progress in
the context of renin research because it enabled the discovery that glycols are potent
renin inhibitors. A few compounds from the 24.3 class are listed in Table 24.5.
The introduction of a second hydroxyl group in the correct configuration increased
the affinity by a factor of 10–200 depending on the chosen P1 side chain.
540 24 Aspartic Protease Inhibitors

OMe

CH3 O O OH CH3
H3C H 24.4 A-64662
N
H2N N N CH3 Enalkiren
H H
O OH IC50 = 14 nM

N NH

O O O OH 24.5 Ro 42-5892
H
H3C S N Remikiren
N
H3C H IC50 = 0.7 nM
CH3 O OH

N NH

Fig. 24.6 Enalkiren 24.4 and remikiren 24.5 were the first renin inhibitors upon which a clinical
trial was conducted.

Accordingly, it was possible to find tripeptide analogues with binding constants of


about 1 nM. Several companies developed renin inhibitors to the point of clinical
trials. Examples are A-64662 24.4 (Fig. 24.6) from Abbott and Ro 45-5892 24.5
from Roche.
The desired goal, however, had still not been achieved. The compounds had
short half-lives and were not orally available. It showed that the amide bond
between the P3 moiety Phe and the P2 moiety His could be quickly cleaved by
the digestive enzyme chymotrypsin. The high molecular weight of the compounds,
which led to fast biliary excretion was also a problem. Further work focused on the
goal of finding a suitable replacement for the P2 and P3 side chains.
Inhibitor stability to chymotrypsin was achieved by altering the P3 group,
phenylalanine. The stability of a few of these modified renin inhibitors 24.6 are
summarized in Table 24.6. A compound that was no longer cleaved by chymotryp-
sin was obtained by using b,b-dimethylphenylalanine. This is because, in contrast
to phenylalanine, the very bulky side chain no longer fits into chymotrypsin’s
specificity pocket.
The extended search for possible replacements of Phe–His as the P3–P2 moiety led
to a wealth of new, non-peptidic renin inhibitors with high affinity. The introduction
of a terminal basic group was highly effective, even if adequate oral bioavailability
was not achieved. Typical representatives are 24.8 and 24.9 (Fig. 24.7).
Despite the enormous effort, renin research stagnated worldwide because no
compound could achieve the required oral bioavailability. All compounds
contained at least one amide bond, and their molecular weights were too high.
In addition, a 3D structure of renin became available only in the late 1980s from the
24.2 Design of Renin Inhibitors 541

Table 24.6 By modifying the P3 moiety, Phe, the stability to chymotrypsin is improved.

O OH CH3
H
N
R N CH3
H
OH
N
NH 24.6

R IC50 (nM) Hydrolysis by chymotrypsin t1/2 (min)


0.35 2.2

N N
H
O O

OMe 0.76 727

N N
H
O O

0.58 Stable
CH3
H3C
O

N N
H
O O

laboratory of Michael James, which was relatively late. Until then, it had already
been recognized that renin had a certain, if modest, sequence homology of 20–30%
with aspartic proteases that came from fungi, for which the 3D structures were
known. This was the starting point for homology modeling in multiple laboratories.
The first model was published in 1984 by Tom Blundell’s group. They used the
crystal structure of endothiapepsin as a reference. Initially the renin sequence
was compared with that of other aspartic proteases to find structurally conserved
regions. Then the modeling in the interior of the protein was accomplished by
replacing residues in endothiapepsin with those of renin. Truncations and insertions
in the polypeptide chain had to be considered. The flap region had particular
importance. It opens so that the ligand can enter and form hydrogen bonds with
the protein. Its structural architecture is therefore important for ligand binding.
Unfortunately the renin sequence deviated from that of the fungal enzymes exactly
at the flap region. A comparison of the renin model with the later-determined crystal
542 24 Aspartic Protease Inhibitors

O O O OH CH3 24.7 A-72517


H
S N IC50 = 1.1 nM
N N CH3
H
N O OH
N

O O O OH CH3 24.8 PD-134672


H
S N IC50 = 0.57 nM
N N N CH3
H H
O O OH
N

S NH2

O OH CH3 24.9 EMD-65010


O
P N CH3
H OH
O OH

CH3

Fig. 24.7 Structures 24.7–24.9 are a few moderately orally available renin inhibitors. All have
a diol unit and a cyclohexylmethylene side chain in common that binds in the P1 pocket.

structure showed good agreement, especially in the lower face of the binding pocket
near the two aspartic acids. Significant differences were found in the turn areas of the
loop region. In the context of the entire protein architecture, these were less impor-
tant. In the context of drug design, they were decisive! Errors in the structural model
necessarily had to lead to the wrong suggestions for inhibitor design.
The renin structure in complex with the inhibitor CGO-38560 24.10 that was
determined by Markus Gr€ utter and John Priestle at Ciba in Basel, Switzerland is
shown in Fig. 24.8. Based on this structure, the researchers at then Ciba, now
Novartis, managed a breakthrough. If the arrangement in the binding pocket of
renin is considered, it is apparent that the S1 and S3 pockets merge into one large
hydrophobic cavity. The residues in the P1 and P3 position, a cyclohexylmethylene
and a benzyl group, approach one another closely. The following design concept
was then obvious: Instead of stretching the molecule and its groups on a peptide
backbone, the scientists disrupted the chain at the amide bond. This created a new
polar group: a charged, free amino terminus. The tethering of the molecule was
instead diverted to the closely placed hydrophobic residues in the S3/S1 pocket. An
entirely new, dipeptide-like scaffold resulted (24.11, Figs. 24.8 and 24.9). It had an
24.2 Design of Renin Inhibitors 543

Ser219 Asp32
Asp215

Ser76

Thr77

Fig. 24.8 Superposition of the crystal structures of renin complexes with the inhibitors CGP-
38560 24.10 (gray carbon atoms) and aliskiren 24.12 (light-green carbon atoms). The inhibitors
bind with their peptide-like architecture in an extended, pleated-sheet-like conformation. Com-
pound 24.10 orients its benzyl and cyclohexylmethyl groups in the broad S3/S1 pocket. Both of the
hydrophobic side chains from 24.10 were linked to each other for the design of aliskiren. Instead
the peptide chain could be cut open and a new polar N terminus is formed.

IC50 of 6 nM. Finally, a few steps of side-chain optimizations were under taken on
the aromatic ring and on the amide bond. The methoxypropoxy side chain occupies
a somewhat different pocket than the corresponding groups in CGP-38560. It
causes a significant increase in the binding affinity. The optimized residue in P10
exerts only a small influence on the in vitro affinity, but it is pivotal for the duration
of action. A geminal substitution with two methyl groups and a terminal
carboxamide group proved to be optimal for the P20 position. This inhibitor was
introduced to the market in 2006 with the name aliskiren (24.12) as the first orally
available renin inhibitor. Despite being so well optimized, the compound does not
show ideal bioavailability. Therefore, it must be applied in therapy with rather high
dosage. Nonetheless, aliskiren shows virtually no binding to other aspartic pro-
teases such as cathepsin D or pepsin.
Roche achieved another hit from their renin work that later proved to be
stimulating for the entire research field. The company had a potent inhibitor with
remikiren 24.5, which unfortunately lacked the desired oral bioavailability. The
company then initiated a broad screening program. Chlorophenylmethoxybenzy-
loxypiperidine 24.13 was discovered (Fig. 24.10) with an IC50 value of 50 mM.
This structure was surprising because it does not have a typical transition-state-
mimicking group. The crystal structure with a very similar derivative showed that
the protonated nitrogen on the piperidine ring binds between the two catalytic
aspartic acids. The lipophilic chlorophenyl portion orients into the broad S1/S3
544 24 Aspartic Protease Inhibitors

Fig. 24.9 For the further


development of 24.10 to
aliskiren 24.12 as an orally
available renin inhibitor
benzyl and cyclohexylmethyl O O O O
H
side-chain groups in 24.10 S N
N N CH3
were tethered together to H
O H OH
feature 24.11. Then the
peptide chain could be
cleaved after the nitrogen N NH
atom, and a new polar group
24.10 CGP-38560 IC50 = 2 nM
could be formed that binds to
the catalytic center. The
analogous molecular parts in
both inhibitors are shown
in red. O
O

MeO H2N N CH3


O H
OH CH3
24.11 IC50 = 6 nM

O
O O
O
N NH2
H2N
O H
OH

24.12 Aliskiren IC50 = 0.6 nM

pocket that is normally occupied by the leucine and phenylalanine of the


angiotensinogen substrate. Because the available space in this pocket was not yet
fully occupied, the researchers at Roche initially concentrated on structural varia-
tions in the para position of the aromatic ring as a surrogate for the chlorine atom.
The introduction of aromatic groups with variable chain lengths afforded derivatives
with up to 100-fold improved activity. It seemed to be critical that only hydrophobic
groups could be placed in this area. The best results were achieved with
a propylenedioxybenzyl side chain. With this, the compounds advanced into the
subnanomolar inhibition range. A crystal structure determination was accomplished
with derivative 24.14, which showed an entirely unexpected binding mode
(Fig. 24.11). The protonated nitrogen atom of the piperidine ring is still between
the two aspartic acid residues, but the lipophilic naphthyl group is oriented in the
broad S1/S3 pocket. The long, hydrophobic side chain of the 40 -substituted phenyl
group opens a new pocket in renin. As with all aspartic proteases, renin has a flexible
flap region that falls over the binding pocket after the substrate has bound. In the
present case the flap is pushed outward by the inhibitor. The enzyme adopts
a geometry that corresponds better to the open-flap conformation. A hydrogen
bond between Tyr75 and Trp39 that closes the flap is ruptured. At the same time,
24.2 Design of Renin Inhibitors 545

Fig. 24.10 A piperidine O O


Cl
derivative 24.13 that was
found in a screening for renin
inhibitors at Roche. A crystal
structure was determined O
from the optimized compound O
O
24.14.

N N
H H
24.13 24.14

Trp39
Tyr75

Asp32

Asp215

Fig. 24.11 The crystallographically determined binding mode of the piperidine lead structure
24.14 with renin. The basic nitrogen of the inhibitor binds between the two aspartic acids of the
catalytic dyad. The lipophilic side chain lies in a newly opened binding pocket. It was formed by
breaking up a hydrogen bond that was originally present between Tyr75 and Trp39 in the
uncomplexed protein. Both residues adopt a new position with larger distance between one another
after binding of 24.14.

the inhibitor’s 4-phenyl group occupies a region where the aromatic ring of Tyr75
would be located if the flap were closed. This structure afforded the researchers two
important pieces of information: (1) a nitrogen-containing heterocycle is an
interesting peptidomimetic that binds to the catalytic aspartic acids, (2) inhibitors
can bind to the aspartic protease family in more conformers than the closed-flap
conformation. The open conformer can also be stabilized by an inhibitor. These
exemplary studies on renin afforded important information for new work on the
aspartic proteases (Sect. 24.6).
546 24 Aspartic Protease Inhibitors

24.3 Design of Substrate-Analogue HIV Protease Inhibitors

AIDS is an infectious disease that is caused by the human immunodeficiency


virus, HIV. HIV protease, which is needed for viral replication, is coded as a large
pro-protein in the viral genome. The function of HIV protease is to cut the poly-
peptides that are produced in the virus’s lifecycle into functional proteins. Inhibitors
of HIV protease should therefore be able to suppress the replication of HIV. The
existence of the HIV protease was postulated in 1985 and experimentally confirmed
in 1988.
In 1989, the first 3D structure of the enzyme as well as a few enzyme–inhibitor
complexes were determined. HIV protease is a homodimer made up of two iden-
tical chains. One catalytic aspartic acid comes from each chain of the homodimer.
The dimeric structure of HIV protease with its twofold symmetry is shown in
Fig. 24.12.
It was soon discovered that HIV protease is also inhibited by pepstatin. This was
the starting point in the search for HIV protease inhibitors. Many companies
already active in the renin area tested their compounds prepared in the course of
the programs for HIV protease inhibition. By starting from the non-natural amino
acid statin, which was known from the renin work, a series of active HIV protease
inhibitors were discovered. Just as was the case with renin, the hydroxyethylene
isostere proved to be a particularly well-suited building block. For example, H 261
24.15 (Fig. 24.13) is a potent HIV protease inhibitor with Ki ¼ 5 nM.
Heptapetides were identified as a minimal substrate for HIV protease. Ser–Leu–
Asn–Phe–Pro–Ile–Val is such a substrate. Cleavage of the amide bond is accom-
plished between the amino acids Phe and Pro. The replacement of the cleavable
amide bond with a hydrolytically stable hydroxyethylamino group –CHOH–CH2–
NH– led to 24.16 (JG 365, Fig. 24.13), a high-affinity HIV protease inhibitor
(Ki ¼ 0.66 nM). This compound was, however, inactive in cell culture tests. It is
not able to penetrate the cell membrane to exert its antiviral effects.
The chemists at Roche proved that the design of an HIV protease substrate
analogue can lead to an effective drug. Because proline often occurs in the P10
position (e.g., 24.17, Fig. 24.14), isosteres of analogues of the dipeptide Phe–Pro
were investigated as HIV protease inhibitors. The replacement of proline by
homoproline 24.18 or decahydroisoquinoline 24.19 led to a significant increase in
the binding. Moreover 24.19 exhibited marked selectivity regarding the other
aspartic proteases renin, pepsin, cathepsin D, and cathepsin E. More importantly,
the compound was active in a cellular test. It has the ability to penetrate the cell
membrane. In the enzyme tests, 24.19 inhibits HIV protease with Ki < 0.12 nM.
Viral replication is inhibited in cell culture with EC50 values of 1–10 nM. The
activity in cells is therefore on the same order of magnitude as the pure enzyme
inhibition. Saquinavir 24.19 was the first HIV protease inhibitor to pass all phases
of clinical trials and gain market approval in November of 1995. In subsequent
years, other pharmaceutical companies managed to introduce substrate-like HIV
protease inhibitors to the market. Therefore our drug arsenal has eight approved
drugs (24.19–24.27) with peptide-like scaffolds (Fig. 24.15). However, nelfinavir
24.3 Design of Substrate-Analogue HIV Protease Inhibitors 547

Ile50ⴕ Ile50

Asn25 Asn25ⴕ

Fig. 24.12 3D Structure of HIV protease in complex with the peptide substrate Arg–Pro–Gly–
Asn–Phe–Leu–Gln–Ser–Arg–Pro. The structure with the substrate could be obtained with
a catalytically inactive enzyme because both acidic aspartic acids of the catalytic dyad had been
mutated to asparagines. The protease exists as a C2-symmetric homodimer. The peptide chains are
shown in green and red, respectively.

N NH N NH N NH

O OH O Ki = 5 nM
H H H
Boc N N N N
N N N COOH
H H H
O O O O

24.15 H 261
H2N
Ki = 0.66 nM
OH O
O H O H OH H O
N N N N OMe
N N N
H O H O O H O

24.16 JG 365

Fig. 24.13 The peptidic HIV protease inhibitors H 261 24.15 and JG 365 24.16 are potent
inhibitors in the enzyme test. They are inactive in cell culture.

24.24 was withdrawn from the European market in 2007. It was noticed that tablets
containing this substance had an unusual smell. The subsequent analysis gave
the alarming result that the drug was contaminated with ethyl mesylate from the
synthesis. Because Saquinavir has unsatisfactory bioavailability (3–5%), it is
administered in combination with ritonavir 24.20, a potent CYP 3A4 inhibitor
548 24 Aspartic Protease Inhibitors

H2N
O
O H OH
N N O
O N
H
O O

24.17 IC50 = 140 nM


H2N
O
O H OH
H
N N N N
N
H
O O

24.18 IC50 = 2 nM

H2N
O H H
O OH
H H
N N N N
N
H
O O

24.19 Ro 31-8959
Saquinavir
IC50 < 0.4 nM
Ki < 0.12 nM

Fig. 24.14 The stepwise optimization of the substrate-analogue inhibitor 24.17 led to the highly
potent HIV protease inhibitor Ro 31-8959 24.19 via 24.18. This compound was the first protease
inhibitor to pass clinical trials and is marketed with the name saquinavir.

(Ki ¼ 17 nM; ▶ Sect. 27.6). This significantly minimizes the first pass effect upon
co-administration with saquinavir. Amprenavir 24.23 was withdrawn in 2004
because it was replaced by the better-soluble prodrug fosamprenavir (Lexiva ®).

24.4 Structure-Based Design of Non-Peptidic HIV Protease


Inhibitors

The relationship to the parent substrate is clearly visible in the inhibitors that were
introduced in the last section. Fundamentally the compounds are still peptides. The
crystal structures of the peptidic HIV protease inhibitors in complex with the
enzyme all show that the inhibitors form essentially the same H-bond pattern in
the immediate vicinity of the catalytically active aspartic acid residues (Fig. 24.16).
A water molecule is particularly interesting here because it is found in all of the
crystal structures. This water molecule forms two hydrogen bonds to both the
24.4 Structure-Based Design of Non-Peptidic HIV Protease Inhibitors 549

H2N
O H H
O OH H H
H H OH
N N N N H H
N N N N
H HO
O O
O O
S

24.19 Saquinavir Invirase® (1995)

S N 24.24 Nelfinavir Viracept® (1997)


H O OH
H
N N N O
N N S
H O
O O
Ph O
O N O
H H
O N NH
N N
24.20 Ritonavir Norvir® (1996) H
O OH

N OH OH
H
N N
N
24.25 Atazanavir Reyataz®, Zrivada® (2000)
CONHtBu O

24.21 Indinavir Crixivan® (1996) NH2


OH
H
O N N
Ph S
O OH O
H H O O O
HN N N H
N O O
H
O O
Ph 24.26 Darunavir Prezista® (2006)
® ®
24.22 Lopinavir Kaletra , Aluvia (2003)
OH
NH2
H OR
O N N CF3
S O O
O O HN
O O S N
O O

24.23 R=H Amprenavir Agenerase®, Prozei® (1999) 24.27 Tipranavir Aptivus® (2005)
R=PO3H Fosamprenavir Prodrug Lexiva® (2003)

Fig. 24.15 Up to now, nine new compounds for AIDS therapy have been introduced to the
market. Compounds 24.19–24.26 are peptide-like inhibitors, and tipranavir 24.27 alone has
a completely non-peptidic structure.

inhibitor and the enzyme. Inhibitors designed to displace this water molecule were
hoped to increase the binding affinity by the entropically favorable release of this
water (▶ Sect. 4.6). Moreover, it was expected that such an approach would also
increase the selectivity because a water molecule with a similar function is not
known to exist in the other aspartic proteases.
550 24 Aspartic Protease Inhibitors

a b
N Ile 50 P1 8.5–12 Å P1′
Ile 50′ N
H H

HOH 3.5–6.5 Å 3.5–6.5 Å

H-bond
O R O donor/acceptor
N
H OMe
R OH
HO OH
O O
HO OMe 24.28
- - Asp 25′
Asp 25 O O
O

OH

HO OH

O
R R
N N

HO OH
24.29

Fig. 24.16 The pattern of the hydrogen bonds between HIV protease and the peptide inhibitors in
the vicinity of the catalytic aspartic acids (a). A water molecule is found in the binding pocket that
forms two H-bonds to the inhibitor and to the protein. The hydroxyl group of the inhibitor
displaces the water molecule that is involved in the catalytic process (c.f. Fig. 24.1). By starting
with this binding mode, the spatial pharmacophore of a potential inhibitor was defined (b). The
search in databases of crystal structures of low-molecular weight compounds was started with this
pattern. It produced the substituted phenol 24.28 as a hit. From there, six- and seven-membered
cyclic ketones and a cyclic urea 24.29 were developed. These derivatives could displace the
structurally conserved water molecules from the binding pocket of the protease with their carbonyl
groups.

At Dupont–Merck a 3D database was searched for new scaffolds for HIV


protease inhibitors. For this, a pharmacophore pattern was derived from the crystal
structure of the enzyme. The assumption was made that the occupancy of the S1 and
S10 pockets was essential for binding as well as an interaction with the catalytic
24.4 Structure-Based Design of Non-Peptidic HIV Protease Inhibitors 551

Fig. 24.17 The newly O


discovered lead structure of
the cyclic urea 24.29 was HN NH
optimized stepwise via the
derivatives 24.30 and 24.31 to O
DMP-323 24.32 and DMP- HO OH
412 24.33 at Dupont–Merck. N N

24.30 Ki = 4500 nM
HO OH

24.31 Ki = 0.3 nM

HO OH H2N NH2

O
O N N
N N

HO OH
HO OH

24.32 DMP-323 24.33 DMP-412


Ki = 0.27 nM

aspartates. Two lipophilic groups, separated by 8.5–12 Å, were sought that were
also 3.5–6.5 Å away from a hydrogen-bond acceptor or donor (Fig. 24.16b). In
addition, a functional group should be between the two lipophilic groups that can
displace the structurally conserved water molecule from the binding pocket.
A search in the Cambridge database (▶ Sect. 17.11) afforded a molecular scaffold
derived from a substituted phenol (24.28). From this, the idea emerged to use
4-hydroxycyclohexanone as a scaffold (Fig. 24.16). Modeling studies and intensive
discussions with the synthetic chemists finally led to a cyclic urea 24.29 as
a scaffold for the new inhibitors 24.30–24.33 (Fig. 24.17). The first result of this
development was DMP-323 24.32, a low-molecular-weight HIV-protease inhibitor.
The 3D structure of 24.31 in complex with the protease is shown in Fig. 24.18. It
confirms the hypothesis that the carbonyl group displaces the structural water
molecule, and the two hydroxyl groups bind to the catalytic aspartate residues. As
promising as the design of the cyclic urea as an HIV protease inhibitor seemed, to
date no compounds have survived all stages of clinical trial to achieve approval.
A new lead structure 24.34 (Ki ¼ 1.1 mm, Fig. 24.19) was found by screening at
Parke–Davis. The spatial structure with the protease was determined with the
homologous inhibitor 24.38 (Fig. 24.18). It showed that this structure, analogously
to 24.31, displaces the water molecule in the active site and forms H-bonds to the
552 24 Aspartic Protease Inhibitors

N N

HO OH

24.31

O O Br

OH

24.38

Fig. 24.18 The superimposition of the crystal structures of the complexes of HIV protease with
the urea-containing inhibitor 24.31 (gray) and the coumarin derivative 24.38 (light green).

catalytic aspartate residues as well as to the NH groups of Ile50 and Ile500 . The
X-ray structure was used to design derivatives with improved binding properties
such as 24.36. Modeling studies led to the idea to introduce an acidic group in the S3
pocket to form a salt bridge with Arg8. The corresponding compound with an
OCH2COOH group in the para position of the 6-phenyl ring was synthesized and
led to a marked increase in the binding affinity. The inhibitor 24.37 (Ki ¼ 51 nM) is
achiral, has a low molecular weight, and can be prepared in three steps. The
hydroxypyrone building block proved to be successful for the inhibitor develop-
ment in the end. In 2005, Boehringer Ingelheim introduced tipranavir 24.27, the
first non-peptide HIV protease inhibitor, to the market. This compound binds to the
catalytic dyad with its hydroxyl function, but the side-chain optimization resulted in
a much more complex structure than was purported in 24.34.

24.5 The Development of Resistance Against HIV Protease


Inhibitors

The first inhibitor for the HIV protease was developed and introduced to the market
in less than 8 years. In the following 10 years, nine drugs have been introduced as
marketed products (Fig. 24.15). Compounds from completely different structural
classes have been successfully developed to block HIV protease. In the meantime,
an arsenal of non-peptide, low-molecular-weight, orally available compounds are
available for therapy. Inhibitors were also developed and introduced to the market
for another important viral enzyme, reverse transcriptase (▶ Sect. 32.5). In addition
to the substrate-analogue inhibitors such as zidovudine (AZT 24.39) and didanosine
24.5 The Development of Resistance Against HIV Protease Inhibitors 553

OH
S
24.34

O O Ki = 1100 nM
IC50 = 3000 nM

OH
S
24.35
O O Ki = 700 nM
IC50 = 1670 nM
OH
S

O O 24.36
IC50 = 1260 nM
OH
S
24.37
O O Ki = 51 nM
HOOC O IC50 = 160 nM
OH

CF3 24.27 Tipranavir


O O
HN
S N
O O

Fig. 24.19 Optimization of the coumarin-like HIV protease inhibitor 24.34, which was discov-
ered by mass screening at Parke–Davis. The extension of the thioether side chain to 24.35 and
24.36 as well as the introduction of a carboxyl group led to 24.37. A hydrogenated hydroxypyrone
building block could be incorporated in tipranavir 24.27 at Boehringer Ingelheim. The compound
represents the first non-peptide HIV protease inhibitor in therapy.

(DDI 24.40), allosteric inhibitors (such as nevirapine 24.42) are available


(Fig. 24.20). Another enzyme that can be used as a target to fight HIV was identified
in HIV integrase. Ratelgravir 24.42 was granted approval at the end of 2007 as the
first drug to act on this enzyme. Furthermore, drugs such as enfuvirtide and
maraviroc that block the fusion process during viral entry into the infected cell
are noteworthy (▶ Sect. 31.5).
The virus has developed resistance to reverse transcriptase inhibitors
quickly. As with the other RNA viruses, HIV replication is very error prone. The
viral reverse transcriptase commits one error for about every 10,000 bases. In this
way, the virus produces large genetic diversity that directly leads to resistance.
554 24 Aspartic Protease Inhibitors

O
H3C O
NH
N
NH
HO N O HO
O O N N

N3

24.39 Zidovudine AZT 24.40 Didanosine DDI


F

O
H3C H
N
HN OH

N N N O
N N N
H3C H
N
O N O
24.41 Nevirapine O H3C CH3 CH3

24.42 Raltegravir

Fig. 24.20 By using HAART therapy, which is a combination of a protease inhibitor


(Fig. 24.15), a reverse-transcriptase inhibitor such as 24.39–24.41, or an integrase inhibitor such
as 24.42, the hope is to break through the increasingly observed resistance to the drugs in AIDS
therapy.

Because about 108–109 replication cycles take place each day, 105 point mutations
occur in the population of viral proteins in one infected patient. Therefore, it is not
surprising that the introduction of HIV protease inhibitors has induced a great deal
of resistance. Different mutations in the binding pocket further away from the
active site also lead to a severe reduction in the binding affinity of HIV protease
inhibitors. The positions at which mutations have been observed are shown in
Fig. 24.21. If all of the previously observed exchanges are taken together, half of
all positions in the protease are affected by now. However, similar amino acid
exchanges in the vicinity of the catalytic center are observed over and over again.
This is certainly because the peptide-like inhibitors 24.19–24.26 all adopt a rather
similar binding mode in the protease (see Fig. 24.25).
Therefore a combination therapy is used to treat AIDS. The simultaneous
administration of multiple inhibitors should lead to better suppression of viral
replication. Here the formation of resistance is markedly slowed and hindered.
The best results are achieved with the co-administration of antiviral drugs with
different modes of action, for example, the combination of a nucleosidic and a non-
nucleosidic reverse transcriptase inhibitor and a protease inhibitor. Such therapies
are part of theso-called HAART strategy (highly active antiretroviral therapy)
that has found application in the clinic.
24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the Catalytic Dyad 555

Substrate
Binding Pocket

Fig. 24.21 Mutations in the amino acids in HIV protease lead to resistance to the inhibitors. The
course of the polymer chain is coded in green or red. Red represents residues that, with high
probability, have mutated, and green areas show little exchange. Many mutations are found in the
vicinity of the active site, but some are fairly far away from the substrate-binding pocket.

24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the


Catalytic Dyad

As mentioned above, Roche managed to discover a piperidine derivative 24.13


(Fig. 24.10) in a broad-based screening as a renin inhibitor with micromolar
activity. It could be further optimized to a subnanomolar hit. Its piperidine nitrogen
atom binds at the pivotal point between the two aspartic acids (Fig. 24.10). As
a result of this work, secondary amines have been intensively investigated in the
meantime as binding partners for the aspartic acids. An entire series of building
blocks have been described so far (Fig. 24.22), and their inhibitory effect on
different aspartic proteases has been tested. In a rational-design approach, the
five-membered ring pyrrolidine 24.43 was designed instead of a six-membered
piperidine. Such a ring with its nitrogen atom can be placed between the two
aspartic acid residues. At the same time, the specificity pockets can be symmetri-
cally reached with its side chains on the prime and unprime sides. Just as it was
done for the substrate-like inhibitors 24.19–24.26 (Fig. 24.15), hydrogen-bond
acceptor groups were conceptualized on both sides of the heterocycle in the design.
In the special case of the HIV protease, a conserved water molecule (so-called
structural water) is found at pivotal position to mediate the interaction to the flap
area. In other aspartic proteases direct contact with the flap region is achieved. The
pyrrolidine ring was augmented with aminomethylene groups on both sides, and
amide and sulfonamide groups were introduced as acceptor functions. At the same
time, the amide nitrogen atoms served as branching points to reach the four
subpockets of the protease with attached groups.
In a small series of compounds, Edgar Specker at the University of Marburg,
Germany, developed micro- to submicromolar lead structures for HIV protease and
cathepsin D. By using the racemate of 24.45, Jark Böttcher determined the crystal
556 24 Aspartic Protease Inhibitors

R
R N Ile 50
Ile 50′ N
H H
+
N
H H HOH

R1N NR2 R1N NR2 O O


R1 X R4
+ + N N
N N S2
R2 R3 S2′
H H H H
+
24.43 24.44 N
X = (C, S=O)
S1 H H
R2 R2 S1′
R1 O O
R3
R1
+ + Asp 25 O - -O Asp 25′
N N
H H H H

Fig. 24.22 Secondary amines are promising binding partners for the aspartic acids of the catalytic
dyad of aspartic proteases. In a rational design approach, the nitrogen atom of the five-membered
pyrrolidine ring 24.43 was placed between the two aspartic acids. At the same time, the scaffold
and its side chains symmetrically reach the specificity pockets on the primed and unprimed side of
the protease. The incorporated acceptor function should form H-bonds to the structurally con-
served water in the flap region of HIV protease.

structure with HIV protease. It contained a big surprise (Figs. 24.23 and 24.24a).
The nitrogen in the pyrrolidine ring of the R,R enantiomer is, as defined, in the
pivotal point between the two aspartic acids. It takes the same position as the
hydroxyl groups in the transition-state-analogue inhibitors. However, in contrast
to the original concept, the inhibitor displaces the structural water from the binding
pocket! The oxygen of the sulfone group forms a direct hydrogen bond to the NH
group of Ile50 in the flap. The carbonyl group of the amide bond that is found on the
opposite side is not involved in any interactions with the flap region. On the other
hand, the loop of the flap adopts a distorted geometry in that the NH function of
Ile500 makes a hydrogen-bond contact to the turn of the other monomer unit. Such
a geometry had never been seen before. In a more detailed analysis, it seemed that
the inhibitor does not optimally fill the S2 to S20 subpockets of the protease.
Compared to amprenavir 24.23 (Fig. 24.15), the S2 pocket remains virtually
unoccupied. Moreover, the molecule’s bulky dimethylphenoxy group seemed to
protrude out over the S10 pocket and interfere with the loop of the flap. Despite
single-digit micromolar affinity (Ki ¼ 1.5 mM), the inhibitor seemed to be uncom-
fortable in the pocket. It seemed to break almost all of the “golden” rules of drug
design (▶ Sect. 4.11). Is the awkward dimethylphenoxy group responsible for
the binding mode? To check this, a three-armed inhibitor 24.46 (Fig. 24.23) was
synthesized. Surprisingly, the crystal structure of this inhibitor 24.46 (Ki ¼ 52 mM)
shows the same binding mode: the structural water is displaced, and the loop region
takes on a distorted form, even though there is obviously no longer a large group to
24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the Catalytic Dyad 557

Ile 50⬘ NH O Ile 50


Ile 50⬘ NH O Ile 50
S2 N S2⬘ N
H S2 H S2⬘

O O
S O
O
N N O N S
O
O +
N+ N
O H H O– S1⬘ O H H O–
S1⬘ S1 S1
O– O O– O
Asp 25 Asp 25⬘ Asp 25 Asp 25⬘
24.45 24.46
Ile 50
Ile50⬘
N N N Ile 50
S2 H Ile 50⬘ N
H S2⬘ H H
O O S2⬘
O S2 H H
SO2 S O O
O NH2
N N H2N S O S
N N
N+ H
S1⬘ O H H O– S1 S1⬘ N+ S1
O–
O H O–
O
Asp 25 Asp 25⬘ O– O
24.47 Asp 25 24.59 Asp 25⬘

Fig. 24.23 Schematic representation of the different binding modes of the four inhibitors that are
shown in Fig. 24.24. The three pyrrolidine derivatives 24.45, 24.26, and 24.47 differ in their
connecting geometry at the ring and the number of substituents. The central heterocycle was
opened in 24.59. Interestingly, the conserved structural water returns to the structure with this
ligand.

interfere with the region (Fig. 24.24b). The occupancy of the specificity pockets
seems to be anything but optimal with this inhibitor.
Next, an attempt was made to bring the substituents on both sides of the central
pyrrolidine ring closer together. Andreas Blum eliminated the two methylene
linkers to use 3,4-diaminopyrrolidine 22.44 as a central building block
(Fig. 24.22). This scaffold was symmetrically decorated with substituted sulfon-
amides on both sides. Initially benzenesulfonic acid derivatives were optimized
with regard to the substitution on the tertiary nitrogen atom (24.47–24.58;
Table 24.7). In addition to the inhibitory effects on the wild-type enzyme, a quickly
induced resistant mutant that carries a valine instead of an isoleucine in position 84
was also studied. The enlarged pocket in the resistant mutant shows reduced
binding affinity with many inhibitors because of the reduced hydrophobic contact
surface. It was also demonstrated that the wild-type enzyme does not tolerate
558 24 Aspartic Protease Inhibitors

Fig. 24.24 Crystal structures of the inhibitors from Fig. 24.22 in HIV protease. (a) Compound
24.45 leaves the S2 pocket virtually unoccupied. Its voluminous o,o0 -dimethylphenoxy substituent
only incompletely fills the S10 pocket and seems to push against the loop in the flap region. The
structural water is displaced from the binding pocket. (b) Compound 24.46 only partially fills the
S2 and S10 pockets. Water is also displaced from this structure and the loop takes on a distorted
geometry even though no unfavorable contacts are recognizable. (c) Compound 24.47 binds
almost C2 symmetrically and places its benzenesulfonyl groups in S2 and S20 . The N-benzyl
groups are found in S1 and S10 . Here too, the structural water is displaced from the complex.
(d) Compound 24.59 orients its p-aminobenzenesulfonyl group in S2 and S20 . The N-benzyl
substituents occupy S1 and S10 . Both SO2 groups form H-bonds to the structural water, which
has returned in this structure. The inhibitor seems to fill the binding pocket perfectly, but it does
not achieve better binding affinity than the other derivatives despite the additional NH2 functions
that form H-bonds to the protein.

branched-chain groups (24.47–24.50) as well as the mutants (Table 24.7). The


benzyl group proved to be the best compromise for good inhibition of both
isoforms. A crystal structure determination was accomplished with this derivative
24.47 (Fig. 24.24c). The pyrrolidine nitrogen atom adopts the desired position
between the two aspartic acids. The structural water, once again, is displaced
from the pocket and one of the two sulfonamide groups forms a hydrogen bond
to the Ile50 NH group in the flap region. The inhibitor sits largely symmetrically in
24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the Catalytic Dyad 559

Table 24.7 By modifying the R1 and R2 groups on 3,4-diaminopyrrolidine 24.44 the affinity and
resistance profile is improved (wt: wild type; I84V: mutant).

O R1 R1 O
O O
S N N S
R2 R2

N
H
24.44

Compound R1 R2 Ki (mM) wt Ki (mM) I84V


24.47 2.15 1.07

24.48 12.3 84.0

24.49 74.7 53.1

24.50 1.57 5.82

24.51 CH3 0.67 0.46

24.52 Cl 0.77 0.47

24.53 0.46 0.55


Br

24.54 0.39 0.33


I

(continued)
560 24 Aspartic Protease Inhibitors

Table 24.7 (continued)


Compound R1 R2 Ki (mM) wt Ki (mM) I84V
24.55 0.80 0.50
F3C

24.56 0.27 0.13

NH2

24.57 0.26 0.04

CONH2

24.58 0.06 0.01


F3C

CONH2

the binding pocket. The benzyl groups on the amino group are placed in the S1 and
S10 pockets. The benzenesulfonyl group occupies the S2 and S20 pockets. The
pockets seem to be much better filled with this inhibitor than it was with 24.45 or
24.46. Nevertheless, it stood to reason that the substituents in S1 and S10 in the para
position should be enlarged for optimization. An added bromine or iodine substit-
uent increases the affinity relative to the wild type by about six-fold. The mutant’s
inhibition is improved by a factor of 2. There even seemed to be adequate room in
the S2 and S20 pockets for larger groups. Indeed, a methyl group or a chlorine atom
in the ortho position increases the affinity by a factor of 2. The effect was not as
pronounced with the mutant. Furthermore, the acidic amino acids Asp29 and
Asp30, which can be involved in the interactions with the ligand, are found at the
end of this pocket. This interaction is possible by introducing an amino or
carboxamide group in the para position. The binding to the enzyme then increases
by a factor of about 10. Further optimization led to derivative 24.58 with a CF3
group on the P1 benzyl group and an amide group on the P2 substituent. It inhibits
the wild-type enzyme with Ki ¼ 61 nM and the mutant with 14 nM. The binding
mode of these new inhibitors based on a 3,4-diaminopyrrolidine 24.44 deviates
from that of all other currently marketed inhibitors. Perhaps this will open new
perspectives to break resistance (Fig. 24.25).
In one of the last steps, the central heterocycle was “cut open” and replaced with
open-chain, secondary amines (Fig. 24.23). Two- and three-membered aliphatic
chains were introduced between the SO2 groups, which were attached to address the
flap region, and the central amine nitrogen atom. The measurement of the inhibition
24.7 Other Targets from the Family of Aspartic Proteases 561

Fig. 24.25 Compared to all


of the currently marketed HIV
inhibitors (Fig. 24.15, beige)
the inhibitors based on 3,4-
diaminopyrrolidine 24.44
(light green) adopt an
deviating binding mode. As
a result, a different activity
profile against resistant
mutants is observed.

constants of different aspartic proteases gave single and double-digit micromolar


inhibition constants. A crystal structure determination was achieved with com-
pound 24.59 (Ki ¼ 9.6 mM for HIV protease; Fig. 24.24d). As expected, the basic
nitrogen atom binds between the two aspartic acids, but at an H-bonding distance
from only one of the two aspartic acids. Interestingly the structural water returned in
the inhibitor complex and mediates a binding contact between the sulfonyl group
and the residues of the flap region.
The study of the open-chain compounds suggests how the sterically fixed
heterocycle determines the arrangement of the inhibitors in the binding pocket.
Its spatial requirement is responsible for displacing the structural water from the
binding pocket so that the H-bond-acceptor groups of the inhibitor can interact
directly with the flap region. The crystal structure with the open-chain compound
24.59 gives the impression that it has an ideally complementary fit to the protease. It
lies seemingly fully relaxed in the binding pocket, finds partners for its polar groups
in the protein, and allows the structural water to return. Even though it has amino
groups in the para position of the benzenesulfonic acid group that led to a tenfold
increase in activity in the series with the 24.44 scaffold (Fig. 24.22), it binds with
only micromolar affinity. The most beautiful binding mode does not help if the
open-chain compound first has to rearrange to achieve the necessary geometry at
the site of action. It loses too many degrees of freedom around rotatable bonds, and
this is a high price to be paid for by the binding affinity (▶ Sect. 4.7). Inhibitors with
a preorganized geometry have an advantage from an entropic point of view.

24.7 Other Targets from the Family of Aspartic Proteases

In addition to the two examples of renin and HIV protease, many other members of
the aspartic protease family have been validated as targets for drug development.
First, cathepsin D, a protein involved in protein catabolism, appeared to be
interesting, and concepts for the treatment of breast cancer or muscular dystrophy
have been pursued. The previously mentioned pepsin from the stomach has been
discussed as a possible therapeutic target for peptic ulcer disease. Secretory aspartic
562 24 Aspartic Protease Inhibitors

proteases (SAP) from Candida albicans have been considered as a possible target
enzymes to treat fungal diseases.
Drug design in the area of b-secretases looks very promising. Their inhibition
could lead to an efficacious Alzheimer’s disease treatment. b-Amyloid protein,
which makes up the pathological and dangerous plaques in the brains of
Alzheimer’s patients, is cleaved from a larger precursor, amyloid precursor protein
(APP). In 1999 two membrane-bound proteases from the aspartic protease family,
b- und g-secretases, were reported that catalyze the release of b-amyloid. Thereafter
potent inhibitors of this protease have been intensively sought. They are termed
BACE-1 and 2, which are abbreviations of beta-site-APP-cleaving enzymes. Addi-
tional target structures that are being currently investigated are the plasmepsins.
They serve the malaria parasite in the digestion of hemoglobin in the phagosome.
The parasite uses the components of hemoglobin as nutrition. Plasmepsin is used in
the initial cleavage of hemoglobin and cleaves its a-chain between Phe33 and
Leu34. Four isoforms of plasmepsin are involved in the further digestion to larger
peptide fragments. Moreover, falcipaines, cysteine proteases, and falcilysin, a zinc
protease, are involved in this process. The plasmepsins show a structural homology
to cathepsin D. The first lead structures are derived according to entirely analogous
principles as, for example, were used with renin. Newer results have underscored
the fact that multiple enzymes must be simultaneously shut off to efficiently fight
the parasite and achieve a malaria therapy based on the protease inhibitors of
hemoglobin metabolism. Therefore, it might be appropriate to develop inhibitors
for the simultaneous, selective deactivation of all four plasmepsins.

24.8 Synopsis

• Aspartic proteases possess two facing aspartate residues in their catalytic cleav-
age site. A water molecule, located at the apex between both aspartates, is
polarized and nucleophilically attacks the carbonyl carbon atom of the amide
bond to be cleaved.
• The cleavage reaction proceeds through a tetrahedral transition state with
a temporarily formed geminal-diol structure. Peptidomimetic inhibitors imitate
this intermediate structure by using chemically stable building blocks. Hydroxyl
groups embedded into hydroxyethylene or statin moieties have been especially
used as transition-state isosteres.
• Aspartic proteases often cleave between hydrophobic amino acids. These resi-
dues cannot form strong interactions to the specificity pockets of the protease on
both sides of the cleavage site. They bind through multiple contacts, and the
recognition pockets are well formed on both sides.
• Renin specifically cleaves angiotensinogen to angiotensin I. Subsequently, this
product is further cleaved to the octapeptide angiotensin II, which stimulates
blood pressure to increase once recognized at its receptor. Renin exhibits
a large, virtually merged S1/S3 pocket; this gave rise to the design concept to
bridge P1/P3 substituents in the inhibitor aliskiren and to disrupt its central
Bibliography 563

peptide chain. A more polar, orally available antihypertensive agent resulted


with good duration of action.
• HIV protease is a viral aspartic protease that cleaves the incipient polypeptide
chain into mature proteins required for the life cycle of the virus. It is a
C2-symmetric homodimer with a structural water molecule mediating interactions
between the bound substrates and the flap region closing up the catalytic site.
• Through systematic variations of the minimal substrate and introduction of
transition state isosters, a series of potent and selective drugs with peptide-like
scaffolds is available for therapy. Upon administration, the virus becomes
resistant through mutational modifications; meanwhile exchanges have been
reported for nearly half of all amino acid positions.
• A combination therapy, the so-called HAART strategy, is recommended for HIV
infection; this tries to achieve a better suppression of viral replication through
simultaneous administration of multiple inhibitors acting on different targets that
are crucial for the virus.
• Multiple design attempts have been followed to depart from peptide-like inhibitors
to non-peptidic structures. To date, tipranavir is the only compound to be success-
fully launched to market that has a central hydroxypyrone building block. Recently,
many scaffolds incorporating a basic nitrogen to address the catalytic dyad have
been proposed as novel skeletons to develop aspartic protease inhibitors.
• Aside from renin and HIV protease, cathepsin D, b-, and g-secretase, the
parasitic plasmepsin proteases, and the fungal secretory aspartic protease SAP
have been investigated as potential drug targets.

Bibliography

General Literature
Anderson PS, Kenyon GL, Marshall GR (eds) (1993) Therapeutic approaches to HIV, vol 1,
Perspectives in drug discovery design. Escom, Leiden
Babine RE, Bender SL (1997) Molecular recognition of protein-ligand complexes: applications to
drug design. Chem Rev 97:1359–1472
Dash C, Kulkarni A, Dunn B, Rao M (2003) Aspartic peptidase inhibitors: implications in drug
development. Crit Rev Biochem Mol Biol 38:89–119
De Clercq E (1995) Toward improved anti-HIV chemotherapy: therapeutic strategies for inter-
vention with HIV infections. J Med Chem 38:2491–2517
de Clercq E (ed) (2011) Antiviral drug strategies, 50th edn, Methods and principles in medicinal
chemistry. Wiley-VCH, Weinheim
Eder J, Hommel U, Cumin F, Martoglio B, Gerhartz B (2007) Aspartic proteases in drug
discovery. Curr Pharm Des 13:271–285
Ghosh AK (ed) (2010) Aspartic acid proteases as therapeutic targets, vol 45, Methods and
principles in medicinal chemistry. Wiley-VCH, Weinheim
Greenlee WJ, Weber AE (1991) Renin inhibitors, drugs. News & Perspectives 4:332–339
Hutchins C, Greer J (1991) Comparative modeling of proteins in the design of novel renin
inhibitors. Crit Rev Biochem Mol Biol 26:77–127
Martin JA, Redshaw S, Thomas GJ (1995) Inhibitors of HIV proteinase. Prog Med Chem
32:239–288
564 24 Aspartic Protease Inhibitors

Rosenberg SH (1995) Renin inhibitors. Prog Med Chem 32:37–144


West ML, Fairlie DP (1995) Targeting HIV-1 protease: a test for drug-design methodologies.
Trends Pharmacol Sci 16:67–74

Special Literature

Blum A, Böttcher J et al (2008) Structure-guided design of C2-symmetric HIV-1 protease


inhibitors based on a pyrrolidine scaffold. J Med Chem 51:2078–2087
Condra JH, Schleif WA, Blahy OM et al (1995) In vivo-emergence of HIV-1 variants resistant to
multiple protease inhibitors. Nature 374:569–571
Güller R et al (1999) Piperidine-renin inhibitors compounds with improved physicochemical
properties. Bioorg Med Chem Lett 9:1403–1408
Kleinert HD, Rosenberg SH, Baker WR et al (1992) Discovery of a peptide-based renin inhibitor
with oral bioavailability and efficacy. Science 257:1940–1943
Lam PYS, Jadhav PK, Eyermann CJ et al (1994) Rational design of potent, bioavailable,
nonpeptide cyclic ureas as HIV protease inhibitors. Science 263:380–384
Li YC (2007) Inhibition of renin: an updated review of the development of renin inhibitors. Curr
Opin Investig Drugs 8:750–757
Vacca JP et al (1994) L-735,524: an orally bioavailable human immunodeficiency virus type
i protease inhibitor. Proc Natl Acad Sci 91:4096–4100
Vara Prasad JVN, Para KS, Lunney EA et al (1994) Novel series of achiral, low molecular weight,
and potent hiv-1 protease inhibitors. J Am Chem Soc 116:6989–6990
Wood JM et al (2003) Structure-based design of aliskiren, a novel orally effective renin inhibitor.
Biochem Biophys Res Commun 308:698–705
Inhibitors of Hydrolyzing Metalloenzymes
25

A metal ion in the catalytic site is needed for the function of another important
class of enzymes that cleave peptide and ester bonds. By coordinating the metal ion,
these enzymes activate a water molecule for nucleophilic attack on the bond that is
to be cleaved. The water molecule experiences a drastic change in its pKa value in
this state. By far, zinc is the most commonly used metal ion in these enzymes, but
iron, cadmium, cobalt, or manganese are also found. The presence of a metal ion is
essential for the activity of the protease or esterase. If the metal ion is removed
from the enzyme by the addition of a strong complexation reagent, for example,
b-mercaptoethanol or ethylenediaminetetraacetic acid (EDTA), the catalytic activ-
ity is not observable anymore.
Many therapeutically important enzymes are metalloproteases. The zinc pro-
teases must be mentioned first, above all, angiotensin-converting enzyme (ACE).
ACE inhibitors have been used for many years to treat high blood pressure.
Moreover, in recent years further metalloproteases have been identified as possible
targets for drug design. Among these are the endothelin-converting enzyme, neutral
endopeptidases, and the matrix metalloproteases (Table 25.1). Further groups of
important zinc enzymes are the carbonic anhydrases, the zinc-containing
b-lactamases and phosphodiesterases.

25.1 Structure of Zinc Metalloproteases

In 1967 William Lipscomb determined the 3D structure of the first zinc protease for
the digestive enzyme carbopeptidase A. The zinc ion that is necessary for the
enzyme activity is complexed to two His and one Glu side chains. A water molecule
occupies the fourth coordination site. Moreover, an additional glutamate is found in
the vicinity of the zinc ion. The same amino acids are also responsible for binding
zinc in many other metalloproteases. The presence of the amino acid sequence
His–Glu–X–X–His (X is an arbitrary amino acid) is characteristic for most of the
known zinc proteases. For example, it is found in collagenase, thermolysin,
neutral endopeptidase 24.11, and in endothelin-converting enzyme (Table 25.2).

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_25, 565


# Springer-Verlag Berlin Heidelberg 2013
566 25 Inhibitors of Hydrolyzing Metalloenzymes

Table 25.1 Function and preferred cleavage sites of some metalloproteases.


3D
Enzyme Cleavage site Function structure
Thermolysin X–Ala, X–Val, Bacterial protease +
X–Ile
Carboxypeptidase X–Tyr, X–Phe Digestion +
ACEa Phe–His, Phe– Transforms angiotensin I into angiotensin II, +
Leu, Pro–Phe which increases blood pressure
NEP 24.11b Phe–Leu, Cys– Multifunctional (cleaves enkephalin, among 
Phe others)
ECEc Trp–Val Transforms big-endothelin into endothelin, 
which increases blood pressure
Collagenase Gly–Leu, Gly–Ile Tissue remodeling +
Stromelysin Gly–Leu, Gly–Ile Tissue remodeling +
a
ACE angiotensin-converting enzyme
b
NEP neutral endopeptidase
c
ECE endothelin-converting enzyme

Table 25.2 Characteristic amino acid sequences in the active site of different metalloproteases.
Enzyme Position Amino acids
Thermolysin 142–146 His Glu Leu Tyr His
NEP 24.11 583–587 His Glu Ile Thr His
ECE 590–594 His Glu Leu Thr His
Astacin 92–96 His Glu Leu Met His
Collagenase 201–205 His Glu Phe Gly His
Stromelysin 201–205 His Glu Ile Gly His

The discovery of this amino acid sequence in the primary sequence of a new protein
is strongly indicative of a zinc protease. In metalloproteases or carbonic
anhydrases, zinc is complexed by three histidine residues. A water molecule
occupies the fourth site.
In the body, zinc exists as the double-positively charged cation Zn2+. This
positive charge is used by the enzyme for the amide cleavage. Ivano Bertini’s
group at the University of Florence in Florence, Italy managed to determine the
high-resolution structures of the uncomplexed and product-inhibited
metalloprotease MMP-12. These structures allow the following mechanism to be
derived: In the uncomplexed metalloprotease, the zinc ion is octahedrally coordi-
nated by three water molecules in addition to three amino acid residues (His or
Glu). One of the water molecules forms an additional hydrogen bond to
a neighboring glutamate. This residue Glu219 in the MMPs, Glu270 in carboxy-
peptidase, and Glu143 in thermolysin additionally polarizes the water molecule.
Therefore it probably exists as an OH ion (Fig. 25.1). The peptide substrate
diffuses into the binding pocket and displaces the two other water molecules from
the zinc ion. The water molecule that is polarized by glutamate attacks the carbonyl
group of the substrate’s amide bond, which is to be cleaved; the substrate is held in
25.1 Structure of Zinc Metalloproteases 567

Fig. 25.1 Mechanism of


peptide cleavage by
a metalloprotease. The
peptide substrate binds with
its P2, P1, P10 , and P20 residues
in the corresponding
specificity pockets of the
protease. The amide group to
be cleaved is found between
the zinc ion and a water
molecule (or OH–), which is
polarized by the acid group of
the neighboring glutamate
residue (a). This water
molecule nucleophilically
attacks the carbonyl carbon
atom to form a tetrahedral
transition state. The zinc ion
is temporarily
pentacoordinated and
stabilizes the negative charge
of the newly formed geminal
diol structure (b). The
transition state collapses with
release of both cleavage
products (c).

place by hydrogen bonds to the peptide backbone on the C-terminal side. A geminal
diol structure is formed at the reaction site that is stabilized by the now
pentacoordinated zinc ion. The actual cleavage of the amide bond is achieved,
and the two product molecules initially remain in the vicinity of the zinc.
568 25 Inhibitors of Hydrolyzing Metalloenzymes

O
COO−
N
H3N+ H H3N+
H
N H
O N

O COO−
H2N O

N-Terminal
Cleavage Product
Glu219

C-Terminal Cleavage
Product

Zn2+

Fig. 25.2 A crystal structure of MMP-12 with both cleavage products has been determined
(Fig. 25.1c). The cleavage product of the former N terminus (left, light-red carbon atoms)
coordinates with its newly formed carboxylic acid function through an oxygen atom to the zinc
ion and forms no hydrogen bonds to the enzyme itself. The cleavage product originating from the
C terminus (right, light-green carbon atoms) forms four H-bonds to the main chain of the protein
and binds with its P10 residue in the deeply formed S10 pocket. The released amino group
coordinates to Glu219 and the water molecule that is bound to the zinc ion.

The glutamate residue presumably adopts the role of the proton transfer agent in
this step. The cleavage product of the former N terminus is coordinated through an
oxygen atom of the newly formed carboxylic acid function to the zinc ion
(Fig. 25.2). However, it does not form any other hydrogen bonds to the protein.
On the other hand the cleavage product of the C terminus forms four hydrogen
bonds to the main chain of the enzyme, and its P10 residue binds in the S10 pocket.
The newly formed free amino group initially remains in the vicinity of the zinc ion.
It probably exists next to the zinc ion in an uncharged state. Then the cleavage
product that originated from the N terminus leaves the catalytic site. Presumably it
25.2 Key Step in the Design of Metalloprotease Inhibitors: Binding to the Zinc Ion 569

is displaced by water, which resumes its place at the zinc ion. Finally the C-terminal
product leaves the binding pocket.
Meanwhile, the three-dimensional structures of many zinc proteases have been
solved, among them those of the angiotensin-converting enzyme and many of the
interesting matrix metalloproteases such as collagenases, gelatinases, and
stromelysin (Table 25.1). Four subfamilies are known from the family of carbonic
anhydrases. The therapeutically most significant ones are the a-carbonic
anhydrases that fulfill important tasks in many organs and for which numerous
drugs have been developed.

25.2 Key Step in the Design of Metalloprotease Inhibitors:


Binding to the Zinc Ion

The zinc ion plays a pivotal role in the catalytic mechanism. The known spatial
structures of metalloprotease–inhibitor complexes show that almost all highly
potent inhibitors contain functional groups that bind directly to the zinc ion. If
these groups are left out, the binding affinity drops significantly. Therefore the first
step in the design of new inhibitors must be to seek functional groups that can bind
to the Zn2+ ion particularly well. Different groups have been described in the
literature for this purpose and are summarized in Fig. 25.3. Phosphonamides –
PO2NH–, phosphonates –PO2O–, and phosphinates –PO2CH2– can all be seen as
transition-state analogues of the enzyme reaction. In fact a few potent
metalloprotease inhibitors are known, for instance, the natural product
phosphoramidon 25.1, that contain such a group. The relative binding strength of
different groups was investigated for carboxypeptidase A (Table 25.3). Likewise,
different zinc-binding groups were tested for endothelin-converting enzyme. The
results of these investigations are presented in Table 25.4.
Remarkable variability has been observed in the binding potency of functional
groups that interact with the Zn2+ ion. Attenuated partial charges on the zinc ion and on
the anchor group are probably responsible for this effect. The zinc ion itself can be
found in very different local environments (i.e., 3  His; 2  His/1  Glu or 1  Glu/1
 Cys). Obviously thiol groups, –SH, and hydroxamic acids, –CONHOH, are
particularly well-suited to contribute strong binding of the metalloprotease. The latter
group binds as a bidentate ligand to the zinc ion. Carboxylic acids and ketones bind
more weakly to the zinc ion than the above-mentioned groups. Nonetheless, acids are
of particularly great interest because in form of esters acids are often orally available
prodrugs (▶ Sect. 9.2). In contrast to phosphinates and phosphonic acids,
phosphonamides are chemically not particularly stable and therefore are not the first
choice in the development of a new drug. On the other hand, sulfonamides are
excellent zinc anchors, above all for the carbonic anhydrases.
How might potential active substances for metalloproteases be designed?
A comparison of known crystal structures (e.g., MMP-12, Fig. 25.2) shows that
the binding pockets in these proteins are much better defined on the primed side of
the cleavage site. Therefore, inhibitor design must concentrate on the S10 and its
570 25 Inhibitors of Hydrolyzing Metalloenzymes

Fig. 25.3 Functional groups H NH O


of metalloprotease inhibitors N OH
that are often used to bind to SH S NH
the zinc ion. Hydroxamic O OH
acids and thiols (upper left) in
particular lead to highly H
potent inhibitors. O OH O OH O OH N
A phosphoramide group is P P S
P
found in the natural product OH N O O
H
phosphoramidon 25.1 that the
inhibitor uses to coordinate to CH3
the zinc ion. It inhibits
HO
thermolysin in the nanomolar O O O−
range.
HO P O
O N
OH H
HN

COO− NH
25.1 Phosphoramidon Ki = 2.8 ⫻ 10−8 M

Table 25.3 Binding of phenylpropionic acids 25.2 to carboxypeptidase. The strongest binding
was found with the thiol derivative.

OH 25.2
R
O

R Ki (nM)
H 6,200
CH2COOH 450
CH2S(¼NH)2CH3 250
OP(¼O)(OH)2 140
CH2SH 11

neighboring pockets on the primed side. Nevertheless, it has been shown that
occupancy of the S1 and S2 pockets can be very important to afford inhibitors
with adequate selectivity.
The choice of appropriate groups for these pockets is dictated by their chemical
composition. Moreover, the inhibitors must be fitted with an appropriate head
group, as described previously, to coordinate to the zinc ion. It is easily understood
from the above-described mechanism for peptide cleavage why the binding pocket
on the unprimed side is less well-established. After peptide cleavage, a peptide is
formed on this side that has a terminal acid function. Such a function is itself a good
25.3 Thermolysin: Tailored Design of Enzyme Inhibitors 571

Table 25.4 Inhibition of the endothelin-converting enzyme by tryptophan derivatives 25.3. The
hydroxamic acid (R ¼ CONHOH) as well as thiol compounds have much better affinity than the
carboxylic acid derivatives.

CH3

H
N COOH
R 25.3
O

R Ki (mM)
CONHOH 24
CH2SH 12
COOH >100
CH2COOH >100

coordination group for zinc. If the N-terminal end of the cleaved peptide were
bound in a strongly pronounced pocket on the unprimed side, a self-inhibition of the
protease would result. Usually there is no interest in this property. If the cleaved
N terminus has weak affinity to the protease, this type of inhibition can occur at high
product concentration. This can be a desirable regulatory mechanism (feedback
regulation) of Nature.

25.3 Thermolysin: Tailored Design of Enzyme Inhibitors

Thermolysin is a bacterial zinc protease that has no therapeutic importance. None-


theless, the 3D structure of thermolysin in complex with a large number of different
inhibitors has been determined. The influence of many elementary factors on the
strength of the protein–ligand interaction could be investigated with this protease.
Therefore this enzyme is particularly well-suited for the study of 3D structure–
activity relationships. Furthermore, its great stability makes it a robust object for
experimental investigations, and the 3D structure of thermolysin has been repeat-
edly consulted as the basis for modeling other metalloproteases.
One of the central assumptions of structure-based drug design is the idea that the
binding affinity of a ligand can be improved if the receptor-bound conformation
can be embedded into a rigid scaffold. This working hypothesis was more closely
investigated on the example of thermolysin inhibitors in the group of Paul Bartlett.
The 3D structure of the complex of Cbz–GlyP–Leu–Leu 25.4 (Ki ¼ 9 nM Fig. 25.4)
in complex with thermolysin served as the starting point for the work. The peptidic
inhibitor binds in a conformation that is similar to a b-turn. Therefore, the design of
a macrocyclic ligand that stabilizes this turn conformation seems possible. Essential
interactions became obvious from the analysis of the 3D structure of this inhibitor
572 25 Inhibitors of Hydrolyzing Metalloenzymes

Fig. 25.4 The development


of cyclic thermolysin O O−
inhibitors based on the open- H
chained inhibitor Cbz–GlyP– O N P
N
Leu–Leu 25.4. The cyclic H O
O HN
inhibitor 25.5 binds 50-fold
more strongly to thermolysin −OOC
than the open-chain
compound 25.7. Compound
25.6 also contains the 25.4 Cbz-GlyP-Leu-Leu
chromane scaffold, but the
conformation is not enforced
by a ring closure. O O− O O− O O−
P P P
N N N
H O H O H O
O HN O HN HN

25.5 25.6 25.7


Ki = 4 nM Ki = 80 nM Ki = 190 nM

with thermolysin. The Bartlett group then sought a rigid structural element to form
a scaffold in which the conformation of both leucine side chains remained
unchanged. Chromane 25.5 (Fig. 25.4) was chosen. The additional methyl group
on the ring had to be added for synthetic reasons.
A comparison of the binding constants of compounds 25.5 and 25.7 proves that the
rigidization caused by the chromane group increased the binding affinity by a factor
of 50. This corresponds to an energetic gain of about 10 kJ/mol. The X-ray structure
analysis of the macrocyclic ligand 25.5 shows that it binds as expected. Both leucine
side chains and the main-chain atoms are found in the same position as with
Cbz–GlyP–Leu–Leu (25.4, Fig. 25.5). Certainly the gain in binding energy is not
only a result of the rigidization of the ligand. The direct interaction of the chromane
group with the enzyme additionally contributes to the affinity. The goal of the synthesis
of 25.6 was to differentiate between the two effects of the rigidization and the affinity
gain from the chromane unit. Compound 25.6 binds 20-times more weakly to
thermolysin than 25.5. Nonetheless, the 3D structure shows that the open-chained
inhibitor binds to the enzyme in another conformation. This is a further example that
structures assumed to be very similar do not necessarily bind in the same way!

25.4 Captopril, a Metalloprotease Inhibitor for Hypertension


Therapy

Angiotensin-converting enzyme (ACE) transforms the decapeptide angiotensin I


into the octapeptide angiotensin II by cleaving off the C-terminal dipeptide
25.4 Captopril, a Metalloprotease Inhibitor for Hypertension Therapy 573

Phe114

Asn112

Zn2+
Arg203

Fig. 25.5 The 3D structure of a complex of thermolysin and Cbz–GlyP–Leu–Leu 25.4 (gray
carbon atoms). The leucine side chain (right) neighboring a phosphate group occupies the deep S10
pocket that is pointed toward the interior of the protein; the second leucine residue lies in the
shallow S20 pocket that is open to the surface. The Cbz group is oriented in the S1 pocket (left). The
macrocyclic inhibitor 25.5 (green carbon atoms) with the chromane scaffold locks the conforma-
tion of 25.4 and places the leucine side chains in S10 and S20 in an analogous way. Although this
inhibitor leaves the S1 pocket completely unoccupied, it binds to thermolysin more strongly than
the open-chain compound 25.4.

His–Leu (Fig. 24.5, ▶ Sect. 24.2). The release of this octapeptide leads to an
increase in blood pressure. Furthermore, ACE catalyzes the degradation of the
blood-pressure-lowering nonapeptide bradykinin to inactive peptides and in so
doing, also acts indirectly to increase blood pressure. This means that the inhibition
of ACE can simultaneously prevent blood pressure increases by blocking multiple
mechanisms. In 1965 Sergio Henrique Ferreira and John Robert Vane isolated
a peptide mixture from the venom of a snake, Bothrops jararaca (the South
American pit viper), which prolonged the blood-pressure-decreasing effects of
bradykinin in that it inhibited a protease that degraded bradykinin in the body. It
was shown that this peptide (initially called bradykinin potentiating peptide, BPP)
also inhibits the transformation of angiotensin I into angiotensin II. Multiple
structurally related peptides were identified. The most active was teprotide Pyr–
Trp–Pro–Arg–Pro–Gln–Ile–Pro–Pro (Pyr ¼ pyroglutamic acid). This nonapeptide
was synthesized by Miguel Ondetti at the Squibb company. Teprotide is a potent
ACE inhibitor with a binding constant of Ki ¼ 100 nM. In clinical trials it was shown
that the compound has not only a blood-pressure-decreasing effect in animal
574 25 Inhibitors of Hydrolyzing Metalloenzymes

Zn2+

Arg145

Glu270

Fig. 25.6 The crystal structure of the carboxypeptidase–benzylsuccinate complex. A carboxylate


group binds to the zinc ion and the other forms a chelate-like salt bridge to the arginine side chain
of Arg145. The phenyl group fills a lipophilic pocket.

models but also in humans. Nevertheless, because it is a peptide, teprotide is not


orally bioavailable and is therefore not suitable as a drug. Despite this observation,
the studies proved that an ACE inhibitor is an interesting active compound for the
treatment of high blood pressure. Further investigations showed that even dipep-
tides such as Val–Trp (Ki ¼ 1.8 mM) and Ala–Pro (Ki ¼ 230 mM) inhibit ACE, even
though more weakly than the nonapeptide.
The decisive breakthrough was the hypothesis from Miguel Ondetti and David
Cushman that ACE has a structural similarity to the metalloprotease carboxypep-
tidase A, which had been extensively investigated. Lipscomb had shortly before
determined the 3D structure of this enzyme. In addition, benzylsuccinic acid was
known as an extraordinarily effective inhibitor of carboxypeptidase A considering
its small molecular size (Fig. 25.6). A binding mode was postulated for this
molecule that invokes interactions to the enzyme also experienced by both products
of the substrate hydrolysis (Fig. 25.7). Ondetti and Cushman translated this concept
to ACE. Whereas carboxypeptidase A cleaves the last amino acids of a peptide,
ACE cleaves a dipeptide. This means that a succinic acid derivative that is
substituted with a suitable amino acid should be a potent ACE inhibitor (Fig. 25.8).
After the observation that proline led to good results as the C-terminal amino
acid of peptidic ACE inhibitors, carboxyalkanoylprolines were initially investi-
gated as possible ACE inhibitors (Fig. 25.9). Succinoyl-L-proline (25.8) was the
first compound to be synthesized in this project at Squibb. As hoped, it proved to
be an ACE inhibitor, but with affinity only in the micromolar range
25.4 Captopril, a Metalloprotease Inhibitor for Hypertension Therapy 575

Lipophilic binding Lipophilic


pocket binding pocket

Zn2+ Zn2+

O O
H
N O− O−
{ N + HO +
H NH2 NH2
R O O
H2N N H2N N
H H

Substrate Inhibitor (benzylsuccinic acid)

Fig. 25.7 Comparison of the binding mode of the inhibitor benzylsuccinic acid and the peptidic
substrate to carboxypeptidase A. The inhibitor forms the same interaction to the enzyme as the
substrate. The amide group to be cleaved was replaced by a carboxylate group.

O R2 O O R2 O
H H
N N
N OH HO OH
H O R1
R3 O R1
Substrate Inhibitor

Fig. 25.8 The development of ACE inhibitors: A comparison of the substrates with the inhibitors
that were investigated by Ondetti and Cushman. In the initially investigated structure, the amide
bond to be cleaved is replaced by a carboxylate group.

(IC50 ¼ 300 mM). The replacement of the proline unit by another amino acid did
not produce an improvement in the binding: Proline was already the optimal
amino acid. Next, the length of the acid side chain was optimized. Glutaryl-L-
proline (25.9) proved to be the best representative with a moderate improvement
in binding (IC50 ¼ 70 mM). The introduction of a methyl group in the side chain
(25.10 and 25.11) gave a strong increase in the binding affinity by a factor of 15.
Finally, the replacement of the carboxylate with a thiol group (25.12 and 25.13)
afforded the breakthrough with an increase in the binding by an order of
magnitude. The compound, SQ 14225, 25.13 D-2-methyl-3-mercaptopropanoyl-
L-proline binds to ACE with Ki ¼ 1.7 nM, and is orally available. SQ 14225 has
been marketed under the name captopril for many years and has proven itself as
an agent to treat high blood pressure. Because decreasing the blood pressure
significantly reduces stress on the heart, captopril has also been successfully
used for the treatment of congestive heart failure.
The compounds that are shown in Fig. 25.10 prove that both a free SH group
and a free carboxylate group are necessary for the strong binding of captopril to
ACE. Esterification of the carboxyl group to 25.14 or S-methylation to 25.15
leads to a dramatic loss in affinity, just as the exchange of the amide group
576 25 Inhibitors of Hydrolyzing Metalloenzymes

Fig. 25.9 Binding of ACE


inhibitors. The rationally HOOC N
designed lead structure 25.8 is
optimized stepwise. The O COOH 25.8 IC50 = 330 μM
introduction of a methyl
group in the side chain to give
25.10 as well as the N
replacement of the HOOC
carboxylate group with a thiol O COOH 25.9 IC50 = 70 μM
are crucial for the increase in
affinity. The result was CH3
captopril 25.13.
HOOC N

O COOH 25.10 IC50 = 22 μM

CH3
N
HOOC
O COOH 25.11 IC50 = 4.9 μM

HS N 25.12 IC50 = 200 nM


Ki = 12 nM
O COOH

CH3
HS N 25.13 Captopril

O COOH IC50 = 23 nM
Ki = 1.7 nM

HS N

O COOEt 25.14 IC50 = 17 μM


Fig. 25.10 A free thiol and
carboxylate function are
necessary for binding to ACE.
Esterification of the acid S N
group of 25.12 (Fig. 25.9) to
give 25.14 reduced the O COOH 25.15 IC50 = 4300 μM
binding affinity by almost two
orders of magnitude. The
H
S-methylation of 25.12 gives HS N
25.15, which has a binding
affinity that is reduced by O COOH 25.16 IC50 = 2.8 μM
a factor of 20,000. Compound
25.17 contains merely the
thiol and the carboxylate
HS
group. These two groups
alone are just enough to
COOH 25.17 IC50 = 1100 μM
achieve detectable binding.
25.5 Finally the Crystal Structure of ACE: Do We Have to Redraft a Success Story? 577

for a –CH2CH2– group in 25.16 to 25.17 does. Because of their sensitivity to


oxidation, thiol groups are not very popular functional groups for drugs. Therefore,
other anchor groups were sought.
In the meantime, an entire palette of efficacious ACE inhibitors was made
available (Fig. 25.11); 17 products found their way into clinical trials. Of particular
note is enalapril 25.18 from Merck & Co. It, like most other marketed products,
with the exception of lisinopril, is administered as a prodrug to increase the oral
availability (▶ Sect. 9.2). As with other ethyl esters, it is quickly converted in the
body to its biologically active form enalaprilat, the anion of the free acid. Both
enalapril and lisinopril 25.19 have much longer plasma half-lives than captopril.

25.5 Finally the Crystal Structure of ACE: Do We Have to


Redraft a Success Story?

In his seminal publication in 1977 on the design of captopril, David Cushman once
again highlighted the importance of structural models for the work at Squibb:

The studies described above exemplify the great heuristic value of an active-site model in
the design of inhibitors, even when such a model is a hypothetical one. Only when suitable
information on substrate specificity and mechanism of action of an enzyme is available can
one make a reasonable working hypothesis with regard to complementary functionality
needed in an inhibitor.

Could he have dreamed that it would take another 25 years before this structure
became available? In 2003 the group of Edward Sturrock in Cape Town, South Africa
accomplished the structure determination. Could it confirm the previously proposed
model? Not all of the assumed binding modes for the inhibitors were correct, but the
structure delivered critical knowledge that brought life into the ACE-inhibitor research
area again. The human enzyme is extensively glycosylated. It is composed of 1,227
amino acids of an extracellular domain and is anchored in the cell membrane with 28
additional residues. Interestingly, it possesses two catalytic domains, a phenomenon that
is only very rarely seen in enzymes and has its origin in gene duplication. The
N-terminal domain contains 612 residues, and the C-terminal has 650. The two domains
are 60% identical. Both domains are catalytically active, and their catalytic sites differ
by only a few amino acids. Thus, a difference in selectivity for potential ligands is to be
expected. Moreover, the C-domain depends strongly on the local chloride concentration,
whereas the N domain does much less so. Aside from this so-called somatic form
(s-ACE), there is also a testis form (t-ACE), which is 701 amino acids long and is
composed of one domain. Except for the first 36 residues, it is almost identical to the
C domain of the somatic form. The structure determination with lisinopril 25.19
(Fig. 25.12) was accomplished with this form. The inhibitor binds with its central acid
group to the zinc ion. Its phenethyl group lies in the S1 pocket. The lysine-like group is in
the S10 pocket and undergoes an interaction with Glu162. The proline part binds with its
acid group in S20 to Lys511 and Tyr520. It was then interesting to model the differences
in the N and C domains of the s-ACE based on the t-ACE structure and to prepare the
578 25 Inhibitors of Hydrolyzing Metalloenzymes

NH3+

CH3
N
EtOOC N N
H HOOC N
O COOH H
O COOH
25.18 Enalapril
25.19 Lisinopril

S CH3
CH3 H H
S
N
N EtOOC N
EtOOC N H
H O COOH
O COOH
25.20 Spirapril 25.21 Perindopril

CH3 CH3
H H H H
N N
EtOOC N EtOOC N
H H
O COOH O COOH
25.22 Ramipril 25.23 Trandolapril

R
R

CH3 N
N N
EtOOC N EtOOC N
H H
O COOH O
COOH
25.24 R = H Quinapril 25.26 Cilazapril
25.25 R = OCH3 Moexipril

N O O N
EtOOC N P
H O O COOH
O COOH O
25.27 Benazepril 25.28 Fosinopril

Fig. 25.11 Examples of ACE inhibitors that are used in therapy.


25.5 Finally the Crystal Structure of ACE: Do We Have to Redraft a Success Story? 579

Glu162

Ala354

Zn2+
His353

His513
Lys511

Tyr520

Fig. 25.12 Crystal structure of lisinopril 25.19 (Fig. 25.11) with t-ACE. The central carboxylate
group of the inhibitor coordinates to the zinc ion. The NH group on the lysine residue of lisinopril
forms an H-bond to the C═O group of Ala354 in the S10 pocket, and the carbonyl group also forms
hydrogen bonds with His353 and His513. The terminal ammonium group of the lysine residue
forms an H bond to Glu162. The acid group of the proline residue forms an H-bond contact with
Lys511 and Tyr520. The phenethyl side chain is placed in the S1 pocket.

proteins by mutagenesis. Both domains bind lisinopril with very similar affinity
(Table 25.5). The S1 pocket and the S2 pocket, which is not occupied by lisinopril,
exhibit Tyr396, Asn494, and Thr496 in the N domain. A phenylalanine, a serine, and
a valine are in these positions in the C domain. Moreover, an asparagine is found in this
pocket in the N domain that limits the entrance to the S2 pocket through a glycosylation.
Therefore, it is not surprising that keto-ACE 25.29, with its bulky benzamido group,
interacts much better with the C domain. Two other compounds are known, RXP 407
25.30 and RXP A380 25.31, which bind to the two domains with a selectivity differ-
ence of 1,000-fold. They are derived from phosphinic acids. Because the zinc-binding
group lies in the center of the molecule, these inhibitors can occupy all of the pockets
from S2 to S20 well. RXP A380 has a much larger group for the S2 pocket. Furthermore,
this molecule has an indole moiety in the P20 position that can undergo stronger
hydrophobic interactions in S20 . At this site the C domain has an advantage over the
N domain: Instead of a serine, a hydrophobic valine is found at position 379 which
translates into stronger binding of the inhibitor to the C domain.
What advantage does the domain-specific inhibition of ACEs offer? The enzyme
not only transforms angiotensin I into II, it metabolically degrades the blood-
pressure-decreasing bradykinin as well. ACEs are assumed to be also involved in
580 25 Inhibitors of Hydrolyzing Metalloenzymes

Table 25.5 Domain-specific inhibition of angiotensin-converting enzyme by structurally devi-


ating compounds.

COOH
O O O
H H H
Ph N N N P N
N
O O COOH H OH
O O CONH2

25.29 Keto-ACE 25.30 RXP407

H
N
O
H H
Ph O N P N
OH
O O COOH

25.31 RXPA380

Compound N-domain inhibition (nM) C-domain inhibition (nM)


RXP A380 25.31 10,000 3.0
Captopril 25.13a 8.9 14.0
Enalapril 25.18b 26.0 6.3
RXP407 25.30 2.0 2,500
Lisinopril 25.19b 44.0 2.4
Keto-ACE 25.29 15,000 40.0
a
Fig. 25.9
b
Fig. 25.11

the cleavage of other signal peptides. ACE inhibitors are usually well tolerated by
patients. Some adverse effects, however, have been described. For example, many
patients develop an unpleasant dry cough, and occasionally a life-threatening
angioedema (an acute swelling of the mucous membranes) can occur. It is
suspected that this is associated with the blocked degradation of the described
peptides, especially bradykinin. The catalytic activity of the C domain seems to
be responsible for the blood pressure regulation under in vivo conditions, and
angiotensin I is efficiently cleaved there. Bradykinin, on the other hand, is degraded
equally well by both domains. By using compounds that are selective for the
C domain, it might be possible to decrease blood pressure while leaving
a residual degradation of bradykinin intact. The excessive concentration of this
peptide could then be avoided. The structure determination of ACE therefore opens
a new perspective for the development of selective inhibitors that allow for an
efficient regulation of blood pressure according to an established principle. Hope-
fully, they will display fewer adverse effects.
25.6 Inhibitors of Matrix Metalloproteases 581

25.6 Inhibitors of Matrix Metalloproteases: An Approach to


Treating Cancer and Rheumatoid Arthritis?

The family of matrix metalloproteases (MMPS) belongs to the neutral zinc endopep-
tidases. They assume important roles in the construction and degradation of connec-
tive tissue, for example after an injury or during angiogenesis (the proliferation of
blood vessels). In a healthy state, these proteases are kept in balance by tightly
controlled mechanisms. In this way, active proteases are released from inactive
precursors only when they are needed, or our bodies have adequate endogenous
inhibitors that can mediate the balance between matrix synthesis and matrix
degradation. In a disease state, this complex equilibrium is thrown off balance,
and different MMPs are produced in excess. Pathological situations ensue that are
associated with the construction and degradation of the extracellular tissue.
The etiology of rheumatoid arthritis is based on such chronic destructive
processes that lead to loss of bone and cartilage. Cartilage tissue is composed of
a glycoprotein matrix that is cross-linked and reinforced by collagen. The MMPs
cleave such scaffold proteins. In rheumatoid arthritis, the balance between matrix
synthesis and degradation is obviously lost. Excessive activity of the matrix
metalloproteases leads to an overwhelming degradation of the cartilage. Inhibiting
these proteases could therefore be a promising approach to treat rheumatoid
arthritis. The degradation of the extracellular matrix is also critical for the growth
of malignant tumors, the invasiness of tumour cells, metastasis, and angiogenesis.
Therefore, the inhibition of MMPS could also lead to cancer therapy.
In the meantime, almost 30 MMPs are known, which include the collagenases
(MMP-1, -8, -13), gelatinases (MMP-2, -9), stromelysins (MMP-3, -10, -11),
matrilysin (MMP-7), macrophage metalloelastases (MMP-12, -19), and enamelysin
(MMP-20). The collagenases, gelatinases, and stromelysin recognize collagen as
a substrate. Collagen is composed of three intertwined, left-handed, a-helical
chains. Each individual chain is more than a 1,000 amino acids long and contains
the repeating sequence –(Gly–X–Y)n–, in which the position X is usually occupied
by a proline or an alanine, and the position Y is usually occupied by
a hydroxyproline or an alanine. The collagenases cleave collagen in its native,
threefold helical structure, gelatinases cleave collagen in a denatured form, and it is
assumed that stromelysins cleave the proteoglycans.
A series of different collagens are cleaved by collagenases between the glycine
and leucine or isoleucine residues. A substrate comparison between the species
human, cattle, mouse, and chicken showed that three amino acids to the right and
left of the cleavage site are conserved. Therefore, the N- or C-terminal-protected
hexapeptide Ac–Pro–Leu/Gln–Gly–Leu/Ile–Leu/Ala–Gly–OEt, for example, 25.32
(Fig. 25.13) is recognized as a minimal substrate. This established the starting point
for the design of collagenase inhibitors. The peptide bond to be cleaved in the
minimal substrate 25.32 is replaced with a non-cleavable isostere. The replacement
of the amide bond between Gly and Leu with a ketomethylene group –COCH2–,
a hydroxymethylene group CH(OH)CH2– or a hydroxylamine derivative led to
inactive compounds in all cases. These groups are apparently unable to form
582 25 Inhibitors of Hydrolyzing Metalloenzymes

O O O
H H
N N OEt
N N N
H H H
N O O O
Ac
25.32 Minimal substrate
Cleavage site

O O O− O
H H
N P N OEt
N N
H H
N O O O
Ac
25.33 IC50 = 70 nM

O O
H
N OH
N N
H H 25.34 IC50 = 10 mM
N O
Ac

O O
H
HO N OEt
N N
H H 25.35 Ro 31-4724
O O
IC50 = 9 nM

O O
H
HO N CH3
N N
H H 25.36 Ro 31-9790
O
IC50 = 5 nM

O O O O
H H
HO HO N
N N NHMe
N NHMe
H H
O O
OH S

S
25.37 Marimastat 25.38 Batimastat

Fig. 25.13 Collagenase inhibitors made from substrate analogues. Compound 25.32 covers the
substrate sequence from P3 to P30 . Replacement of the amide bond by a –PO2–group 25.33 leads to
a potent inhibitor. Compound 25.35 contains only the three amino acids prior to the cleavage site
as well as the C-terminal hydroxamic acid as a zinc-binding group. Compounds 25.35 and 25.36
contain the three or two amino acid side chains following the cleavage site in their structures, this
time they are augmented with an N-terminal hydroxamic acid group. The two inhibitors
marimastat 25.37 and batimastat 25.38 were in clinical trials for several years as compounds for
tumor therapy.
25.6 Inhibitors of Matrix Metalloproteases 583

Asn80
Glu119

Tyr140
S1′

S3′
Zn2+

S2′
His122

His128
Pro138

Fig. 25.14 Crystal structure of Ro 31-4724 (25.35, IC50 ¼ 9 nM) and collagenase; the binding
mode is shown. The hydroxamic acid binds in a bidentate-like manner to the zinc ion. Both amide
groups form hydrogen bonds to the enzyme. The leucine side chain of the inhibitor in the P10
position fills the S10 pocket, which is oriented toward the protein’s interior. The alanine methyl
group binds in the S30 pocket, whereas the leucine side chain in position P20 protrudes into the
solvent because the S20 pocket is practically non-existent.

a favorable interaction with the zinc ion. The use of a phosphinate group finally gave
a potent collagenase inhibitor 25.33. However, if only the N terminal proline is
eliminated from this hexapeptide, the inhibitory activity is largely lost. The search
for collagenase inhibitors based on the N terminal tripeptide fragment led to modestly
active compounds such as 25.34. The synthesis of potential inhibitors that contain the
C terminal tripeptide sequence Leu–Leu–Gly–O-alkyl was much more successful.
The coupling of these structural elements with the potent head group hydroxamic
acid to bind the zinc ion gave collagenase inhibitors with nanomolar affinity such as
Ro 31-4724, 25.35 and Ro 31-9790, 25.36. The X-ray structure of 25.35 in complex
with human fibroblast collagenase was solved. As expected, the compound binds to
the zinc ion as a bidentate ligand. The leucine side chain in the P10 position fills the S10
pocket, and the alanine methyl group binds in the S30 pocket. The leucine side chain
in position P20 , which should formally occupy the S20 pocket, orients away from the
enzyme. The binding mode is shown in Fig. 25.14.
Interestingly, exchanging the isobutyl side chain at position P20 for a tert-butyl
group in 25.36 led to an increase in affinity, even though the group is not in direct
contact with the enzyme. This result is attributed to a conformational stabilization.
The voluminous tert-butyl group limits the mobility of the inhibitor so that the
584 25 Inhibitors of Hydrolyzing Metalloenzymes

conformation assumed in the enzyme is still energetically favorable. Compound


25.36 showed some activity after oral application in an animal model and was
chosen for clinical trials as a drug to treat arthritis. The structurally similar
inhibitors marimastat 25.37 and batimastat 25.38 from British Biotech were devel-
oped for many years as broad-spectrum MMP inhibitors for the treatment of tumors.
In the meantime, a great many lead structures in the field of matrix metalloproteases
have been discovered and further developed into potent inhibitors. Some of these
substances (25.39–25.48), which are all derived from hydroxamic acids and have
hardly any recognizable peptide character, are listed in Fig. 25.15. To date, none of
these compounds have successfully found their way through clinical trials to market
maturity. The results of the clinical trials were rather sobering. The development
product tanomastat 25.47 from Bayer to prevent angiogenesis, tumor growth, and
metastasis behaved worse than the placebo in the clinic. CGS 27023A 25.48 from
Novartis did not do much better.
Why was the drug development unsuccessful? One of the reasons could be an
inadequate selectivity of the development compounds. At the time these substances
were developed, only a few of the relevant MMPs were known. The MMPs are all
very similar to one another, and overlapping substrate profiles can be observed. To
some extent, another member of the family can take over the role of the protease that
has been inhibited. If the proteases are compared among themselves, it is conspicuous
that virtually only the S10 pocket is deeply buried. All of the other pockets, S3, S2, S1,
S20 and S30 , are relatively shallow and easily accessible from the solvent. Further-
more, it has been shown that the S10 pocket is highly adaptive toward bound sub-
strates and inhibitors. This can indeed be an opportunity for the development of
selective inhibitors, but as a rule, the development of drugs for such pockets is not
easy. It could be shown for the collagenase MMP-1 that a significantly larger S10
pocket could be opened because of conformational rearrangement around Arg214
(Fig. 25.16). In the initially known conformation there was enough room for a sec-
butyl group such as in 25.49. After rearrangement of the arginine, a much longer
biarylether group (cf. 25.50) could be accommodated in the S10 pocket!
Another complication is caused by the fact that there is another family of zinc
proteases, the ADAM family (a disintegrin and metalloprotease, or also called
adamlysines), that has low sequence homology with, but very similar catalytic sites
to those of the MMPs. The family was first discovered after the first MMP inhibitors
were in clinical trials. TNF-a-converting enzyme (TACE) belongs to this family.
Blocking this enzyme exerts an influence on the function of TNF-a, the
proinflammatory cytokine that plays a pivotal role in immune response. The
enzyme itself is being investigated as a target structure for drug therapy for
autoimmune disease. Cross-reactivity with MMP inhibitors is not desired. Unfor-
tunately it has also been shown that MMP inhibitors have no effect against
advanced and late tumors. In the beginning of MMP research, however, models
from an early phase in tumorigenesis were worked on.
Only time will tell whether the selectivity problems associated with the MMP
family can be solved and whether interesting clinical candidates can be developed
by this route that find their way into therapy.
25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential Reaction 585

O
O O
R S
S OCF3 O
H N O O O CN
HN
OH
O OH 25.42
25.39 R = Ph, 2-Pyridyl, N-Morpholino-ethyl
Cl
N O O
S
N N
N
SO2 HN O O

O OH OH
25.40 25.43 F

N N
N S
SO2 O O
HN O
O OH
25.41 OH 25.44
Cl

O O
S
N
N
S
O
O O O OH
HN O
25.47 Tanomastat BAY 12-9566
OH

25.45 N N OMe
O
S O
N
N S
O O
HN O
HN O
OH
OH
25.46 25.48 CGS 27023A

Fig. 25.15 Development candidates 25.39–25.48 from different companies as potent MMP
isoenzyme inhibitors. Hydroxymates, inverse hydroxymates, and carboxylates were used as
anchor groups for the zinc ion. Tanomastat 25.47 from Bayer and CGS 27023A 25.48 from
Novartis were clinically developed for several years.

25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential


Reaction

Another group of zinc-dependent enzymes that share a very similar catalytic


mechanism with the zinc proteases are the carbonic anhydrases (CAs). They catalyze
a very important reaction in our bodies, the fixation of carbon dioxide from
586 25 Inhibitors of Hydrolyzing Metalloenzymes

H O
H
N O N
HO HO
O H SO2
N O
H N
O
25.49
O

25.50

Fig. 25.16 Crystal structure of the collagenase MMP-1 with two different inhibitors 25.49 and
25.50. Because of a conformational rearrangement of Arg214, the S10 pockets with their volumi-
nous groups can be accommodated. This adaptive ability of the specificity pockets in MMP makes
the development of selective inhibitors extremely difficult.

bicarbonate, or the back-reaction for the release of CO2. In total, four different
families of these enzymes are known, which are called the a-, b-, g-, and d-CAs.
Sixteen of these isoforms occur in mammals. Some are found in the cytosol and some
are membrane anchored. They are involved in many physiologically important
processes such as respiration, CO2/HCO3 transport between metabolizing tissue
and the lungs, pH homeostasis, electrolyte secretion, biochemical reactions that need
C1 building blocks, bone resorption and calcification, and tumor growth.
The zinc ion is found at the end of a funnel-shaped catalytic site in the a-carbonic
anhydrases. It is held in position by three histidine residues. A water molecule is found
at the fourth coordination site. This water is severely polarized by coordination to the
Zn2+ and is most probably present as an OH ion. Furthermore, a hydrogen-bond-
acceptor group is found in the OH group of Thr199 (Fig. 25.17). The proton of the OH
group of Thr199 forms an H-bond to the carboxylate group of Glu106. The water (or
OH ion), which has strongly enhanced nucleophilicity, attacks a CO2 molecule, which
is positioned in a hydrophobic niche in the vicinity of Val121, Val143, and Leu198 at
the bottom of the binding pocket. One of the oxygen atoms of the CO2 finds a hydrogen-
bonding partner in the NH function of Thr199. The newly formed bicarbonate is
25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential Reaction 587

Fig. 25.17 The catalytic site in a-carbonic anhydrases is found at the end of a funnel-shaped
binding pocket. There an OH ion which is coordinated to the Zn2+ ion nucleophilically attacks
a CO2 molecule. Bicarbonate forms, which is held in place by Thr199 (left). A sulfonamide,
deprotonated at nitrogen, fits at the site of the carbonate in the very narrow binding pocket (right).
Because of the tetravalency of the sulfur, this site can be fitted with another substituent, as is shown
in the case above with a p-fluorophenyl group.

displaced from the temporarily pentacoordinated zinc ion, and a new water molecule
adopts its position at the zinc ion. A new catalytic cycle can begin. CAII is one of the
fastest enzymes known. The acquisition or removal of a proton is the rate-limiting step
in the reaction cycle. For this, carbonic anhydrases have a series of multiple histidine
residues that deliver the protons from the edge of the funnel-shaped binding pocket. At
the same time, this arrangement causes the funnel to appear amphiphilic. One side is
hydrophobic, and the other is hydrophilic. The very narrow area around the catalytic
zinc ion affords only enough space for CO2 and HCO3. Putative inhibitors must be
able to form an equivalent interaction to the bicarbonate on the one hand, and on the
other hand they must occupy the funnel opening. In addition to ions such as cyanide,
thiocyadnate, or isocyanate, it is above all the sulfonamides, sulfamates, and
sulfamides that have the appropriate head group for coordination in the catalytic site.
The amino group on these sulfur derivatives is acidic enough to easily release a proton
and coordinate to the zinc ion in a charged state, analogously to the OH ion. The
remaining proton undergoes an interaction with the threonine OH group. An oxygen
atom of the SO2 function satisfies the NH function of the latter amino acid. The second
S═O group expands the coordination number at zinc to five. An aromatic carbon that is
part of a heterocyclic ring system is usually found at the fourth bond of the central sulfur
atom of most known inhibitors. In further examples there is another oxygen or nitrogen
atom as a linker to this heterocycle.
In the case of carbonic anhydrases, the coordination of the ligands to the zinc ion
in the catalytic site is essential for good binding. In this way, small ligands such as
phenyl sulfonamide 25.51 or its isostere thiophene-2-sulfonamide 25.52 achieve
submicromolar inhibition of carbonic anhydrase II (Fig. 25.18). More than 50 years
ago, the replacement of these aromatic rings with other heterocycles led to the first
marketed products, which were introduced to therapy as sulfonamides under the
588 25 Inhibitors of Hydrolyzing Metalloenzymes

NH2 25.51 N
S N NH
O O Ki = 300 nM N
O
NH2 S N S NH2
S 25.52 H
S O
O O Cl
O N N 25.59 Azosemide
NH2 25.53 Acetazolamide
H3C N S S HOOC
H O
O O
H3C O N S NH2
H O
O N N
Cl
NH2 25.54 Methazolamide
H3C N S S 25.60 Furosemide
O O
H3C
NH
O N N
NH2 25.55 O
F3C N S S
S NH2
O O O N
CH3 S S
O
O O
HN CH3 25.61 Brinzolamide
CH3
NH2 25.56 MK 927
S S S
Ki = 0.7 nM
O O O O
CH3 O
N S NH2
HN N O
F3C

NH2 25.62 Celecoxib


H3C
H3C S S S O
H3C
O O O O
O O
25.57 Dorzolamide O O
Ki = 0.37 nM O S NH2
O O
O CH3 O NH
N H3C
S NH2 S

H3C O S O 25.63 Topiramate O O

25.58 Ethoxzolamide 25.64 Saccharin

Fig. 25.18 The small aromatic sulfonamides 25.51 and 25.52 bind to carbonic anhydrase II with
submicromolar affinity. By exchanging a heterocycle, acetazolamide 25.53 and methazolamide
25.54 are obtained. Both drugs were used for a long time as systemic carbonic anhydrase inhibitors
for diuresis and for the treatment of glaucoma. Compound 25.55 was the first topically active, that
is, useable as eye drops, CA inhibitor. The structure-based design of new inhibitors led to the
marketed product dorzolamide 25.57 by way of 25.56. Compounds 25.58–25.61 are further drugs
that inhibit carbonic anhydrases and are used for the treatment of glaucoma or as a diuretic. Even
celecoxib 25.62, topiramate 25.63, and the artificial sweetener saccharin 25.64 inhibit carbonic
anhydrases and this explains some of their observed side effects.
25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential Reaction 589

names acetazolamide 25.53 and methazolamide 25.54. In 1954 acetazolamide


represented the first mercury-free diuretic (▶ Sect. 30.9). It was also used as
a systemic treatment for glaucoma. Glaucoma is an eye disease that leads to
visual field loss and, in severe cases, to blindness. It is caused by insufficient
drainage of the aqueous humor from the eye. Because of this, pressure builds in
the eye that damages the optical nerve if left untreated. Carbonic anhydrase II
inhibitors reduce the production of aqueous humor and can reduce the internal
pressure in the eye. Acetazolamide 25.53 and methazolamide 25.54 were used for
many years to treat glaucoma. They must be systemically administered. A direct
application in the form of eye drops does not work because the compounds cannot
penetrate the eye externally. The systemic application and low selectivity regarding
the different isoforms of carbonic anhydrase means that these enzymes are also
inhibited outside of the eye. Undesirable side effects are the consequence. There-
fore both of these compounds have largely disappeared from therapy today. For
a long time it was assumed that carbonic anhydrase inhibitors could not be used as
eye drops because of their unfavorable physicochemical properties. In 1983, to
general surprise, the topically active carbonic anhydrase inhibitor 25.55 was
reported for the first time. The single exchange of a methyl for a trifluoromethyl
group caused this transformation! As a consequence of this discovery, the lipophilic
range of a large number of carbonic anhydrase inhibitors has been characterized,
within which a topical application is possible.
The development of the active substance dorzolamide 25.57 resulted. Its design
is indeed the first example for a drug that was optimized with the help of structure-
based design by using the crystal structure. After the X-ray structure of carbonic
anhydrase II became available, the structure-based design of carbonic anhydrase
inhibitors was undertaken in the mid-1980s at Merck, Sharpe & Dohme. The first
active compound from this effort was thienothiopyran-sulfonamide 25.56 (MK
927). It binds to carbonic anhydrase with a subnanomolar inhibition constant
(Ki ¼ 0.7 nM). The crystal structure with the enzyme shows the expected coordi-
nation of the sulfonamide group to the zinc ion in the active site. Aside from
hydrogen bonds, the inhibitor forms hydrophobic interactions to the protein. The
observation that the isopropylamino group adopts an energetically unfavorable
axial position on the ring in the bound state was a surprise. The compound only
fits into the binding pocket in this unfavorable conformation. To improve the
affinity to the enzyme, a modification of the molecule was planned that decreased
the energetic difference between the equatorial and axial orientation of the side
chain. This was accomplished by introducing another methyl group onto the six-
membered ring. To balance out the increased lipophilicity, the isopropylamino
group was reduced to an ethyl group. The result of this modeling was dorzolamide
25.57. It binds with Ki ¼ 0.37 nM to carbonic anhydrase II. Dorzolamide has
successfully passed all clinical trials. Since 1995, it has been marketed under the
name Trusopt ® and was the first marketed topically active carbonic anhydrase
inhibitor for the treatment of glaucoma. A few other important drugs (25.58–
25.61) that inhibit carbonic anhydrase are shown in Fig. 25.18. They serve as
diuretics, glaucoma inhibitors, antiepileptics, altitude sickness treatments, gastric
590 25 Inhibitors of Hydrolyzing Metalloenzymes

ulcer disease treatments, or treatments for ankylosing spondylitis (also known as


Bechterew’s disease, a chronic autoimmune inflammatory disease that leads to
spinal fusion). Because tumors require an acidic milieu, carbonic anhydrases such
as CA IX and CA XII could be responsible for maintaining these conditions.
Therefore, they represent potential targets for tumor therapy because the acidic
homeostasis would be disrupted by CA inhibition. The 15 human isoenzymes of
a-carbonic anhydrase that have been characterized until now are highly homolo-
gous. Small differences, for example, the exchange of a threonine for a histidine at
position 200, distinguish the isoforms CA I from CA II. Drugs must exploit these
differences to achieve selectivity between these isoforms (▶ Sect. 18.14).
In the meantime, a few very surprising adverse effects of known drugs can be
attributed to carbonic anhydrases. To improve the solubility, terminal sulfonamide
groups have often been incorporated in active substances as functional groups. The
analgesic celecoxib 25.62 is a cyclooxygenase II inhibitor (▶ Sect. 27.9). It is also
able to bind to carbonic anhydrase with nanomolar affinity through its sulfonamide
group. In patients with familial adenomatous polyposis (FAP), a disease that leads
to the development of polyps in the colon, a reduction in the number of tumors was
clinically observed in patients undergoing therapy with celecoxib. This result could
be consistent with carbonic anhydrase inhibition. One side effect of the
antiepileptic drug topiramate 25.63 is, among others, a loss in appetite. As
a sulfamate, this compound represents a potent mitochondrial CA V inhibitor.
This isoenzyme is involved in the de novo lipogenesis there. This observation led
to the in-depth investigation of CA V as a possible therapeutic principle for obesity
therapy. Even the very old and widely used artificial sweetener saccharin 25.64,
which contains a cyclic sulfonamide unit can inhibit some carbonic anhydrases very
strongly. It is known that other clinically used carbonic anhydrase inhibitors have
an unpleasant metallic aftertaste, just as with saccharin. This property is suspected
to be due to inhibition of CA VI, which is produced in the oral cavity. Its inhibition
exerts an influence on the pH values and because of this can cause a bitter taste.
Presumably, other drugs with terminal sulfonamide groups also have effects on
carbonic anhydrases. Only time will tell whether the great problem of achieving
sufficient selectivity within this enzyme class can be resolved.

25.8 A Case for Two: Zinc and Magnesium in the Catalytic


Centers of Phosphodiesterases

Phosphodiesterases (PDEs) represent a class of metalloenzymes with at least 12


gene families that hydrolyze the intracellularly formed second messengers cAMP
25.65 and cGMP 25.67 (cyclic AMP and GMP) to the open-chain analogues
(Fig. 25.19). They are broadly distributed in different tissues and organs and control
important processes in the regulation of calcium channels, the sense of smell,
platelet aggregation, aldosterone release, cell proliferation, myocardial contractil-
ity, insulin release, inflammatory modulation, smooth-muscle contraction, mood,
25.8 Zinc and Magnesium in the Catalytic Centers of Phosphodiesterases 591

N P
O HO O N
N NH2 PDEs HO O
O N NH2
O P O N N
OH N N
HO OH
HO
25.65 cAMP 25.66 AMP
O

N P
O HO O N
N O HO O
O PDEs N O
O P O N NH
OH N NH
HO OH
HO NH2
NH2
25.67 cGMP 25.68 GMP

Fig. 25.19 cAMP 25.65 and cGMP 25.67 are hydrolyzed into their open-chain analogues AMP
25.66 and GMP 25.68, respectively, by phosphodiesterases.

penile erectile function, or muscle metabolism. Among themselves, the sequences


of the members of the family are highly conserved.
At first, the crystal structures of PDE 4 and PDE 5 were solved. By now, eight
PDEs have been crystallographically characterized. Whereas inhibition of PDE 4
could lead to treatments for asthma, chronic obstructive lung disease, or autoim-
mune disease, PDE 5 inhibitors have been developed to treat erectile dysfunction.
The PDE 5 enzyme is expressed in different tissues and is specific for the hydrolysis
of cGMP. In addition to the zinc ion, which is essential for the hydrolytic cleavage,
at the active site an additional magnesium ion is found. The zinc ion is coordinated
by two histidine and two aspartic acid residues. A water molecule is found at the
fifth position that makes up a bridge to the magnesium ion together with one of the
two aspartic acid residues (Fig. 25.20). The other coordination site of the octahe-
drally surrounded Mg2+ is occupied by a water molecule. The Zn2+ also prefers an
octahedral geometry in the phosphodiesterases. For this the sixth coordination
position is occupied by a water molecule. This water presumably adopts the
role of the nucleophilic OH– for the hydrolytic cleavage of the cyclic
phosphodiester.
Three PDE 5 inhibitors were brought to the market as drugs to treat erectile
dysfunction. Aside from sildenafil 25.69 (Viagra ®), the first to be introduced by
Pfizer, vardenafil 25.70 (Levitra ®) and tadalafil 25.71 (Cialis ®) have passed clinical
trials (Fig. 25.21). Interestingly, these inhibitors bind to the catalytic site of PDE 5,
but do not make direct contact with the zinc ion (Fig. 25.20). In fact, the binding of
the basic nitrogen to the metal ion is mediated by two water molecules. The
pyrazolopyrimidinone moiety in sildenafil replaces the analogous group in the
592 25 Inhibitors of Hydrolyzing Metalloenzymes

Gln817 Phe820

Zn2+

His653

Phe786
Mg2+

Fig. 25.20 Crystal structure of sildenafil 25.69 (Fig. 25.21) in PDE 5. The pyrazolopyrimidinone
moiety of the inhibitor is recognized by Gln817 through two parallel hydrogen bonds and binds to
the catalytic zinc ion (blue-gray) through a water molecule. It is found in the vicinity of
a magnesium ion (light green), which is coordinated by five water molecules and Asp654.
A bridging water molecule is shared by Mg2+ and Zn2+.

natural substrate cGMP. The relationship with cGMP is even more apparent when it
is considered that the 2-phenyl-substituted purines such as 25.72 served as lead
structures (Fig. 25.21). The pyrazolopyrimidine 25.73 or imidazotriazenone 25.75
that are contained in sildenafil and vardenafil, respectively, were developed from it.
The chemically closely related vardenafil adopts a very similar binding mode as
sildenafil. On the other hand, the structurally deviating tadalafil adopts a distinctly
different orientation.
The discovery of the effects of sildenafil was once again accomplished by
serendipity. The compound was in clinical trials at Pfizer for the treatment of
angina pectoris. It proved, however to be no better than the classic nitro
compounds (i.e., nitroglycerin or isosorbide dinitrate). These nitro derivatives
release NO under reductive conditions, which stimulates guanylate cyclase.
cGMP is then formed, which in turn exerts an influence on vascular constriction.
A phosphodiesterase inhibitor also increases the cGMP level because it blocks the
degradation of this second messenger. In the clinical trials, however, a side effect
proved to be remarkable in the male probands: It stimulated penile erections. NO is
released into the cavernous body of the penis, and increased cGMP is produced by
activation of guanylyl cyclase. This causes increased blood flow to the cavernous
body and stimulates penile erection. Sildenafil amplifies the effect by inhibiting the
degradation of cGMP. In 1998 sildenafil was approved for the treatment of erectile
dysfunction. The market accepted Viagra euphorically. Until 2005, more than
25.8 Zinc and Magnesium in the Catalytic Centers of Phosphodiesterases 593

O CH3
O CH3
N O O HN
O O HN N
N S N
S N N
N N
H3C N
N O
H3C O CH3
CH3
CH3
CH3
25.70 Vardenafil
25.69 Sildenafil
O
CH3
N
N N
H
O

O
O
25.71 Tadalafil

O O O
H CH3 CH3
HN N N
HN HN
N N
N N N
N N
O O O
CH3 CH3 CH3
CH3 CH3

25.72 IC50 = 10 nM 25.73 IC50 = 40 nM 25.74 IC50 = 5 nM

Fig. 25.21 Sildenafil 25.69, vardenafil 25.70, and tadalafil 25.71 represent potent PDE 5 inhib-
itors. The first two compounds were developed from phenyl-substituted purines such as 25.72, and
modified to pyrazolopyrimidines such as 25.73 or imidazotriazenones such as 25.74.

177 million prescriptions in 120 countries around the world have been registered.
In addition to PDE 5, PDE 6 is also inhibited by sildenafil, vardenafil, and tadalafil.
This isoform is involved in visual processes, which provides an explanation why the
use of these drugs is accompanied by visual disturbance. Tadalafil has better
selectivity against PDE 6, but inhibits PDE 11 in addition to PDE 5. Another
clinical application has been approved for sildenafil and tadalafil. They are used
on intensive care units to prevent and treat pulmonary hypertension in mechanically
ventilated patients.
Do PDE 5 inhibitors have another career? What helps men apparently also gives
cut flowers more stamina. According to experiments by Heribert Warzecha at the TU
Darmstadt, cut daisies stay fresh longer when Viagra® is added to the water in the
vase! Aspirin®, however, is less expensive and it also supposedly keeps cut flowers
fresh longer. In another study it was found that hamsters could reset their circadian
594 25 Inhibitors of Hydrolyzing Metalloenzymes

rhythm faster when they had Viagra® in their blood. A higher cGMP level apparently
helps the internal clock to more easily adjust to changes in external conditions.
Whether Viagra also helps to overcome jetlag after long-distance travel must be
demonstrated. The examples show that no drug is without side effects. Often these are
only discovered after some time in clinical trials or after practical use.

25.8.1 What Zinc Can Do, Iron Can Too

What makes zinc so special that it preferentially occurs in the catalytic site of so
many enzymes? Zinc is an ion that often occurs in biological systems. This is also
valid for an element like iron. Zinc exists as a doubly positively charged ion. This,
however, is also achieved by other ions such as Fe2+, Co2+, Ni2+, or Cu2+. In
contrast to the latter-named elements, the zinc ion is not redox sensitive because
of its filled d-orbitals. If the reaction mechanism of an ester or amide cleavage is
considered, aside from the coordination properties, only the charge of the metal ion
is critical. It serves to polarize a water molecule that initiates the nucleophilic attack
on the carbonyl carbon atom of the ester or amide to be cleaved. This task can also
be assumed by other metal ions. In fact, under reductive conditions, hydrolyzing
enzymes can be found that have an iron instead of a zinc ion in the catalytic site.
New polypeptide chains that are synthesized in prokaryotes, mitochondria, or
plastids initially carry a methionine at the first position of the N terminus that is
substituted with a formyl group. In other compartments of more complex organisms
the same proteins are formed without these formyl groups. The methionine is
cleaved by a methionine aminopeptidase in about a third of all mature proteins.
The formyl group must be removed so that the formylated chains can also undergo
this process. This is achieved by peptide deformylases (PDFs). They carry an Fe2+
ion in their catalytic site and are therefore exceedingly sensitive to oxidation. An
exchange for Ni2+ or Co2+ is achieved only with a drastic concomitant loss in
catalytic activity. On the other hand, the exchange of iron for a Zn2+ ion leads to
a complete loss in enzymatic function in almost all PDFs. Peptide deformylases
occur in bacteria as well as plastids from plants and some parasites. Initially it was
thought that these enzymes do not occur in humans, so that they would seem to be
an ideal target structure for an antibacterial or antiparasitic therapy. In the
meantime, PDFs have also been discovered in the mitochondria of animals and in
humans. This must be considered when developing antibiotics based on PDF
inhibitors. The potent inhibitor actinonin 25.75 (Fig. 25.22) has not only
antibacterial effects but also inhibits proliferation in human cells. This can lead to
cytotoxic side effects, but can also be exploited for antineoplastic effects. More-
over, these inhibitors have importance as herbicides.
The iron ion is tetrahedrally coordinated by two histidines and one cysteine
in PDFs (Fig. 25.22). A water molecule occupies the fourth position. The pKa
value of this water molecule is drastically shifted by the direct coordination to the
metal ion and gains nucleophilicity because of its ease of deprotonation.
It presumably attacks as a hydroxide ion the formyl peptide group being cleaved.
25.8 Zinc and Magnesium in the Catalytic Centers of Phosphodiesterases 595

Fig. 25.22 Crystal structure H3C CH3


of actinonin 25.75 with the O
peptide deformylase from H
Escherichia coli. The peptidic N N
N OH
inhibitor binds to the Fe2+ ion H
with its hydroxamate HO O O
function. Its n-pentyl chain
replaces the methionine side
chain in the natural substrate
and lies in the deeply buried CH3
S10 pocket. The iron ion is
25.75 Actinonin
bound to a cysteine and two
histidines.

Arg97

His132
Fe2+

Ile44

The mechanism is very similar to that of the proteases. The carbonyl carbon
atom of the formyl group being cleaved adopts a tetrahedral transition state.
For this, the charge that forms on the oxygen is stabilized by an NH of the main
chain, a terminal carboxamide group of a glutamine, and coordination to the
iron. The amino group of the bond to be cleaved is bound to a glutamate by an
H-bond. The polypeptide chain is cleaved with concomitant release of the
N terminus. The remaining formiate group leaves the coordination site at the
metal ion and dissociates from the enzyme. Two water molecules take its place at
the catalytic site. Inhibitors of this enzyme have hydroxamate groups to anchor
them to the iron ion. Because the natural peptide substrate has a methionine in the
P10 position, n-alkyl chains with four or five carbon atoms on inhibitors are ideal
in the same position. The S10 pocket is well-formed in the PDFs, but the sur-
rounding pockets are not well-characterized. This is because of the function of the
proteins. A broad palette of formylated substrates can be processed, that is, after
the formyl methionine the amino acid sequence is arbitrarily composed. Interest-
ingly, thiorphan also inhibits PDFs. This underscores that the thiol group can
coordinate to the iron atom. The benzyl group of the inhibitor fills the S10 pocket
of the enzyme.
596 25 Inhibitors of Hydrolyzing Metalloenzymes

Another group of deacetylating enzymes that actually belongs to the group of


transferases is being worked on as a target structure for the inhibition of cell
proliferation in cancer therapy. Inhibitors of such histone deacetylases, which
carry out the cleavage of acetyl groups from the terminal nitrogen of a lysine by
splitting an amide bond, can lead to apoptosis in tumor cells. The structure of
chromatin in the cell nucleus is changed by deacetylation so that the DNA is bound
more strongly to the histone (▶ Sect. 12.13). Histone deacetylases contain a zinc
ion in their active site that is responsible for the hydrolytic function. Inhibitors of
these enzymes are also derived from hydroxymates.

25.9 Synopsis

• In metalloproteases, a positively charged metal ion, usually a zinc ion, activates


a coordinated water molecule, which nucleophilically attacks the peptide bond to
be cleaved. Through expansion of its coordination sphere, the zinc ion also
polarizes the carbonyl group of the amide bond to be cleaved, and an adjacent
glutamate residue helps in the transfer of protons.
• Potent inhibitors exhibit appropriate functional groups to coordinate the zinc ion
efficiently; they also address the specificity pockets on the primed side that
recognize the C terminal part of the substrate to be cleaved.
• Angiotensin-converting enzyme (ACE) transforms angiotensin I to II by cleav-
ing a C terminal dipeptide. Rational design concepts resulted in dipeptide
mimetics with a carboxylate group at a proline-like moiety and a zinc-
coordinating group at the opposite end. Captopril was the first compound
introduced to therapy; a large number of ACE inhibitors followed.
• The target protease ACE is composed of two slightly different catalytic domains.
Aside from angiotensin I, ACE degrades other peptides such as the blood-
pressure-decreasing bradykinin. Undesired side effects of ACE inhibitors are
related to this degradation. Because the two domains of ACE show different
substrate profiles that can be translated into selective inhibition, the possibility
exists to develop domain-selective active agents with efficient blood regulation
properties that avoid the unwanted adverse effects.
• Matrix metalloproteases (MMPs) are a large family of structurally related
neutral zinc endopeptidases. They are involved in the construction and degra-
dation of connective tissue. Several therapeutic indications have been proposed
such as rheumatoid arthritis or cancer.
• The development of selective MMP inhibitors proved to be extremely difficult.
The adaptive nature of the binding pockets of this protein class has proven to be
challenging, and the pronounced overlapping substrate profiles are problematic
because different members of the family can mutually take over the role of the
protease being inhibited.
• Carbonic anhydrases are hydrolases that transform carbon dioxide to bicarbon-
ate. They catalyze important processes from respiration to CO2 transport, pH
Bibliography 597

homeostasis, electrolyte secretion, C1 building block delivery, bone resorption


and calcification, or tumor growth.
• Due to the narrow funnel-shaped architecture of the enzyme with the catalytic
zinc ion at the end, almost all a-carbonic anhydrase inhibitors feature a terminal
sulfonamide group. Particularly diuretics and anti-glaucoma agents, which
reduce the internal eye pressure, have been brought to market.
• Phosphodiesterases (PDEs) are a small family of metalloenzymes that hydrolyze
the intracellularly formed second messenger cAMP and cGMP. They are broadly
distributed in different tissues and regulate many important processes.
• Inhibitors of PDE 5 such as sildenafil, originally developed for the treatment of
angina pectoris, proved to be agents to stimulate penile erections via the inhibi-
tion of cGMP degradation.
• Under reductive conditions, iron (II) ions can be found instead of zinc ions in the
catalytic center of hydrolyzing enzymes. In peptide deformylases the iron ion
takes a similar role as the zinc ion and helps to remove the formyl group that is
found at the first position of the N terminus of a newly formed polypeptide chain
in prokaryotes, mitochondria, or plastids. Inhibitors of peptide deformylases are
either potential antibiotics or can be exploited for antineoplastic effects.

Bibliography

General Literature
Becket RP, Davidson AH, Drummond AH, Huxley P, Whittaker M (1996) Recent advances in
matrix metalloproteinase inhibitor research. Drug Discov Today 1:16–26
Fersht A (1985) Enzyme structure and mechanism. W. H. Freeman, New York, p 416
Rich DH (1990) Peptidase inhibitors. In: Hansch C, Sammes PG, Taylor JB (eds) Comprehensive
medicinal chemistry, vol 2, Enzymes & other molecular targets. Pergamon Press, Oxford,
pp 391–441
Türk B (2006) Targeting proteases: successes, failures and future prospects. Nat Rev Drug Discov
5:785–799

Special Literature

Acharya KR, Sturrock ED, Riordan JF, Ehlers MRW (2003) ACE revisited: a new target for
structure-based drug design. Nat Rev Drug Discov 2:891–902
Baldwin JJ, Ponticello GS, Anderson PS et al (1989) Thienothiopyran-2-sulfonamides: novel topically
active carbonic anhydrase inhibitors for the treatment of glaucoma. J Med Chem 32:2510–2513
Bertenshaw SR et al (1993) Thiol and hydroxamic acid containing inhibitors of endothelin
converting enzyme. Bioorg Med Chem Lett 3:1953–1958
Bertini I, Calderone V, Fragai M, Luchinat C, Maletta M, Yeo KJ (2006) Snapshots of the reaction
mechanism of matrix metalloproteinases. Angew Chem Int Ed 45:7952–7955
Borkakoti N, Winkler FK, Williams DH, D’Arcy A, Broadhurst MJ, Brown PA, Johnson WH,
Murray EJ (1994) Structure of the catalytic domain of human fibroblast collagenase complexed
with an inhibitor. Nat Struct Biol 1:106–110
598 25 Inhibitors of Hydrolyzing Metalloenzymes

Cushman DW, Cheung HS, Sabo EF, Ondetti MA (1977) Design of potent competitive inhibitors
of angiotensin-converting enzyme. Carboxyalkanoyl and mercaptoalkanoyl amino acids.
Biochemistry 16:5484–5491
Hu J, van den Steen PE, Sang Q-XA, Opdenakker G (2007) Matrix metalloproteinase inhibitors as
therapy for inflammatory and vascular diseases. Nat Rev Drug Discov 6:480–498
Jain R, Chen D, White RJ, Patel DV, Yuan Z (2005) Bacterial peptide deformylase inhibitors:
a new class of antibacterial agents. Curr Med Chem 12:1607–1621
Matter H, Schudok M (2004) Recent advances in the design of matrix metalloprotease inhibitors.
Curr Opin Drug Discov Devel 7:513–535
Matthews BW (1988) Structural basis of the action of thermolysin and related zinc peptidases. Acc
Chem Res 21:333–340
Morgan BP, Holland DR, Matthews BW, Bartlett PA (1994) Structure-based design of an inhibitor
of the zinc peptidase thermolysin. J Am Chem Soc 116:3251–3260
Porter JR, Beeley NR, Boyce BA et al (1994) Potent and selective inhibitors of gelatinase-A,
1. Hydroxamic acid derivatives. Bioorg Med Chem Lett 4:2741–2746
Rotella DP (2002) Phosphodiesterase 5 inhibitors: current status and potential applications.
Nat Rev Drug Discov 1:674–682
Supuran CT, Scozzafava A (2000) Carbonic anhydrase inhibitors and their therapeutic potential.
Expert Opin Ther Pat 10:575–600
Supuran CT, Mastrolorenzo A, Barbaro G, Scozzafava A (2006) Phosphodiesterase 5 inhibitors –
drug design and differentiation based on selectivity, pharmacokinetic and efficacy profiles.
Curr Pharm Des 12:3459–3465
Transferase Inhibitors
26

At the end of the 1970s, the evidence became corroborated that proteins are not only
translated and synthesized in the ribosomes, but can also undergo subsequent
changes after their synthesis. In addition to glycosylation, the attachment of phos-
phate groups to alcohol functions on serine, threonine, and tyrosine residues occurs.
Later it was recognized that even histidine can be phosphorylated. Moreover it was
shown that the degree of phosphorylation of a protein can dramatically change with
time in the cell. Cellular reproduction proved to be strongly dependent on these
changes. It therefore became obvious to correlate phosphorylation with intracellular
signaling processes. ATP was established as the source of the transferred phosphate
groups. However, the bonds between the phosphate groups of ATP cannot be so
easily transferred to an amino acid. This reaction is kinetically too slow in
aqueous solution. Therefore, Nature developed efficient catalysts for this task: the
protein kinases. On the other hand, the cleavage of a phosphate group from
a phosphorylated amino acid is also a very slow process. This process therefore
requires efficient enzymes, for which the phosphatases are available. In this way,
protein phosphorylation is a reversible process that can be “switched” in both
directions by the above-named enzyme classes (Fig. 26.1). Although these enzymes
catalyze very general reactions, their substrate recognition is highly specific. It is
only in this way that the signal transduction processes are precisely controlled and
the protein function is switched on and off.
The palette of posttranslational modifications is still not exhausted with these
examples. Each newly synthesized protein carries an N-formylmethionine at its
N terminus. Initially this formyl group is cleaved by a deformylase (▶ Sect. 25.9)
before a methionine aminopeptidase removes the methionine residue from the
peptide chain of many proteins. The attachment of sugar residues (glycosylation)
not only improves the solubility and proteolytic stability of the protein, it also
especially serves to label proteins with crucial recognition characteristics for
signaling and intracellular transport processes. Above all, sugar residues are of
crucial importance for cell–cell recognition and interactions with the extracellular
matrix (▶ Sect. 31.3). The transglutaminases, which posttranslationally crosslink
proteins by forming isopeptide bonds through glutamate and lysine side chains,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_26, 599


# Springer-Verlag Berlin Heidelberg 2013
600 26 Transferase Inhibitors

Signal-
Input

Protein-
Protein-
substrate
substrate
Adenosine P P P
P

Kinase Phos-
phatase
Adenosine P P

P P
Protein-
Protein-
substrate
substrate

Signal
Switch on Switch off
Transmission

Fig. 26.1 The posttranslational phosphorylation of proteins is critical for regulating intracellular
signal processes, for example, cellular reproduction is strongly dependent on these processes.
A phosphate group P is transferred from ATP (green) to the alcohol function of a serine, threonine,
or tyrosine. This task of turning-on protein function is performed by the kinases. Conversely
phosphate groups can be cleaved from a phosphorylated amino acid again. The protein function is
turned off by this step.

were discussed in ▶ Sect. 23.9. Transferases can also transfer alkyl groups. For one
family of transferases, methyl groups that are used to modify residues. For others, it
is a prenyl group that is transferred, the terpene anchor of which can be used to
immobilize proteins at the membrane (Sect. 26.10). Finally, ubiquitin and SUMO
should be mentioned. Ubiquitin is a polypeptide chain that labels proteins for
proteolytic degradation in the proteasome (▶ Sect. 23.8). SUMO is also a small
protein that can be attached to proteins and exerts an influence on, for example,
processes in the cell nucleus.

26.1 The Kinase “Gold Rush”

In case of a disease, it sounds initially very attractive to regulate with drug therapies
enzymes that act as switches in signaling cascades. In ▶ Sect. 12.4 kinases were
identified as enzymes that are often involved in disease processes. In eukaryotes,
26.2 Structure of Protein Kinases: More than 500 Variations with Similar Geometry 601

about 30% of all proteins are reversibly phosphorylated. The electrostatic proper-
ties of the protein are changed by attaching a phosphate group, conformational
rearrangements are induced, and new binding sites can be formed. The design of
kinase inhibitors was initially focused nearly exclusively on a competitive displace-
ment of ATP from its binding site. But not only kinases use ATP as a substrate. This
molecule is the most important energy-transfer system in cellular metabolism.
Many cofactors use ATP as a building block to fulfill their cellular tasks. There
are about 2,000 proteins in the human genome that use ATP as a substrate in
a variety of ways. At 0.01 M, the intracellular concentration is very high. Overall,
the physiological turnover of ATP is 75 kg per day in an adult! In light of this
situation, the question as to how specifically and selectively a binding site of
a particular kinase can be blocked by an inhibitor is justified: The same substrate
ATP is transformed by each of these enzymes, and its cellular concentration is very
high. The problem is further complicated in that Nature has established redundancy
in many of these processes as a failsafe. If one signal transduction pathway is
removed, a similar pathway can serve as a replacement in that it produces more of
its own phosphorylated proteins. In this way, they contribute to the correction of the
deficit caused by the blocked function. Is this true especially for signaling cascades
for which many different structurally similar kinases and phosphatases are used to
transmit information? Up until the early 1990s, all of these problems were consid-
ered to be so complex and unsolvable that anyone who tried to develop selective
kinase inhibitors as drugs was considered crazy. In the meantime, the tables have
been completely turned. Today, a pharmaceutical company that does not work on
multiple kinase projects is considered to be backward and not innovative! Until
now, no other protein family has been investigated with so much fervor. What
brought about this change of heart that led to a pharmaceutical “kinase gold rush”?

26.2 Structure of Protein Kinases: More than 500 Variations


with Similar Geometry

Protein kinases represent one of the largest target families in the human genome.
More than 530 kinases switch the most different signaling pathways in our bodies
on and off and transform proteins from inactive to active states. They are related to
one another in varying degrees by their sequence and structure, and are divided into
subfamilies based on a family tree (Sect. 26.3). Kinases can also be regulated by
further binding partners. Allosteric binding sites and second messengers that
intervene in the regulation of kinase function are known. Inhibitory or activating
proteins (e.g., cyclines) control kinase activation via complexation to the kinase
domains. The autophosphorylation of kinases exerts an important influence on
their conformation and the correct positioning of the catalytic residues for the
transfer of the g-phosphate group of ATP to the amino acids serine, threonine,
tyrosine, or histidine (Fig. 26.2). The conserved architecture of protein kinases is
shown in Fig. 26.3. The N-terminal domain is constructed from five b-pleated
sheets. The C-terminal domain is overwhelmingly a-helical and contains the
602 26 Transferase Inhibitors

O
PO3H2
O O
PO3H2 PO3H2

N N N
H H H
O O O

Fig. 26.2 Kinases transfer phosphate groups (red) to the alcohol function of serine, threonine, or
tyrosine (black, peptide strand is blue).

N-Terminal Domain

Fig. 26.3 The catalytic


domains of kinases all have Hinge-
the same folding. The Region
N-terminal domain is
composed of five b-pleated
sheets and helices (yellow),
the C-terminal domain is
overwhelmingly composed of
a-helical segments (red). Substrate
Both domains are linked to Chain
one another via the so-called
hinge region (green). They
contain the recognition motif
for the adenosine moiety of
DNA (molecule in the
center). The terminal
phosphate groups orient
themselves near the substrate
chain (blue), which is
symbolized here by a segment
of its polymer chain. It carries
the Ser, Thr, or Tyr in the
spatial vicinity of the
phosphate groups that are to
C-Terminal Domain
be transferred.

substrate-binding site. The two domains are connected by the so-called hinge region.
This contains the recognition motif for the adenosine moiety of ATP. The ribose
building block and the triphosphate group are bound in a crevice between the two
domains and are coordinated by a magnesium ion, which is essential for the transfer
mechanism. The activating loop with the DFG (Asp–Phe–Gly) and APE (Ala–Pro–
Glu) motifs that are next to the catalytic site are also important for the mechanism.
26.3 Isosteric with ATP, and Selective Nonetheless? 603

The structure determination of a cAMP-dependent kinase with a bound ADP


group and aluminum trifluoride afforded more detailed information about the
reaction mechanism. This g-phosphate-like molecule sits between the b-phosphate
group of ADP and the serine residue of the substrate peptide chain. Both could be
crystallized in a complex with the enzyme. Additionally, two magnesium ions are
found in the binding pocket. Detailed information about the mechanism of phos-
phate transfer could be derived from these structural data (Fig. 26.4). Asp184 from
the DFG loop coordinates one of the two Mg2+ ions that bring the three ATP
phosphate groups into the correct position for the reaction. The substrate serine,
which is to be phosphorylated, nucleophilically attacks the terminal g-phosphate
group, and the phosphate group is transferred upon formation of a trigonal bipyra-
midal phosphorus intermediate. The neighboring Asp166 polarizes the nucleophilic
serine OH group and accepts its proton during the reaction. The positively charged
residues Lys168 of the kinase and Arg18 of the substrate act as stabilizers.
Moreover, the two magnesium ions compensate for the negative charges on the
phosphate groups. The aromatic rings of Phe54 and Phe187 shield the transition
state from the aqueous milieu.
Principally there are three strategies to inhibit kinases: blocking the substrate
binding, displacing ATP from the binding site, or modulation of allosteric
regulation (vide infra). In the first case, the formation of the protein–protein
contact must be prevented because kinases recognize and bind other proteins as
substrates. The inhibition of such contacts is considered to be exceptionally difficult
because of the size of the interaction surface that is formed, particularly if this is to
be achieved with a small molecule (▶ Sect. 10.6). In the second case, the compet-
itive displacement of ATP from the binding site is concentrated upon. But is such
a concept also doomed to fail in light of the many structurally similar kinases, the
high ATP concentration in the cell, and the numerous other proteins that use ATP as
a substrate? Some kinases are regulated allosterically. There, the third strategy
offers a possibility to intervene in the regulatory function of the kinases through the
allosteric binding sites.

26.3 Isosteric with ATP, and Selective Nonetheless?

A detailed analysis of the binding sites for ATP in a large number of kinases
afforded a surprising and promising picture: There are indeed unoccupied regions
in the vicinity of the ATP-recognition site that are different for individual kinases!
Two hydrophobic regions open up, one deep toward the interior of the kinase, and
a second on the opposite side toward the surface (Fig. 26.5). The aminopyrimidine
ring of adenine forms two adjacent hydrogen bonds to the peptide main chain in the
hinge region of the kinase. A third interaction site on the polymer chain remains
unused by ATP 26.1, but can principally be involved in an interaction with
a ligand’s donor function. The design of ATP-competitive kinase inhibitors has
uncovered many interaction motifs that address the hinge region. They have been
incorporated into many clinically tested kinase inhibitors (26.2–26.21, Fig. 26.6).
604 26 Transferase Inhibitors

DFG Loop
O
Asp 184 −
O
Mg2+
O Asp 166
O O
P O O
O O P
Adenine O H −
P O O
O O
O
O
Mg2+ Ser
HO OH
Substrate Chain

Hinge-
Region
Pa
Pb Phe54

Asp184 Mg2+
Mg2+
Pg
Phe187

Asp166 Ser21

Agr18 Lys168

Fig. 26.4 Based on the crystal structure of a cAMP-dependent kinase with a bound ADP and
aluminum trifluoride as a transition-state mimetic, the reaction steps of the phosphate group
transfer from ATP (red) to the serine residue of the substrate (blue) can be modeled. Asp184
from the DFG loop is coordinated to the b- and g-phosphate groups of ATP via a magnesium ion.
An additional Mg2+ helps to position the three phosphate groups correctly. Serine 21, which is to
be phosphorylated, nucleophilically attacks the terminal g-phosphate group, and a phosphorus
atom is transferred with formation of a trigonal bipyramidal intermediate. The neighboring
Asp166 takes the proton from Ser21 OH during this reaction step. At the same time, the positively
charged residues Lys168 of the kinase and Arg18 of the substrate stabilize this intermediate.

The ubiquitous H-bonding pattern of the hinge region, which occurs in all kinases,
makes it difficult to confer ligands with selectivity. Nonetheless, certain MAP
kinases (mitogen-activated protein kinases, signal transduction pathways in cell
differentiation, cell growth, and cell death) provide the chance to design inhibitors
with interesting selectivity, which is associated with a conformational change in the
26.3 Isosteric with ATP, and Selective Nonetheless? 605

Fig. 26.5 Schematic overview of the recognition site of ATP 26.1 in kinases (so-called Traxler
model). The adenine moiety is recognized in the hinge region by two parallel hydrogen bonds from
the peptide strand. A third carbonyl group is available for interactions, but is not involved in the
ATP binding. Kinases with a glycine residue at this position can switch an exposed acceptor
function for a donor function at this third position by folding over an amide bond (left). Two
differently composed pockets open in the kinases next to the ATP-binding site, the so-called front
and back pocket. The latter-named pocket is bordered by the gatekeeper residue. The residues in
this pocket are not involved in ATP binding. Adjacent the phosphate-binding site is found.

hinge region. The orientation of the amide bond in this region is spatially
exchanged so that a donor function instead of an acceptor function is oriented
toward the bound ligands (Fig. 26.5). The flip of the amide bond is possible in these
kinases because a glycine is present in the neighboring position. Because glycine
lacks a side chain on its Ca atom, this residue can access a much larger conformational
space. Inhibitors with a dihydroquinazolinone scaffold as a mimic for the adenine
motif of ATP can induce this conformational flip. In the altered protein conformation,
they can bind selectively to kinases that carry a glycine in this position of their
sequences. If an amino acid with a side chain is found at this position, as it is in other
kinases, the rearrangement cannot be induced. Inhibitors that require this conforma-
tional flip to produce the specific H-bond pattern with the hinge region will therefore
only bind with reduced affinity to the latter kinases. The required conformational
rearrangement of the main chain is not possible in those cases for steric reasons.
The occupancy of the hydrophobic pockets on both sides of adenine’s binding
site (Fig. 26.5) is a generally applicable concept to render kinase inhibitors with
selectivity. The pocket that is found deep in the protein (the so-called back pocket)
has amino acids in its front part that can have very different properties in different
kinases. These are called gatekeeper residues. For example, a threonine is found in
the gatekeeper position in the p38a and p38b kinases. A much larger methionine
residue is found in the same position in the structurally similar p38g and p38d
kinases (Fig. 26.7). The compound SB 203580 26.3 has a p-fluorophenyl group in
the 5-position of its indole ring. The steric demand of this group is just enough that
there is sufficient space for its uptake into the binding pocket next to the threonine.
606

26.2 SB202190 26.3 SB203580 26.4 SP600125 26. 5 Imatinib 26.6 VX-745

26.7 BIRB-796 26.8 BAY43-9006 Sorafenib 26.9 GW-2016 Lapatinib 26.10 Gefitinib 26.11 Erlotinib

26.12 CI-1033 26.13 EKB569 26.14 ZD-6474 Vandetanib 26.15 Vatalanib 26.16 SU11248 Sunitinib
26

26.17 MLN-518 Tandutinib 26.18 LY-333531 26.20 Flavopiridol 26.21 Staurosporine

Fig. 26.6 Marketed products and development candidates of ATP-competitive kinase inhibitors 26.2–26.20; staurosporine 26.21 is a natural product.
All substances bind through hydrogen bonds to the peptide bonds in the hinge region of the kinases.
Transferase Inhibitors
26.3 Isosteric with ATP, and Selective Nonetheless? 607

Leu108

Tyr35
Thr106

Met109

Asp168 F

p38a,b Thr
N
5
p38γ,d Met S
N O
Erk1,2 Gln H
N
JNK1,2,3 Met 26.3 SB203580

Fig. 26.7 The kinases p38a and p38b have threonine (Thr106, violet) as gatekeeper residues;
a sterically more demanding methionine is in this position in the structurally related p38g and p38d
kinases. SB203580 26.3 binds with its p-fluorophenyl group at the central imidazole ring in a small
niche next to the threonine (green surface, interior is blue). The activity is significantly reduced on
other kinases with more voluminous amino acids in this position (Met, Glyn) because of steric
conflicts.

On the other hand, a methionine in this position requires so much space that there
is insufficient room for the p-fluorophenyl group, and the affinity of 26.3 markedly
decreases.
Analogously, 26.22 benefits from a binding advantage on the p90 ribosomal S6
kinase (RSK) because its p-tolyl group has enough space in a large pocket that is
gated by threonine as well as a neighboring cysteine (Fig. 26.8). The combination of
a Thr and Cys residue in these two positions has only been discovered in three kinases
in our genome. If a reactive fluoromethylene group is introduced as in 26.22, this group
can react with the neighboring cysteine to form a stable, covalent bond to the protein.
Another concept for the development of selective inhibitors exploits the confor-
mational adaptation of kinases. During the course of their activation, kinases go
through multiple steps on the way from the inactive to the active conformation.
Interestingly, kinases have a high degree of structural homology in the active state
in which ATP is bound. Inhibitors that exhibit high affinity to the active conforma-
tion are less selective as inhibitors than those that stabilize the inactive conforma-
tion. This is because the differences in the inactive conformations are significantly
greater. Therefore, the goal is to especially develop inhibitors that bind to an
inactive state of the kinase (Sect. 26.4).
608 26 Transferase Inhibitors

Back Pocket
Hinge-Region Phosphate Pocket

HN O
OH
O H H
R N HS
R N F
H N

N N O
HN O

26.22
R HO

Front Pocket
Ribose Pocket

Fig. 26.8 With its p-tolyl group, 26.22 achieves selective binding to the p90-ribosomal S6 kinase
because it finds a sufficiently large niche next to the threonine gatekeeper residue. Because of this,
the neighboring fluoromethylene group is placed in the vicinity of a cysteine residue with which
the inhibitor can subsequently react. In this way, a strong covalent bond is formed with the kinase.
The necessary arrangement of the Thr and Cys residues has been discovered in three kinases in our
genome so that 26.22 achieves high selectivity for kinases with this amino acid composition in the
back pocket.

Today it is common practice to compile a so-called inhibition or selectivity


profile for development candidates (Fig. 26.9). Their inhibition with respect to a
large panel of kinases is measured via as many binding assays as possible. Then the
assay results are plotted onto a family tree that summarizes the structural relation-
ships between kinases from different subfamilies. The division and length of the
individual branches reflects the mutual degree of relatedness. The extent of the
inhibition of individual kinases is represented by differently sized circles, in which
the larger circles denote stronger inhibition (Fig. 26.10). Conspicuously, many of
the active substances from Fig. 26.6 affect individual branches of the kinase family
tree especially strongly. This indicates that the structural differences within
a subfamily, which is described by such a branch, are often so small that no
selectivity can be achieved with these compounds. As already mentioned, there is
functional redundancy between the kinases. If one kinase is blocked, another can
assume its function in that its expression is up-regulated. Therefore it could be
essential for a successful drug therapy that not only one member of a subfamily is
turned off, but rather all members are affected to an equal extent. The natural
product staurosporine (26.21, Fig. 26.6), a highly potent alkaloid of bacterial origin,
represents a promiscuous inhibitor of most kinases. It binds to the kinases in their
active conformations. In Sect. 26.6 it is shown how small modifications to this lead
structure can nonetheless lead to highly active inhibitors.
26.4 Gleevec ®: Success Stories Breed Copycats! 609

Fig. 26.9 Kinases go


through multiple
conformations during their
activation from an inactive active
(green) to an active (red) inactive
state. The shown ATP
molecule binds to the active
conformation. For this,
a complete loop, the so-called
DFG loop, of the protein
Hinge
(inactive form, yellow) moves
Region
from an inward oriented
geometry in to an exposed ATP Substrate-
orientation (violet, arrow). At Chain
the same time the binding site
for ATP is rendered Tyr
accessible and the substrate
(blue) can bind. Interestingly,
kinases possess great
structural homology among
themselves in this state.
Therefore, inhibitors that bind
with high affinity to the active
conformation are less DFG-
selective than inhibitors that Loop
block the inactive
conformation of the kinase.

26.4 Gleevec ®: Success Stories Breed Copycats!

Well into the 1980s, drug development for cancer therapy was almost exclusively
concentrated on processes that intervene in DNA synthesis or cell division. This led
to the development of antimetabolites, alkylating compounds, microtubule
disruptors, and inhibitors for DNA synthesis. These strategies attempt to attack
target cells with very high rate of division such as cancer cells. The disadvantage of
such a chemotherapy is the massive adverse effects that severely limit the treated
patients’ quality of life. In 1960, Peter Nowell and David Hungerford were the first
to recognize that chronic myeloid leukemia comes from a specific genetic modifi-
cation. This defect causes about 15% of all leukemia cases. Chronic myeloid
leukemia represents the second most common form of chronic leukemia and is
caused by a severe proliferation of white blood cells, in particular the granulocytes.
A reciprocal translocation between chromosomes 9 and 22 causes chromosome 22
to be shortened. This is termed the Philadelphia Chromosome. The exchange has
the result that the so-called BCR-ABL fusion gene is generated, which codes for
a protein with constitutionally activated tyrosine kinase activity. This protein
belongs to the group of receptor tyrosine kinases (▶ Sect. 29.8) and plays an
important role in the regulation of cell growth. Uncontrolled proliferation is the
result of unregulated activation, and the cell becomes a tumor cell. It has been
610 26 Transferase Inhibitors

26.2 26.3 26.4 26.5 26.6

26.7 26.8 26.9 26.10 26.11

26.15
26.12 26.13 26.14 26.16

26.19 26.21
26.20
26.17 26.18

Kd < 1 nM 1–10 nM 10–100 nM 100 nM–1mM 1–10 mM

Fig. 26.10 Inhibition profile of the inhibitors 26.2–26.21 that were shown in Fig. 26.6 for 113
different kinases. The size of the red circle quantifies the strength of the inhibition. The data are shown
on the kinase family tree. In this diagram, the branching and the length of the individual branches
denotes the degree of relatedness between the members of the kinase families. The longer the distance
in the dendrogram is, the smaller the degree of relatedness. The natural product staurosporine 26.21 is
a largely unselective inhibitor, whereas 26.9 and 26.15 inhibit a few kinases very selectively.
Abbreviations: TK non-receptor tyrosine kinase, RTK receptor tyrosine kinase, TKL tyrosine
kinase-like kinase, CK casein kinase family, PKA protein-kinase-like family, CAMK calcium/cal-
modulin-like kinase, CDK cyclin-dependent kinase, MAPK mitogen-activated kinase, CLK CDK-like
kinase (from M.A. Fabian et al. 2005, with kind permission from the author and publisher).

shown based on further leukemia models that this gene is responsible for causing
this type of cancer. Therefore, it seemed that the increased kinase activity as a result
of the misregulated gene is responsible for the disease. It should be possible to
intervene in this overmodulation with a pharmaceutical therapy. As a result,
a program for the development of selective inhibitors of ABL-tyrosine kinases
was undertaken at Sandoz.
26.4 Gleevec ®: Success Stories Breed Copycats! 611

N
N H
H H H
N N N R1
N N N N
N O
N 6
N

26.23 Screening Hit 26.24 26.25

N N
H H N
H H
N N N R1 N N N N

N O N O
H3C H3C
26.26 26.5 Imatinib

Fig. 26.11 By starting with the PKC kinase inhibition screening hit 26.23, multiple development
steps afforded imatinib 26.5.

In the 1980s, several companies had already initiated the search for protein kinase
C inhibitors. Phenylaminopyrimidine (26.23, Fig. 26.11) was identified by a screening
campaign as a well-suited lead structure. The compound was derivatized (i.e., 26.24)
and initially optimized as a PKC inhibitor. It was noticed that the introduction of a
methyl group in position 6 (i.e., 26.26) completely reversed the kinase inhibition. This
“magic” methyl group influences the conformation between the central aromatic ring
systems, which are coupled through an amino group. In the binding mode observed
with ABL-tyrosine kinase, the inhibitor adopts an extended conformation, and the
methyl group contributes to a twisted arrangement between the two ring systems.
Compound 26.26 proved to be ideal for the inhibition of members of this family
of tyrosine kinases. Initially, this derivative had inadequate oral bioavailability and
water solubility. Therefore an attempt was made to improve these properties by
introducing polar groups such as an N-methylpiperazine group. Compound 26.5
proved to be optimal; it passed all phases of clinical trials and was introduced into
therapy in 2001 as imatinib (Gleevec ®). The compound selectively blocks the
BCR-ABL receptor tyrosine kinase and prevents phosphorylation of the substrate
proteins of this kinase. Later it was determined that still other kinases, for example,
the related c-Kit and PDGF receptor kinase are also inhibited.
Why did imatinib develop into such a success story? First of all the development
of this inhibitor represented an entirely new approach to cancer therapy. Ultimately
a cancer variant was being treated by a selective therapy. The drug showed very few
side effects. Therapy with this compound, however, is not cheap. In short order, it
evolved into a blockbuster for Novartis, and achieved more than a billion Euros in
sales per year. In light of the therapy as well as the sales, such a success story has
a maximally stimulating effect on the area of kinase research. Success stories breed
copycats! The original pessimism about the selectivity problems and redundancy in
the kinases seemed to have blown over. But experience has shown how difficult it is
to write a similar success story. In the meantime over ten kinase inhibitors have
been introduced to the market for different indications (mostly cancer therapy).
612 26 Transferase Inhibitors

Phe317

26.5

Thr315
Thr315

Phe382

Asp381

DFG-
Loop
DFG-
Loop

Fig. 26.12 Two views of the superimposed crystal structure of imatinib 26.5 and tetrahydro-
staurosporine 26.27 (Fig. 26.13) with the active (green) and inactive (red) states of the BCR-ABL
receptor tyrosine kinase. Whereas 26.5 binds to the inactive form of the kinase, the unselective
inhibitor 26.27 blocks the active conformation. With the so-called magic methyl group, imatinib
orients in the direction of the gatekeeper residue Thr325. The amino group that is found between
both rings forms a hydrogen bond to its OH group.

There are also follow-up compounds for imatinib, but no other compound has been
able to achieve a similar economic or therapeutic success.
The binding of imatinib to the kinase stabilizes an inactive enzyme conforma-
tion. The DFG loop, which is critical for the catalytic mechanism, remains in
a conformation that is oriented outward (Figs. 26.9, 26.12). The inhibitor’s
N-methylpiperazine group, which was initially introduced to improve the solubility,
adopts a position that would be occupied by this loop in the active state. Conse-
quently, this group is decisive for the binding mode adopted by 26.5. A structural
comparison of the kinase in complex with imatinib 26.5 and tetrahydrostaur-
osporine 26.27 (Fig. 26.13) is shown in Fig. 26.12. The latter inhibitor stabilizes
the enzyme in its active conformation. The DFG loop takes on a completely
different course; in consequence the DFG sequence motif is oriented toward the
interior. The magic methyl group in the 6-position of the central phenyl ring of 26.5
forces a perpendicular arrangement of this ring relative to the neighboring pyrim-
idine ring. This geometry enables favorable hydrophobic contacts to the gatekeeper
residues Thr315 and a hydrogen bond is formed between the NH group that
26.4 Gleevec ®: Success Stories Breed Copycats! 613

N
N
H H
N N N N
CH3
H
N O N
H3C O

26.5 Imatinib CF3


N
O
H N N
N N O CH3
N N
H CH3
N CH3
H3C N
O
26.28 Nilotinib H3C NH

H3C 26.27 Tetrahydrostaurosporine


N
N
N N
CH3 N
H OH
N N
S H

O
Cl 26.29 Dasatinib

Fig. 26.13 Nilotinib 26.28, which has a resistance-breaking profile, was developed as a follow-up
compound for imatinib 26.5. This compound binds with almost the same binding mode, but with
stronger affinity to the BCR-ABL kinase. Dasatinib 26.29, which was developed at Bristol-Myers
Squibb, also binds to this kinase, but adopts an entirely different binding mode.

connects the two rings and the hydroxyl group of this threonine. The combination of
an optimal interaction with Thr315 and a potent binding to the inactive conforma-
tion of the protein provides imatinib’s selectivity advantage. c-Kit is the only other
kinase to which imatinib has a pronounced affinity. This is explained by the high
sequence homology of this kinase with the BCR-ABL kinase in the DFG loop and
in the ATP-binding region. In both cases, the gatekeeper residue is threonine.
Meanwhile, cases of resistance to imatinib have developed. The observed
mutations desensitize the kinase to imatinib inhibition. To date about 30 mutations
have been described. They are a consequence of single base pair exchanges in the
genetic code and have developed from multiple cell populations in which the
exchanges happened purely by chance or have been influenced by oxidative
damage to the DNA. These variants have established themselves under the selection
pressure of imatinib blockage. The most commonly observed resistance mutation is
caused by an exchange of the gatekeeper residue Thr315 for isoleucine. Because of
the larger size of the exchanged amino acid, the inhibitory effects of imatinib fail.
Moreover, the hydrogen bonds can no longer form. The affinity drops from
Ki ¼ 85 nM to 10 mM. In the hinge region, Phe317 forms aromatic contacts with
the pyridine ring of the inhibitor. Mutation of this residue to a leucine causes a loss
in the aromatic interactions and reduces the binding affinity by a factor of 3. Most of
614 26 Transferase Inhibitors

the other observed mutations are rationalized in that the conformation of the kinase
is shifted more in the direction of the active conformation. Consequently, the
selection advantage of imatinib, which is caused by its potent binding to the
inactive conformation, becomes a disadvantage in terms of susceptibility to resis-
tance mutations. Novartis has introduced a follow-up drug for imatinib, the struc-
turally similar nilotinib (Tasigna ®) 26.28 (Fig. 26.13), which shows an improved
resistance profile. With the exception of the mutant Thr314 ! Ile, it shows good
affinity to all of the described resistance-imparting exchanges and stabilizes the
inactive conformation of the kinase. With its altered side chain exhibiting a
trifluoromethyl-substituted aromatic ring and an imidazole motif, nilotinib fits
into the preformed binding pocket better and achieves a higher binding affinity.
The affinity advantage is presumably the reason for its diminished susceptibility to
resistance because small shifts from the inactive to the active conformation are
better tolerated. Another compound, dasatinib (Sprycel ®), is available from Bristol-
Myers Squibb that can circumvent the observed resistance to imatinib. It adopts an
entirely different binding mode with the BCR-ABL kinase. Therefore it is also
reasonable to assume that it has a different selectivity profile than imatinib and
nilotinib; for example, it also binds to kinases of the Scr family.

26.5 Tracing Selectivity: The Bump and Hole Method

The properties of a cell are driven by a complex network of many interwoven


signaling pathways. Kinases are regulators of such information cascades. Because
of the complexity of these networks, it is extremely difficult to isolate the individual
signaling paths and to tease apart the individually involved kinases. Furthermore,
this task is made even more difficult by the overlapping substrate specificities of the
kinases. Therefore, methods have been developed that can clarify these signal
pathways by using appropriate chemical probes and genetic techniques. In princi-
ple, these techniques are not limited to kinases; they can be used to analyze the
functional properties of individual representatives of other protein families. The
structural differences between kinases that allow the development of selective
inhibitors were highlighted in detail in Sect. 26.2. The gatekeeper residue adopts
a key position. The size and polarity of this residue varies between the individual
kinases. Because the gatekeeper residue is not involved in the ATP binding, ATP
binds as a substrate almost identically with the same affinity to all kinases. If the
back pocket enlarged by exchanging a given gatekeeper residue for an amino acid
with a smaller side chain (e.g., Thr ! Gly), the modified kinase can recognize
a modified ATP, which has an attached side chain (i.e., 26.30), and uses this ATP
surrogate as a phosphorylation reagent for the protein substrate (Fig. 26.14). This
concept was colorfully termed the bump and hole method. A ligand that is too
large and must lead to steric conflict with the protein (bump) is converted into
a well-fitting ligand if a corresponding hole is made on the side of the protein.
The technique is, of course, not limited to the phosphorylation of substrates. It
can be used just as well for the development of specific inhibitors. In the research
26.5 Tracing Selectivity: The Bump and Hole Method 615

NH2

N N

N N

O
HO O− O− O−
O O O O−
P P P
HO O O
O

26.1 Adenosine Triphosphate ATP

HN

N N

N N

O
HO O− O− O−
O O O O−
P P P
HO O O
O
26.30 Enlarged Adenosine Triphosphate ATP

Fig. 26.14 In the context of the bump and hole method, the back pocket of a kinase is enlarged by
exchanging the gatekeeper residue (yellow) for smaller amino acids (e.g., Thr ! Gly). The altered
kinase can then recognize a chemically modified ATP 26.30 with an enlarged side chain, which
can subsequently be used as a phosphorylating reagent for the protein substrate.

group of Kevan Shokat, formerly at Princeton, later at UCSF in San Francisco,


protein kinases were modified by exchanging the gatekeeper residue for a glycine or
alanine (Fig. 26.15). Because of this back pocket enlargement, the mutated kinases
became highly sensitive to inhibition by 26.31 and 26.32, which only inhibit the
wild type weakly. This observation in an in vitro assay was later translated to in vivo
conditions. The researchers used baker’s yeast Saccharomyces cerevisiae as
616 26 Transferase Inhibitors

a Protein- b Protein-
substrate substrate

Adenosine P P P Adenosine P P P

Wild- Wild-
type type Inhibitor

Adenosine P P
X

P
Protein- Protein-
substrate substrate

c Protein- Protein-
substrate d substrate

Adenosine P P P Adenosine P P P

Mu- Wild-
X Inhibitor
tant type

Adenosine P P Adenosine P P

P P
Protein- Protein-
substrate substrate

e Protein-
substrate

Adenosine P P P
NH2 NH2

Mutant N N
Inhibitor N N
N N N
N

X
26.31 26.32
Protein- IC50 (Wild type): 28000 nM IC50 (Wild type): 1000 nM
substrate IC50 (Mutant): 4.2 nM IC50 (Mutant): 1.5 nM

Fig. 26.15 (a) The wild type of a kinase activates a protein substrate by transferring a phosphate
group. (b) If a potent inhibitor is added, the phosphorylation is inhibited. (c) Exchanging
a gatekeeper residue for a smaller amino acid such as glycine does not change the catalytic
activity of this kinase. (d) If an inhibitor that has an enlarged substituent to fill out the pocket
next to the gatekeeper residue is added to a wild-type kinase it can barely bind to the wild type
because of steric conflicts. (e) It could, however, block the kinase with the enlarged pocket. The
two inhibitors 26.31 and 26.32 hardly block the wild type at all, but are able to efficiently inhibit
the kinase with the enlarged binding pocket due to their modified gatekeeper residues.
26.6 Metals Teach Kinase Inhibitors Selectivity 617

a model organism. The yeast genome codes for 120 kinases, of which many are
related to kinase families in mammals. One such case is the Cdc28 protein kinase
(cyclin-dependent kinase) in yeast. It plays an important role in yeast reproduction
and drives special phases of the cell cycle. It exhibits 62% sequence identity with
a comparable enzyme, CDK2, in humans. To demonstrate the high specificity of the
inhibitors 26.31 and 26.32 for the mutated kinase, the altered protein had to be
incorporated in the genome of the yeast. This was accomplished with established
retroviral methods in molecular genetics (▶ Sect. 12.14). Finally, it had to be shown
that the cells of the genetically modified yeast exhibited normal growth. Only
a 20% longer reduplication time was observed. Next, the inhibitor 26.32 was
added to the cells of the wild-type yeast and the genetically modified yeast. The
cell growth of the wild-type yeast remained unaffected, except at an inhibitor
concentration above 50 mM, at which a longer replication time was observed. On
the other hand, the yeast with the modified cdc28 gene showed a strong dependence
on 26.32 under in vivo conditions. The growth was reduced by 50% at concentra-
tions as low as 50–100 nM; at 500 nM the growth was completely arrested. Obvi-
ously the inhibitor blocks the cells at the step before mitosis (cell nucleus division
during cell replication) because the phenotype of these inhibited cells seemed to be
very similar to those in which the mitotic cyclins (proteins with a key function in the
control of the cell cycle) were turned off. Individual processes in the cell cycle can
be investigated by using this method, above all, the phase in which a specific
inhibitor intervenes, can be determined. This information is of critical importance
for the development of a therapeutically valid drug. Usually at the beginning of
a project though, adequately selective inhibitors are not yet available that would
allow this type of specific study. This problem is particularly pronounced when
many proteins with high homology are found in the cell. The bump and hole
method, a combined chemical–genetic technique, allows a specific therapeutic
validation of the biological relevance of the target protein as well as the optimiza-
tion of the inhibitor class that is intended for development of a model organism in
an early phase of the project.

26.6 Metals Teach Kinase Inhibitors Selectivity

Metals and metal ions play an important role in biological systems, especially as
catalytic centers. Zinc and calcium ions can contribute to the crosslinking and
stabilization of proteins by acting as multidentate ligands (cf. zinc finger proteins,
▶ Sect. 28.2). Magnesium ions often serve as a kind of charge buffer to counteract
the electrostatic contribution of the strongly negatively charged phosphate groups.
As described in Sect. 26.2, they are involved in the phosphate-transfer mechanism
from ATP to the hydroxyl groups of Ser, Thr, or Tyr. In rare cases, metals serve as
a component of a ligand that binds to the biomolecule. An example of this are
magnesium ions that are so tightly coordinated to the b-hydroxyketo group of
tetracycline 26.33 that they remain bound during complex formation on the ribo-
some or on the tet-repressor. Another example is cisplatin 26.34, which induces
618 26 Transferase Inhibitors

crosslinking in the neighboring base pairs of DNA strands by a substitution reaction


at platinum; this makes the DNA unreadable in the replication process
(▶ Sect. 14.9, ▶ Fig. 14.19).
Indeed, metals can be incorporated into active substances with entirely different
intentions. Typically, carbon represents the architectural element in drugs. Its
coordination scheme is, however, rather boring. It is limited to linear, trigonal
planar, and tetrahedral geometries. A stereocenter can ensue when four different
substituents are on the tetrahedron (▶ Sect. 5.2); this affords the possibility of two
stereoisomers. Metals are much more exciting in this sense. By expanding their
coordination sphere, a markedly larger diversity of coordination geometries is avail-
able to them. An octahedral center with just six different substituents affords
30 stereoisomers! Initially, any medicinal chemist would resist the idea to incorporate
metals as a structural center in an active substance. The risk that such centers would
impart undesirable toxic properties on the substances seems too great. However, if
metals are considered that only form connections with the coordination partners that
are substitution-inert, this argument seems less valid. Ruthenium fulfills these
requirements for inert behavior very well. Why should the advantages of a much
more exciting coordination chemistry to construct an entirely different molecular
geometry not be exploited to generate an alternative pharmacophore pattern in very
small space? The goal is to use the metal center as a scaffold and not as an interaction
partner with the biomolecule. This concept, which seems unusual at first glance, was
pursued by Eric Meggers and his research group at the University of Marburg in
Germany. In Sect. 26.3, stauroporine 26.21 was introduced as a largely unselective
inhibitor of almost all kinases. This indolocarbazole alkaloid has a molecular building
block that is reminiscent of a carbohydrate and that adopts a comparable position to
the ribose ring in ATP (Fig. 26.16). On the other side a scaffold for a chelating ligand
is suggested by the molecular structure of stauroporin. If the sugar moiety is
exchanged for a metal center, a wide range of novel and interesting scaffolds are
produced. Considering hexa-coordinated metals, four additional coordination sites
are available for further substitution.
Derivatives such as 26.35 were synthesized in the group of Eric Meggers
(Fig. 26.16). They proved to be highly potent kinase inhibitors. Interestingly, and
in contrast to staurosporine, they show distinctly attenuated selectivity profiles.
Even complexes with cyclopentadiene groups could be synthesized in which the
five-membered ring covers three coordination positions. Compound 26.36 proved
to be selective for the GSK-3 and PIM-1 kinases in an inhibition study with
57 different kinases. Compared to staurosporine (IC50 ¼ 40 nM), 26.36 is ten
times more potent (IC50 ¼ 3 nM) at these kinases. The metal-free coordination
ligand 26.37 (IC50 ¼ 50 mM) and the N-methyl compound 26.38 (IC50 > 300 mM)
were almost inactive. A crystal structure of the R stereoisomer with PIM-1 was
determined (Fig. 26.17). The structure largely coincides with the geometry of the
staurosporine complex. The ruthenium complex with its carbonyl group is oriented
opposite to a b strand in the kinase fold that is located above the ATP-binding site.
The cyclopentadiene group replaces the lower part of the sugar-like moiety in
staurosporine. The highly similar geometries of the complexes give no obvious
26.6 Metals Teach Kinase Inhibitors Selectivity 619

N(CH3)2 H H3C OH
Cl NH3
HO
Pt
H2N Cl NH3
OH
O O OH O OH 26.34 cis-Platin

Mg2+

26.33 Mg2+ * Tetracycline

CH3 CH3
HN
A
O
O B
O N N
HN Ru
HN N
N O C
O
D
CH3

26.21 Stauroporine 26.35 Ruthenium Complex

H H CH3
O N O N
O O N
O O

N N N N N
H N
Ru
C Ru
O C
O

26.36 IC50 = 3nM 26.37 IC50 = 50μM 26.38 IC50 > 300μM

Fig. 26.16 Examples for protein ligands that bind to the metal center of the protein. Tetracycline
26.33 chelates magnesium ions so strongly that protein binding of this ligand is achieved together
with the Mg2+ ion. Cisplatin 26.34 binds through substitution of the chlorine atoms by the basic
nitrogen atoms of the nucleotide bases of DNA. Replacement of the sugar moiety in staurosporine
26.21 led to the chelating ruthenium complex 26.35. They proved to be potent kinase inhibitors
(e.g., 26.36).

indication as to why the metal group converts the promiscuous staurosporine


scaffold into highly selective inhibitors. The selectivity profile for other kinases
can be shifted by exchanging the coordinating ligands on ruthenium and by inverting
the stereochemistry. Whether the severely altered charge distribution on the scaffold
is responsible for this shift, or the interactions with the polymer chain above the
ATP-binding site are determinant, remains unclear. Interestingly, the ruthenium
complexes proved to be active under in vivo conditions and demonstrated intervention
in the signal cascade of the so-called wnt signaling pathway in human cell lines, and in
620 26 Transferase Inhibitors

Hinge

Leu120
C≡O

Ru
Asp186
VaI126

Fig. 26.17 Superposition of the crystal structures of the complex of PIM-1 kinase with the
unselective inhibitor staurosporine 26.21 (light blue) and the selective ruthenium carbonyl com-
plex 26.36 (olive green). The binding geometry is almost identical in both cases. In 26.36, the
carbonyl group is opposite to the b strand that runs above the binding pocket.

frog and zebra fish embryos. Time will tell whether such metal complexes do, in fact,
open a new perspective for drug development or whether they serve as interesting
probe molecules for basic research on signaling pathways. Certainly they will have an
answer in store for the specific question of the development of selective kinase
inhibitors, but it still must be provided.

26.7 Phosphatases Switch Proteins Off

The posttranslational modifications of proteins serve to regulate cellular processes.


Phosphorylation by kinases leads to the activation of proteins; their functions are
turned on. To turn them off again, Nature has developed an opposite to the kinases:
the phosphatases (Fig. 26.1). They are able to remove phosphate groups from the
amino acids Ser, Thr, Tyr, and His by hydrolysis. Altogether there are three families
of phosphatases. The first group cleaves phosphate groups from serine and threo-
nine. It has two metal ions in its catalytic site: presumably zinc and manganese or
magnesium ions (Fig. 26.18a). These are held in position by histidine and aspartic
acid residues. A water molecule (or OH) is found as a bridge between the two
metal ions. It is therefore strongly polarized and can carry out a nucleophilic attack
on the phosphate group that is to be cleaved. The phosphate group also experiences
a polarization and is prepared for the nucleophilic attack in that it coordinates to the
26.7 Phosphatases Switch Proteins Off 621

a Asp b Asp H Arg


N
O H2N
- O +
NH2
O O
H O O
Substrate O P
His NH
S Cys
O
NH+
Substrate
Asn O
O
P
Asp H Arg
O O N
O - O H2N
HN OH O
M M O +
Asp NH2
O - O H
N N N O O
O H O
N N
P
S C
Cys
Asp OH O
N
His His
His
Substrate

Fig. 26.18 Two catalytic mechanisms have been described for the cleavage of phosphate groups
from serine, threonine, and tyrosine in peptide substrates. The first group (a) uses two metal ions
(presumably Zn2+ and Mn2+ or Mg2+), which are coordinated by a histidine or aspartic acid. A
water molecule (presumably in the form of an OH group) nucleophilically attacks the phosphate
group of the substrate and initiates the cleavage. The second class of phosphatases begins the
cleavage reaction with a nucleophilic attack by the thiolate group of a cysteine (b, above). The pKa
value of this cysteine is severely shifted by the dipole moment of a helix that is pointing toward the
site that accommodates the thiol group and the reaction starts from a deprotonated cysteine.
Finally, a water molecule initiates the cleavage of the phosphate group from cysteine (b, lower).

metal ions with two of its oxygen atoms. The intermediate collapses with transient
formation of a pentacoordinated phosphorus atom. The bond between the hydroxyl
oxygen atom of the Ser or Thr residue and the phosphate group is cleaved.
A neighboring histidine assists the cleavage by providing the required proton.
The reaction is reminiscent of the reaction mechanism of phosphodiesterases
(▶ Sect. 25.8). The second group of phosphatases does not use a metal ion for the
cleavage reaction, but rather a covalent intermediate is formed during the course of
the reaction (Fig. 26.18b). These phosphatases cleave phosphate groups from
tyrosine residues. The formation of a very deep, 9-Å-long binding pocket is
characteristic for the latter phosphatases. It is completely established only after
the substrate is bound. A loop that contains a tryptophan, proline, and aspartic acid
(WPD loop) lies over the catalytic site and closes it to the outside. It contributes the
catalytically important aspartic acid and is critical for substrate recognition
(Fig. 26.18). In a closed, substrate-bound state the aspartic acid forms an H-bond
with the phenolic oxygen atom of the phosphotyrosine residue. The phosphate group
is polarized by this interaction and is prepared for nucleophilic attack. This is
622 26 Transferase Inhibitors

Table 26.1 Examples for phosphatases that have been recognized as target structures for drug
therapy
Family Description Disease, therapeutic approach
pSer, pThr PP1, PP2A Tumor suppression
PP2B, PP2C Cystic fibrosis
(Calcineurin) Immune suppression
Asthma
Cardiovascular disease
pTyr PTP1B Diabetes, obesity
CD45 Alzheimer’s disease
SHP Neuroprotection
Dual-specific VHR Regulation of MAP phophatases; stimulation of the cell
phosphatases Cdc25 kinases cycle; cancer therapy

accomplished by a neighboring cysteine, which is positioned near the end of a long


helix. Additionally, an arginine helps to stabilize the reaction’s transition state; this
is analogous to the oxyanion hole in serine or cysteine proteases. Similar to the acyl–
enzyme complex formed in the proteases (▶ Sect. 23.2), the protein is temporarily
phosphorylated at sulfur. The dephosphorylated substrate leaves the catalytic site. In
the next step, a water molecule attacks and cleaves the phosphate group on cysteine,
which is polarized by the neighboring aspartic acid. With this, the catalyst is reset to
its initial state. The next reaction cycle can begin.
Whereas the first and second family of phosphatase processes different sub-
strates according to entirely different mechanisms, a third family exists that works
similarly to the second group of tyrosine phosphatases. It has a dual specificity and
accordingly can cleave phosphate groups from serine, threonine, and tyrosine. In
contrast to the specific tyrosine phosphatases, it has a shorter binding pocket that
allows phosphotyrosine as well as the shorter phosphoserine and phosphothreonine
to reach the catalytic site.
To date, 107 genes have been discovered in our genome that code for phospha-
tases. Many phosphatases intervene in signaling cascades by targeted dephosphor-
ylation. The overwhelming portion cleaves phosphate groups from activated
proteins and in doing so, deactivates the involved receptors. Because the phosphate
group as well as the phosphorylated amino acids and residues in the immediate
vicinity are involved in the interaction with the phosphatase, the selectivity problem
is not as severe as it is with the kinases. Phosphatases have been characterized as
target structures in many different diseases. A few examples of the concepts that
have been assigned to and acted upon for the development of drugs are summarized
in Table 26.1. An example of how potent inhibitors for phosphatases can be
developed will be shown in the example of PTB-1B, a receptor tyrosine phospha-
tase that has been pursued by many pharmaceutical companies as an innovative
target enzyme for the therapy of diabetes and obesity.
26.8 Inhibitors of PTP-1B: Treatment for Diabetes and Obesity? 623

26.8 Inhibitors of PTP-1B: Treatment for Diabetes and Obesity?

Adult-onset type-II diabetes and obesity are diseases that have increased alarm-
ingly in our society in the last years. They must be considered to be a typical
civilization disease. Adult-onset diabetes is based on increasing insulin resistance,
which is observed as a reduced ability of the cells in the target organ to respond to
insulin. As a consequence, high blood insulin levels occur even at normal blood
sugar concentration. Because of the resistance, the cells no longer respond as
required to the signal that insulin would illicit in a healthy person. Insulin causes
glucose uptake from food into the liver cells, where glucose is stored in the form of
glycogen. If increasing resistance comes to happen, pathophysiological changes
occur based on inadequate insulin control. The uptake of blood sugar into tissues
and the release of sugar from the liver runs askew. As a result, the blood sugar level
increases even more, and this can manifest itself in the form of complications such
as coronary heart disease, retinopathy, cataracts, and vascular disease.
The other civilization disease is much more obviously seen. There are more and
more obese people. The signs are a disproportionate excess of body mass. The fact
that obesity is in no way limited to age is even more alarming. Even in young years,
the number of cases of obesity is increasing dramatically. There are estimates that
by the year 2015, 75% of all adults in industrialized countries such as the USA will
be overweight, and 40% defined as obese. The percentage is also distinctly increas-
ing in developing countries. Of course, this has something to do with our altered
lifestyles. An overabundance of food, often without any dietary fiber, coupled with
a lifestyle that demands ever-decreasing amounts of physical labor has caused this
development. Furthermore, a genetic predisposition contributes to the development
of obesity. Interestingly, the development of type-II diabetes and obesity occur very
commonly together, and in so doing increase the health risks for the patient. The
generated disease symtoms are called a metabolic syndrome. For this diagnosis,
the following other criteria apply: an abdominal girth of more than 80 cm in
a woman or 90 cm in a man, and two of the following other factors such as an
elevated triglyceride level (>150 mg/dL), an elevated fasting blood sugar level
(>100 mg/dL), arterial hypertension (>130/85 mmHg), and/or a reduced HDL
cholesterol level (<40–50 mg/dL; ▶ Sect. 27.3). The costs to society that come as
a consequence of this increased health risk are barely estimable today. They are
most likely dramatic. Therefore, great effort has been made in the search for drug
therapies that can counteract metabolic syndrome and its consequences.
The correlation between insulin resistance and obesity is not fully understood at
the molecular level. Insulin is indeed a hormone that is related to fat metabolism
and that exerts an influence on the fat inventory. For example, it influences the
storage of fat but an insulin deficiency leads to weight loss. Insulin is bound to the
insulin receptor, which undergoes autophosphorylation by its tyrosine kinase
domain as a response to this signal (▶ Sect. 29.8). This initiates a cascade of
multiple kinases that ends in the synthesis of the sugar-storing glycogen. The
synthesis of fatty acids and proteins is also induced. Dephosphorylation of the
insulin receptor attenuates its function. The cleavage of phosphate groups from two
624 26 Transferase Inhibitors

tyrosine residues on the receptor is accomplished by the PTB-1B tyrosine phos-


phatase. This leads to a deactivation of the insulin receptor and the cascade that the
receptor initiates. Blocking this dephosphorylation step seems to be a rewarding
concept to counteract insulin resistance. The real initiator in the search for PTP-1B
inhibitors was the observation that mice which have a ptp-1b gene turned off are
resistant to developing obesity despite unchanged nutrition, and their insulin sen-
sitivity is increased without apparent negative consequences. This spectacular
observation suggests that the ideal target to fight the most prominent civilization
disease has been found. This optimism was buttressed by the fact that antisense
nucleotides (▶ Sect. 32.4) that block the expression of PTP-1B also cause an
increased insulin effect. As a result, almost every pharmaceutical company of
distinction flocked to this enzyme to develop potent inhibitors. In a period of
4 years, over 200 patent applications appeared in the literature!
Did PTP-1B prove to be an easily addressed target? The mechanism of action is
displayed in the previous section, Sect. 26.7. The catalytic cysteine, which accepts
the cleaved phosphate group temporarily, orients itself at the tip of a long helix that
is arranged toward the reaction site. Such a helix establishes special electrostatic
conditions at its terminal end (▶ Sects. 30.2 and ▶ 30.6) and can stabilize negative
charges well. Additionally, an aspartic acid and an arginine are found there. The
structure with the phosphorylated tyrosine, shown in Fig. 26.19, is a part of a
substrate. The structure with this substrate could be determined because the enzyme
was rendered almost catalytically inactive by exchanging the catalytic Cys215 for
a structurally analogous serine. The phosphate group is bound in a tight network of
H-bonds. The phenyl ring of tyrosine is taken into a hydrophobic clamp by two
neighboring aromatic residues, Tyr46 and Phe182. These two residues also deter-
mine the depth and width of the entrance passage to the phosphatase’s catalytic site
(Fig. 26.20a). First an attempt was made to replace the phenolic oxygen atom of the
tyrosine residue which attaches the phosphate group of the substrate 26.39 with
a non-hydrolyzable mimetic 26.40 (Fig. 26.21). A CF2 group was chosen to
substitute the oxygen atom. The compound’s polar properties were essentially
maintained, but the hydrolytic stability was markedly improved. A fragment-
based screening approach using crystallography and NMR spectroscopy (▶ Sects.
7.8 and ▶ 7.9) was used to discover oxalic acid anilide 26.41 and
N-oxalylanthranilic acid 22.42 as phosphotyrosine mimics. The thiophene analogue
26.43 proved to be a submicromolar inhibitor. Surprisingly, in the crystal structure
with phosphotyrosine a second molecule of 26.39 (pink) was found to be bound
(Figs. 26.19, 26.20a). It neighbors the first molecule (green) and occupies a second
pocket formed by Arg24, Arg254, Gln262, and Asp48. The affinity to this binding
site, however, was only millimolar. Nonetheless, the discovery led to the pivotal
idea to couple the phosphotyrosine mimetic to a molecular building block that
occupies the second binding site. The purpose was to arrive at inhibitors with much
better binding affinity.
Aromatic oxalic acid derivatives such as 26.44 and 26.45 were also worked on at
Abbott as mimics for the substrate to bind to the catalytic site. Interestingly, the
derivatives pursued by Abbott forced a conformational change at Phe182 at the
26.8 Inhibitors of PTP-1B: Treatment for Diabetes and Obesity? 625

Arg24
Phe182
Arg254

Asp181

2nd Binding Site

Arg221

Asp48 Cys215

Tyr46
Arg47

Catalytic Site

Fig. 26.19 The crystallographically determined binding mode of a phosphorylated tyrosine


(26.39, green, Fig. 26.21) as a minimal mimetic for a peptide substrate in the human phosphatase
PTP-1B. The phosphate group is held in place by an arginine residue (221) and Cys225 is
positioned for nucleophilic attack. Asp181 is found above the Cys residue and buffers for the
protonation inventory. The entrance to the binding pocket is bordered by the two aromatic residues
Phe182 and Tyr46. The binding position of the cysteine is found at the end of a long helix. The
displayed geometry is based on a crystal structure with the catalytically inactive Cys ! Ser
mutant. A second phosphotyrosine (pink) is found in the crystal structure that binds to Arg24 and
Arg254 in a second distal pocket. Consequentially, the occupancy of this second binding pocket
was important for the development of nanomolar PTP-1B inhibitors (cf. Fig. 26.20).

entrance, so that the top of the catalytic site is opened (Fig. 26.20b). Abbott
additionally applied their SAR-by-NMR technique (▶ Sect. 7.8) to discover poten-
tial binders for the second binding site. Small aromatic acids such as 26.46–26.48
were discovered. By coupling such moieties (e.g., naphthylcarboxylic acids) and
the already-known mimetic 26.45 to bind to the catalytic center produced the
nanomolar inhibitor 26.49 (Ki ¼ 22 nM, Figs. 26.20c, 26.21). This second binding
site was determinant for the lead structure optimization. At Novo Nordisk the initial
oxalic acid derivatives on the thiophene ring were expanded by using Asp48 as an
additional anchor point to arrive at more potent and selective inhibitors based on
scaffold 26.50 (Fig. 26.21).
The development of highly potent, PTB-1B-selective, and orally available
inhibitors was overshadowed by another observation. It had been suspected on
the basis of sequence comparisons that there is another phosphatase, the T-cell
protein tyrosine phosphatase TCPTP which exhibits high similarity to PTP-1B.
626 26 Transferase Inhibitors

a Arg24
b Phe182
Gln262 Phe182

Arg24
Arg254

Arg254

Asp48 Tyr46 Asp48


Tyr46

Lys41
c d Phe182
Phe182 Arg24

Arg24 Arg254

Arg254

Asp48 Tyr46

Asp48
Tyr46

Lys41

Fig. 26.20 (a) Binding mode of the substrate-analogous phosphotyrosine (26.38, Fig. 26.21) in
human PTP-1B. The phosphate group binds deeply in the catalytic site (green). The two hydro-
phobic amino acids Phe182 and Tyr46 form a narrow entry portal to the catalytic center. A second
phosphotyrosine (pink) is found in the crystal structure that binds to Arg24 and Arg254. (b) Crystal
structure of an aromatic oxalic acid derivative (26.54) that was developed at Abbott to occupy the
catalytic site (green). The compound induces a rearrangement of the Phe182 side chain and opens
the catalytic site to the top. (c) By chemically coupling an aromatic carboxylic acid that was
discovered with the SAR-by-NMR method as a binder for the second binding site (pink) and
a mimetic to occupy the catalytic site, a nanomolar inhibitor 26.49 (Fig. 26.21) was obtained.
(d) To achieve selective binding to PTP-1B compared to the structurally very similar TCPTP,
structural differences at position 41 were exploited (light blue outlining). There PBP-1B has
a lysine, and the related family member TCPTP has an Arg in this position. The nanomolar
inhibitor 26.51 achieves a significant selectivity advantage.

Such an observation is worrying because the developed PTP-1B inhibitors could, of


course, also inhibit this phosphatase. The crystal structure that was published in
2002 confirmed these suspicions: the sequence identity of catalytic domains is 74%,
and the WPD loop, which lies on top of the catalytic site after substrate binding, is
identical. Knock-out mice that lack the tcptp gene are born healthy, but die within
3–5 weeks of birth. More alarmingly, turning the ptp-1b and tcptp genes off
simultaneously led to animals that had no chance of survival at all. This underscores
the extreme danger that inadequately selective PTP-1B inhibitors simultaneously
inhibiting T-cell protein tyrosine kinase could lead to a life-threatening situation.
26.8 Inhibitors of PTP-1B: Treatment for Diabetes and Obesity? 627

F F O
R O O R O OH
P P N
OH OH H O
26.39 OH OH
26.40 26.41
R=H2N-CH-COOH
O O
O OH O
OH O
OH O
OH OH
N S OH N
H O N
H O
R O
26.42
26.43
O
OH O
O 26.44
OH
N COOH
O
26.46
S N COOH
COOH
26.48
26.45 PTP-1B Ki = 39 μM
TCPTC Ki = 44 μM
26.47
O O
OH
OH O
H
OH N OH
N N
S
O
O O
26.50
OH R
H
N
O
O
O N O
H F O
F
26.49 PTP-1B Ki = 22 nM HN O O–
TCPTC Ki = 44 nM O P
H
O N O–
– N
O H
P O
COOH
O–
F F
26.51 PTB-1B Ki = 4.9 nM
TCPTC Ki = 20 nM

Fig. 26.21 By starting with a substrate with a terminal phosphotyrosine 26.36, a hydrolytically
stable compound 26.40 was developed. A fragment screening drew attention to the two mimetics
26.41 and 26.42. Thiophene derivatives such as 26.43 were designed from the latter compound.
Aromatic carboxylic acids such as 26.46–26.48 were discovered by screening with the SAR-by-
NMR method as ligands for the second binding site. By chemically linking such aromatic
carboxylic acids as binders for the second binding site and a mimetic for the phosphotyrosine in
the catalytic site, 26.49 was obtained as a nanomolar inhibitor. Lead structures were also fitted with
side chains for the second binding site (26.50) at Novo Nordisk. Compound 26.51 embodies
a fourfold selective PTP-1B inhibitor compared to TCPTP.
628 26 Transferase Inhibitors

The need was great. Where do differences between the structures of the two
phosphatases occur that could be exploited to develop sufficiently selective com-
pounds? All of the developed inhibitors at that time showed almost equipotent
affinity for both proteins. Bidentate inhibitors such as 26.51 (Fig. 26.21), which
were reported in 2003, proved to be very interesting because they occupy the
catalytic site and neglect the second binding site (Fig. 26.20d). Even the sequence
of this region proved to be identical to TCPTP. With their somewhat altered
orientation, the new inhibitors address a lysine residue (Lys41) that is an arginine
in TCPTP. At the least, the nanomolar inhibitor 26.51 has a distinct selectivity
advantage for PTP-1B compared to TCPTP. In 2004, the Sunesis company reported
the discovery of an allosteric binding site 20 Å away on the back side of the
catalytic site in PTP-1B. An inhibitor was developed for this site that binds with
micromolar affinity to the enzyme. It blocks its function by preventing the closure
of the WPD loop. In this way, the loop cannot fold upon the substrate-binding site.
The essential residues such as the catalytically active aspartic acid are not brought
in the vicinity of the substrate. The most potent ligand from this series, 26.52
(IC50 ¼ 8 mM), wraps itself around a phenylalanine that is found there, as proven by
the crystal structure (Fig. 26.22). In the structurally analogous TCPTP, a cysteine is
found at this position and forms entirely different interactions with the aromatic
groups of this ligand. Due to the deviating interaction pattern, this compound
achieves TCPTP inhibition at only 280 mM. Perhaps blocking this allosteric binding
site will open a new perspective for the selective inhibition of PTP-1B. The future
must show whether the severe selectivity problem can be resolved in an appropriate
way. Therefore, all hopes to block this, at first glance, ideal target are currently
focused on the antisense nucleotide ISI 113715 from Isis Pharmaceuticals that is
undergoing clinical trials (▶ Sect. 32.4).

26.9 Inhibitors of Catechol-O-Methyltransferase

A large family of transferring enzymes are the methyl transferases, which shift
methyl groups to other biomolecules. The DNA methyl transferases represent an
important group in this family. Their task is to chemically change the nucleobases
of DNA at particular positions by transferring methyl groups (▶ Sect. 12.13).
This methylation does not lead to changes in the genetic code, that is, the same
amino acids are translated as before. It acts, however, as a kind of DNA-strand
labeling that can, for instance, allow the recognition of foreign DNA or the
differentiation of the original from newly synthesized strands. Another group of
methyl transferases shuffle methyl groups onto oxygen, nitrogen, or sulfur atoms in
small biomolecules. Methyltransferases use S-adenosyl-L-methionine (SAM 26.53)
as a cofactor (Fig. 26.23). A highly reactive methyl group on the sulfonium group is
transferred during the transmethylation reaction.
Inhibitors of catechol-O-methyltransferase (COMT) have gained importance in
pharmaceutical therapy. This enzyme deactivates the endogenous function of
catecholamines such as dopamine, adrenaline, or noradrenaline in that it transfers
26.9 Inhibitors of Catechol-O-Methyltransferase 629

Fig. 26.22 A new allosteric


binding site was discovered at Asn193
Sunesis that is approximately
20 Å away from the catalytic
site of the phosphatase. Phe196
Compound 26.52 inhibits
PTP-1B 16-fold more
Glu276
strongly than TCPTP. The
crystal structure with PTP-1B
shows that the inhibitor
basically wraps itself around
the exposed Phe280. In
TCPTP a cysteine residue is Phe280
found in the same position.

O O
O
S S
Br O
N
O H HN S
O
HO N
Br 26.52

a methyl group onto the phenolic hydroxyl group of these neurotransmitters.


Polymorphisms in this enzyme have been associated with psychiatric changes
that could be related to anxiety disorders and schizophrenia. Inhibitors of this
enzyme are used in therapy, especially for the treatment of Parkinson’s disease.
This disease, originally known as the “shaking palsy,” occurs particularly in older
people. It is caused by a slow, progressive degeneration of the dopaminergic
neurons in the so-called Substantia nigra in the midbrain. A causal treatment of
the neuronal degeneration has currently not yet been achieved. Therefore, an
attempt is made to counteract the dopamine deficiency with exogenous replacement
substances. The amino acid L-DOPA was already introduced in ▶ Sect. 9.4 as
a precursor of dopamine. Although it exhibits more polar properties than dopamine,
it can penetrate the blood–brain barrier because it uses an amino acid transporter for
its access to the brain. In reality, however, only about 1% of the administered
amount reaches the brain. The overwhelming portion is degraded in the periphery
by decarboxylases that are found there. To prevent this degradation and the side
effects associated with dopamine being released in the periphery, a decarboxylase
inhibitor is simultaneously administered. This must be so polar that it cannot penetrate
the blood–brain barrier (e.g., benserazide 9.39, ▶ Fig. 9.9). This strategy allows
the bioavailability of L-DOPA in the brain to be substantially increased. The active
substance is degraded by monoamine oxidases (▶ Sect. 27.8) as well as by catechol-O-
methyltransferases. COMT recognizes L-DOPA as a substrate as well as dopamine.
630 26 Transferase Inhibitors

O2N
NH2
NO2
26.54 N
N

HO O– N N
Mg2+
H3C O
OH
S+
–OOC 26.53 S-Adenosyl-L-Methionine (SAM)
OH
NH3+

Asp169
Asn170

Lys144
Mg2+

Glu199

Asp141

SCH3+

SAM

Fig. 26.23 The crystals structure of COMT with the cofactor S-adenosyl-L-methionine 26.53 and
the catecholamine-analogous nitro-substituted inhibitor 26.54. The methyl group that is to be
transferred to the phenolic oxygen atom (red) is within a short distance (2.63 Å). The phenolic
oxygen, which is the nucleophile in the transfer reaction, is presumably deprotonated because of
the electron-withdrawing effect of the nitro groups and the narrow proximity to the magnesium
ion, the sulfonium group, and the ammonium group of Lys144. The accumulated positive charges
shift the pKa value of this hydroxyl group additionally into the acidic range. The second phenolic
OH group is probably uncharged and forms an H-bond to Glu199.

COMT is, as such, inactivated by the transfer of a methyl group onto the phenolic
hydroxyl group of its substrate. Inhibiting COMT therefore allows the bioavailability
of L-DOPA to be further increased, and a higher concentration of dopamine in the
brain can be achieved. The crystal structure of the enzyme was elucidated in 1994 in
the group of Anders Liljas (Fig. 26.23). A deeply buried magnesium ion that takes on
26.9 Inhibitors of Catechol-O-Methyltransferase 631

an octahedral coordination geometry is decisive for the mechanism. The neighboring


oxygen atoms of the catecholamine chelate with the magnesium ion. The phenolic
oxygen atom is brought in close proximity (2.63 Å) to the sulfonium group as
a result. Presumably, this hydroxyl function is deprotonated due to its proximity to
the magnesium ion, the sulfonium group, and the ammonium group of Lys144,
which enhances its nucleophilicity for the SN2-like transfer of the methyl group from
the positively charged sulfur of SAM. The second, probably uncharged phenolic OH
group is involved in a hydrogen bond with Glu199. The crystal structure determi-
nation was accomplished with a substrate-like inhibitor 26.54, in which the
nucleophilicity of the oxygen atom is very strongly suppressed by two electron-
withdrawing nitro groups (Fig. 26.23). The methyl transfer does not occur anymore.
Molecules with multiply hydroxylated aromatic rings such as pyrogallol 26.55,
gallic acid 26.56, or tropolone 26.57 show weak, micromolar affinity to the enzyme.
The introduction of strongly electron-withdrawing nitro groups or carbonyl groups
to the aromatic rings leads to a distinct increase in the affinity of these substrate-like
inhibitors. The inhibitors tolcapone 26.58, entacapone 26.59, nitecapone 26.60, or
nebicapone 26.61 all have a substitution pattern with a nitro group in the ortho
position to the nucleophilic hydroxyl group and a second electron-withdrawing
group in the para position (Fig. 26.24). It could be shown by crystallography for
these derivatives that their nitro groups match with the one of 26.54, which orients
toward the SAM substrate (Fig. 26.23). The second electron-withdrawing substit-
uent lies where the other nitro group from 26.54 does, and orients in the direction of
the surrounding solvent. Tolcapone 26.58 was approved in 1997 as a peripherally
and centrally effective COMT inhibitor. Its therapeutic use has been severely
limited because of observed liver toxicity. The concomitant administration of
L-DOPA and entacapone 26.59, which works predominantly in the periphery,
proved to be more favorable. Since 1998 it is on the market and contributes to
a balanced level of L-DOPA.
All of these drugs compete with catecholamine for a magnesium ion in the
binding site. Recently, nanomolar bisubstrate inhibitors such as 26.62 have also
been developed. They displace the cofactor SAM as well as catecholamine from the
binding pocket. The original building blocks of the initial molecular model com-
pounds can be seen in the bisubstrate inhibitors. The crystal structure of one of these
inhibitors is shown in Fig. 26.25 superimposed with the natural substrate. Its
binding geometry largely matches the adenosine part of SAM on the nucleoside
side, and the catecholamine side with the nitroaromatic ring. The correct choice of
bridges between the two substrate-analogous molecular portions is critical for
binding affinity. A rigid five-membered chain made up of an amide group and an
E-configured double bond represents the optimum. Transition to a flexible structure
by hybridizing the double bond causes the binding affinity to decrease by a factor of
100. Elongating the chain with an additional member leads to a further 25-fold
reduction in the binding affinity. Bisubstrate-analogue inhibitors should achieve
a higher selectivity for their target enzymes. In the present case, the rigid and
geometrically strained tether between the two molecular portions is responsible for
632 26 Transferase Inhibitors

O
HO HO
OH

HO HO HO
OH OH O

26.55 Pyrogallol 26.56 Gallic acid 26.57 Tropolone

O O
HO HO
N CH3
CN
HO CH3 HO CH3
NO2 NO2

26.58 Tolcapone IC50 = 0.3 nM 26.59 Entacapone IC50 = 0.3 nM


O
O
HO
CH3 HO

HO O CH3
HO
NO2
NO2
26.60 Nitecapone IC50 = 1 nM 26.61 Nebicapone
NH2
O2N
N
O N

N
N
N
H
HO O– O
OH
Mg2+

26.62 OH

Fig. 26.24 Pyrogallol 26.55, gallic acid 26.56, or tropolone 26.57 bind to COMT with micro-
molar affinity. Tolcapone 26.58, entacapone 26.59, nitecapone 26.60, or nebicapone 26.61 have
strongly electron-withdrawing groups directly on or conjugated to the aromatic ring. They are
nanomolar, competitive inhibitors of catecholamine. The linking of two moieties, each analogous
to catecholamine or adenosine with a rigid five-membered tether (amide bond and double bond,
red) affords the nanomolar bisubstrate-analogue inhibitor 26.62.

the prearrangement necessary for binding the pharmacophoric groups. This


preorganization of a ligand provides an advantage upon binding to the receptor.
The further development must show whether such bisubstrate inhibitors have
a chance to find entry into drug development.
26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors 633

Trp143

SCH3+
Glu199 Met91

SAM

Fig. 26.25 Superposition of the crystal structures of COMT with SAM 26.53 and the catechol-
amine-like inhibitor 26.54 (gray carbon atoms) with the bisubstrate inhibitor 26.62 (green carbon
atoms).

26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors

Kinases are not the only proteins that undertake posttranslational modification in
the course of signal transduction. The spatial location of proteins is often essential
for their correct function in the cell. Some proteins must be anchored in the
membrane. In addition to examples in which a section of the polymer chain sub-
merges in the membrane, proteins are known that are anchored there by an added
farnesyl 26.63 or geranylgeranyl anchor 26.64. These hydrophobic anchors are
made up of isoprenoid units (Fig. 26.26). The attachment to the proteins is accom-
plished via cysteine residues that are found in the vicinity of the C terminus. Three
classes of so-called prenylating enzymes are known: the farnesyl transferases
(FTases) and the geranylgeranyl transferases I and II (GGTase I and II). The
substrates of these catalysts are, among other, the GTPases of the Ras, Rab, and Rho
families, lamins, and the g subunit of G protein heterotrimers. To become fitted with
a prenyl anchor by FTases and GGTases, substrate proteins must carry a CAAX
sequence 26.65 on their C terminus (Fig. 26.26). Here, C stands for the cysteine
upon which the prenyl group will be transferred, and A is usually an aliphatic amino
acid. If X is a serine, methionine, glutamine, or alanine, the protein is prenylated by
an FTase. A leucine in this position prefers a GGTase as catalyst.
By now over 250 proteins have been discovered that require the posttranslational
attachment of a prenyl tail for its function. The interest in these prenylating
634 26 Transferase Inhibitors

O– O–
O O
O– P
O – P Zn2+ Zn2+
O O
O P O– O P O–H H Peptide–NH3+ Peptide–NH3+
H H
O O S NH S NH

O O
HN HN
A1 A1
O O
NH NH
A2 A2
O O
HN HN
X X
O O
26.63 OH OH
26.65
26.66
26.64 (red)

Fig. 26.26 Farnesyldiphosphate 26.63 binds to FTase and occupies a part of the large catalytic
site. The crystal structure of the enzyme with this substrate was determined (above, farnesyldi-
phosphate is green). Geranylgeranyl groups 26.64 that have an elongated isoprenyl chain
(isoprenyl chain indicated in red instead of a black chain in 26.63) are transferred by GGTase.
Trp102b and Tyr365b border the binding pocket in FTase and provide for substrate selectivity.
After binding the farnesyl substrate, the peptide substrate 26.65 (gray) with its CAAX terminus
diffuses into the binding pocket. The farnesyl group transfer onto the thiol group of the cysteine is
accomplished with the help of a neighboring catalytic zinc ion, which coordinates the cysteine
residue of the substrate. The diphosphate group is displaced by nucleophilic attack. A crystal
structure could also be determined with the product 26.66 (gray-green). It is shown above
superimposed with the binary complex. The farnesyl group moves into the pocket (arrow) and
the newly formed product coordinates to the zinc ion. The enzyme recognizes the tetrapeptide unit
by its two aliphatic residues A1 and A2 of the CAAX motif. The terminal methionine (X) forms
a hydrogen bond to Glu167a with its carboxylate group.
26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors 635

enzymes, and especially the FTases, began in the early 1990s. It was observed that
RAS proteins, which mediate a permanent growth signal in a mutated form in
cancer, must be farnesylated. It is only then that they are active. If the farnesylation
is omitted, the RAS activity is suppressed. After transferring the prenyl group in the
cytoplasm to the cysteine three amino acids away from the C terminus, the protein
migrates to the endoplasmic reticulum. There the AAX tripeptide tail is proteolyt-
ically cleaved and a methyl group is transferred to the C terminus by
a carboxymethylation step. Finally, the prenylated protein is anchored in the
membrane. The FTases and the GGTases contain a zinc ion in their catalytic site
that is coordinated by cysteine, aspartate, and histidine. First the diphosphate
farnesyl or geranylgeranyl anchor diffuses into the large, funnel-shaped binding
pocket of the enzyme. FTases and GGTases form a heterodimer with a barrel-like
architecture, to which almost exclusively helical structural elements contribute.
FTase recognizes the shorter substrate farnesyldiphosphate 26.63 specifically
because the floor of its binding pocket is defined by Trp102b and Tyr365b. After
successful binding of the prenyl substrate, the peptide chain with the tetrapeptidic
C-terminal CAAX diffuses into the catalytic site. The prenyl substrate provides
a large interaction surface for the incoming peptide substrate.
The farnesyl chain must move to the peptide substrate for the actual reaction.
The CAAX substrate occupies the fourth coordination site on the zinc ion with the
thiol group of its cysteine. It binds with its hydrophobic aliphatic side chain A2 into
the preformed binding pocket of the enzyme. The A1 side chain protrudes into the
surrounding solvent. In the structure shown in Fig. 26.26, a methionine occupies the
X position and the C-terminal carboxylate group forms a hydrogen bond to
Gln167a. The prenyl group is transferred to the peptide chain by a nucleophilic
attack of the cysteine in the substrate onto the carbon atom next to the diphosphate
group. The prenylated substrate 26.66 diffuses out of the catalytic center. Interest-
ingly, this is the rate-determining step. There are indications that a new substrate
molecule is necessary to displace the product from the enzyme. For this, the product
molecule takes on a new position and binds in an area of the binding pocket,
through which it leaves the reaction site.
According to the reaction mechanism, different concepts for the development
of inhibitors for this enzyme have been pursued. The first attempts aimed at competing
with the isoprenoid diphosphate binding. For example, the isoprenoid analogue
a-hydroxyfarnesylphophonic acid 26.67, occupies the binding pocket comparably to
farnesyldiphosphate and forms extensive interactions with the enzyme as well as with
the CAAX peptide substrate. The second and most often-used strategy is the displace-
ment of the peptide substrate from the binding site. This goal can be achieved by the
development of peptidomimetics. An example is L-739750 26.68, an ester prodrug that
caused the regression of tumors in rats without systemic toxicity (Fig. 26.27).
It was also possible to completely depart from peptide lead structures. Examples
are R115777 (tipifarnib) 26.69 from Janssen Pharma or BMS-214662 26.70 from
Bristol-Myers Squibb. Both use their imidazole groups to coordinate to the zinc ion.
A superposition of BMS-214662 26.70 with the peptide substrate 26.66 is shown
in Fig. 26.28. Compound 26.70 replaces the isopropyl group of the peptide in the
636 26 Transferase Inhibitors

OH CH3 CH3 CH3


HO
HO CH3
P
O
26.67 a-Hydroxyfarnesylphosphonic acid

O Cl
H3C
S Cl
O OR

O
NH2
O SH
H N
N N
O NH2 O N H3C
CH3
26.69 R115777 Tipifarnib

26.68 L-739750 R = H bzw. R = i Pr


H3C
S
H
N
O CH3
N HO
S N
H
N O
O S N
O N

CN
CH3
26.70 BMS-214662
26.71 ABT-839
N
Br Cl
N
N
N
Br O
CN
O N
N N NH2

26.72 Lonafarnib Cl
26.73 L-778123

Fig. 26.27 Development of compounds for the inhibition of FTase. Compound 26.67 represents
a competitive inhibitor for farnesyldiphosphate 26.63. Compounds 26.68–26.73 are inhibitors that
bind competitively to the tetrapeptide substrate, CAAX. Only some of them (26.68–26.70, 26.73)
use their functional groups (e.g., imidazole rings) to block the zinc ion in the catalytic site.
Compounds 26.71 and 26.72 inhibit FTase without direct coordination to the Zn2+. Compound
26.73 blocks FTase and GGTase equipotently.
26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors 637

H
N

N
S

O S N N
O

CN
26.70 BMS-214662

Fig. 26.28 Crystal structure of 26.70 (violet); 26.70, which has a completely non-peptidic
structure, mimics the binding mode of the peptide substrate 26.66. It coordinates to the catalytic
zinc ion with its imidazole group. The hydrophobic benzyl group and the thiophene group replace
the A1 and A2 side chains in the natural substrate. The binding area of the terminal amino acid
X (here methionine) remains unoccupied by 26.70.

A1 position with its thiophene ring. The inhibitor uses its benzyl group for the A2
position to emulate the side chain of the isoleucine. With ABT-839, Abbott has
found a compound that undergoes no coordination to the zinc ion at all. It carries
a methionine group at the end that is very similar to the peptide tail in position X of
the natural substrate. Lonafarnib 26.72, a tricyclic derivative, was developed at
Schering-Plough; its urea group orients into the binding area over which the
processed substrate leaves the binding pocket. This inhibitor also blocks the
enzyme without coordinating to the zinc ion. The compounds 26.68–26.72 all
show a selectivity advantage for FTase. Merck has developed the non-peptide
structure 26.73 that strongly inhibits both FTase and GGTase I. Of course
a strategy can be followed here, as with COMT (Sect. 26.9), that pursues the
simultaneous displacement of both substrates from the binding pocket. The
bisubstrate-analogue inhibitors have to contend with the problem that they must
be very large to successfully compete with the two large substrates.
Clinical studies on the non-peptidic farnesyltransferase inhibitors 26.68–26.73
are not advanced enough to be judged. Monotherapy with these inhibitors delivered
a rather disappointing picture, although very promising results have been seen
for tipifarnib 26.69 for the treatment of breast cancer. We must wait and see
whether FTase inhibitors find application in tumor therapy as a monotherapy or
638 26 Transferase Inhibitors

whether they are more efficiently used with other cytostatic and hormone drugs.
Most recently, however, a new field has been opened for FTase inhibitors in drug
development. It seems that they are potential lead structures for the treatment of
infectious diseases that are caused by pathogenic microorganisms such as Plas-
modium (malaria), Trypanosoma (African sleeping sickness and Chagas disease),
and Leishmania (leishmaniasis, kala-azar). The causative agent of fungal diseases
such as Candida albicans can also be fought in this way. Obviously the post-
translational prenylation of their proteins is an essential step in the lifecycles of
these organisms. We can hope that the sequence differences in the transferases
are adequately large compared to the human enzymes to develop selective
compounds.

26.11 Synopsis

• Proteins can be modified after translation in the ribosome by the attachment of


groups such as phosphate, methyl, or acetyl, and larger building blocks such as
prenyl moieties or polypeptide chains such as ubiquitin or SUMO.
• Kinases transfer phosphate groups from ATP to the hydroxyl groups of Ser, Thr,
or Tyr residues or the imidazole group of His. This switches the function of the
modified protein substrates on; phosphatases can reverse this step by cleaving
the phosphate group off again from a phosphorylated amino acid residue.
• The more than 530 human kinases act as switches in signaling cascades, thus
they seem very attractive as putative drug targets. However, their substrate ATP
is present in high concentrations in cells, it is recognized by multiple proteins
often with other functions, and Nature has established many processes involving
kinases redundantly as a fail safe. This makes selective competitive inhibition of
kinases at the ATP-binding site a difficult task.
• Kinases are rather flexible proteins that adapt to their substrates. The adenine
moiety of ATP is recognized by a peptide strand in the hinge region. Pockets are
found adjacent to the ATP binding site and they are called front and back pocket.
They are not involved in ATP recognition but they can be exploited to endow
competitive inhibitors with the required selectivity.
• Inhibitors are profiled against the kinase family (kinome) and exhibit either high
selectivity against individual members or show promiscuous binding to individ-
ual branches of the kinase family tree. Interestingly, introduction of inert metal
centers that expand the basic coordination architecture to attach pharmacophoric
groups can succeed in producing highly selective compounds.
• Imatinib and its follow-up compound nilotinib bind to the inactive conformation
of BCR-ABL kinase. They present an entirely new approach to cancer therapy in
curing chronic myeloid leukemia through the inhibition of the product of
a misregulated gene.
• The bump and hole method allows a specific therapeutic validation of the
biological relevance of a target protein as well as the optimization of an inhibitor
class. Genetically, the target protein is modified in its substrate specificity
Bibliography 639

(e.g., a kinase at its gatekeeper residue) and implemented into a model organism.
Selective inhibition of this protein under in vivo conditions is achieved via
inhibitors that are adapted to the modified binding site of the engineered protein.
• Phosphatases remove phosphate groups from Ser, Thr, Tyr, and His residues thus
switching off the function of the substrate protein. Two catalytically different enzyme
classes are known, either operating through nucleophilic attack of a water molecule,
which is highly polarized by two adjacent metal ions, or through the nucleophilic
attack of a cysteine residue via a pathway similar to that in cysteine proteases. In both
cases the tetrahedral phosphorous atom in the phosphate group is attacked.
• PTP-1B initially appeared to be an ideal target to treat the metabolic syndrome
because it involves dephosphorylation of the insulin receptor kinase. Potent
inhibitors of this target with challenging druggability could be developed;
however, sufficient selectivity with respect to another phosphatase, TCPTP,
failed. Knock-out mice were unable to survive if the genes of both phosphatases
are simultaneously turned off. A similar life-threatening situation can be anti-
cipated with insufficiently selective inhibitors.
• Catechol-O-methyl transferase is representative for the family of methyl trans-
ferases using S-adenosyl-L-methionine as cofactor for methyl transfer via its
sulfonium group. It transfers methyl groups to catecholamines such as dopa-
mine, adrenaline, or noradrenaline.
• Inhibition of the methyl transferase reaction is achieved by introduction of
strong electron-withdrawing groups, such as nitro groups, at the aromatic ring
of the natural substrates, producing substrate-like inhibitors.
• Farnesyl and geranylgeranyl transferases transfer prenyl anchor groups onto
protein substrates exhibiting a CAAX sequence on their C terminus. The phos-
phorylated prenyl anchor is attacked by the nucleophilic cysteine thiol group,
which is further polarized through the coordination to a neighboring zinc ion in
the catalytic center.
• Inhibitors of farnesyl and geranylgeranyl transferases bind competitively either
to the CAAX peptide substrate or the prenyldiphosphate substrate binding site.
Some of them show strong peptidomimetic character and involve coordination
of the zinc ion. However, also completely non-peptidic inhibitors have been
developed some of which bind without zinc coordination.

Bibliography

General Literature

Alaimo PJ, Shogren-Knaak MA, Shokat KM (2001) Chemical genetic approaches for the eluci-
dation of signalling pathways. Curr Opin Chem Biol 5:360–367
Bialy L, Waldmann H (2005) Inhibitors of protein tyrosine phosphatases: next-generation drugs?
Angew Chem Int Ed 44:3814–3839
Bonifacio MJ, Palma PN, Almeida L, Soares-da-Silva P (2007) Catechol-O-methyltransferase and
its inhibitors in Parkinson’s disease. CNS Drug Rev 13:352–379
Bridges AJ (2001) Chemical inhibitors of protein kinases. Chem Rev 101:2541–2571
640 26 Transferase Inhibitors

Cowan-Jacob SW, Guez V et al (2004) Imatinib (STI571) resistance in chronic myelogenous


leukemia: molecular basis of the underlying mechanisms and potential strategies for treatment.
Mini Rev Med Chem 4:285–299
Fabian MA, Biggs WH et al (2005) A small molecule-kinase interaction map for clinical kinase
inhibitors. Nat Biotechnol 23:329–336
Klebl BM, M€uller G (2005) Second-generation kinase inhibitors. Expert Opin Ther Targets
9:975–993
Klebl B, M€uller G, Hamacher M (2011) Protein kinasesas drug targets. In: Mannhold R,
Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry, vol 49. Wiley,
Weinheim
Kubinyi H, M€uller G (eds) (2004) Chemogenomics in drug discovery. A medicinal chemistry
perspective. Wiley, Weinheim
Lane KT, Beese LS (2006) Structural biology of protein farnesyltransferase and geranylgeranyl-
transferease type I. J Lipid Res 47:681–699
Strickland CL, Weber PC (1999) Farnesyl protein transferase: a review of structural studies.
Curr Opin Drug Discov Devel 2:475–483

Special Literature

Bishop AC, Ubersax JA et al (2000) A chemical switch for inhibitor sensitive alleles of any protein
kinase. Nature 407:395–401
Cowan-Jacob SW, Fendrich G et al (2007) Structural biology contributions to the discovery of
drugs to treat chronic myelogenous leukaemia. Acta Crystallogr D63:80–93
Lerner C, Masjost B et al (2003) Bisubstrate inhibitors for the enzyme catechol-O-
methyltransferase (COMT): influence of inhibitor preorganization and linker length between
the two substrate moieties on binding affinity. Org Biomol Chem 1:42–49
Madhusudan, Akamine P, Xuong N-H, Taylor SS (2002) Crystal structure of a transition state
mimic of the catalytic subunit of cAMP-dependent protein kinase. Nat Struct Biol 9:273–277
Meggers E, Atilla-Gokcumen GE et al (2007) Exploring chemical space with organometallics:
ruthenium complexes as protein kinase inhibitors. Synlett 8:1177–1189
Puius YA et al (1997) Identification of a second aryl phosphate-binding site in protein-tyrosine
phosphatase 1B: a paradigm for inhibitor design. Proc Natl Acad Sci U S A 94:13420–13425
Szczepankiewicz BG et al (2003) Discovery of a potent, selective protein tyrosine phosphatase 1B
inhibitor using a linked-fragment strategy. J Am Chem Soc 125:4087–4096
Vidgren J, Svensson LA, Liljas A (1994) Crystal structure of catechol-O-methyltransferase.
Nature 368:354–358
Oxidoreductase Inhibitors
27

Chemical reactions that occur via the exchange of electrons are termed redox
reactions. Normally the carbon atom changes its oxidation state in biochemical
redox processes. As a general rule, derivatives with a significant number of directly
bound hydrogen atoms are transformed into derivatives with a larger number of
contacts to nitrogen, oxygen, and sulfur in oxidations. Because these bonds to the
above-mentioned electronegative elements are usually associated with the intro-
duction of polar functional groups, redox reactions exert a decisive influence on the
physicochemical properties of the oxidized substances. For example, the water
solubility is increased. This is of great importance for the elimination of xenobi-
otics. Cytochrome P450 enzymes, a large group of oxidizing enzymes, are involved
in the corresponding metabolic transformations. On the other hand, reductions
are of crucial importance for the organism too. In these reaction steps, reactive
aldehydes or ketones are transformed into alcohols, which subsequently are more
easily conjugated and eliminated (▶ Sect. 8.1). Transition metals, which can adopt
a variety of oxidation states, are predestined to serve as electron donors and
acceptors in redox reactions. In biological systems, one transition metal, iron, is
often used for this task. Once incorporated in a protoporphyrin ring scaffold, it
exists in penta- or hexavalent coordination state and can take on oxidation states
between +2 and +4. Moreover, it participates in complexes with sulfur. There it
forms interesting multinuclear structures: the so-called iron–sulfur clusters. In addi-
tion to iron, copper also plays a role as a mediator of biochemical redox processes.
Nature uses so-called cofactors for enzyme-catalyzed redox reactions. They are
embedded in the specific environment of a protein, and, shielded from the sur-
rounding solvent, they accomplish the electron or hydride ion transfer from the
group being oxidized to the group being reduced. Cofactors can be tightly coupled
to the protein. In these cases, they are referred to as prosthetic groups and do not
leave the enzyme during the reaction. Other loosely bound cofactors can be taken
up by the protein, just as the substrate is, chemically altered, and finally released
again. These cofactors must be regenerated for the next redox reaction cycle in
another independent reaction.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_27, 641


# Springer-Verlag Berlin Heidelberg 2013
642 27 Oxidoreductase Inhibitors

The oxidoreductase enzyme class shall be addressed in this chapter. They are
involved in numerous electron-transfer reactions and need electrons or hydrogen in
the form of hydride ions. These particles are transferred from cofactors such as
NAD(P)+ (nicotinamide adenine dinucleotide (phosphate)) or flavinucleotides
FMN (flavinmononucleotide) and FAD (flavin adenine dinucleotide) and the
already mentioned iron atom in the heme group. Because many of these enzymes
are often involved in processes that are related to the development of pathophys-
iology, multiple drugs act by inhibiting these enzyme systems.

27.1 Redox Reactions in Biological Systems Use Cofactors

As already mentioned, enzymes employ cofactors for the transfer of electrons or


hydride ions in redox reactions. NAD+/NADP+ 27.1 (nicotinamide adenine dinu-
cleotide phosphate) and NADH/NADPH 27.2 serve as hydride ion acceptors and
donors (Fig. 27.1). This cofactor is composed of three components: the nicotin-
amide with an attached ribose ring, the central diphosphate unit, and the adenosine
moiety. The last-mentioned group can carry a phosphate moiety on the 2’-OH
group, and then it is referred to as NADP+/NADPH. The redox-active part is the
nicotinamide moiety, a pyridine derivative. During the oxidation, the positively
charged NADP+ accepts a hydride ion in the 4-position of the pyridine ring. An
H is released from the same position during the reverse reaction. Altogether, two
electrons are transferred. NAD(P)+ is loosely bound to the enzyme. It can be easily
exchanged and regenerated for a subsequent reaction cycle on another protein.
Typical oxidation and reduction reactions as they can occur in a dehydrogenase or
a reductase are outlined in Fig. 27.2.
A direct contact is formed between the group being oxidized or reduced and the
nicotinamide ring in the binding pocket of such an enzyme. The binding site in
these proteins is usually shielded from the aqueous solvent by a hydrophobic group,
a loop, or an amino acid lid. This also ensures on the one hand that an unambiguous
stereochemistry is established for the hydride transfer. On the other hand, access to
protons must be excluded because the enzyme would not be able to otherwise
reduce the substrate as elementary hydrogen would be generated. The interaction
geometry followed during such reaction steps is shown as an example for
dihydrofolate reductase in Fig. 27.3. Even though enzyme-catalyzed reactions are
principally reversible, and the direction of the reaction depends on the concentra-
tion of the cofactors in the environment, with few exceptions, NADP/H is involved
in reduction reactions. Oxidation reactions are almost exclusively carried out by
using NAD/H. The majority of enzymes can distinguish unambiguously between
the cofactors. This is because the additional phosphate group in the 2’-position is
specifically recognized in a binding pocket in the vicinity of the cofactor-binding
pocket as a kind of marker. The majority of NAD(P)H-dependent enzymes have
a structurally similar binding domain. It is composed of a total of four a helices that
arrange on the top and bottom of a central six-stranded pleated sheet. A topological
switch in the arrangement of the helices from one side to the other occurs in the
27.1 Redox Reactions in Biological Systems Use Cofactors 643

H O H O
H

NH2 NH2

N+ O N
O
O O
NH2 P
NH2
O
P
O−HO O O−HO OH
OH O
O N
N N − N
P O− +H P O−
O O
− N N O O
N N O O −H

OH XO OH
XO

27.2 NADH X = H
27.1 NAD+ X = H
NADP+ X = PO32− NADPH X = PO32−

Fig. 27.1 Many enzymatic redox reactions use NAD+/NADP+ 27.1 (nicotinamide adenine
dinucleotide) and NADH/NADPH 27.2 as a cofactor for the transfer of electrons and/or hydride
ions. It is made up of three components: the nicotinamide, which bears an attached ribose sugar,
the central diphosphate unit, and the adenosine moiety. Compounds 27.1 and 27.2 differ in the
phosphate group (blue) at the 2’-OH group of the ribose ring. Upon oxidation, the positively
charged nicotinamide moiety takes on a hydride ion (red) at the 4-position; upon reduction the H
ion is released from this position.

O OH Malate O O
OH Dehydrogenase OH
HO + NAD+ HO + NADH + H+
O O

O NH3+ Homoserine OH NH3+


OH Dehydrogenase OH + NADP+
H + NADPH + H+ H
H
O O

Fig. 27.2 Examples for an oxidation reaction with malate dehydrogenase (top) and for
a reduction with homoserine dehydrogenase (bottom). The transformation of a hydroxyl group
into a ketone function or vice versa (red) is carried out in both reactions.

middle of the pleated sheet (Fig. 27.4). The binding of the charged diphosphate
group to the conserved nucleotide-binding moiety occurs in an extension of this
position. The folding motif is termed “Rossmann fold” to honor its discoverer,
Michael Rossmann. We will return to this nucleotide-binding domain with the
enzymes dihydrofolate reductase, HMG-CoA reductase, and 11b-hydroxysteroid
dehydrogenase in the next sections. Other folding motifs can also make a binding
site for the NADPH cofactor available. A TIM barrel (▶ Sect. 14.3) is used for the
binding of this cofactor in aldose reductase (Sect. 27.4). Two protein superfamilies
644 27 Oxidoreductase Inhibitors

Ribose Adenosine Ribose Adenosine

N N+
NADPH NADP+
NH2 NH2
H H O H O
H+ H R H R
O N O N
H H
N N
HN HN

H2N N N H2N N N
H H

NADPH

DHF

Fig. 27.3 The stereochemically unambiguous transfer of a hydride ion from the NADPH cofactor
to the double bond of the substrate being reduced is accomplished deep in the protein’s binding
pocket. Crystal structure determination of the enzyme dihydrofolate reductase with bound
dihydrofolic acid (DHF) and cofactor (NADPH) provided detailed information about the course
of the reduction step. The two reaction sites come spatially very close to one another in the
structure. A hydride ion is transferred from the 4-position of the reduced nicotinamide ring onto
the neighboring double bond of the DHF substrate (violet line).

are known that can reduce or oxidize carbonyl compounds in biological systems.
The first group encompasses the aldo–keto reductases, to which the aldose reduc-
tase belongs as a representative. The second superfamily contains short-chained
dehydrogenase/reductases, to which the 11b-hydroxysteroid dehydrogenase (Sect.
27.5) belongs.
Flavoproteins use FMN 27.3 and FAD 27.4 as cofactors (Fig. 27.5). They are
derived from vitamin B2, riboflavin. FAD is composed of an adenosine that is
27.1 Redox Reactions in Biological Systems Use Cofactors 645

Fig. 27.4 In most NAD(P)H-


dependent enzymes, the
cofactor binds to the so-called
Rossmann fold in
a structurally conserved
domain. This forms a central
six-stranded pleated sheet with
at least four a-helices on the
top and bottom sides. The
topological change of the
helices from one side to the
other takes place in the middle
of the pleated sheet. The
charged diphosphate unit of
the cofactor (yellow arrow)
binds in an extension to this
position.

O
H3C N
NH

H3C N N O
OH

OH
HO
O− O− N
O O O
P P NH2
O N
O O
N
27.3 FMN N
27.4 FAD (blue) HO OH

O H O
H3C N H3C N
NH 2H+, 2e− NH

H2C N N O H2C N N O
S-Enzyme R S-Enzyme R H
FAD (oxidized form) FADH2 (reduced form)

Fig. 27.5 Flavoproteins use FMN 27.3 and FAD 27.4 (extended by the blue part) as a cofactor.
FAD is composed of an adenosine moiety, a diphosphate bridge with the carbohydrate alcohol
ribitol, and the tricyclic isoalloxazine ring. This tricyclic heterocycle represents the redox-active
part of the molecule and can accept or donate one or two electrons to the substrate. The cofactor is
very tightly but reversibly in some cases via a covalent bond anchored to the enzyme.
646 27 Oxidoreductase Inhibitors

coupled by a diphosphate bridge and a carbohydrate alcohol ribitol to the tricyclic


isoalloxazine ring, which represents the redox-active part of the cofactor
(Fig. 27.5). This group can be reversibly reduced and oxidized in an exchange of
one or two electrons with the substrate. Usually two redox equivalents are trans-
ferred and the reaction remains at the semiquinone stage, which is a stable radical.
Radical intermediates form along the course of the reaction pathway. To avoid
damaging the cells with this reactive species, flavin cofactors never exist in free
solution. Instead, they are covalently anchored to the interior of the enzymes. The
flavin-dependent oxidoreductase monoamine oxidases MAOA and MAOB shall be
introduced in Sect. 27.8. Many therapeutic inhibitors act on these enzymes by
irreversibly binding to the isoalloxazine ring, and in so doing, prevent the redox
processes.
The heme group 27.5 must be mentioned as a third important cofactor that
occurs particularly in proteins that use oxygen as an oxidant. The heme group
(Fig. 27.6) exists in cytochrome P450 enzymes, in cyclooxygenases, and in the
oxygen-transporting proteins such as hemoglobin and myoglobin. A central iron
atom is embedded in a protoporphyrin system. It coordinates to four pyrrole rings in
a planar geometry. The fifth apical position is occupied by a histidine or a cysteine.
The electron transfer or oxidative attack of a bound oxygen molecule is accom-
plished through the sixth coordination site on the iron ion. The iron changes its
oxidation state during the redox reaction. The heme group remains permanently
bound to the protein.
The properties of the iron ion, a good partner for coordinating ligands, can be
exploited to inhibit heme-containing proteins. Small molecules such as carbon
monoxide or cyanide ions can thus become attached to the sixth coordination
position. This blocks the function of the protein and is responsible for the toxicity
of both of these compounds. CO inhibits binding of O2 to hemoglobin and prevents
oxygen transport in blood, while cyanide reacts with the iron in cytochromes in the
respiratory chain. Heterocycles such as imidazoles or triazoles can also coordinate
to the iron ion. Potent antimycotics such as fluconazole 27.6 or ketoconazole 27.7
(Figs. 27.6 and 27.7) also follow this principle. Metyrapone 27.8, a drug for the
treatment of adrenal insufficiency, also represents a potent inhibitor of many P450
enzymes. Many natural products have been described as cytochrome inhibitors, for
example, the flavonoid naringenin 27.9, which gives grapefruit its bitter taste.

27.2 Chemotherapeutics for Cancer and Bacteria:


Dihydrofolate Reductase Inhibitors

Together with thymidylate synthetase and serine transhydroxymethylase,


dihydrofolate reductase forms a synthetic cycle that catalyzes the biosynthesis of
thymine (Fig. 27.8). Thymine is a pyrimidine base that represents a critical com-
ponent of DNA (▶ Sect. 14.9). Initially the nucleotide desoxyuridylate is methyl-
ated by thymidylate synthase. The methyl group originates from the enzyme
cofactor, methylenetetrahydrofolate 27.10. After successful methyl group transfer,
27.2 Chemotherapeutics for Cancer and Bacteria: Dihydrofolate Reductase Inhibitors 647

COOH
H3C

COOH OH
N
N N
N Fe N N N

CH3 N F N
H3C N

CH3 F

27.5 Heme 27.6 Fluconazole

N
O H3C CH3
Cl O
N N N
O O CH3
O
N
Cl

27.7 Ketoconazole CYP 3A4 Ki = 15 nM 27.8 Metyrapone

OH

HO O

OH O

27.9 Naringenin

Fig. 27.6 The heme group 27.5 occurs as a cofactor in proteins that use oxygen as an oxidant.
An iron atom is embedded in a protoporphyrin system in a quadratic–pyramidal or octahedral
geometry. The four pyrrole rings form a plane. The fifth apical position is occupied by a
histidine or a cysteine, and the sixth position is coordinated by a reactive-oxygen species.
This binding site can be blocked by a nitrogen-containing heterocycle such as a triazole or
imidazole ring in fluconazole 27.6, ketoconazole 27.7, or by pyridine rings as in metyrapone
27.8. Even natural products such as the flavonoid naringenin 27.9 represent examples of cyto-
chrome inhibitors.

the cofactor leaves the enzyme as dihydrofolate 27.11 and must be reduced to
tetrahydrofolate 27.12. Dihydrofolate reductase (DHFR) accomplishes this task.
As the carrier of genetic information, DNA is produced in increased quantities
when a high level of cell division is necessary. Cancer represents one example of
increased cell proliferation. Bacteria cells also reproduce at an increased replication
648 27 Oxidoreductase Inhibitors

Fig. 27.7 Fluconazole 27.6


is an antimycotic and blocks
the sixth coordination site on
the iron ion of a cytochrome
P450 enzyme. The indicated
binding geometry was
determined by
crystallography.
Fluconazole

Fe

Heme

dUMP dTMP
O O
CH3
HN HN
Thymidylate
Synthase
O N O N
Desoxyribose-OPO3H Desoxyribose-OPO3H

R R
O H2C N O HN

N N
HN HN

H2N N N H2N N N
H H 27.11
27.10

Serine R NADPH + H+
Dihydrofolate
Glycine Transhydroxymethylase O HN Reductase
H
N
HN

Serine H2N N N NADP+


H
27.12

Fig. 27.8 Dihydrofolate reductase, thymidylate synthase, and serine transhydroxymethylase


make up the synthetic cycle for the biosynthesis of thymine (TMP) from uracil (UMP). The
methyl groups to be transferred (red) are provided by methylenetetrahydrofolate 27.10, which is
regenerated via dihydrofolate 27.11 and tetrahydrofolate 27.12. The double bond marked in red is
hydrogenated by dihydrofolate reductase.
27.2 Chemotherapeutics for Cancer and Bacteria: Dihydrofolate Reductase Inhibitors 649

O COOH O
COOH
N N
H
R R
NH2 X COOH NH2 N
N N COOH
N N

H2N N N H2N N N

27.13 X = N, R = CH3 Methotrexate Ki = 4.8 pM 27.16 R = H Ki = 34 pM


27.14 X = N, R = H Aminopterin Ki = 3.7 pM 27.17 R = CH3 Ki = 2100 pM
27.15 X = C, R = C2H5 Edatrexate Ki = 11 pM

Fig. 27.9 Inhibitors 27.13–27.17 of human DHFR that are used as chemotherapeutics in
cancer therapy.

rate during infections. Therefore the inhibition of this enzyme in the synthesis cycle
represents a point of attack for the chemotherapy of tumor disease. If the target is
an enzyme of a bacterial organism, a compound with a bacteriostatic effect is
obtained. Dihydrofolate reductases from different species are rather small enzymes.
Depending on their origin, they are composed of between 150 and 260 amino acids.
The substrate dihydrofolate 27.11 is composed of a pteridine ring, a central para-
aminobenzoic acid, and a terminal L-glutamate moiety. The hydrogenation of the
5,6-double bond in the pteridine ring occurs stereospecifically by the attack of
a hydride ion with subsequent addition of a proton onto the neighboring nitrogen
atom. The mechanism is shown in Fig. 27.3 in detail.
Early on, before the first crystal structure of this enzyme was determined in the
group of Joseph Kraut in San Diego in 1982, methotrexate 27.13 was known to be
a potent dihydrofolate reductase inhibitor (Fig. 27.9). Aminopterine 27.14 and
edatrexate 27.15 were described as analogues. Chemically, they appear to be very
similar to the natural substrate dihydrofolate 27.11. Nonetheless, a decisive
exchange of a hydrogen bond acceptor group for a donor group on the heterocycle
is apparent. As explained in detail in ▶ Sect. 17.6, this causes a 90 twist in the
orientation of this moiety in the binding pocket of the reductase. Therefore, intimate
contact with the reduced nicotinamide group of the NADPH cofactor and the
double bond of the bound ligand cannot take place. A transformation is impossible;
the enzyme is blocked.
Methotrexate is a potent chemotherapeutic that is used in cancer therapy to treat
breast tumors, sarcomas, acute lymphatic leukemia, and non-Hodgkin lymphomas.
Both the natural substrate and methotrexate are very polar compounds and must be
transferred into the cell through the reduced folate carrier (RFC). Then the ligands
are augmented with additional glutamic acid residues. A prerequisite for good and
efficient inhibition of DHFR in cancer therapy is therefore not only a strong binding
to the reductase but also a highly specific uptake through the transporter. For
example, the derivatives 27.16–27.17, which were obtained by replacing the central
phenyl ring with the attached amide bond of methotrexate with a benzolactam
group (Fig. 27.9), indeed have somewhat poorer binding constants to DHFR.
650 27 Oxidoreductase Inhibitors

Cl NH2 NH2 CH3 OMe


NH2 OMe
N N
N
H2N N OMe
H2N N N
H2N N OMe OMe
27.18 Pyrimethamine 27.19 Trimethoprim 27.20 Piritrexim

OMe
CH3
OMe NH2
NH2 CH3 O
N Cl
NH2
N N OMe
H H2N N N N N
H2N N O CH3
H2N N CH3
27.21 Trimetrexate CH3
27.22 Epiroprim 27.23 Cycloguanil

Fig. 27.10 Bacteriostatic inhibitors 27.18–27.23 of bacterial dihydrofolate reductase.

This is, however, compensated for by an improved affinity to the RFC transporter so
that the tumor growth can be suppressed by these compounds equipotently. The
RFC transporter and the highly potent binding of folic acid analogues to this
receptor offers yet another perspective for tumor therapy. This transporter is
expressed on malignant cells to ensure that their increased need for folic acid is
met. Because of the tight binding of folic acid derivatives to this receptor and the
subsequent internalization of its substrates, there is a possibility that folic acid
derivatives could carry an additional molecular freight, which would be
piggybacked into the cell. Upon arrival, this freight could be chemically unloaded
and could, if it were a potent cancer therapeutic, unleash its destructive effects in
the interior of the tumor cell.
In addition to chemotherapeutics for tumor therapy, bacteriostatic inhibitors
such as trimethoprim 27.19 are directed against the corresponding bacterial
enzymes. A few of these non-classical antifolate inhibitors (27.18–27.23) are listed
in Fig. 27.10. Structurally, a relationship to the natural substrate is apparent. The
first heterocycle is the same as in methotrexate so that an identical binding mode for
this moiety is observed. For all DHFRs in different species an aspartate or
a glutamate is conserved that uses an interaction with the positively charged
nitrogen atom in the ring and the exocyclic 3-amino group. The amino group in
the 1-position finds interaction partners in two carbonyl groups in the protein
backbone (▶ Figs. 17.7 and ▶ 17.12). In contrast to methotrexate, trimethoprim-
like antibiotics display a more strongly hydrophobic group as the second ring
moiety. This grouping is decisive for the selective inhibition of DHFRs in bacteria.
At therapeutic doses, trimethoprim inhibits bacterial but not human dihydrofolate
reductase. For bacteria, the inhibitory concentrations range from a factor of
60 (Neisseria gonorrhoeae, the causative agent of gonorrhea) to 50,000
(the intestinal bacterium Escherichia coli) times lower than for human DHFR.
27.2 Chemotherapeutics for Cancer and Bacteria: Dihydrofolate Reductase Inhibitors 651

Table 27.1 Binding constants of a few dihydrofolate reductase (DHFR) inhibitors for the
humane enzyme and the RFC transporter and cell-growth inhibition in tumor tissue
Compound DHFR Ki (pM) RFC Ki (mM) Cell growth IC50 (nM, 72 h)
27.13 4.8  0.45 4.7  1.3 14  2.6
27.14 3.7  0.35 5.4  0.09 4.4  0.10
27.16 34  3.0 0.28  0.10 5.1  0.25
27.17 2100  200 1.1  0.11 140  5.0

Table 27.2 Dissociation constants Kd of trimethoprim 27.19 for dihydrofolate reductase from
different species
Species Kd (nM)
Escherichia coli 0.02
Escherichia coli, Gln118 mutant 0.09
Escherichia coli Arg28/Gln118 double mutant 3.8
Lactobacillus casei 0.4
Neisseria gonorrhoeae 15
Chicken 3,500
Mouse 3,500
Cattle 330
Human 1,000

This enormous specificity is initially puzzling because trimethoprim binds to all of


these enzymes in an entirely comparable way. Even the amino acids that are
directly involved with ligand binding possess very similar physicochemical char-
acteristics (Table 27.1).
An explanation for this observation comes from evidence with mutants of
Escherichia coli in which the amino acids binding the inhibitor remained unchanged,
but that nevertheless bind trimethoprim more poorly than wild-type DHFR. In fact
the inhibitor binds with an unchanged geometry. Trimethoprim is positively charged
at physiological pH values. Charges in the binding-site environment should therefore
decisively influence the affinity of these ligands. The negatively charged Glu118 is
exchanged for a neutral glutamine in one of the DHFR mutants. Despite a distance of
about 15 Å between the oppositely charged groups, the absence of a negative charge
in the extended environment causes a loss in affinity by a factor of 4–5. An even more
pronounced effect could be seen in the double mutant in which a leucine, that is about
8 Å remote, is in addition exchanged for a positively charged arginine. Because of
this additional unfavorable change in charge, the inhibition constant is lower than in
the wild type by a factor of about 200 (Table 27.2).
A comparison of chicken DHFR (from the liver) with Escherichia coli DHFR
shows that seven amino acid side chains that are within 10–16 Å of the positively
charged nitrogen atom on the ligand change their charge: two from negative to
neutral, and another five from neutral to positive. The vicinity of the binding site in
chicken DHFR has therefore become unfavorable for the addition of a positively
652 27 Oxidoreductase Inhibitors

NH2 Pc-DHFR
OMe Murine-DHFR
N

H2N N OMe

Asn64
27.24

COOH Phe69
27.24

Arg75

Fig. 27.11 The exchange of the 3-methoxy group in trimethoprim 27.19 for an unsaturated
aliphatic side chain in 27.24 gives an affinity that is improved by a factor of 5,000 and selectivity
for the bacterial enzyme from Pneumocystis jirovecii compared to the mouse enzyme. An Asn64
residue is found in the crystal structure of the vertebrate enzyme in the place where Phe69 is found
in the bacterial enzyme. This exchange to a more hydrophobic and less-charged environment in the
bacterial enzyme allows the selectivity advantage for 27.24.

charged molecule with respect to seven charge units. Correspondingly, it is not only
the direct contacts that are responsible for the strength of the protein–ligand
interaction, but rather the electrostatic interactions in the remote environment.
The 3-methoxy group in trimethoprim was exchanged for an unsaturated, acidic
side chain in another model study (Figs. 27.11, 27.24). The modified derivative
shows a significantly improved affinity (factor of 5000) and therefore selectivity
with regard to the bacterial enzyme from Pneumocystis jirovecii compared to the
enzyme from rodents. Crystal structures of the inhibitor 27.24 were obtained with
the enzymes from bacteria and vertebrates. Asn64 in the vertebrate enzyme is
exchanged for Phe369 in the bacterial enzyme. This change leads to a more strongly
hydrophobic and less-charged environment in the bacterial reductase, and thus
causes an increased selectivity for the modified trimethoprim derivative.
In the bacterial enzyme, a tighter and spatially more favorable contact exists
between the unsaturated triple bond of the ligands and the aromatic ring of the
phenylalanine. A comparable contact to Asn64 in the rodent enzyme cannot
achieve this contribution.
In the pioneer era of structure-based drug design at the beginning of the 1980s,
DHFR was the model protein par excellence. Therefore much of the expertise that
shapes our current understanding of selectivity phenomena was collected on this
enzyme.
27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of Drug Development 653

27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of


Drug Development

Coronary heart disease (CHD), atherosclerosis, and the concomitant heart attacks
and strokes belong to the most common causes of death in the majority of European
countries and in the USA. CHD has multifactoral genetic causes and is also
a typical disease in the developed world. Risk factors are obesity, smoking, high
blood pressure, and elevated fibrinogen and cholesterol levels. High levels of
cholesterol are found in the plaques that constrict and occlude the blood vessels.
The overwhelming academic opinion considers a reduction in the cholesterol
level to be a reasonable treatment strategy so that medications that act in this way
are often prescribed. Cholesterol fulfills different functions in the construction of
the cell membrane (▶ Sect. 4.2) and serves as a starting material for the synthesis of
steroid hormones and bile acids (▶ Sect. 28.3). The brain, the adrenal glands,
skeletal muscle, skin, blood, and the liver have an increased need for cholesterol.
Between 0.9 and 2 g of this substance is required daily. About a third is obtained
from the diet, and the rest is synthesized in the liver.
A group of drugs that inhibit cholesterol biosynthesis are the statins. There is
hardly another class of compounds that illustrates the success and failure of drug
development in pharmaceutical research equally well. There is a thin line between
astronomical financial success from gigantic sales figures and catastrophic crashes
that can bring a company to the edge of financial ruin. The development of the
statins began as far back as the 1950s. The American company Merck & Co. began
intense work on the biochemistry of lipid metabolism. In 1956 Karl Folkers and
Carl Hoffman, both from Merck, discovered mevalonic acid 27.25, an intermediate
in the biosynthesis of cholesterol 27.26 (Fig. 27.12). Nonetheless, the importance of
the substance and the enzyme 3-hydroxy-3-methylglutaryl-coenzyme A reductase
(HMG-CoA reductase), which transforms HMG-coenzyme A into mevalonic
acid, was not recognized at that time. The enzyme reduces the substrate, which is
composed of two acetate units, by using two equivalents of NADPH in the rate-
determining step in the biosynthetic pathway.
As a therapeutic approach to decreasing the cholesterol level, Merck initially
pursued a basic ion-exchange resin (cholestyramine), which has a high affinity to
bile acids. Because bile acids are synthesized from cholesterol, the removal of bile
acids from the intestines causes more cholesterol from the diet to be used for the
replacement of these substrates. Altogether the cholesterol level in blood decreases.
The sucess of clofibrate 27.27 (Fig. 27.12, ▶ Sect. 28.6) began in the 1960s. This
substance decreases elevated triglyceride levels and, to a lesser extent, the choles-
terol level too. Long-term observations showed, however, that the number of
fatalities in the patient group that was treated with clofibrate was higher than in
the control group. Moreover, cases of liver cancer were observed in animal
experiments.
In 1973 Merck & Co. and other companies began to investigate the influence of
hydroxylated steroids on the biosynthesis of cholesterol. Although these substances
are active in vitro, they were inactive in animal experiments. In the same year the
654 27 Oxidoreductase Inhibitors

Coenzyme A
Coenzyme A S O
S O -
H+ + HO HO O
-
O HO O H O
H H O O
NH2 NH2 -
O HO O
N NADPH NADP + N+
NADPH O
Ribose Ribose
NADP+ -
HO HO O
Adenosine Adenosine
27.25 Mevalonic Acid

O
O
OEt
H3C CH3
HO Cl
27.26 Cholesterol 27.27 Clofibrat

Fig. 27.12 3-Hydroxy-3-methylglutaryl-coenzyme-A reductase (HMG-CoA reductase) trans-


forms HMG-coenzyme A into mevalonic acid 27.25 (blue) by consuming two equivalents of
NADPH. The reaction takes place in two steps. In the first step, the thioester is reduced to
a thioacetal, which is hydrolyzed to mevaldehyde. In the next step another equivalent of
NADPH is used to reduce the newly formed aldehyde function to an alcohol. Finally, mevalonic
acid is transformed via multiple steps to cholesterol 27.26. Clofibrate 27.27 is a PPARa agonist
(▶ Sect. 28.6).

importance of low-density lipoprotein (LDL) was recognized, which is made up


largely of apolipoprotein B-100. It serves as a transport vehicle in plasma for water-
insoluble substances such as cholesterol and carries the largest part of the free-
circulating substance with it. LDL can be easily oxidized. In this form it is taken up
into the arterial wall by macrophages and stored there. This overloading of the
macrophages leads to the formation of foam cells, which can rupture and, coupled
with the coagulation process, can lead to the deposition of plaques and even occlude
the artery entirely. A hardening of the arteries is the consequence (arteriosclerosis).
If such a plaque ruptures, it can obstruct an artery in a different place, and a heart
attack, stroke, kidney insufficiency, or angina pectoris is the consequence. In short,
it can be said that a high LDL level represents an increased health risk for the
formation of atherosclerosis. Interestingly, a high HDL level (high-density lipopro-
tein) on the other hand, is favorable and could even influence the disaggregation of
plaques. High LDL levels are particularly dangerous for those patients with
a genetically caused hypercholesterolemia and an apolipoprotein B defect. These
lead to an extremely high risk of atherosclerosis. A therapeutic approach for the
containment of this risk is the reduction of blood cholesterol level.
27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of Drug Development 655

Beginning in 1974 Merck developed in vitro cell tests for the evaluation of
cholesterol biosynthesis inhibitors, especially for HMG-CoA. At the same time,
Akiro Endo and colleagues at Sankyo Japan began investigating extracts from
8,000 microorganisms. The most active compound, which was also isolated at
Beecham in England, was compactin 27.28 (mevastatin, Fig. 27.13). Early in
1979 Endo registered a Japanese patent for another microbial HMG-CoA reductase
inhibitor, monacolin K, without knowing its structure. In the fall of 1978 microbial
extracts were being investigated at Merck too. In the second week of the experi-
ment, they found what they were looking for. In February 1979 the compound was
isolated and a patent for lovastatin 27.29 (Fig. 27.13), complete with structural
details, was registered in June of 1979. The substance was identical to monacolin K.
The Merck patent was awarded at the end of 1980 in the USA and later in other
countries as well. In a few countries the patent was awarded to Sankyo instead. The
reason for this varying credit was the different interpretation of the time priorities.
Sankyo registered the patent (first to file) 4 months earlier. Merck was awarded the
patent in the USA and in many other countries because they could demonstrate
a 3-month-earlier date of invention (first to invent).
Merck began clinical studies with lovastatin in April 1980, but these were
discontinued again in September 1980. The reason was rumors that compactin
was causing tumors in dogs. Toxicity studies with lovastatin showed no indication
of this, and the rumors could not be confirmed. Nonetheless, the project was
initially halted. In July 1982, Merck negotiated an agreement with the American
FDA that lovastatin could be clinically used by selected investigators. The use
would be limited to therapy-resistant cases with severely elevated cholesterol levels
because the risk of heart attack and stroke was particularly high in these patients.
The therapeutic effects on the LDL cholesterol level as well as the total cholesterol
level in blood were convincing, and the side effects were minimal. The chronic
toxicology and clinical studies were reinitiated. In November 1986 a licensing
application was made. Altogether, 160 volumes of preclinical and clinical data
were submitted to the FDA. Just 9 months later the drug was approved, and the
compound developed into a blockbuster with billions in sales.
Years later, the crystal structure of the target enzyme, HMG-CoA reductase, was
determined. The enzyme is a tetramer in its active form. Each monomer is com-
posed of three subunits. The N-terminal domain has an anchor that fixes the enzyme
to the membrane of the endoplasmatic reticulum. The smaller S domain that
contains the binding site of the reduced NADP(H) is nested in the larger
L domain. The S domain adopts the geometry of a Rossmann fold. The extended
HMG-CoA molecule binds to the L domain. It protrudes deeply into the interior of
the protein with its pantothenic acid moiety, whereas the ADP part is found in
a pocket with positively charged residues at the protein’s surface. The actual
binding site for hydroxymethylglutaric acid (HMG) is found between the L and S
domains. The product of the first reduction step, mevaloyl-CoA, has a negatively
charged oxygen atom that is stabilized by a neighboring Lys691 in the enzyme
(Fig. 27.14). The thiolate that is temporarily released from the CoA group is
stabilized by His752, which is presumably protonated. The activity of HMG-CoA
656 27 Oxidoreductase Inhibitors

HO O HO
COO− Na+
O O OH
O

O O
H3C R2 H H
CH3 H3C H CH3

R1 HO

27.28 Mevastatin, R1 = H, R2 = H 27.31 Pravastatin


27.29 Lovastatin, R1 = Me, R2 = H
27.30 Simvastatin, R1 = Me, R2 = Me
HO
COOH
HO HO OH
COOH COOH
OH OH
F CH3
F N
F CH3 CH3 CH3
N CH3 O
CH3
CH3O N HN

H3C CH3

27.32 Fluvastatin 27.33 Cerivastatin 27.34 Atorvastatin

HO
COOH HO
COOH
OH H3C
OH COOH
H3C
F
CH3 F

CH3 O
N N
N H3C
N
H3C SO2CH3
CH3
27.35 Rosuvastatin 27.36 Pitavastatin 27.37 Gemfibrozil

Fig. 27.13 The natural products mevastatin (compactin) 27.28 and lovastatin 27.29
inhibit cholesterol biosynthesis at the HMG-CoA reductase step. Simvastatin 27.30 and prava-
statin 27.31 are partial-synthetic analogues that were developed later. The ring-opened form
27.31 is significantly less lipophilic than lovastatin and therefore has fewer CNS side effects.
The opened lactone ring is the actual active form of lovastatin and its analogues (▶ Sect. 9.2).
Fluvastatin 27.32, cerivastatin 27.33, atorvastatin 27.34, rosuvastatin 27.35, and pitavastatin 27.36
were introduced to the market later as synthetically prepared inhibitors. A fivefold-higher plasma
concentration of cerivastatin 27.33 as a consequence of a blockade of its metabolism by cyto-
chrome CYP 3A4 by gemfibrozil 27.37 was obtained when the two medications were
coadministered.
27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of Drug Development 657

His752

Lys735

Glu559 HMGCoA

Lys691
Ser684

NADPH
Arg590

Fig. 27.14 The crystal structure determination of HMG-CoA reductase was accomplished with
the bound NADPH cofactor (green) and HMG-coenzyme-A (pink). The nicotinamide ring of the
cofactor lies underneath the thioester bond of HMG-coenzyme A. The hydride ion is transferred
from there in the first reduction step (cf. Fig. 27.12).

reductase can be controlled by phosphorylation. A serine residue in the vicinity of


the bound NADP+ cofactor is phosphorylated. This presumably leads to a decrease
in the affinity to NADP(H). The energetically demanding cholesterol biosynthesis
can be curtailed in the cell with this step.
As with the discovered natural products, the statins were developed as structural
analogues to the carboxylic acid chain of 3-hydroxy-3-methylglutaryl-CoA. These
inhibit the reductase competitively, but their affinity is a thousand times higher than
that of the natural substrate. The statins mevastatin 27.28, lovastatin 27.29, and the
later-developed simvastatin 27.30 (Fig. 27.13) are prodrugs with a lactone structure
that is opened into the actual active substances in the mucous membranes of the
gastrointestinal tract or in the liver.
If the structure of the reductase with the substrate and with the inhibitor,
simvastatin, are compared, it is recognizable that the extended 3,5-
dihydroxycarboxylate moiety lies at the position of the HMG (Fig. 27.15). All
statins have a carbocyclic or heteroaromatic ring system at the C5 atom of the
dihydroxycarboxylate unit that is separated by a two-membered spacer. This
moiety binds in the area in which the thiol side chain of the pantothenic acid rest.
658 27 Oxidoreductase Inhibitors

Atorvastatin
A
Atto
Ato
torva
rrv
vva
asta
sstta
tatiin
n

Lys735

Simvastatin
Ser684

Glu559

Arg590

Fig. 27.15 Superposition of the structures of HMG-CoA reductase in complex with simvastatin
27.30 (gray) and atorvastatin 27.34 (green). Both inhibitors bind Lys745, Ser684, and Arg590 with
their mevalonic acid analogue moiety, just as the natural substrate does. The remaining molecular
portion, which is very different in the natural-product-like simvastatin as it is in the fully synthetic
atorvastatin, binds in the region that is occupied by the CoA residue in the substrate complex. The
NADPH pocket remains unoccupied in the structures.

The high degree of structural variation of this group in the newer fully synthetic
statins underscores the fact that this molecular portion indeed contributes to the
affinity of the inhibitors but no specific interactions are formed in the binding
pocket that is open to the surface.
The history of the statins would be incomplete without a discussion about
cerivastatin 27.33 and atorvastatin 27.34 (Fig. 27.13). Both are statins from the
more recent research and were prepared fully synthetically. Atorvastatin was
developed by Warner–Lambert in the USA. In 1997 it was introduced to the market
and changed hands from Warner–Lambert to Pfizer in a corporate acquisition.
There, it was developed to a success story par excellence. The compound became
the best-selling medication ever (Sortis ® and Lipitor®). In 2004 it made up half of
the market share of the statins. Pfizer was able to earn $US14 billion in 2006 as
well as in 2007 with this compound. In Germany, the sales figures were low
because of a healthcare reform that required a fixed copayment for statins.
Cerivastatin 27.33 (Lipobay ®) represented a similar cash cow for the Bayer
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 659

Corporation. This compound was also introduced in Germany in 1997 and in other
European countries and in the USA. At the end of 1998 the Bundesinstitut f€ur
Arzneimittel und Medizinprodukte (BfArM, the German authority that monitors
adverse drug events) reported fatalities under cerivastatin therapy. After more
fatalities were reported in the USA and in Germany, Bayer withdrew the drug
from the market in mid-2001. What happened? The fatalities were a result of
rhabdomyolysis, an acute disintegration of skeletal muscle, and concomitant
kidney failure from the toxic muscle metabolites. This adverse event occurred
especially upon overdosing, and in particular with the combination of cerivastatin
and gemfibrozil 27.37, a compound that belongs to the fibrate group. Gemfibrozil
increases the plasma level of cerivastatin by a factor of five and can cause
myopathy by itself. The cause of death was assumed to be an overdose of
cerivastatin by simultaneous mutual inhibition of the degradation mechanism of
both compounds through the metabolizing cytochrome CYP 3A4 (Sect. 27.6).
Patients were informed of the risk by the enclosed leaflet describing the correct
use of medication, and pharmacists distributing the drug in the USA were also
informed. Cerivastatin was considered to be a growth product for Bayer’s phar-
maceutical business. Within a short time after its approval, it achieved sales of
2.5 billion Euros. Worldwide, about six million people took the medication. Its
withdrawal had broad consequences for the Bayer Corporation with which it had
to struggle for several years. The recall itself caused resentment because the press
and stockholders were informed before physicians and pharmacists. This approach
was certainly suboptimal because it readily contributed to a diminishment in trust
in the public eye regarding the pharmaceutical industry and suggested purely
commercial intentions. The two medications Sortis ® and Lipobay® show, how-
ever, how thin the line is between success and failure in the pharmaceutical
business, and with the risks associated with introducing a new medication to the
market, despite a known, established principle.

27.4 Hitting a Moving Target: Aldose Reductase Inhibitors

The alarming increase in cases of type-II diabetes mellitus was already men-
tioned in ▶ Sect. 26.8. One hundred and fifty million people already suffer from the
consequences of glucose metabolism disorders. In the next 15 years this number is
expected to double. The treatment of diabetes and its consequences devours billions
and represents a massive economic and healthcare-system burden. Acquired
diabetes, which manifests itself as an increasing resistance of cells to insulin,
leads to grave complications if left untreated. These manifest themselves in
secondary complications, for example, increased atherosclerosis (Sect. 27.3) with
increasing risk of infarct and stroke. The long-term consequences of a poorly
controlled blood glucose level preferentially affect cells in tissues that do not
control their glucose uptake by insulin. This occurs especially in cells in the
vascular system, in nerves, in the eyes, and the kidneys. Exogenous insulin admin-
istration does not directly help these cells because they are not able to downregulate
660 27 Oxidoreductase Inhibitors

O OH OH
H
OH OH O
Aldose Reductase Sorbitol Dehydrogenase HO
HO HO
OH OH OH
HO HO HO

HO NADPH + H+ NADP+ HO NAD+ NADH +H+ HO


27.38 27.39 27.40
D-Glucose D-Sorbitol D-Fructose

Fig. 27.16 D-Glucose 27.38 is transformed by aldose reductase to D-sorbitol 28.39 and further by
sorbitol dehydrogenase to D-fructose 27.40 along the polyol pathway.

their glucose uptake. Early blindness, kidney damage, and peripheral vascular
disease can all be a consequence, the treatment of which could require the ampu-
tation of limbs.
One way to intervene in blood glucose regulation is the exogenous administra-
tion of insulin (▶ Sect. 32.4) in addition to significant changes in diet and lifestyle.
However, even a rigorously followed insulin-replacement therapy cannot match the
efficiency of endogenous insulin. Repeated episodes of injurious hyperglycemia
can occur that especially affect the insulin-independent cells. Despite therapy,
diabetics must count on long-term complications. These consequences particularly
affect the quality of life of older patients. All the more, therapeutic approaches are
sought that can reduce long-term complications.
One approach is intervention in the so-called polyol pathway. Glucose 27.38 is
reduced to sorbitol 27.39 and then oxidized to fructose 27.40 (Fig. 27.16) along this
pathway. The first step is catalyzed by aldose reductase, and the second by sorbitol
dehydrogenase. The transformation by aldose reductase is the rate-determining
step. It proceeds with the consumption of NADPH, which is oxidized to NADP+.
NADH/NAD+ is needed as a cofactor in the next step of the dehydrogenase. For
a long time, it was discussed whether overloading the polyol pathway by severe
glucose enrichment would lead to an increased concentration in polar reaction
products in the cell. As a result, the cell would experience elevated osmotic
pressure, which would be alleviated by increased water uptake. This would, how-
ever, lead to cell swelling and increased osmotic stress in the membrane. The
oxidative stress that the cell experiences due to the overloaded polyol pathway
seems to be more serious. The elevated glucose flow along this degradation
pathway requires increasing amounts of NADPH and NAD+ and therefore severely
stresses the homeostasis of these redox-active substances. The body must protect
itself against reactive-oxygen species that are formed as a side product of the ca.
400–800 L of oxygen that we take in daily. These species possess cell-damaging
potential. If the production of these aggressive oxygen derivatives exceeds the
detoxification capacity of the endogenous antioxidant systems of the cell, then
this is referred to as oxidative stress. The main defensive system is glutathione,
which is oxidized to glutathione disulfide species under oxidative conditions.
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 661

To be once again available for the defense mechanism, it must be regenerated by


glutathione reductase, which consumes NADPH. If only a small amount of NADPH
is available, the capacity of the glutathione-defense system is quickly exhausted,
and the cell shows signs of oxidative stress. Today, these are especially associated
with inadequate regulation of the glucose homeostasis.
The inhibition of aldose reductase represents a possibility to avoid overloading
this degradation pathway. A genetic polymorphism that has been discovered and is
related to the risk of diabetic complications has afforded indications about the
therapeutic relevance of such a strategy. Variations in the genes that carry the
information for aldose reductase lead to an increased or reduced expression of
aldose reductase in affected individuals. Increased supply of the enzyme leads to
elevated NADPH consumption. Carriers of the allele for increased expression seem
to have an increased vulnerability to diabetic complications. On the other hand,
a reduced supply of the enzyme corresponds to a reduced prevalence. This is a clear
indication that a reduction in aldose reductase activity is a useful strategy to avoid
the long-term complications of inadequately controlled blood glucose levels.
Interestingly, aldose reductase seems to fulfill another role in the cell. It is able to
reduce a broad palette of aldehyde substrates. This is an important task in the
detoxification mechanism. Aldehyde reductase, an enzyme related to aldose reduc-
tase, fulfills a similar function. A blockade of both enzymes leads to grave problems
because the removal of toxic and highly reactive aldehydes from the cell is
suppressed. Therefore care must be taken in the development that a potent aldose
reductase inhibitor is also as selective as possible. Aldose reductase’s very broad
substrate promiscuity makes this goal nontrivial. This reductase has an astonishing
adaptive ability to be able to handle substrates of highly varying size. The enzyme’s
binding pocket opens itself as best it can to take in substrates, apparently without
significant energetic consequences. It is, however, the case that the properties that
afford this adaptability to accommodate such variably sized substrates can also be
exploited for its inhibition by active substances. Therefore the attempt to block
aldose reductase’s function with different inhibitors can be compared with shooting
at a moving and evasive target.
First, the architecture and mode of action of aldose reductase shall be consid-
ered. It uses NADPH as a cofactor and, unlike the majority of reductases, does not
have an NADPH-binding domain with a “Rossmann fold” but rather a TIM barrel
geometry. The active site is in the lid area of this barrel structure. It is divided into
a relatively rigid catalytic site, the so-called anion-binding pocket, and the adaptive
specificity pocket (Fig. 27.17). The reduction step with the hydride ion transfer
from NADPH to the substrate’s aldehyde group occurs there. An alcohol is formed
as a product. A series of head groups were discovered that can serve as mimics of
the geometry in the reduction step. Carboxylic acids, derived from acetic acids, are
commonly incorporated into many lead structures (Fig. 27.18). The pKa value of
these compounds is, however, unfavorable for their bioavailability. Therefore, other
groups were sought that can also take on anionic character but have more favorable
pKa qualities for transport and distribution (▶ Sect. 19.4). Hydantoins have proven
to be very useful for this. Other variations are listed in Fig. 27.18.
662 27 Oxidoreductase Inhibitors

Trp79
Val47

HN
Phe122
His110
Tyr48
N
Phe115
NH N HO
H
Trp111 H2N N+
NADP+

Thr113 OH O
Anion-Binding
Pocket
HN
Specificity
Cys303 SH
Pocket
Trp20

O
O
Tyr309 OH HS N
N N H Trp219
N
O

O N

Val297 - Leu300 N

Fig. 27.17 The binding pocket of aldose reductase is divided into a catalytic (blue) and
a specificity (orange) pocket. The cofactor NADPH/NADP+ binds below the catalytic pocket.
Phe122, Trp219, and Leu300 are most responsible for the structural adaptability of the specificity
pocket. Trp20 can also undergo conformational changes in the catalytic pocket. The segment
Val297–Leu300 (red) belongs to a loop that exhibits particularly high adaptability.

Another interesting property of the aldose reductase raises the question as to how
an enzyme can adapt itself to so many different substrates. Protein–ligand com-
plexes of four different inhibitors are shown in Fig. 27.19 (Fig. 27.18, 27.50, 27.44,
27.48, and 27.45); each of these binds to a different protein conformer. In the case
of aldose reductase, it is assumed that there are many protein conformers that
coexist in a dynamic equilibrium with one another that cause an opening and
closing of different sub-pockets of the specificity pocket. A substrate molecule,
but also an inhibitor binds to one of these conformers in the equilibrium and
stabilizes it upon complex formation. Only by this it is understandable that it is
virtually energy cost-neutral to gain access to and strongly block different con-
formers of the enzyme without a concomitant loss in binding affinity.
MD simulations can be carried out to gain access to the conformational diversity
of the possible geometries of the enzyme. How such a study can be carried out has
been described in ▶ Sect. 15.8 by using aldose reductase as an example. In this
study it was shown that it is largely the side chains of only a few amino acids that
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 663

COOH COOH
COOH CF3
O N O Br
N
N N
N
N
S
O F
O
27.41 Alrestatin 27.42 Zopolrestat 27.43 Ponalrestat

COOH
COOH
O
S N COOH
O N
CH3 N
SO2
S
S
OH
CH3O
CF3 O
27.44 Tolrestat 27.45 27.46 Epalrestat

COOH COOH

Cl N O Br F O Br
H
N N

O F S F
27.47 Zenarestat 27.48 IDD594

O H O H
H N N
O
N HN HN O
O F
O O F
S
NH2
O O
O
O
27.49 Risarestat 27.50 Sorbinil 27.51 Fidarestat

O
O H O H
N N NH
O O
F O Br O Br N
N O
N N S O
Cl O
O F O F CH3

27.52 Minalrestat 27.53 Ranirestat 27.54

Fig. 27.18 Synthetic inhibitors 27.41–27.54 of aldose reductase. Epalrestat 27.46 was the only
one to reach the market.
664 27 Oxidoreductase Inhibitors

a b
Phe122
Phe122

Trp20
Trp20
Leu300
Leu300

c d
Phe122 Phe122

Trp20
Trp20
Leu300
Leu300

Fig. 27.19 Crystal structures of aldose reductase with (a) sorbinil 27.50, (b) tolrestat 27.44,
(c) IDD594 27.48, and (d) 27.45 (Fig. 27.18). All inhibitors bind to a different conformer of the
protein. Above all, the residues Trp20, Phe122, and Leu300 undergo significant spatial
rearrangements and open up structurally altered sub-pockets in the enzyme.

allow this conformational adaptability and that induce the opening and closing of
entire areas of the binding pocket. The binding geometry of sorbinil 27.50
(Fig. 27.18) is shown in Fig. 27.20. The inhibitor blocks the catalytic site with its
hydantoin group and sits above the nicotinamide ring of the cofactor. Interestingly,
it leaves the specificity pocket closed. Phe122 and Leu300 orient toward one
another like wings of a swinging door and close the parts of the specificity pocket
that lie behind it.
The development of potent aldose reductase inhibitors has been worked on for
many years. Numerous candidates have successfully made their way into clinical
trials (Fig. 27.18). Unfortunately, the development of most of these was terminated
at this phase. Often, it was adverse effects or inadequate efficacy that led to these
decisions. In 1992 ONO Pharmaceutical Co. in Japan managed to introduce
epalrestat (Kinedak ®) 27.46 to the market for the treatment of diabetic neuropathy.
Many other derivatives such as fidarestat 27.51, ranirestat 27.53, ponalrestat 27.42,
27.5 11b-Hydroxysteroid Dehydrogenase 665

Trp79 Tyr48

Phe122

NADPH

Leu300

Fig. 27.20 Crystallographically determined binding geometry of sorbinil 27.50 to aldose reduc-
tase. The inhibitor’s hydantoin group binds above the nicotinamide ring of the cofactor to Tyr48,
Trp79, and His110. The specificity pocket remains closed during this binding. This pocket can be
opened by twisting the side chain of Phe122 and Leu300 out of space.

zopolrestat 27.42, or zenarestat 27.47 made it into phase-II clinical trials.


Sometimes the development was abandoned at this stage, or the trials have not
completed yet.
It is surprising that such a modest therapeutic advantage has been achieved
despite intensive research and an apparently valid therapeutic principle. Aldose
reductase, however, holds another record. To date it is probably the best character-
ized protein in terms of structural and physicochemical properties. A crystal struc-
ture with 0.66 Å resolution was determined with the inhibitor IDD594 (27.48) that
shows almost every water molecule and H atom (cf. ▶ Fig. 13.9), and a well-
resolved neutron structure is also available. Almost no other enzyme has been
characterized by so many quantum mechanical calculations and MD simulations.
Thermodynamic and mutation studies have offered glimpses into the energetics of
the protein’s adaptive behavior. But the extensive knowledge about its properties
has not yet helped to find a reliable and broadly applicable therapy for late-onset
diabetic complications by using an appropriate inhibitor.

27.5 11b-Hydroxysteroid Dehydrogenase

Isoforms of 11b-hydroxysteroid dehydrogenase (11b-HSD) are mutually responsi-


ble for the transformation of the biologically active glucocorticoid cortisol 27.56
666 27 Oxidoreductase Inhibitors

CH2OH
CH2OH
O O
HO R
R 11b -HSD1/NADPH
O
H3C H
H3C H
11b -HSD2/NAD+
H H
H H
O
O

27.55 Cortisone R = OH
27.56 Cortisol, R = OH
27.57 11-Dehydrocorticosterone R = H 27.58 Corticosterone, R = H

Fig. 27.21 The two isoforms of HSD1 and HSD2 of the 11b-hydroxysteroid dehydrogenase
transform inactive cortisone 27.55 into active cortisol 27.56 and vice versa. In rodents the same
enzyme pair transforms 11-dehydrocorticosterone 27.57 into corticosterone 27.58.

into the biologically inactive 11-keto form, cortisone 27.55 (Fig. 27.21). Two
isoenzymes 11b-HSD1 and 11b-HSD2 were found that belong to the superfamily
of short-chain dehydrogenases/reductases. There is a sequence identity between the
two of only 15%. Chemically they are opponents. 11b-HSD1 is broadly distributed,
but with increased expression in the liver and in adipose tissue. The enzyme acts as
a reductase with consumption of NADPH and forms active cortisol from inactive
cortisone, the latter binds to the glucocorticoid receptor (▶ Sect. 28.5) and activates
it. On the other hand, as a dehydrogenase 11b-HSD2 oxidizes cortisol to inactive
cortisone with consumption of NAD+. In doing so, it protects the mineral corticoid
receptor from overexposure to this active hormone. This is especially important in
the colon and in the kidney. An overactivation of this receptor by cortisol, in
addition to aldosterone, leads to an increased renal resorption of sodium and
chloride ions. Water retention and an increase in blood pressure is the consequence.
A congenital gene defect that causes mutations in 11b-HSD2 can lead to
a hereditary form of hypertension. The mutated enzyme works less efficiently. An
excess of cortisol is the result. The receptor becomes overloaded and causes an
elevated blood pressure. Interestingly, glycyrrhizin 27.59 (Fig. 27.22), one of the
ingredients in licorice, is a potent 11b-HSD2 inhibitor. Excessive consumption of
this confectionary, which is made from the root of Glycyrrhiza glabra, can, in the
worst cases, lead to temporary symptoms that are comparable to those of the
congenital gene defect.
The short-chained dehydrogenases/reductases take on a Rossmann folding pat-
tern. The occurrence of a Tyr-Lys-Ser triad, which occurs in almost all members
of this family, is critical for the catalytic mechanism. The sequence of the reduction
reaction is outlined in Fig. 27.23. A hydride ion is transferred from the nicotinamide
ring of NADPH to the carbonyl group being reduced. The carbonyl function is
involved in a network of hydrogen bonds, which is responsible for its polarization
for the nucleophilic H ion attack. The hydroxyl group of a tyrosine residue serves
as a proton donor. Moreover, the ammonium group of a neighboring lysine
27.5 11b-Hydroxysteroid Dehydrogenase 667

S O O
H3C COOH
S
N N
N H
H H3C N
O O
COOH F
CH3 CH3 CH3
HO Cl
O 27.61 BVT-2733
H H
HO O H
HOOC O O H3C CH3
S
HO OH 27.59 Glycyrrhizin O
N N
OH H
H3C COOH F
27.62
H
O
CH3 CH3 CH3 Cl
O H
H H CH3SO2 N O F
HOOC O H
CH3
H3C CH3 O CH3

27.60 Carbenoxolone 27.63

Fig. 27.22 The contents of licorice, glycyrrhizin 27.59, represents a potent inhibitor of both
11b-HSD isoforms. Its derivative, carbenoxolone 27.60, is also able to block both isoforms of
11b-HSD. Arylsulfonamidothiazole BVT-2733 27.61 was developed as an inhibitor of 11b-HSD1.
The crystal structure shown in Fig. 27.25 was obtained with an analogous compound, 27.62.
The development of adamantylsulfone 27.63 with a central amide bond was accomplished at
Abbott.

+ CH2R
Lys HO
NH3

HO O

Fig. 27.23 11b-HSD1 uses NADPH


N
NADPH as a cofactor for the
reduction of cortisone to
cortisol. The substrate’s
H2NCO
carbonyl group is involved in Ser CH2OH
H H
a hydrogen bond with O
a neighboring Ser and Tyr OH
residue. In this way it is O CH3
Tyr
prepared for the nucleophilic H
attack by the hydride ion.
Additionally, the positive O H
charge of a neighboring Lys H3C H
H
residue polarizes the carbonyl O
H
group (cf. Fig. 27.25) and
reduces the pKa value of Cortisone
tyrosine, which serves as
a proton donor. O
668 27 Oxidoreductase Inhibitors

NADPH Corticosterone 27.58

Carbenoxolone 27.60

Ile121
Tyr183 Ser170

Lys187

Fig. 27.24 Crystal structure of human 11b-HSD1 with carbenoxolone 27.60 (gray). The inhibitor
binds competitively to cortisol, the natural ligand. The binding geometry of corticosterone 27.58
(green) was extracted from the crystal structure with the murine enzyme and superimposed on
human 11b-HSD1. This shows the binding geometry of the natural substrate.

immobilizes the OH groups of the sugar moiety and facilitates the proton transfer
by lowering the pKa value of the tyrosine residue (Figs. 27.23 and 27.24). The
different isoforms of 11b-HSD are suitable for both oxidation steps (dehydroge-
nases) as well as reduction steps (reductases), which are catalyzed according to very
similar mechanisms.
Endocrinologists have since long noticed a phenotypical similarity between the
relatively seldom Cushing syndrome and the metabolic syndrome, which com-
monly occurs in industrialized countries. Cushing syndrome occurs as
a consequence of excessive cortisol production and leads to a “full-moon face”
and adrenocortical obesity (central fat distribution). The alarming increase in
obesity in the industrialized world and the simultaneous increase in type-II diabetes
was already discussed in the two previous sections and in ▶ 26.8. Interestingly, an
elevated cortisol level could be evidenced in the adipose tissue of obese people
compared to the tissues of lean people.
Obviously there is a tendency for an increase in 11b-HSD1 activity in the
adipose tissue of people who tend toward obesity. A resistance to nutritionally
caused obesity could be observed in genetically altered mice with no 11b-HSD1
activity. The mice showed better lipid and lipoprotein levels and an increase in
insulin sensitivity in the liver. On the other hand, transgenic mice with induced
overexpression of 11b-HSD1 in adipose tissue showed an increasing insulin resis-
tance. These results suggest that a reduction in 11b-HSD1 activity might represent
a promising therapeutic principle for the treatment of metabolic syndrome.
27.5 11b-Hydroxysteroid Dehydrogenase 669

NADPH

Corticosterone 27.58

27.63

Ile121
27.62

Lys187 Tyr183 Ser170

Fig. 27.25 Crystallographically determined binding geometries of the inhibitors 27.62 (beige)
and 27.63 (gray) together with the binding mode of corticosterone 27.58 (green) taken from the
crystal structure of the murine enzyme. Despite entirely different molecular scaffolds, the inhib-
itors largely occupy the steroid’s position. They bind with their amide (27.63) or amide-like
(27.62) groups to Ser170 and Tyr183 of the catalytic triad. Lys187 holds the ribose moiety in
position and polarizes the oxygen functionality of the neighboring Tyr183.

This strategy seems to be particularly attractive because the elevated 11b-HSD1


levels are only in particular tissues, especially adipose tissue. Additionally, the
unselective 11b-HSD inhibitor carbenoxolone 27.60 (Fig. 27.22) increases insulin
sensitivity in the liver in healthy volunteers as well as in diabetics without increas-
ing the glucose metabolism in the periphery.
Programs were therefore established in numerous pharmaceutical companies to
develop selective 11b-HSD1 inhibitors. The arylsulfonamidothiazolenes were the
first substance class. BVT-2733 27.61 (Fig. 27.22) is derived from this series as
a promising development candidate. These inhibitors possess an aminothiazole
substructure that binds to the serine and tyrosine residues of the catalytic triad
and mimics the geometry of the 11b-ketone function of the natural steroid. The
crystal structure with one representative of this class, 27.62, is shown in Fig. 27.25.
Other substance classes use an amide group for the keto function, a urea unit, or
a heterocycle as a mimetic for the keto function in the substrate. Sulfones derived
from adamantane (e.g., 27.63) and sulfonamides were developed at Abbott to
mimic the hydrophobic steroid scaffold. The hydrophobic adamantane moiety
replaces the C- and D-rings of the steroid scaffold (Fig. 27.25). The terminal
sulfone group mimics the 17-COCH2OH-substituent, and 27.63 binds to Ser190
and Tyr183 of the catalytic triad with the carbonyl function of its central amide
670 27 Oxidoreductase Inhibitors

group. Also the example of this enzyme underscores the point that structurally very
different molecular scaffolds can mimic the geometry and properties of a steroid to
successfully block the binding pocket and therefore the catalytic mechanism.

27.6 The Cytochrome P450 Enzyme Family

The family of cytochrome P450 enzymes plays a central role in drug metabolism.
The fundamentals of distribution, transport, and degradation of drugs were already
discussed in ▶ Sect. 9.2. Here the architecture and mode of action of these
monooxygenases shall be introduced, above all their interaction with low-
molecular-weight active substances. The cytochrome P450s (CYPs) are a super-
family of heme proteins that carry out biochemical transformations as
monooxygenases, usually by the introduction of oxygen onto the substrate being
oxidized. They have an iron-containing protoporphyrin system as a prosthetic group
in their center. The fifth, apical position of iron is coordinated by a cysteine residue.
An oxygen is intermediately bound at the sixth coordination position and is intro-
duced to the substrate from there. The name comes from a typical absorption band at
450 nm that is observed when the complex is blocked with carbon monoxide.
The proteins are constructed from about 500 amino acids. Until now, more than
6,000 genes have been described for CYPs in Nature. In humans, 17 families have
been characterized, which are subcategorized into 57 isoenzymes. A combination of
numbers and letters are used to name the proteins in which the first number indicates
the family, the letter the subfamily, and the second number describes the isoform. In
the body, these are found predominantly in the liver, lung, and the gastrointestinal
tract. This provides clues about their function: above all, to intervene in the metab-
olism of xenobiotics. Some CYPs carry out important transformations on endogenous
substrates, such as CYP 2R1 in vitamin D metabolism, CYP 19A1 (aromatase) in
steroid metabolism, or CYP 2J2 and CYP 5A1 (thromboxane synthase) in eicosanoid
metabolism. Xenobiotic compounds are transformed in so-called phase-I reactions to
better water-soluble and hence more easily excretable substances. Usually these
transformations serve to detoxify compounds, but in a few cases a toxification of
the substrate can also occur (▶ Sect. 9.1). A few typical reactions that are catalyzed
by CYPs are listed in Fig. 27.26.
The catalytic cycle of P450 enzymes is NADPH dependent. Initially the iron ion
in the heme center is in the +3 oxidation state. The substrate diffuses into
a reaction cavity that is practically fully shielded from the outside (Fig. 27.27).
A helical sequence segment allows entrance into the catalytic site and also acts as
a lid over the site. An NADPH reductase delivers the first electron to the cyto-
chrome and reduces the iron atom there. Then molecular oxygen coordinates to the
iron. In the next step, NADPH reductase delivers a second electron. Then a proton is
taken up, and a Fe2+OOH species forms, which homolytically cleaves the sub-
strate’s C—H bond being oxidized with concomitant water release. An OH group is
stereoselectively transferred from the iron to the carbon being oxidized. The iron
returns to its original +3 state, and the oxidized product can leave the binding
27.6 The Cytochrome P450 Enzyme Family 671

R2 R2 R1 R2 R1 O

R1 H R1 OH R2

R3 R3 H R3 R3 H

OH OH
RX RX+ O−

X HO

RX ROH R2

R2
R1 R1
X R3 O + HXR3 N
N
H
R2 R2
R1

O O
O

Fig. 27.26 Examples for typical oxidation reactions as they are catalyzed by cytochrome P450
enzymes; X stands for heteroatoms such as nitrogen or sulfur.

pocket. Even today, the reaction is not yet fully understood in detail. It has been
shown, however, that P450 enzymes are able to adapt to their substrates, sometimes
to an extreme extent. Even the uptake of two different molecules, as opposed to just
one substrate molecule, into the binding pocket is possible. As we shall see, this has
broad consequences for drug metabolism. The majority of CYPs are in the liver. In
mammals, they are embedded in the endoplasmatic reticulum membrane by an
anchor. The distribution of CYPs into different families is shown in Fig. 27.28. If
their role in drug metabolism is considered, CYP 3A4, CYP 2D6, and CYP 2C9
take on the lion’s share of this task (Table 27.3). CYP 3A4 in particular has
demonstrated a pronounced adaptive structure. Its binding pocket broadens
from 900 Å3 in the uncomplexed state to 2,000 Å3 in the complexed state upon
binding erythromycin (Fig. 27.27). Moreover, erythromycin must completely
rearrange in the binding pocket, because in the experimentally determined crystal
structure, the group being oxidized is still 17 Å away from the heme center.
P450 enzymes can be blocked by diverse compounds. Compounds containing
heteroaromatic rings such as imidazole or triazole tend to inhibit them. Fluconazole
27.6 and ketoconazole 27.7 represent potent CYP 3A4 inhibitors (Fig. 27.6). Other
examples are the flavonoids such as naringenin 27.9, which is contained in grape-
fruit juice. They are metabolized by CYPs to active inhibitors and finally bind
irreversibly to diverse CYPs, above all to CYP 3A4. Additional examples are listed
in Table 27.3. These inhibitory properties must be considered when a CYP inhibitor
672 27 Oxidoreductase Inhibitors

a b

c d

Fig. 27.27 Crystal structures of human CYP 3A4 in an uncomplexed state (a), with bound
metyrapone 27.8 (b), with erythromycin 32.29 (c), and with ketoconazole 27.6 (d). The protein
is shown with a white surface that is red colored on the interior. The ligands are shown with their
own surfaces (outside green, inside blue). In the case of ketoconazole, two ligands bind to the
protein (the second molecule is shown with a violet surface and a cyan interior). CYP 3A4’s
binding pocket, which is nearly fully closed to the exterior (cf. (a) and (b)), had proven itself to be
extremely adaptive. It is only because of this that the enzyme can take on ligands with entirely
different sizes and shapes.

a b c

CYP 3a CYP 3A4 CYP 3


CYP 2E1 CYP 2E1 CYP 2E1
CYP 2D6 CYP 2C11
CYP 2C11 CYP 2C6
CYP 2C19 CYP 2A6
CYP 2C6
CYP 2C9 CYP 1A2
CYP 2B6 CYP 1A CYP 1A6
CYP 1A Others CYP 2D6
Others

Fig. 27.28 Percentage of CYP P450 enzymes involved in drug metabolism and their relative
distribution. (a) A study from 2002 compiled the data for the relative portion of the different CYP
enzymes that take part in the metabolism of the 200 best-selling drugs. (b) Proportion of the
different CYP enzymes in the small intestines. (c) Relative distribution of CYP enzymes over the
different P450 families in humans.
27.6 The Cytochrome P450 Enzyme Family 673

Table 27.3 Examples of drugs that act as substrates, inhibitors, or inducers of CYP 3A4, CYP
1A2, and CYP 2D6
Substrate Inhibitor Inducer
CYP 3A4 Amitryptiline Ketoconazole Barbiturate
Clarithromycin Cimetidine Carbamazepine
Ciclosporin Ciprofloxacin Glucocorticoids
Dexamethasone Erythromycin Phenobarbital
Carbamazepine Fluconazole Rifampicin
Terfenadine Ritonavir St. John’s wort
Ethinylestradiol Grapefruit juice
CYP 1A2 Caffeine Cimetidine Insulin
Amitryptiline Ciprofloxacin Omeprazole
Paracetamol Grapefruit juice Aromatic hydrocarbons
Theophyllin Smoking
Verapamil
CYP 2D6 Amitryptiline Cimetidine Dexamethasone
Captopril Haloperidol
Chlorpromazine Clotrimazole
Codeine Quinidine
Imipramine Ritonavir
Metoprolol
Propafenone
Debrisoquine

is coadministered with a drug that is metabolized by this enzyme. Because of


limited metabolism, the plasma concentration of the coadministered drug can
increase, and this can lead to grave consequences with regard to the fraction of
the dose that is in the body (cf. cerivastatin 27.33, Sect. 27.3). On the other hand,
such a factor can be exploited to reduce the dosing of an expensive drug such as
ciclosporin (▶ Sect. 10.1). The simultaneous administration of ketoconazole 27.7
allows lower doses of this immunosuppressant because ciclosporin is metabolized
by CYP 3A4.
The organism must react to exposure to xenobiotics flexibly and adapt its
degradation mechanisms. Therefore CYPs can also be induced, that is, the body
upregulates the availability if a particular isoform as needed. Presumably this
induction has multiple mechanisms. One way is by the stimulation of the tran-
scription factor PXR, which belongs to the group of nuclear receptors. Such a case
is introduced in ▶ Sect. 28.7. For example, one of the components of St. John’s
wort, hyperforin, induces an increased expression of CYP 3A proteins. As a result,
the metabolism of drugs that are degraded by this protein family is boosted. This
can lead to a fall below therapeutically adequate doses. A dangerous situation can
ensue for the patient, especially when the St. John’s wort is discontinued. Further
examples of inducers of different CYP isoforms are listed in Table 27.3. In addition
to its metabolism by alcohol dehydrogenase, alcohol is also degraded by CYP 2E1.
This enzyme is upregulated in cases of excessive alcohol consumption, especially
in chronic alcoholics, and is available in increased levels for alcohol metabolism.
This explains the tolerance that leads to an ability to “hold a drink” better by heavier
674 27 Oxidoreductase Inhibitors

O
O O
HN CH3
HO N CH3
N CH3
CYP 2E1 Conjugation with
Macromolecules
OH
OH O Glutathione, if available
27.64 Paracetamol in adequate amounts
Intermediate 27.65 Toxic!
O

HN CH3

Sulfation
Glutathione
Gluconidation S
OH

Fig. 27.29 In addition to alcohol dehydrogenase, alcohol is metabolized by CYP 2E1. This
enzyme is overexpressed because of induction in chronic alcoholics. The analgesic paracetamol
27.64 is partly metabolized by CYP 2E1. Compound 27.65 is formed in the process as a toxic
intermediate. At low concentrations, it can be detoxified by glutathione. If, however, elevated
levels of paracetamol end up in this pathway because of CYP 2E1 upregulation, the supply of
glutathione will be insufficient, and a poisoning can occur.

drinkers. If, however, these drinkers wish to treat their hangovers with paracetamol
(acetaminophen) the next morning, problems can occur. Paracetamol 27.64 is
partially metabolized by CYP 2E1, and it is through this enzyme that the toxic
intermediate is formed (Fig. 27.29). If the intermediate is present in low concen-
trations, it can be transformed with the available glutathione and detoxified. If
paracetamol is metabolized extensively via this pathway, the available amount of
glutathione is inadequate, and toxicity symptoms can occur. This danger is the most
severe with heavy drinkers, in whom the CYP 2E1 concentration is permanently
elevated by continuous induction, and in whom paracetamol is predominantly
metabolized through this pathway.
Saturation and upregulation of cytochromes by induction or inhibition of cyto-
chromes by drug–drug interactions represent a serious potential danger in drug
metabolism. Therefore efforts are made in drug design to estimate the metabolic
profile of a development candidate. It would be nice to know at what position
a compound is metabolized and whether cytochrome inhibition is expected, espe-
cially of the most important enzymes. The crystal structure determination of the
essential human CYPs was pursued aggressively. The information obtained was
rather disillusioning. The proteins have such extremely adaptive properties that it
seems practically impossible to predict plausible binding modes to estimate inhi-
bition data. Even a prediction about what parts of a molecular scaffold are prefer-
ably metabolized and which metabolites would be expected has not gotten any
easier, despite the many crystal structures. Currently, routine structural determina-
tion of each development candidate with these proteins still seems rather utopian.
Moreover it has been shown that not only binary but also ternary complexes with
27.7 What Makes Slow and Fast Metabolizers Different? 675

one or two different ligands can be formed. Only time will tell how the methodology
in this area develops. The current state of the art, however, allows the estimation
of metabolic properties with empirical QSAR models and 3D comparisons
(▶ Chaps. 17, “Pharmacophore Hypotheses and Molecular Comparisons” and
▶ 18, “Quantitative Structure–Activity Relationships”). The program MetaSite
from Gabriele Criciani at the University of Perugia in Italy attempts to find the
best-fitting patterns by considering possible complementary interaction patterns on
the surface of the ligand and in the binding pocket. Multiple ligand conformations
are considered for this. Next, concepts about possible binding modes in the binding
pocket of the metabolizing cytochromes are developed that estimate the spatial
accessibility for oxidative attack by the iron atom on the different sites in the ligand.
Furthermore, the technique accesses a system of rules to judge the reactivity of organic
molecules that are similar to those that were introduced to construct the Hammett
equation (▶ Sect. 18.2). Both concepts rank the individual centers in a molecule with
regard to the probability for a metabolic transformation. The combination allows the
metabolic properties of drugs to be estimated surprisingly well.

27.7 What Makes Slow and Fast Metabolizers Different?

A standardized prediction about the metabolism of a drug is practically impossible


simply because we are all different. The equipment with cytochrome P450
enzymes varies from person to person. On the one hand it has to do with varying
enzyme concentrations in our bodies, on the other hand, polymorphisms
(▶ Sect. 12.10) cause varying metabolic behavior between different individuals.
This has been intensively investigated for the enzymes CYP 2D6, CYP 2C9, and
CYP 2C19. For example, CYP 2C9 is absent in 1–3% of Caucasians. These people
have difficulty metabolizing S-warfarin, and prodrugs such as codeine, tramadol,
and losartan are not activated. Deviations in the polymorphism of CYP 1A2 are
responsible for the fact that different individuals react differently to caffeine.
Multiple polymorphisms have been described for CYP 3A4. The best-investigated
example for a correlation between genetic variability and mode of action is the
degradation of debrisoquine 27.66 to 4-hydroxydebrisoquine 27.67, an antihyper-
tensive (Fig. 27.30). This drug is metabolized by CYP 2D6. Caucasians can be
divided into slow, extensive, and ultrafast metabolizers with respect to their
ability to metabolize this drug. A study carried out in Sweden established the
distribution that is shown in Fig. 27.30. This has consequences for the prescription
of this drug. If a standard dose is administered to all patients, the extensive
metabolizers would be correctly dosed. The slow metabolizers would have
a plasma level that was too high, which can lead to undesirable side effects. It
would be barely possible, on the other hand, to establish a plasma level that was
adequate for therapy in the ultrafast metabolizers; the desired effect of the drug would
be absent. At this point, it would be ideal for the physician or pharmacist if the patient
had a directly readable gene chip that indicated to which group this patient belonged.
If an alternative antihypertensive were available that was metabolized by a different
676 27 Oxidoreductase Inhibitors

Ultrafast Extensive Slow


120

Metabolizer
Number of Probands

HO H
80
CYP 2D6
N NH2 N NH2

NH NH
27.66 Debrisoquine 27.67 4-Hydroxydebrisoquine

40

0
0.01 0.1 1 10 100
Metabolic Ratio of Debrisoquine/4-Hydroxydebrisoquine

Fig. 27.30 Correlation between genetic variability and the metabolism of the antihypertensive
debrisoquine 27.66 to hydroxydebrisoquine 27.67. The Caucasian population metabolizes this
drug with CYP 2D6 and is divided into slow, extensive, and ultrafast metabolizers. If a standard
dose of the drug is prescribed, the extensive metabolizers would respond well. On the other hand,
the same dose will lead to a plasma level that is too high for the slow metabolizers, which can lead
to side effects. The ultrafast metabolizers will barely reach a plasma level that is adequate for
therapy, and the desired effect of the drug will not be achieved.

CYP, the patient might benefit from a switch to a different drug. There are already
chips on the market on which a patient’s genetic CYP profile can be recorded. It must
be said, however, that the information on the genome for the coding of a particular
enzyme is not adequate to assign an individual to a metabolic group. The genotype is
not important for the metabolic efficiency, that is, the individual genetic complement
of coding proteins, rather the actual expressed quantity of a protein. This determines
the appearance, that is, the phenotype. Additionally, the phenotype can vary
according to the lifestyle and state of health of a person. One only has to think
about the induction of CYP 2E1 in heavy drinkers. This patient profile becomes
important if the therapeutic window for the use of a drug is very narrow. This means
the difference between the desired effect and a toxic dose (▶ Sect. 19.7).
It should also be mentioned that genetic differences in the cytochrome comple-
ment are not the only factors that lead to variable metabolic behavior. Transferases
(▶ Chap. 26, “Transferase Inhibitors”) that transfer, for instance, acetyl groups,
sugar moieties, or methyl groups, play an important role too. These enzymes are
also differently expressed in the general population, which is divided into, for
example, fast and slow acetylators. More attention must be paid to the metabolism
and the genetic and phenotypical variability. How the proband groups are distrib-
uted in terms of their metabolic characteristics must also be better investigated in
clinical trials. It is only then that reliable data about the therapeutic breadth of
a drug can be obtained before the drug gains widespread use in therapy.
27.8 Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors 677

27.8 Blocking the Degradation of Neurotransmitters:


Monoamine Oxidase Inhibitors

Monoamine oxidases are examples of oxidoreductases that have a FAD molecule in


their active site acting as a cofactor. Two isoenzymes MAOA and MAOB have been
characterized. A sequence identity of 70% exists between the two forms. These
enzymes are embedded in the mitochondrial membrane. They were described for
the first time in 1928 as tyramine oxidases. In addition to tryptamines such as
serotonin 27.68, which are preferable transformed by MAOA, both isoforms can
metabolize dopamine 29.69, adrenaline 27.70, and tyramine 27.71 (Fig. 27.31).
Their function is to degrade neurotransmitters in the synaptic gap by oxidative
deamination. The neurotransmitters are released into the synaptic gap and bind to
a postsynaptic G protein-coupled receptor (▶ Sect. 22.5, ▶ Fig. 22.7). To terminate
the stimulation, the neurotransmitters are removed from the synaptic gap by
a transporter and shuffled back in the presynaptic cell. There they are stored in
a vesicle again, or their chemical degradation is accomplished by monoamine
oxidase. Inhibition of MAOA or MAOB reduces the oxidative deamination of the
transmitters. As a result, they remain available longer for nerve transmission. This
therapeutic principle can be exploited in diseases in which, for instance, the brain
metabolism or the neurotransmission has fallen out of equilibrium. Examples of this
are depression, Alzheimer’s disease, or Parkinson’s disease.
Inhibiting MAOA raises the level of serotonin. Drugs that inhibit this isoform are
used to treat severe depression. Blocking MAOB increases the dopamine level,
which can represent an approach to treating dementia and Parkinson’s disease. The
catalytic reaction produces H2O2, an aldehyde, and free ammonia as products. The
peroxide can play an important role as a source of hydroxyl radicals in metabolic
processes, which, depending on the amount, can either have a protective or destruc-
tive effect. The other degradation products also have biological roles. There is,
however, a danger of a cytotoxic effect if their concentrations are too high.
The first MAO inhibitors were found by accident. Isoniazid 27.72 was synthe-
sized at the University of Prague by Hans Meyer and Josef Mally in 1912
(Fig. 27.31). Its antibiotic effects were recognized in the Second World War.
Even today, it is still a component of tuberculosis therapy. The development of
a hydrazide-substituted derivative, iproniazid 27.73, was accomplished at Roche. It
was introduced to the market under the name Marsilid®. Shortly after its introduc-
tion, a mood-brightening side effect was noticed in the tuberculosis patients. On
this basis, it was prescribed to patients suffering from depression. Because the only
method available for the treatment of these patients until then was electroconvul-
sive therapy, it was quickly celebrated as the “Drug of the Year.” There were
fatalities, however, because of liver toxicity, and in 1960 it was withdrawn from the
market. The success of this compound stimulated the search for other inhibitors that
lacked side effects. For example, phenelzine 27.74, tranylcypromine 27.75, and
pargyline 27.76 (Fig. 27.31) resulted from this work. They react with the FAD
system in the enzyme and render it useless for the electron-transfer mechanism
(Fig. 27.32b, c).
678 27 Oxidoreductase Inhibitors

CH3
NH2 HN NH2
NH2
HO

HO
N HO HO
H
OH OH OH
27.68 Serotonin 27.69 Dopamine 27.70 Adrenaline 27.71 Tyramine
CH3

NH2
NH2 HN CH3 HN NH2
O NH O NH

N N
27.72 Isoniazid 27.73 Iproniazid 27.74 Phenelzine 27.75 Tranylcycpromine

H3C
N
H3C
N

N
CH3 CH3 O
Cl

Cl
27.76 Pargyline 27.77 Deprenyl 27.78 Clorgyline
Selegilin

Fig. 27.31 Serotonin 27.68, dopamine 27.69, adrenaline 27.70, and tyramine 27.21 are metab-
olized by MAOs. At first, the hydrazide derivatives such as isoniazid 27.72, iproniazid 27.73, or
the hydrazine phenelzine 27.74 were discovered as inhibitors. Follow-up drugs such as
tranylcypromine 27.75 react with the FAD system of the flavoenzyme by a ring-opening reaction,
or others such as pargyline 27.76, L-deprenyl (or selegilin) 27.77, and clorgyline 27.78 react
through their propargyl groups.

In the meantime both isoforms have been characterized by crystallography.


Interestingly, MAOB is a dimer, and MAOA is a monomer. The complex of
MAOB with tranylcypromine 27.75 was structurally investigated. How the com-
pound forms an irreversible covalent bond with the flavin scaffold can be seen in
Fig. 27.33. Many other MAO inhibitors have a propargyl amine group in their
a +
27.8

NH2 NH2 O
HO HO HO +
+NH3
H
O O O
Redox H Hydrolysis
H3C N H
H3C N H3C N
NH NH NH
O2 H2O2
H2C N N O H2C N N− O H2C N N O
S-Cys R H
S-Cys R S-Cys R

b +
NH2 +
NH2 NH2 O
+ NH3
27.75 H
O O O O
H + H Hydrolysis H
+H
H3C N H3C N H3C N H3C N
NH NH NH −H+ NH

H2C N N O H2C N N− O H2C N N O H2C N N O


S-Cys R S-Cys R S-Cys R S-Cys R

c
CH3 CH3
N + CH3
N N
H3C H
H H3C H H3C
27.77 H H H O O
O +
−H
H3C N H 3C N H3C N
NH NH NH

H2C N N O H 2C N N− O H2C N N− O
S-Cys R S-Cys R S-Cys R

Fig. 27.32 Possible mechanism for the deamination and inhibition of MAO enzymes. (a) Biogenic amines are transformed into iminium compounds in a redox
reaction by a hydrogen abstraction next to the amino group. Formally, a hydride ion is transferred to the oxidized form of the FAD system. After hydrolysis of the
iminium ion, ammonia and aldehyde are obtained. The prosthetic group is reoxidized with molecular oxygen, whereby H2O2 is formed. (b) Tranylcypromine
Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors

27.75 reacts with the oxidized form of the FAD system upon ring opening and forms a covalent bond to C4a on the ring. (c) Derivatives such as L-deprenyl 27.77
transfer one of their propargylic hydrogens onto the oxidized form of the FAD scaffold. A covalent bond is formed with the N5 nitrogen atom. A delocalized
electron system between the FAD molecule and the inhibitor is formed.
679
680 27 Oxidoreductase Inhibitors

a
Tyr60

Trp388 Arg42

b
Tyr60

Arg42
Phe343
Trp388

Fig. 27.33 (a) Crystal structure of MAOB in complex with covalently bound tranylcypromine
27.75. The inhibitor is attached to the FAD system through the C4a carbon atom. (b) Crystal
structure of MAOB in complex with covalently bound L-deprenyl 27.77, which is coupled to the
cofactor through the N5 nitrogen atom. A delocalized electron system between the FAD molecule
and the inhibitor is formed.

scaffold. This moiety reacts with the nitrogen of the central FAD ring by building
a covalent bond. A delocalized electron system incorporating multiple bonds is
formed (Figs. 27.32 and 27.33).
L-Deprenyl 27.77 selectively blocks MAOB whereas clorgyline 27.78 selectively
inhibits the MAOA isoform. Both isoforms are very similar in the vicinity of the
FAD-binding site. Deviations occur only in the region of the pocket where the
biogenic amine binds as a substrate. Of the 20 amino acids that make up this area,
seven are structurally different. Above all, the two residues Ile199 and Tyr326 in
27.8 Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors 681

MAOB are exchanged for Phe208 and Ile335 in MAOA. They give the binding
pocket a different shape. The pocket is shorter, but broader and shallower in
MAOA. It is bordered by Phe208 from below, and can easily accommodate the
2,4-dichlorophenoxy group of clorgyline. The phenoxy group must adopt
a conformation that results in a parallel orientation extending the aliphatic chain
of the inhibitor. This is achieved due to the conformational properties of the
phenoxymethyl group, which, despite the ortho substituent, prefers to adopt
a planar arrangement with the attached chain (Fig. 27.34a). In MAOB, the pocket
takes on a deeper crevice shape in which a phenyl ring fits alongside its edge. The
volume of the pocket is restricted at the rim by the larger Tyr326 residue. Instead of
a phenyl ring on the bottom (cf. Phe208 in MAOA), the wall of the pocket is made
up by Ile299. This residue is considered to have the properties of a flexible entrance
gate. The inhibitor’s phenethyl group, upon which both ortho positions must remain
unsubstituted, enables the ligand, deprenyl, to take on the needed conformation
with a perpendicular orientation of its terminal aromatic ring relative to the
aliphatic chain (Fig. 27.34b).
In addition to the irreversible, covalently binding inhibitors, reversibly binding
inhibitors such as moclobemide 27.79, befloxatone 27.80, or toloxatone 27.81
(Fig. 27.35) are also known. They also occupy the part of the binding pocket that
takes up the biogenic amine substrate. They do not, however, form a covalent bond
with the FAD scaffold.
MAO inhibitors are especially used as antidepressants and for the treatment of
Parkinson’s disease. The antidepressant effect is primarily achieved by a specific
inhibition of MAOA in the central nervous system. In the brain, the levels of
dopamine, noradrenaline, and serotonin rise. The Parkinson’s disease therapy,
which is usually coupled with an L-DOPA strategy (▶ Sect. 26.9), is focused on
the inhibition of MAOB because this isoform is overexpressed in the brains of
Parkinson’s patients. Because both isoforms metabolize dopamine equally well, an
attempt is made to intervene in the Parkinson’s etiology with selective MAOB
inhibitors.
In addition to the above-mentioned liver toxicity that was observed with the first-
generation antidepressive hydrazide-type MAO inhibitors, hypertensive crises as
a result of an acute dysregulation of the blood pressure were also observed. This led
to these substances’ withdrawal from the market. The liver toxicity could be largely
avoided with compounds such as tranylcypromine 27.75 or pargyline 27.78
(Fig 27.31) but the hypertensive crises continued to occur. These could be provoked
by increased concentration of tyramine in the body, above all when certain foods
containing high levels of tyramine (e.g., cheese, causing the so-called cheese
effect, or wine) were ingested, and the metabolic degrading enzymes were irre-
versibly blocked with a MAO inhibitor. An elevated concentration of noradrenaline
is the consequence, which activates the vascular system and can lead to arrhythmias
or heart attacks. Reversible MAOA inhibitors can avoid this problem to a certain
extent. They adequately block the enzyme in the central nervous system to achieve
the desired antidepressant effect. In the periphery, tyramine displaces the reversible
inhibitor and this allows tyramine degradation.
682 27 Oxidoreductase Inhibitors

a Ile335

H3C
Cl CH2
S-Met
Cl

N
O N N
CH3 O R
Phe208
HN N−

Cl
O
CH2R

b Try326

Tyr326 MAO-B
CH3
HO
CH2
CH3 S-Met

N N
CH3 O N
Ile199 R

HN N−

O
CH2R

Ile199
FAD/
Deprenyl

Fig. 27.34 MAOA (a) and MAOB (b) differ in the binding region of the biogenic amine. The
shape of the binding pocket is mainly determined by the exchange of Phe308 ! Ile199 and Ile335
! Tyr326. The 2,4-dichlorophenoxymethyl moiety of the selective inhibitor clorgyline 27.79
(violet) binds in the broad and shallow binding pocket, which is bordered by Phe208, in the
complex with MAOA (residues are violet). The inhibitor’s aliphatic chain lies in the same plane as
the aromatic ring. This chain conformation relative to the ring is the preferred geometry for this
group. A statistical evaluation of the geometry of the ortho-chlorophenoxymethyl group in small-
molecule crystal structures indicates torsion angles (red) of preferentially 180 . The binding
pocket in the complex of MAOB (orange residues) with the selective inhibitor L-deprenyl 27.77 is
severely limited by Tyr326 and opens only a narrow crevice. The phenyl group of the inhibitor
(gray) submerges into this crevice. For this, the aromatic ring must adopt an orientation that is
90 perpendicular to the attached side of the chain. A geometry similar to the one of the
dichlorophenoxy group in clorgyline is not possible for steric reasons (cf. superimposed geometry
of 27.78). On the other hand, the deprenyl’s phenyl ring cannot bind to MAOA with the same
“submerged” edge-on geometry because a steric conflict with Phe208 would occur. Also here,
a statistical analysis of the torsion angles (blue) shows a clear preference for values 90 , which
corresponds exactly to the desired perpendicular orientation of the plane of the phenyl ring to
the chain.

Oxazolidinones represent a new group of antibacterial substances that presumably


inhibit the peptidyl transferase center in the bacterial ribosome (▶ Sect. 32.6).
A representative of this group is linezolid 27.82 (Fig. 27.35). Because of its structural
similarity to the reversible MAO inhibitors, toloxatone 27.81 is also an MAOA
inhibitor. Therefore administration of this compound can induce hypertensive crises
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 683

H3C
O
O O
HN
N O H
O O
O N
O N
O NH O

O N

O F
N
Cl OH CH3

CF3 O

27.79 Moclobemide 27.80 Befloxatone 27.81 Toloxatone 27.82 Linezolid

H3 C
H3C NH
H3C
N F
H3 C O N CH3
O
S
N
O
N
N H
Cl
Cl
27.83 Citalopram 27.84 Sertralin 27.85 Almotriptan

Fig. 27.35 Examples of reversible MAO enzyme inhibitors 27.79–27.81. The antibiotic linezolid
27.82 bears structural similarity to the oxazolidinones 27.80 and 27.81. It also blocks MAOA.
MAO enzymes also play a role in drug metabolism. Citalopram 27.83, sertraline 27.84, and
triptanes such as 27.85 are metabolized by these enzymes.

as described above. An attempt is therefore made to develop oxazolidinones with


adequate selectivity for the bacterial target, without such side effects.
In the two previous sections, the importance of cytochrome P450 enzymes for drug
metabolism has been introduced. Even the MAO enzymes take on a certain portion of
this task. Compounds such as citalopram 27.83, sertraline 27.84, or triptanes such as
27.85 are also MAO substrates and are metabolized by these enzymes.

27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation

The organism synthesizes a great many important signal molecules from compo-
nents of the lipid membrane. Phospholipids are the starting materials from which
arachidonic acid 27.86 is formed (Fig. 27.36). This chain-type molecule of
684 27 Oxidoreductase Inhibitors

COOH

27.86 Arachidoic Acid

Cyclooxygenase

COOH
O COOH
O COOH
O O
OOH O
27.87 PGG2
OH
27.93 TXA2 Thromboxane
HO Peroxigenase
OH
27.89 PGI2 Prostacyclin

O COOH
HO HO
O
COOH COOH
OH
27.88 PGH2
HO OH O OH
27.91 PGF2 27.92 PGD2

O
COOH

HO OH
27.90 PGE2

Fig. 27.36 Arachidonic acid 27.86 is transformed into the prostaglandin PGH2 27.88 by the
bifunctional enzyme cyclooxygenase by using a cyclooxidation and a peroxidase step. PGH2 is the
starting material for the synthesis of a variety of prostaglandins 27.89–27.93, which are formed by
specific synthases.

20 carbon atoms has a carboxyl group as its only polar function. It is characterized
by four isolated cis double bonds. To be able to generate paracrine hormones such
as the prostaglandins 28.87–27.93 with sufficient water solubility, arachidonic
acid must be oxidized. Oxygen-containing functional groups must be transferred.
This role is taken on by the cyclooxygenases (COX). These are bifunctional
enzymes that catalyze the transformation to prostaglandins in a second step.
Initially a cyclooxidation takes place, then a peroxidase reaction (Fig. 27.36).
Because of the poor water solubility, arachidonic acid diffuses directly from the
membrane into the reaction site of the cyclooxygenase. The enzyme submerges into
the membrane. Three helices are used to allow it to practically swim in the
membrane. These helices attach the protein into the membrane, but they do not
traverse it as is often observed in membrane-anchored proteins. There are two
isoforms, COX-1 and COX-2, the amino acid sequences of which are 65% identical
(Fig. 27.37). Their catalytic sites are almost identically constructed. They are active
as dimers. Access to the catalytic site is obtained through a long channel that opens
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 685

Fig. 27.37 Two isoenzymes COX-1 (green) and COX-2 (blue), which have 65% sequence
homology, are known. They are catalytically active as dimers and submerge into the membrane
with a ring of the hydrophobic helices (coming out of the page, toward the reader). This ring
represents an opening to the channel through which arachidonic acid 27.86 (dark blue) can diffuse
from the membrane into the catalytic site. The superposition of the crystal structures of both
isoforms is shown from the direction of the membrane.

directly to the membrane environment. The natural substrate arachidonic acid 27.86
is taken up in this way. The channel is somewhat narrower in COX-1 than in COX-2
because an isoleucine in a central position is exchanged for a valine. The
arachidonic acid that has diffused into the channel is transformed into endoperoxide
PGG2 by addition of oxygen at C11 and C15 (Fig. 27.36).
The heme cofactor in the vicinity of the reaction channel is essential for the
transformation. Its fifth coordination site is occupied by histidine. The oxidative
oxygen species is bound at the sixth position. The dioxygen species is transferred as
a hydroperoxide in a two-electron reaction. Tyr385 acts as an intermediate tyrosyl
radical for the electron transfer and abstracts a hydrogen atom from C13
(Fig. 27.38). The temporarily present, unsaturated radical adds the peroxide group
to the allyl position at C11. Subsequently, a cyclic peroxide is closed with C9; C8
reacts with C12, which is spatially nearby, and forms a 5-membered carbocyclic
ring. Another hydrogen atom extraction on C13 initiates a peroxide transfer onto
C15, which is also in an allylic position. This is transformed into a hydroxyl
function in the subsequent reduction steps catalyzed by the peroxidase activity. It
is presumed that the peroxidase reaction site of the enzyme located on the opposite
side is accessed from the outside of the protein in the vicinity of the endoplasmatic
reticulum for the reaction step. For this, the oxidized substrate must diffuse out of
the arachidonic acid channel to the position of the peroxidase reaction. Tyr385,
which is found deep in the protein in the vicinity of the heme center, is critical for
686 27 Oxidoreductase Inhibitors

Tyr385 Tyr385
Fe-Heme Fe-Heme
OH
O O O OH
12 H 12
11 O 11
9 15 9 15
8 8

5 5

O O− O O−

27.86 Arachidonic Acid

Tyr385 Tyr385 Tyr385


Fe-Heme Fe-Heme Fe-Heme

O OH OH OH
12 H
O 11 O O O
O OH
9 15 O 12 15 O 12 15
O OH
8C
8 8

5 5 5

O O− O O− O O−
27.87 PGG2 27.88 PGH2

Fig. 27.38 The chemical transformation of arachidonic acid 27.86 to PGG2 27.87 and PGH2
27.88 occurs by an attack of the tyrosyl radical 385 on the C13 carbon atom, from which
a hydrogen atom is abstracted. The intermediately formed, unsaturated radical adds a peroxide
group to C11. A cyclic peroxide is formed with C9 by a ring-closing reaction. The tyrosyl radical
abstracts another hydrogen atom from C13, and C6 closes with C12 a carbocycle, to form PGG2
27.87. The product leaves the binding pocket and is further chemically transformed to PGH2 27.88
in a peroxidase reaction.

the overall reaction. It catalyzes the oxidation and reduction steps according to the
changing oxidation states of the iron. The oxygen species that are to be transferred
are simultaneously supplied from this center. Two enzymatic processes that are
tightly interwoven take place in the COX enzymes. A dioxygen species is needed as
a reagent for cyclooxygenase activity. Tyrosine has a special task because it is
coupled as an intermediate radical to both activities. The radical state of this residue
is formed during the peroxidase reaction and it initiates the cyclooxygenase reac-
tion via homolytic hydrogen atom abstraction. The crystal structures of COX-1 with
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 687

Ile523 Fe

11
8

15 Tyr385
13

Arg120

Ser530

Fig. 27.39 Superposition of arachidonic acid 27.86 (violet) and PGH2 27.88 (gray) in the reaction
channel of COX. The heme center to which oxygen is bound is at the top of the right side. Tyr385
(yellow) is responsible for the hydrogen abstraction from C13 of arachidonic acid. The atoms of
the protein are largely removed, and the reaction channel is indicated with a transparent surface.
The displayed geometry is based on the crystallographically determined complexes of COX with
arachidonic acid and PGH2.

the superimposed arachidonic acid substrate 27.86 (violet) and the product PGH2
27.88 (gray) are shown in Fig. 27.39.
PGH2 27.88 is the central starting material for the synthesis of a series of
arachidonic acid derived products (Fig. 27.36). A variety of synthases are involved
in the transformations that afford the different prostaglandins. COX catalyzes the
rate-determining step, and this explains its central role in the regulation of inflam-
matory processes. Prostaglandins are referred to as inflammatory mediators.
Prostacyclin PGI2 27.89 and PGE2 27.90 increase the vascular permeability. This
leads to tissue swelling, and rubor (redness) occurs as a result of the increased
perfusion. Nociceptive nerve endings are sensitized, and pain perception is
increased. In the stomach, PGI2 and PGE2 are involved in the regulation of the
mucous membranes and the stomach acid production. PGE2 is also associated with
the occurrence of fever in inflammatory processes. The prostaglandin PGF2 27.91 is
associated with reproductive processes. At the beginning of labor, COX-2 is
expressed in the placenta at elevated levels. The PGE2 that is produced, is involved
in stimulating the uterus to contract. PGD2 27.92 takes on the task of regulating
contractions in the bronchial airway. PGH2 is a starting material for the synthesis of
688 27 Oxidoreductase Inhibitors

thromboxane TXA2 27.93. It is formed by COX-1, which is present in thrombo-


cytes (blood platelets). TXA2 binds to the thromboxane receptor, a GPCR
(▶ Sect. 29.1), and activates thrombocyte aggregation. This last step initiates cellular
coagulation and serves to close injured blood vessels (▶ Sects. 23.4 and ▶ 31.2).
Analogously to TGD2, thromboxane also causes smooth-muscle contraction in the
vasculature of the lungs. Moreover, the prostaglandins are associated with important
regulatory processes such as kidney perfusion, body temperature regulation, immune
response modulation, and regulatory process in the ovarian cycle.
Because of their central role in the regulation of such diverse processes in
tissues, enzymes in the synthesis cycle of prostaglandins are ideal candidates for
drug therapy. As already mentioned, we have two isoforms of cyclooxygenases.
COX-1 is ubiquitously expressed in all tissues. It is constitutively present, that is,
its production is largely independent of cell type, cell stage, or other external
influences. This isoform is exclusively found in the platelets. COX-1 occurs in
the endothelial cells of normal blood vessels, whereas the COX-2 isoform is found
in the endothelial cells of proliferating blood vessels, in inflamed tissue, and at sites
of atherosclerotic damage. Furthermore, COX-2 is strongly expressed in some
tumor cells, where it could play a role in tumor growth. It is involved in the
production of prostacyclin PGI2 in the kidney, which then activates renin produc-
tion (▶ Sect. 24.2). COX-1 occurs in the renal cortex and produces PGE2 and PHI2
there, which increases the kidney perfusion and the glomerular filtration rate.
Overdoses of COX-1 inhibitors can therefore exert damaging side effects on the
kidney function. The expression of the second isoform COX-2 is inducible by
many different ways, and the amount that is present in the cell depends strongly on
the cell’s condition and that of the environment. Presumably COX-2 divided from
COX-1 long before the development of vertebrates by gene duplication, and
both isoforms developed in parallel. It was first speculated in 1972 that
there might be two isoforms. Twenty years later the new form was found and
sequenced. Its structural determination guided the development of specific inhibi-
tors for both forms. The first selective COX-2 inhibitors were introduced into
therapy in 1999.
Cyclooxygenase inhibitors are very old and have been used in therapy for a very
long time (Fig. 27.40). Acetylsalicylic acid (ASA) 27.94 was the first to find wide
use (▶ Sect. 3.1). It has an interesting mode of action because it inhibits both
isoforms equally well by an irreversible acetylation of the Ser530 (Fig. 27.41).
ASA diffuses into the COX reaction channel. It most probably forms a salt bridge
with Arg120 and transfers its acetyl group, in principle comparable to the reaction
of a serine hydrolase, to the OH group of Ser530, which is nearby. The channel is
then irreversibly blocked and the enzyme is permanently deactivated. The function
of COX in the inhibited cells can only be restored by new synthesis. This means
that, for example, in the thrombocytes, which lack a nucleus and are therefore
unable to synthesize proteins, the thromboxane production is permanently blocked.
For the 8–12-day lifetime of the blood platelet, the ability to supply thromboxane
A2 (TXA2) and to initiate aggregation is severely limited. It is this effect that is
responsible for the blood-thinning effects of Aspirin ®. Patients are therefore asked
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 689

O O
H3C H3C
OH OH
O OH

O CH3

O H3C
O
CH3

27.94 Acetylsalicylic Acid (ASA) 27.95 Ibuprofen 27.96 Ketoprofen

O
O O
F OH
H3C
OH CH3O OH
CH3
CH3
N
F O

H3C S
Cl O
27.97 Flurbiprofen 27.98 Indometacin 27.99 Sulindac

SO2NH2
SO2NH2
O

Cl OH
NH CH3
N
N H3C
Cl
F3C O N

27.100 Diclofenac 27.101 Celecoxib 27.102 Valdecoxib

SO2CH3 SO2CH3 CH3

N Cl OH
CH3
NH

O Cl N
F
O
27.103 Rofecoxib 27.104 Etoricoxib 27.105 Lumiracoxib

Fig. 27.40 Inhibitors of COX isoenzymes. Acetylsalicylic acid 27.94 and the arylacetic acids or
propionic acids 27.95–27.100 are unspecific inhibitors of both isoforms. After the discovery of the
induced COX-2, the coxibs 27.101–27.104 were developed as selective inhibitors of this isoform.
Rofecoxib was withdrawn from the market due to an increased risk of cardiovascular diseases.
Lumiracoxib 27.105, which is structurally identical to diclofenac with the exception of a Cl/F
exchange and an additional methyl group, was introduced to the market as a COX-2-selective
inhibitor.
690 27 Oxidoreductase Inhibitors

Fig. 27.41 The most probable binding mode of acetylsalicylic acid (ASA) 27.94 with COX-1.
ASA binds in the middle of the reaction channel (gray surface) that is normally occupied by the
natural substrate, arachidonic acid 27.86. The channel spans through the protein with a bent shape
from the lower left. It forms a salt bridge with Arg120 and reacts with the OH group of Ser530 by
transferring its acetyl group. This blocks the channel irreversibly. The additional volume that the
acetyl group blocks is indicated with a violet surface (interior is yellow). The displayed geometry is
based on a crystal structure that was determined with a bromine derivative of ASA.

before any surgery whether they have taken Aspirin ® in the last week. Salicylic
acid, which lacks the acetyl group, is a weak but reversible inhibitor of COX that is
competitive to arachidonic acid. If Ser530 is mutated to Ala, the enzyme is
catalytically fully active. The mutant, however, is only weakly inhibited by ASA.
In addition to ASA, the arylacetic and propionic acids are another group of
slightly selective and reversible COX inhibitors that deserve mention. Among
others, ibuprofen 27.95, ketoprofen 27.96, flurbiprofen 27.97, indometacin 27.98,
sulindac 27.99, or diclofenac 27.100 (Fig. 27.40) belong to this class. Ibuprofen
also binds in the arachidonic acid channel and forms a salt bridge with its terminal
carboxylic acid function to Arg120. Moreover, oxicams, anthranilic acids, and
pyrazole derivatives are important COX inhibitors. They are termed NSAIDs
(non-steroidal anti-inflammatory drugs). The mode of action of paracetamol (acet-
aminophen) 27.64, a very old and widely used analgesic, was associated with COX
enzymes for a long time. Now, however, it seems that this drug might act by
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 691

being conjugated with arachidonic acid through amidation with its metabolite,
p-aminophenol, and in this way, it intervenes in the pain cascade. The newly formed
N-arachidonoyl-p-aminophenol is a nanomolar vanilloid and CB1 receptor antag-
onist, both of which are examples of GPCRs, and the cellular uptake of the
analgetically active anandamide (arachidonoylethanolamide) is inhibited.
Because COX-1 is constitutively expressed in all tissues, unselective COX
inhibitors also act in places where the prostaglandins are needed for other tasks
that have nothing to do with pain. An example is the production of prostacy-
clin 27.89, which is responsible for the regulation of the production of mucous in
the stomach. COX inhibitors block its synthesis, and the protective effect
on the stomach epithelial cells against the severely acidic milieu is lost as an
undesirable side effect. Gastric irritation is the result and can lead to severe
complications.
When it was discovered early in the 1990s that the expression of COX-2 is
upregulated at the site of pain, hopes were high that a side-effect-free pain therapy
could be achieved by selectively inhibiting this enzyme. A careful analysis of both
enzymes showed that there are small but significant differences: in position 523,
COX-1 has an Ile residue, whereas COX-2 has a Val residue. Further, though of less
importance, is an exchange of a Phe residue in COX-1 for a Leu residue in COX-2
at position 503. What can be expected in terms of selectivity from such a small
difference as a methyl group exchange? At the very least, the binding pocket of
COX-2 is 17% larger, and there is a new sub-pocket in the arachidonic acid channel
(Fig. 27.42). It stood to reason that structurally larger inhibitors could be developed
that take advantage of the additional sub-pocket. Such inhibitors can no longer
inhibit COX-1 because of the steric conflict that is caused by the isoleucine residue
at position 523. The first generation of successfully developed COX-2 inhibitors
27.101–27.104 all have a similar structure (Fig. 27.40). In the center is either a five-
or six-membered ring that is usually functionalized with aromatic substituents. This
causes a branched structure that mirrors the larger binding pocket of COX-2 better
than COX-1. In practice, it was demonstrated that the selective COX-2 inhibitors
left COX-1 uninhibited, and the side effects, such as bleeding of the gastric
mucous membranes or a decrease in kidney function, were almost fully eliminated.
The first compounds to come to the market were celecoxib 27.101, valdecoxib
27.102, and rofecoxib 27.103 (Fig. 27.40). Their indications for use ranged from
rheumatism, to osteoarthritis, to chronic polyarthritis, and ankylosing spondylitis
(Bechterew’s disease). All of these diseases are associated with severe pain.
Rofecoxib 27.103 (Vioxx ®) quickly achieved sales in the billions. In 2004, how-
ever, the drug was withdrawn from the market because significant side effects were
observed in patients undergoing long-term therapy. Specifically, an increased risk
of cardiovascular disease was observed, and especially the risk of heart attack,
unstable angina pectoris, and stroke increased. As a result, Merck & Co. experi-
enced a drop in profits in 2004 of 29%. As of March 2006, 10,000 claims for
damages had already accumulated. However, shortly after the withdrawal of
rofecoxib 27.103, Merck introduced a new COX-2 inhibitor, etoricoxib 27.104, to
the market.
692 27 Oxidoreductase Inhibitors

Ile359

Ser530

Ile523(COX-1)

Val523 (COX-2)

Fig. 27.42 Structure of celecoxib 27.101 with COX-2. The inhibitor is shown with a green
surface (interior is blue). Position 523 is a valine in COX-2, but it is an isoleucine in COX-1. If the
Ile residue from COX-1 is superimposed on the valine from the COX-2 structure, the increased
spatial demand of the additional methyl group of the Ile is apparent (surface indicated by the light-
blue net). Ile demands a larger volume in the binding pocket and prevents the binding of the
branched-substituted, five-membered-ring inhibitors. The displayed structure is based on a crystal
structure that was determined with a bromine derivative of celecoxib.

Altogether this raises the question of whether the side effects that were seen with
rofecoxib 27.103 are typical of all COX-2 inhibitors. The cardiovascular risk must
be weighed against the risk of the gastric bleeding that can occur with
acetylsalicylic acid, diclofenac, ibuprofen, or indometacin. Rofecoxib belongs to
the first generation of COX-2 inhibitors, which all have a five-membered ring in the
center. In 2006 lumiracoxib 27.105 (Prexige ®), a COX-2 inhibitor, was introduced
to the market. Structurally it is similar to diclofenac, which is less selective.
It remains to be seen if it shows a different side-effect profile. The example of the
coxibs impressively demonstrates how careful design can exploit even the smallest
difference of a methyl group, that is, an Ile ! Val exchange between COX-1 and
COX-2, to lead up a new class of compounds and successful drugs.

27.10 Synopsis

• Enzyme-catalyzed redox reactions use different cofactors to accomplish the


electron or hydride transfer from the group being oxidized to the group being
reduced. The most important cofactors in oxidoreductases are dinucleotides such
27.10 Synopsis 693

as the nicotinamides NAD(P)+ or the flavine derivatives FMN and FAD and the
iron-containing protoporphyrin ring system in heme enzymes.
• The nicotinamide moiety in NAD(P)+ is an N-substituted pyridine derivative
that either accepts or releases a hydride ion in the 4-position. The cofactor binds
in many oxidoreductases to a conserved fold motif, the nucleotide-binding
Rossmann fold.
• Dihydrofolate reductase is involved in the biosynthesis of thymine. Inhibitors
competitive with the binding site of the natural substrate dihydrofolate have been
developed as potent chemotherapeutics in cancer therapy, or as bacteriostatics to
fight bacterial infections.
• Reduction of the cholesterol blood level is a strategy to fight coronary heart
disease and atherosclerosis as high excess of cholesterol is found in plaques
constricting and thus occluding blood vessels.
• HMG-CoA reductase is involved in the biosynthesis of precursors of cholesterol.
The substrate, composed of two acetate units, is reduced by using two equivalents
of NADPH. Inhibitors, the statins, which occupy the cofactor-binding site, were
first derived from natural compounds discovered by screening microorganisms.
Later fully synthetic derivatives were developed that evolved into the best-selling
drugs ever.
• Aldose reductase, an NADPH-dependent reductase lacking a Rossmann-folded
nucleotide-binding domain, is involved in the polyol pathway, along which
glucose is metabolized to sorbitol and subsequently to fructose. Overloading
this pathway results in increased production of polar compounds, which creates
osmotic stress and oxidative stress as a result of high reductase activity.
• Long-term consequences of poorly controlled blood glucose level in the case of
type-II diabetes preferentially affect cells that do not control their glucose uptake
by insulin. Inhibition of aldose reductase is a viable principle to reduce long-
term complications.
• Aldose reductase is able to reduce a broad scope of different aldehyde substrates.
This is achieved by a highly adaptive binding pocket, which also allows the
development of inhibitors showing largely deviating scaffolds and binding modes.
• Cortisol is transformed to cortisone and vice versa via two isoforms of 11b-
hydroxysteroid dehydrogenase, which is a NADPH-dependent reductase that
takes on a Rossmann fold. 11b-HSD1 inhibition has been suggested as
a promising therapy concept to treat metabolic syndrome.
• The cytochrome P450 enzymes are a superfamily of heme proteins that carry out
biochemical transformations as monooxygenases by introducing oxygen onto
a substrate being oxidized. They are particularly involved in the metabolism of
xenobiotics and a major part of the administered drug molecules are metabolized
in CYP 3A4, CYP 2D6, and CYP 2C9.
• The CYP enzymes are highly adaptive and accommodate substrates of signifi-
cantly different sizes. They can be inhibited by drug molecules, particularly
those containing heteroaromatic rings that coordinate the catalytic iron ion in the
heme center. Their expression can be induced and thus upregulated by xenobi-
otics activating, for instance, the PXR transcription factor.
694 27 Oxidoreductase Inhibitors

• Because the equipment with cytochrome P450 enzymes varies with geno- and
phenotype, this polymorphism causes varying metabolic behavior between dif-
ferent individuals. Differentiation into slow, extensive, and fast metabolizers has
consequences for the prescription and required dose level of a given drug
metabolized by the involved CYPs.
• Because the activity of a metabolizing CYP enzyme can be further modulated
either through inhibition or induction by coadministered drugs or by xenobiotics
taken up in the diet, severe consequences with regard to the dose level present in
the body can result, and this can cause undesired and dangerous side effects or
unexpected failure of drug action.
• Monoamine oxidases MAOA and MAOB are FAD-dependent oxidases and
metabolize important neurotransmitters such as dopamine, adrenaline, or sero-
tonin. Inhibition of these enzymes can help in the therapy of depression,
Alzheimer’s, or Parkinson’s disease.
• Most of the current MAO inhibitors are activated by an initial redox step and
a covalent attachment is formed to the FAD cofactor via a highly reactive
intermediate; this leads to an irreversible chemical modification of its redox
properties.
• The membrane-associated cyclooxygenases COX-1 and COX-2 synthesize the
endoperoxide PGG2, which is a precursor to a large variety of prostaglandins,
from arachidonic acid. Prostaglandins are an important class of paracrine hor-
mones and also referred to as inflammatory mediators.
• COX contains a heme center, and PGG2 is synthesized through a cyclooxidation
step involving radical intermediates. In a subsequent peroxidation step involving
release and diffusion of the substrate to another reaction site, PGG2 is further
modified to PGH2.
• COX is inhibited by non-steroidal anti-inflammatory drugs such as
acetylsalicylic acid, ibuprofen, indometacine, or diclofenac. They bind to the
reaction channel and block access of the natural substrate arachidonic acid.
• Acetylsalicylic acid transfers its acetyl group irreversibly to a channel-exposed
hydroxyl group of Ser530 in a reaction similar to that in serine hydrolases. As
a consequence, in cells lacking a nucleus such as thrombocytes, prostaglandin
synthesis and its products such as thromboxane are permanently blocked for the
lifetime of the cell.
• Two isoforms of COX exist. COX-1 is ubiquitously expressed in all tissues
and constitutively present. Due to its multiple involvement in many physiolog-
ical processes overdosing of COX-1 inhibitors can exert severe side
effects. COX-2 is induced and found in endothelial cells of proliferating
blood vessels, inflamed tissue, sites of atherosclerotic damage, and in some
tumor cells. This makes selective COX-2 inhibition a prospective therapeutic
principle.
• COX-1 and COX-2 differ in the reaction channel by the crucial exchange of an
isoleucine for a valine residue. The additional volume created in COX-2 by the
absent methyl group gives rise to the development of size-extended furcated
Bibliography 695

inhibitors, the coxibs. Their indications range from rheumatism, osteoarthritis,


chronic polyarthritis, to ankylosing spondylitis; all of these diseases are associ-
ated with severe pain.

Bibliography

General Literature
Chan DCN, Anderson AC. Towards species-specific antifolates. Curr Med Chem. 2006;13:
377–98.
Endo A. A historical perspective on the discovery of statins. Proc Jpn Acad Ser B. 2010;86:
484–93.
Flower RJ. The development of COX-2 inhibitors. Nat Rev Drug Discov. 2003;2:179–91.
Gangjee A, Jain HD. Antifolates – past, present and future. Curr Med Chem Anti-Cancer Agents.
2004;4:405–10.
Hoffmann F, Maser E. Carbonyl reductases and pluripotent hydroxysteroid dehydrogenases of the
short-chain dehydrogenase/reductase superfamily. Drug Metab Rev. 2007;39:87–144.
Lamb DC, Waterman MR, Kelly SL, Guengerich FP. Cytochromes P450 and drug discovery. Curr
Opin Biotechnol. 2007;18:504–12.
Michaux C, Charlier C. Structural approaches for COX-2 inhibition. Mini Rev Med Chem.
2004;4:603–15.
Mitchell JA, Warner TD. COX isoforms in the cardiovascular system: understanding the activities
of non-steroidal anti-inflammatory drugs. Nat Rev Drug Discov. 2006;5:75–86.
Oates P. Aldose reductase, still a compelling target for diabetic neuropathy. Curr Drug Targets.
2008;9:14–36.
Tobert JA. Lovastatin and beyond: the history of HMG-CoA reductase inhibitors. Nat Rev Drug
Discov. 2003;2:517–26.
Vagelos PR. Are prescription drug prices high? Science. 1991;252:1080–4.
Webster SP, Pallin TD. 11b-Hydroxysteroid dehydrogenase type 1 inhibitors as therapeutic
agents. Expert Opin Ther Patents. 2007;17:1407–22.
Weinshilboum R, Wang L. Pharmacogenomics: bench to bedside. Nat Rev Drug Discov.
2004;3:739–48.
Wienkers LC, Heath TG. Predicting in vivo drug interactions from in vitro drug discovery data.
Nat Rev Drug Discov. 2005;4:825–33.
Xia W, Low PS. Folate-targeted therapies for cancer. J Med Chem. 2010;53:6811–24.
Youdim MBH, Edmondson D, Tipton KF. The therapeutic potential of monoamine oxidase
inhibitors. Nat Rev Neurosci. 2006;7:295–309.

Special Literature

Bertilsson L, Lou YQ, et al. Pronounced differences between native Chinese and Swedish
populations in the polymorphic hydroxylations of Debrisoquin and S-Mephenytoin. Clin
Pharmacol Ther. 1992;51:388–97.
Cody V, Pace J, Chisum K, Rosowsky A. New insights into DHFR interactions: analysis
of Pneumocystis carinii and mouse DHFR complexes with NADPH and two highly potent
5-(o-Carboxy(alkyloxy) trimethoprim derivatives reveals conformational correlations with
activity and novel parallel ring stacking interactions. Proteins. 2006;65:959–69.
696 27 Oxidoreductase Inhibitors

Daly AK. Pharmacogenetics of the cytochromes P450. Curr Top Med Chem. 2004;4:1733–44.
De Colibus L, Li M, et al. Three-dimensional structure of human monoamine oxidase (MAO A):
relation to the Structure of rat MAO A and human MAO B. PNAS. 2005;102:12684–9.
Ekroos M, Sjögren T. Structural basis for ligand promiscuity in cytochrome P450 3A4. PNAS.
2006;103:13682–7.
FitzGerald GA. COX-2 and beyond: approaches to prostaglandin inhibition in human disease. Nat
Rev Drug Discov. 2003;2:879–90.
Istvan ES, Palnitkar M, Buchanan SK, Deisenhofer J. Crystal structure of the catalytic ortion of
human HMGCoA reductase: insights into regulation of activity and catalysis. EMBO J.
2000;19:819–30.
Rosowsky A, Forsch RA, Wright JE. Synthesis and in vivo antifolate activity of rotationally
restricted aminopterin and methotrexate analogues. J Med Chem. 2004;47:6958–63.
Agonists and Antagonists of Nuclear
Receptors 28

For all the joy of the structure-based design of enzyme inhibitors, it must not be
forgotten that less than half of the prescribed drugs act on enzymes. Many other
drugs have receptors, transporters, pores, or ion channels as target structures. Most
receptors mediate the information transfer from the exterior into the interior of the
cell. Either activating or blocking them changes the cell’s state. In this way they can
take on modulating tasks. Transporters, pores, and ion channels serve to transport
selected substances across the membrane, especially substances that are unable to
cross by passive diffusion because of their polar character. Just as receptors the
latter proteins are embedded in the cell membrane. Before we turn to this class of
membrane-bound targets, another class of receptors that is found in the cell’s
interior should be considered. Nuclear receptors are controlled by specific ligands.
An endogenous hormone must first penetrate the cell to achieve activation. This is
usually accomplished by passive diffusion through the membrane. The ligands must
therefore possess adequate lipophilic or amphiphilic properties or must be sub-
strates of transporters.

28.1 Nuclear Receptors Are Transcription Factors

Nuclear receptors are soluble receptors that are found in the cytosol. As tran-
scription factors, they regulate the expression of specific genes in the cell nucleus
and are therefore responsible for the production of proteins. They bind directly to
DNA and take on an important role in gene regulation in embryonic development,
in cell growth, and in cell differentiation and specialization. Malfunctioning of
these receptors leads to diseases with uncontrolled cell growth (e.g., cancer),
metabolic disorders (diabetes or obesity), or reproductive disruption (infertility).
They are activated by hormones. These natural ligands, which include the steroid
hormones and also lipophilic ligands such as retinoic acid, diverse fatty acids,
triiodothyronine, vitamin D, prostaglandins, bile acids, and phospholipids must
passively cross the cell membrane barrier (Fig. 28.1). Once they arrive at the site of
action, they bind to the ligand-binding domains of the nuclear receptors. From the

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_28, 697


# Springer-Verlag Berlin Heidelberg 2013
698 28 Agonists and Antagonists of Nuclear Receptors

OH R
H3C H3C
19
H H3C H
10
H H H H
HO O
28.1 Estradiol 28.2 Progesterone, R = COCH3
28.3 Testosterone, R = OH
H3C O
CH3
OH NH2 OH
CH3
COOH
HO
I I

HO O I

O
HO OH 3,5,3'-Triodothyronine CH3
OH
1,25-Dihydroxyvitamin D3
Prostaglandin D2
CH3 CH3
COOH
COOH

CH3

All-trans Retinoic Acid CH3


α-Linolenic Acid

Fig. 28.1 The natural ligands of nuclear receptors are steroids such as estradiol 28.1,
progesterone 28.2, and testosterone 28.3 as well as molecules such as retinoic acid, fatty acids,
triiodothyronine, vitamin D, or prostaglandines.

point of view of drug design, these receptors are interesting target structures
because the natural ligands correspond to the typical size of a drug molecule. In
2003, 34 of the 200 most often prescribed drugs acted on nuclear receptors. At first
glance, these target structures seem ideally suitable, but the biological control of
gene expression is, in contrast, very complex. The receptors not only have ligand-
dependent domains, but also ligand-independent domains to activate transcription.
As soon as the receptors migrate into the cell nucleus, the coactivators, corepressors,
and transcription factors contribute to the regulation of gene expression. An
upregulation as well as a downregulation can be achieved. They also seem to interact
with other signal transduction pathways that are controlled by, for instance, NF-kB
or activator protein AP-1. On the other hand, in view of molecular diversity, the
protein family of nuclear receptors seems to be straightforward. In our genome, there
are 48 genes that code for the different receptors.
28.2 The Structure of Nuclear Receptors 699

28.2 The Structure of Nuclear Receptors

Nuclear receptors are all constructed according to the same blueprint. They contain
three domains. The N-terminal A/B region is the most variable in the family.
It contains the transactivation domain and is involved in the ligand-independent
recognition of cofactors and further transcription factors. Next comes the
DNA-binding domain, which contains about 70 amino acids and two so-called
zinc-finger motifs. This domain is the most conserved in the entire gene family.
The C-terminal domain ends with the ligand-binding region, which contains about
250 amino acids. It hosts the binding site of low-molecular-weight ligands and
contributes an additional regulatory element to the recognition of coactivators
and other transcription factors.
Nuclear receptors are divided into two groups. The first one comprises steroid
receptors, which form a homodimer to be activated. The second large group
contains receptors that form a heterodimer with the promiscuous retinoid-X
receptor (RXR) to function. There are further receptors that can bind to DNA as
monomers. The dimerization is achieved as a response to the binding of an agonist,
or the dimer formation is stabilized by the bound agonist. Some nuclear receptors
reside in the cytosol as inactive complexes with heat shock protein. Ligand binding
stimulates the decomposition of these initially inactive complexes and triggers the
signal to migrate into the cell nucleus. There, the dimerized receptor binds with its
DNA-binding domain to a so-called DNA-response element, which resides on the
target gene in the promoter or repressor region. The newly formed complex serves
as a further docking site for coactivators. Their additional binding is translated into
an initiation signal for the start of transcription and subsequent gene expression.
Each DNA-binding domain recognizes a specific pattern of six bases in the
major groove of DNA by using a two-helix motif (▶ Sect. 14.9). This pattern is
located mirror-symmetrically on both complementary strand segments (Fig. 28.2)
in opposite directions. Two zinc fingers stabilize the two-helix motif. For this, the
zinc ion coordinates tetrahedrally to four neighboring cysteine residues, which
enables crosslinking within the protein strand.
The ligand-binding domains of the nuclear receptors also follow a common
construction principle. They are made up of 12 helices. The sequence at the end of
the 12th helix has a particular task. It opens and closes access to the ligand-binding
pocket like a door. In doing so, it undergoes a spatial rearrangement that gives
a signal for the activation for the receptor (Sect. 28.4).
The ligand-binding pockets in the nuclear receptors encompasses about
400–600 Å3. They have polar amino acids on both ends and a belt of hydrophobic
residues in the center. The ligand binding pockets are even larger in the receptors
that form heterodimers with the RXR retinoic acid receptor. In the peroxisomal
proliferation-activated receptors PPAR, they can encompass up to 1,300 Å3.
Despite having a common architecture and rather broad variation ranges in volume
for the ligand accomodation, many ligand-binding domains are able to achieve
astonishing selectivity with respect to the recognition of their ligands. This selec-
tivity shall be illuminated in more detail in the following section.
700 28 Agonists and Antagonists of Nuclear Receptors

Fig. 28.2 Portion of the


crystal structure of the
DNA-binding domain of the
estrogen receptor. The receptor
binds to the major groove of
DNA (backbone is white, bases
are color coded, cf. Zn2+
▶ Fig. 14.17) with a zinc finger
made from a 2-helix motif
(brown) and catches a specific
six-base pattern. The two-helix
motif is crosslinked by a zinc
ion (blue-gray), which is
tetrahedrally coordinated by Zn2+
cysteine residues in the
vicinity. G C T A

28.3 Steroid Hormones: How Small Differences Translate


to the Receptor

The male and female sexual hormones and the corticosteroids are substances with
stunningly similar structures. All are derived from an identical basic scaffold. On
a grand scale, Nature manages to invoke a broad spectrum of the most diverse
biological effects with minimal structural variation. In doing so, a mistake could
have fatal consequences. The difference between estradiol 28.1, progesterone 28.2,
and testosterone 28.3 shall be examined in greater detail. A hydroxyl group occurs
on the aromatic ring of estradiol in the first ring of the steroid scaffold that is
changed to a carbonyl group in the partially hydrogenated ring of progesterone and
testosterone. The aromatic A-ring of the female hormone estradiol adopts a planar
structure, but the ring in the male hormone, testosterone, forms a half-chair
(Fig. 28.3). Furthermore, a methyl group occurs at carbon atom 10 in the male
hormones and in progesterone. The 19-methyl group is missing in this position
because of the aromatic character of the first ring in estradiol. The 19-methyl group
shields this first ring from above and makes a fairly large spatial demand in
progesterone and testosterone.
How is this small difference recognized by the receptor? As the crystal
structure of the estrogen receptor with bound estradiol shows, the hydroxyl
group on the aromatic A-ring is involved in a hydrogen-bonding network with
Glu353 and, via a water molecule, with Arg394 (Fig. 28.4). Glu353 is most
probably deprotonated and recognizes the hormone by the donor functionality of
its hydroxyl group. A glutamine is found in the same position in the structure
of the progesterone receptor (Fig. 28.5). It forms a hydrogen bond with the
carbonyl group in the A-ring of progesterone through the amino group of its
28.3 Steroid Hormones: How Small Differences Translate to the Receptor 701

OH OH

H H

10 H H
H H
HO O
28.1 Estradiol 28.3 Testosterone

Fig. 28.3 The difference between the female hormone estradiol 28.1 and the male hormone
testosterone 28.3 consists of a change from a hydroxyl group on the aromatic ring of estradiol to
a carbonyl group in a partially hydrogenated ring of testosterone. The aromatic A-ring of estradiol
takes on a planar structure, whereas the A-ring in testosterone forms a half-chair. A methyl group
occurs on carbon C10 in the male hormone that gives the molecule additional volume.

Leu384
Leu387

H2O
Glu353

His524

Arg394

Fig. 28.4 Portion of the crystal structure of the estrogen receptor with bound estradiol (surface is
green, interior is blue). The hydroxyl group of the aromatic A-ring forms an H-bond to Glu353 and
a water-mediated bond to Arg294. The volume above the planar, aromatic A-ring is limited by
Leu384 and Leu387.
702 28 Agonists and Antagonists of Nuclear Receptors

Met756 Met759

Gln725
H2O

Thr894

Arg766

Fig. 28.5 Portion of the crystal structure of the progesterone receptor with bound progesterone
(surface is green, interior is blue). The carbonyl group on the partially hydrogenated A-ring
accepts an H-bond form Gln725 and binds to Arg766 through a water molecule. Because of the
19-methyl group above the A-ring, the steroid occupies a larger volume that is limited by Met756
and Met759, which have more flexible and therefore better adaptive side chains.

terminal carboxamide. The water-mediated H-bond to an arginine is also found


in this receptor. Therefore an exchange from glutamate to glutamine causes
a change from a hydroxyl group to a carbonyl function in the hormone. H-bond
donors and acceptors are exchanged in pairs! How is the additional volume
demand of the 19-methyl group recognized by the receptor? It is hydrophobic
amino acids in the central area that form the contact surface to the bound
hormone. In the estrogen receptor, two bulky, terminally branched leucine
residues shield the space above the A-ring. They efficiently limit the volume
of the binding pocket with their rigid geometry (Fig. 28.4). Two methionine
residues are found in the same position in the progesterone receptor (Fig. 28.5).
They are also bulky and hydrophobic. Because of their linear construction,
however, they can adapt themselves to the shape of the bound ligand and they
allow a small volume for the 19-methyl group. The structure of the androgen
receptor with testosterone is also available. The same amino acids as in the
progesterone receptor for the recognition of the carbonyl function on the A-ring
are present and found in direct vicinity of the 19-methyl group. Each receptor
achieves the required selectivity for its specific ligand through these small but
distinct changes. For example, the difference in the side chain at C17 helps to
discriminate between progesterone and testosterone.
28.4 Helix Open, Helix Closed: How Agonists and Antagonists Are Differentiated 703

28.4 Helix Open, Helix Closed: How Agonists and Antagonists


Are Differentiated

The ligand-binding domains of nuclear receptors are constructed from 12 helices


(Fig. 28.6). The 12th and last helix in the sequence, which is also called the AF-2
helix (activation function 2) closes up the entrance to the ligand-binding pocket
like a terminal gate. For a ligand to gain access, helix 12 must undergo a spatial
rearrangement. If an agonist is bound, the gate closes itself again. At the same
time, this rearrangement opens the recognition site for the coactivator. This
coactivator interacts with the available surface segment through a Leu-x-x-Leu-
Leu (or LxxLL motif, whereby x stands for an arbitrary amino acid) binding
motif, which is a segment of an amphiphilic helix (Fig. 28.8). The antagonist
binding suppresses the rearrangement of helix 12, which can no longer close the
entrance area and block the recognition site for the LxxLL motif of the
coactivator. The signal transduction does not occur, the receptor does not migrate
into the cell nucleus, and the response element binding to the DNA also does not
occur. At the molecular level the difference between agonist and antagonists has
been most thoroughly investigated for the estrogen receptor. If the natural agonist
estradiol 28.1 or a synthetic replacement such as diethylstilbestrol 28.4 binds
(Fig. 28.7), helix 12 lies in its active position and allows the coactivator to bind to
the peptide-recognition motif. Asp351 takes on an important role in the stabili-
zation of this helix position. It is located in the middle of the rather long helix 3,
and it is found exactly opposite from the N-terminal end of helix 12. The three
NH groups that protrude over the edge of the helix end are 3–4 Å away from the
carboxylate groups of this acidic residue. Positions that lie opposite such a helix
end are predestined for the stabilization of negative charges. This is caused by
a strong dipole moment that is formed along the helix’s axis. Antagonists such as
raloxifen 28.5 or 4-hydroxy-tamoxifen 28.6 have side chains that remain in the
entrance channel after ligand binding, and in doing so, prevent the closure by
helix 12. The antagonists carry a basic group on the end of this side chain. It is
most probably in a positively charged state and forms a salt bridge to Asp351.
In this way, the antagonists manage to compensate for the negative charge of the
acidic amino acid.
The orientation of helix 12 in the active position during agonist binding is an
important prerequisite for the availability of the recognition site for the LxxLL
motif on the surface of the coactivator. Cocrystallization of the 11-membered
peptide with the estradiol-bound receptor has been accomplished (Fig. 28.8).
The peptide takes on a helical geometry and orients the decisive leucine
residues in a hydrophobic groove on the receptor surface. Three amino acids
from helix 12 help to guide a part of this surface. Once again, it is a negatively
charged carboxylate group, here of Glu448 on helix 12 that is positioned on the
opposite side of the N-terminal end of the helical segment from the LxxLL
peptide. The electrostatic interaction stabilizes the intermolecular contact
here too.
704 28 Agonists and Antagonists of Nuclear Receptors

a b

Asp351
Asp351

c d

Fig. 28.6 The ligand-binding domain of the nuclear receptors is constructed from 12 helices.
Upon binding an agonist such as estradiol 28.1, the 12th and last helix (blue) closes like a gate over
the entrance to the ligand-binding pocket (a, c). Asp351 orients on the tip of the helix and stabilizes
it in the active position. At the same time, the recognition site is opened for the coactivator with the
helical LxxLL motif (violet) to bind to the receptor. Upon binding an antagonist such as raloxifen
28.5, helix 12 cannot close up the entrance channel (b, d). The terminal basic group of the
antagonists forms a hydrogen bond to Asp351.
28.4 Helix Open, Helix Closed: How Agonists and Antagonists Are Differentiated 705

Fig. 28.7 Estradiol 28.1 and OH


H3C CH3
diethylstilbestrol 28.4 are
estrogen receptor agonists, H
raloxifen 28.5 and 4-hydroxy- OH
HO
tamoxifen 28.6 are H H
antagonists. HO H3C
28.1 Estradiol 28.4 Diethylstilbestrol

CH3
NH
H3C N

O
O

O
HO
OH
HO S
CH3
28.5 Raloxifen 28.6 4-Hydroxy-Tamoxifen

Helix 12

Glu448
LxxLL-Peptid

Fig. 28.8 The recognition site of the LxxLL motif on the surface of the coactivator in this crystal
structure is reflected in the 11-membered peptide with the estradiol-bound receptor. The peptide
takes on a helical geometry and orients its three leucine residues in the hydrophobic groove on the
surface. Three amino acids of helix 12 (blue) form a part of this surface. Glu488 on helix 12 binds
to the LxxLL motif at the tip of the N-terminal end of the helix.
706 28 Agonists and Antagonists of Nuclear Receptors

28.5 Agonists and Antagonists of Steroid Hormone Receptors

Steroid hormones are produced in endocrine adenocytes, for example, in the adrenal
glands, the testes, or in the ovaries, and are released into the blood stream. There they
circulate freely, often by binding to a transport protein. Far remote from their site of
production, they reach the target cells for which the signal is meant. Because of their
lipophilic character, they can passively permeate through membranes. Once in the
cytosol, they bind to the corresponding steroid receptor. Five classes of steroid
receptors are differentiated: glucocorticoid, mineralocorticoid, androgen, estrogen,
and progesterone receptors. Two subtypes of estrogen receptor (a-ER and b-ER)
have been discovered that differ in the exchange of a leucine for a methionine, and
a methionine for an isoleucine in the vicinity of the binding site of the C- and D-ring
of the steroid scaffold. The binding affinity to their receptors is extremely large,
typically 0.05–50 nM. As a result of the binding, the gene expression that was
described in the previous sections, is initiated. The cellular response to these
processes occurs within hours to days. In addition to this control process, which
has direct gene expression as a goal, steroid hormones can also initiate fast regula-
tory processes in cells. For this, binding occurs to receptors on the cell exterior.
These receptors, which belong to the class of G protein-coupled receptors or to the
dimerizing receptors with a tyrosine kinase domain, are discussed in ▶ Chap. 29,
“Agonists and Antagonists of Membrane-Bound Receptors.”
As an example, the function of the estrogen receptor shall be examined in more
detail. Estrogen controls the menstrual cycle of women in childbearing years. In
addition to this function, estrogen reduces the risk of coronary heart disease and
supports the maintenance of bone density. After menopause, at an age of about
50 years, the ovaries stop producing estrogen so that women at this age are at an
increased risk of coronary heart disease and osteoporosis. Altogether the hormone
homeostasis of the organism must find a new equilibrium. Often this is accompa-
nied by unpleasant physical and psychological symptoms in menopause. Hormone
replacement therapy was proposed in the 1960s as a solution. The body is supplied
with estradiol 28.1 or an analogous receptor agonist. For example, diethylstilbestrol
28.4, which is related except that it lacks a steroid scaffold, was once used, but is no
longer prescribed because of an elevated risk of cancer.
The long-term use of hormone replacement therapy increases the risk of breast
cancer significantly. This devastating result was proven in a study in the USA in
which a million nurses took part. The relationship between ovarian function and the
development of breast cancer had already been described over a hundred years ago.
In 1936 Antonie Lacassagne speculated that the effect of estrogen antagonists could
lead to the prevention of breast cancer. The discovery of the first antagonists was
once again purely by accident. Compound 28.7 was synthesized at Merrel in the
USA in the late 1950s as part of a cardiovascular research program (Fig. 28.9).
Because of its chemical similarity to 28.8, a then-known synthetic estrogen surro-
gate, it was also examined in an estrogen-activity test. This effect was not seen, but
rather the opposite: antiestrogen activity. Clomiphene 28.9 was obtained by minor
structural modification. This compound was introduced to the market in the 1960s
28.5 Agonists and Antagonists of Steroid Hormone Receptors 707

CH3 CH3
N OMe
O CH3 N CH3
O

OMe
H
OMe

Cl
OH MeO
H H3C

28.6 Tamoxifen 28.7 28.8


CH3

N CH3 N
O O

CH3 OH

H H

HO (CH2)9SO(CH2)3CF2CF3
Cl
MeO
28.9 Clomiphen 28.10 Nafoxidin 28.11 Fulvestrant

Fig. 28.9 Tamoxifen 28.6 was developed from compound 28.7, which originated in cardiovas-
cular research. The marketed product has a hydrogen atom in the 4-position, but the actual active
substance is the oxidation product, the 4-hydroxy derivative 28.6. Fulvestrant 28.11 does not show
the same resistance that has been observed with tamoxifen.

as an ovulation inducer to treat infertility in women. With this, the goal of introduc-
ing a drug to prevent breast cancer initially missed the mark. The development of
nafoxidine 28.10 was also discontinued because of pronounced side effects.
In England, ICI had been pursuing a program for the development of non-steroidal
estrogen replacements for breast cancer therapy since 1940. Because the interest in
contraceptives was in the foreground in the 1970s, it must be seen as a stroke of luck
that tamoxifen 28.6 emerged from this program in 1973 and obtained approval for
the treatment of breast cancer. The compound quickly proved to be a breakthrough
in the treatment of breast cancer. Today it is estimated that the use of tamoxifen in
the industrialized countries has saved one million years of women’s lives each year.
It was only discovered in retrospect that tamoxifen is a prodrug. The actual
active substance is obtained by hydroxylation at the 4-position. It stood ripe for
further development, from which raloxifen 28.5, among others, emerged
(Fig. 28.7). All derivatives with antagonistic effects carry a side chain with
a basic group. As explained in Sect. 28.4, this side chain blocks the refolding of
helix 12 into the active position. The example of raloxifen also shows how complex
effects on the total organism can be. Originally raloxifen was developed for breast
cancer therapy. However, this goal was abandoned in the late 1980s because the
compound displayed no advantages over tamoxifen. It proved, however, to be
708 28 Agonists and Antagonists of Nuclear Receptors

CH3
COCH3
N H3C
OH H3C OH OCOCH3
H3C H3C
H3C H
H H H H
H H H O
HO O Cl
28.12 Ethinylestradiol 28.13 Mifepriston RU486 28.14 Cyproterone Acetate

Fig. 28.10 The introduction of a 17b-ethinyl group leads to orally active steroids, for example,
ethinylestradiol 28.12. The progesterone receptor antagonist mifepristone 28.13 acts as an
antigestagen: a “morning after pill.” The antiandrogen cyproterone acetate 28.14 has gained
importance in the specific therapy of prostate cancer.

a potent drug for the treatment and prevention of osteoporosis. Moreover, it lowered
the risk of breast cancer. Raloxifen is considered to be a selective estrogen receptor
modulator (SERM). Compounds with such a profile are believed to have great
potential as a hormone-replacement therapy without increasing the risk of osteo-
porosis, coronary heart disease, or breast cancer.
Often the entire profile of a compound is only apparent after long-term use.
Tamoxifen afforded the unsettling result that 50% of breast tumors began to grow
again under long-term therapy. The development of resistance is explained by the
fact that the estrogen receptor is phosphorylated by protein kinase A. This does not
prevent tamoxifen from binding, but the antagonistic effect is reversed. Fulvestrant
28.11 seems to provide a solution to this problem because resistance has not yet
been observed (Fig. 28.9).
The progesterone receptor is closely related to the estrogen receptor. Whereas
estrogen 28.1 (follicle hormone) promotes and steers oocyte maturation in the
proliferation phase and indirectly initiates ovulation, progesterone 28.2 (corpus
luteum hormone) is formed in the secretory phase of the menstrual cycle. It controls
the cyclic changes in the uterus and uterine mucous membranes, decreases the
fertility, and maintains an already-intact pregnancy. Gestagens, progesterone recep-
tor agonists, as well as estrogen derivatives, were introduced as contraceptive
hormones (Fig. 28.10). In the 1950s, Carl Djerassi and Gregory Pincus had already
laid the foundations for oral contraceptives. It is based on the timed administration
of a combination of an estrogen with a gestagen; this suppresses the ovulation, that
is, release of a mature egg cell, at mid-cycle. A progesterone antagonist,
mifeprostone 28.13 (RU486), which in analogy to the estrogen antagonists, carries
a nitrogen function on its side chain, was discovered at Roussel Uclaf during
a search for glucocorticoid receptor antagonists. Its use as a “morning after pill”
due to its antigestagen effects is highly controversial in many countries. For the
termination of an intact pregnancy, the single administration of 600 mg of mifep-
ristone, and then 36–48 h later a prostaglandin to induce uterine contractions, is
used. This combination leads to a termination of pregnancy in 96% of cases up until
the 7th week of pregnancy. Persistent bleeding can occur as a side effect, and in rare
28.6 Ligands of PPAR Receptors 709

cases, heart function disorders. The opponents to this substance can be consoled
that, for these reasons alone, it is not appropriate for widespread use.
The male hormone testosterone 28.3 acts as an agonist on the androgen receptor.
It is responsible for the development of secondary male characteristics, intervenes
in the process of spermatogenesis, and regulates protein synthesis. This character-
istic of the enlargement of skeletal muscle cells by androgens has led to its use as an
anabolic hormone to improve performance in competitive athletes, bodybuilding,
or in livestock breeding. Antiandrogens such as cyproterone acetate 28.14 are
suitable for the treatment of prostate cancer.
In addition to the sexual hormones, there are even more active substances from
the class of steroids. In addition to the cardiac glycosides, which occur in plants, the
adrenal corticosteroids or corticoids are of great importance. If the adrenal glands
fail, the absence of these substances can lead to death, or in the case of an adrenal
under or over-function, severe illness can occur. They are distinguished by their
binding to the respective nuclear receptors into glucocorticoids and mineralocorti-
coids. The basic scaffold is very closely related to progesterone 28.2, though they
carry more functional groups (28.15–28.17; Fig. 28.11). The natural agonists of
both receptors are cortisol 28.16 and aldosterone 28.17. The therapeutic importance
of glucocorticoids was underestimated in the beginning. It was only after specific
drugs without mineralocorticoid side effects, such as dexamethasone 28.18
and betamethasone 28.19, were available that broad therapeutic application was
possible. Glucocorticoids influence metabolism, intervene in water and electrolyte
homeostasis, and influence the cardiovascular and nervous system. They are anti-
inflammatory, immunosupressive, and antiallergic. Highly active variants are used in
emergency cases of anaphylactic shock or sepsis. They also have severe side effects.
Their use requires that strict attention is paid to the indication and dosage.
The mineralocorticoids influence the water and electrolyte homeostasis. They
increase the resorption of sodium ions in the kidney, and increase the excretion of
potassium. Ligands for the mineralocorticoid receptor can be used as diuretics.
A potassium-sparing diuresis can be achieved with the structurally related
spironolactone 28.20, which competitively displaces aldosterone from its receptor.
The selective antagonist eplerenone 28.21 is used as a selective compound for the
treatment of hypertension and congestive heart failure.

28.6 Ligands of PPAR Receptors

Of the group of RXR heterodimer receptors, the peroxisomal proliferator-activated


receptors PPAR have achieved great importance as a drug target. Multiple subtypes
are distinguished including PPARa, PPARb/d, and PPARg, whereby three isoforms
are described for the g-type. Their natural ligands are metabolic products derived
from fatty acids, prostaglandins, leukotrienes, cholesterol, and bile acid. These
receptors serve as sensors to control the biosynthesis and metabolism in lipid
homeostasis. They are also involved in the release of cytokines such as TNF-a
and other mediators from adipocytes. PPARa occurs predominantly in the liver.
710 28 Agonists and Antagonists of Nuclear Receptors

CH2OH
O
H3C R 28.15 Corticosterone, R = H
HO
H3C H
28.16 Cortisol, R = OH
H H
O CH2OH
O
CH2OH HO H3C OH
HO R
O H3C H
O

H3C H F H
O
H H
28.18 Dexamethasone, R = a-CH3
O
28.17 Aldosterone 28.19 Betamethasone, R = b -CH3

O O

H3C O H3C O
H3C H H3C O H
H H H
O S O
O CH3
H3C O O

28.20 Spironolactone 28.21 Eplerenone

Fig. 28.11 Corticosterone 28.15 and cortisol (hydrocortisone) 28.16 are glucocorticosterods.
They regulate the release of glucose, both by stimulating gluconeogenesis, and by inhibiting its
metabolic degradation. A stress-induced release of cortisol leads to rapid release of glucose as an
energy source. The mineralocorticoid aldosterone 28.17 is responsible for the regulation of the
water and electrolyte homeostasis. The naturally occurring glucocorticoids act in an anti-
inflammatory manner, but they have mineralocorticoid side effects. Dexamethasone 28.18 and
betamethasone 28.19 are “pure” glucocorticoids. They have 30-times stronger anti-inflammatory
activity and the mineralocorticoid side effects of cortisol are absent. The diuretic spironolactone
28.20 achieves its effect by a competitive displacement of aldosterone from its receptor.
Eplerenone 28.21 is a mineralocorticoid receptor antagonist and is used for the therapy of
hypertension and congestive heart failure.

Its activation increases the fatty acid degradation in this organ. Artificial ligands
of this receptor type are lipid-lowering compounds from the group of fibrates
28.22–28.26 (Fig. 28.12). A crystal structure with the bound agonist 28.27 and
antagonist 28.28 was determined with the PPARa receptor (Fig. 28.13). As in the
case of the estrogen receptor, it is again helix 12 that orients over the entrance gate
of the ligand upon agonist binding. The terminal acid group of the agonist forms
a hydrogen bond to Tyr464 and stabilizes helix 12 in the active position. Antagonist
28.28 is elongated by a propionamide group. It blocks the refolding of helix 12 into
the active position. In the unfolded geometry it accomodates in another region of
the receptor surface.
28.6 Ligands of PPAR Receptors 711

Fig. 28.12 Activation of O


the peroxisomal O
proliferator-activated receptor OH
PPARa increases fat H3C CH3
metabolism. Ligands such as Cl
the fibrates 28.22–28.26 28.22 Clofibric Acid
activate this receptor and act N
as lipid-lowering agents. O
O O
O
H3C CH3 O
Cl
28.23 Etofibrate

O N
CH3
O N N
O
H3C CH3 O
N
Cl O
CH3
28.24 Etofyllin clofibrate

O CH3
Cl O
O CH3
H3C CH3

O
28.25 Fenofibrate
O
O
OH
H3C CH3
HN

Cl
28.26 Bezafibrate

Because of a reduced concentration of fatty acids and a diminished release of


mediators that inhibit insulin release, PPARg-receptor agonists can induce an
increase in glucose metabolism. In this way, they act against insulin resistance,
which is held responsible for the massive increase in type-2 diabetes. A variety of
thiazolidinedione derivatives were developed as PPARg agonists. The starting
point was the lipid-lowering compound clofibrate 28.22, a PPARa agonist
(Fig. 28.14). In 1979, ciglitazone 28.31, the first insulin sensitizer, was developed
at Takeda via compounds 28.29 and 28.30. GlaxoSmithKline followed with
rosiglitazone 28.33 and Takeda introduced another compound, pioglitazone 28.32
to the market a short time later. Both are highly selective for PPARg. They are
administered as the racemate because the stereocenter isomerizes in the organism.
712 28 Agonists and Antagonists of Nuclear Receptors

Fig. 28.13 Superimposed


O COOH
crystal structures of the
PPARa receptor with bound
N HN CH3
agonists (green) and O
antagonists (brown). The O
terminal acid group of agonist
28.27 forms a hydrogen bond
to Tyr464 and stabilizes helix 28.27
12 (turquoise) in the active
position. In doing so, access is O
granted to the recognition site
O CH3
for the LxxLL peptide motif N
of the coactivator (violet). H
N HN CH3
Glu462 stabilizes the helical O
segment of this peptide
O
strand. The antagonist 28.28
is elongated by a propionic
acid amide group. It blocks
the refolding of helix 12
28.28
(beige) into the active
position. It remains unfolded CF3
and adopts a different
orientation.

LxxLL

Glu462

Tyr464
Tyr464
28.6 Ligands of PPAR Receptors 713

Fig. 28.14 The insulin Cl


sensitizer ciglitazone 28.31
was developed at Takeda as O
a PPARg ligand through O
compounds 28.29 and 28.30, OH
H3C CH3
which themselves were
developed from the 28.22 Clofibric Acid
lipid-lowering drug clofibrate O
28.22, a PPARg agonist. Cl
Rosiglitazone 28.33 and OEt
pioglitazone 28.32 act on the Cl
same receptor. O
H3C CH3
28.29
O
Cl
NH
S
O
H3C CH3 O
28.30 O

CH3
NH
S
O
O

28.31 Ciglitazone
O

H3C NH
S
N O
O
28.32 Pioglitazone
O

CH3
NH
N S
O
O
N
28.33 Rosiglitazone

The crystal structure with the bound ligand showed however, that the S enantiomer
is bound by the receptor. It was also possible to cocrystallize a peptide with the
LxxLL recognition motif with this structure. Once again, helix 12 makes the
binding pocket available in the active position and stabilizes the helical segment
of the LxxLL recognition peptide by positioning Glu471.
Recently, the insuline sensitizers of the glitazone type came into discussion due
to unexpected side effects. Rosiglitazone has been withdrawn from the European
market in 2010 due to risks of heart failure. For pioglitazone this risk is unknown,
714 28 Agonists and Antagonists of Nuclear Receptors

however, a risk for bladder cancer has been indicated and therefore also this
compound has been withdrawn from the market in 2011 in some countries. Never-
theless, PPARs also represent a possible target structure for cancer therapy. Pros-
tacyclin (▶ Sect. 27.9) is the natural ligand for a receptor that was initially termed
PPARd, but later proved to be closely related to PPARb. Its expression is regulated
by a variety of oncogenic signaling pathways. The receptor is often overexpressed
in tumor cells. Therefore antagonists of this receptor could represent a new concept
for the development of antitumor drugs.

28.7 Ligands of Nuclear Receptors Stimulate Metabolism

Cytochrome P450s were discussed as metabolic enzymes in ▶ Sect. 27.6. They


carry out the first oxidative attack on exogenous xenobiotics. They attach polar
groups onto lipophilic compounds to prepare the substances for renal elimination.
Drugs can also induce their own metabolism or that of other xenobiotics. This
property is based on the increased expression of cytochrome P450 enzymes in liver
and gastrointestinal cells. This process is mediated by the nuclear receptors PXR
(pregnane-X receptor) and CAR (constitutive androstane receptor). When activated
by the binding of such an inducing drug, the nuclear receptors bind to a xenobiotic-
response element in the promoters for particular cytochromes and induce their
transcription and expression. The increased biosynthesis of cytochromes leads to
an elevated metabolic activity. This property of particular drugs must be paid heed
in their prescription, above all with regard to possible drug–drug interactions with
other simultaneously administered compounds (▶ Sect. 27.7).
The pregnane-X receptor can be activated by ligands of entirely different sizes
(Fig. 28.15). In this way, Phenobarbital 28.34 or the cholesterol-lowering com-
pound SR12813 28.35 can activate the receptor. Paclitaxel 28.36, the macrolide
rifampicin 28.37, and the natural compound hyperforin 28.28 from St. John’s wort,
which are all much larger, are accommodated in the same binding pocket. Mole-
cules with different volumes are obviously able to activate PXR. We were intro-
duced to the steroid receptors as highly selective proteins that can distinguish
between the sole exchange of an OH function for a carbonyl group, and the absence
or presence of a methyl group. The architecture of their corresponding receptors
allows the high selectivity. The pregnane-X receptor belongs to the same family
and adopts an analogous folding. Nevertheless, Nature incorporated small modifi-
cations in the spatial geometry and secondary structural elements that allows this
fold architecture the transition from a highly selective to a promiscuous type of
receptor now barely differentiating between “large” and “small” (Fig. 28.16).
Forty-five amino acids were added between helix 1 and 3, and helix 2 has been
replaced by a multistranded pleated sheet. This structural element is significantly
enlarged compared to the estrogen receptor. Moreover, helix 6 is unfolded
and exists as a long loop. These modifications in the architecture of the general
folding pattern of nuclear receptors lead to the consequence that the ligand-
binding pocket, which is in the center of the protein, is equipped with pronounced
28.7 Ligands of Nuclear Receptors Stimulate Metabolism 715

O
H PO(OC2H5)2 HO O
H3C O N O
PO(OC2H5)2
NH HO

O O

28.34 Phenobarbital 28.35 SR12813


28.38 Hyperforin
O Me

O O OH HO
O
O OH O
O O OH
OH
H O
R N O NH
HO CH3O
OH O Me N
O O O
O N
O OH N
O CH3

28.36 Paclitaxel R = Phenyl


28.37 Rifampicin
28.40 Docetaxel R = tert-Butoxy
O

CH3
CH3 NH
H3C O S
O
O
HO
CH3
28.39 Troglitazone

Fig. 28.15 Upon binding an activator, the pregnane-X receptor induces the expression of
cytochrome P450s from the CYP 3A family, which metabolize numerous drugs. Small ligands
such as phenobarbital 28.34 and the cholesterol-lowering SR12813 28.35 as well as large natural
products such as paclitaxel 28.36, hyperforin 28.38, or the macrolide rifampicin 28.37 activate
PXR. The insulin sensitizer troglitazone was withdrawn from the market because of its activity on
the PX receptor. Small changes such as the exchange of a phenyl group for a tert-butyl group on
paclitaxel can be enough to suppress this activating property.

adaptive properties. In this way, PXR is able to bind to structurally diverse


xenobiotics as agonists. The CAR is related to the PXR and it is also activated by
a large range of chemically diverse ligands, such as barbiturates, chlorpromazine,
acetaminophen or the heme metabolite bilirubin. Possibly activation also occurs
without direct ligand binding by promoting translocation from the cytosol into the
nucleus. Crystal structures of the CAR suggest that the central binding pocket is
about half the size of that in PXR, thus also activating ligands seem to be smaller for
this receptor.
716 28 Agonists and Antagonists of Nuclear Receptors

Fig. 28.16 Schematic representation of the polypeptide chain in the estrogen receptor (a) and the
pregnane-X receptor (b–d). An insertion of 45 amino acids occurs in PXR that renders the lower
right structural portion extremely adaptive. Because of this, the receptor can bind ligands of very
different size: (b) crystal structure with bound SR12813 28.35, (c) crystal structure with bound
hyperforin 28.38, (d) crystal structure with bound rifampicin 28.37. For comparison, the estrogen
receptor bound to estradiol is shown (a).

What consequences occur as a result of these observations that PXR is an


activator to induce cytochrome P450 expression? Above all, CYP 3A proteins
that preferentially control the metabolism of drugs are prepared at an accelerated
rate. In the last section, the glitazones were discussed as insulin sensitizers.
Troglitazone 28.39, another potent drug for the treatment of diabetes, was with-
drawn from the market because it activated PXR. As a consequence, this compound
increased production of CYP 3A4, which metabolizes troglitazone to a potentially
toxic quinone, which, in turn, can lead to liver damage. The same scenario has not
28.8 Synopsis 717

been observed in rosiglitazone 28.33 or pioglitazone 28.32. Even paclitaxel acti-


vates PXR so that this chemotherapeutic is eliminated from the body at an accel-
erated rate by the additionally produced CYP 3A4. Replacement of the terminal
phenyl group with a tert-butoxy group leads to docetaxel 28.40, which can no
longer activate PXR. Today, attempts are made to exclude potential PXR activation
that could lead to increased CYP 3A4 metabolism early during drug development.
The use of a natural substance such as hyperforin 28.38 from St. John’s wort, which
is commonly used to treat mild depression, must be considered because this
substance is a potent PXR activator. It leads to increased metabolism of other
drugs such as, for example, hormone contraceptives, HIV protease inhibitors,
statins, or coumarin-like anticoagulants. This can cause the therapeutic success of
these other compounds to be severely reduced.

28.8 Synopsis

• The nuclear receptors are a family of 48 members that are present as soluble
proteins in the cytosol. They are transcription factors and play an important role
in gene regulation. They form either homo- or heterodimers and are activated by
small molecules such as steroid hormones, retinoic acid, fatty acids, triiodothy-
ronine, vitamin D, prostaglandins, bile acids, or phospholipids.
• Nuclear receptors exhibit a ligand and a DNA-binding domain, however ligand-
independent domains are also involved in the activation of transcription. The
activated receptor, stimulated through agonist binding, migrates to the cell
nucleus and recruits co-activators, co-repressors, and transcription factors to
regulate gene expression.
• The ligand-binding domains can exhibit impressive selectivity in the recognition
of their ligands. Steroid receptors can distinguish the tiny structural differences
between male and female hormones in terms of H-bond donor/acceptor func-
tional group exchanges and presence or absence of the 19-methyl group.
• Agonist and antagonist binding induce a different orientation of helix 12, which
closes the entrance to the ligand-binding site. Antagonist binding hampers
reorientation of this helix across the entrance gate and simultaneously blocks
the recognition site for the binding of a helical LxxLL motif found on the surface
of the coactivator.
• Agonists and antagonists of the steroid receptors are important drugs to interfere
with the menstrual cycle as contraceptives, act in anticancer therapy, show anti-
inflammatory, immunosuppressive, or antiallergic activity on the glucocorticoid
receptor, or act as diuretics or hypertensive agents on the mineralocorticoid
receptor.
• PPAR receptors occur as several subtypes and form heterodimers with the retinoic
receptor upon activation. Agonists of the PPARg receptor can induce an increase
in glucose metabolism. They are used as insulin sensitizers in diabetes therapy.
• The transcription and expression of cytochrome P450 enzymes involved in
the metabolism of xenobiotics can be regulated by the nuclear receptors
718 28 Agonists and Antagonists of Nuclear Receptors

PXR and CAR. These receptors can be activated by a variety of structurally


rather diverse xenobiotics, which leads to increased biosynthesis of cytochromes
and therefore elevated metabolic activity.
• In contrast to the stereochemically highly specific nuclear steroid receptors, the
pregnane-X receptor shows pronounced promiscuous binding of structurally
highly diverse activators. Binding of small to large ligands is accomplished by
an additional highly adaptive structural element in this receptor.

Bibliography

General Literature
Gronemeyer H, Gustafsson J-Å, Laudet V (2003) Principles for modulation of the nuclear receptor
superfamily. Nat Rev Drug Discov 3:950–964
Moore JT, Collins JL, Pearce KH (2006) The nuclear receptor superfamily and drug discovery.
ChemMedChem 1:504–523
Ottow E, Weinmann H (eds) (2008) Nuclear receptors as drug targets. In: Mannhold R, Kubinyi H,
Folkers G (eds) Methods and principles in medicinal chemistry, vol 39. Wiley-VCH,
Weinheim

Special Literature

Fieser LF, Fieser M (1959) Steroids. Reinhold, New York


Hirschmann R (1991) Medicinal chemistry in the golden age of biology: lessons from steroid and
peptide research. Angew Chem Int Ed Engl 30:1278–1301
Jordan VC (2003) Tamoxifen: a most unlikely pioneering medicine. Nat Rev Drug Discov
2:205–213
Owens J (2004) Growing concern for tamoxifen. Nat Rev Drug Discov 3:647
Timsit YE, Negishi M (2007) CAR and PXR: The xenobiotic-sensing receptors. Steroids
72:231–246
Willson TM, Kliewer SA (2002) PXR, CAR and drug metabolism. Nat Rev Drug Discov
1:259–266
Agonists and Antagonists of
Membrane-Bound Receptors 29

Messenger molecules assume the task of conveying and transmitting informa-


tion between cells. These molecules can be as small as single ions, but can also
attain the formidable size of signaling peptides all the way to proteins. They bind to
a membrane-bound receptor on the extracellular side to transmit signals. There
are hardly any alternative pathways for these messenger molecules because sub-
stances such as dopamine, histamine, or adrenaline, and also peptides and proteins
such as insulin, interleukins, angiotensin, endothelin, or neurokinin cannot cross the
cell membrane. Ligand-binding signals are transmitted to the interior of the cell by
a transition of the conformational state of the receptors. In the case of activation, the
bound ligand stabilizes the active receptor conformation. For inhibition, the ligand
binds to the receptor from the outside, which does not change the conformational
equilibrium, but stabilizes the inactive conformation. Signal transmission does not
occur. Both approaches can be beneficial for drug therapy. On the one hand,
agonists are spoken of, and on the other, antagonists or inverse agonists are
meant. G protein–coupled receptors (GPCR), which transect the membrane
with seven helices, encompass a huge group of membrane-bound receptors. Ago-
nists stimulate an activation of the coupled G protein in GPCRs, which initiates
subsequent processes in the cell. The second class is made up by receptors that also
penetrate the cell membrane with a helical segment. Dimerization is a prerequisite
for their activation. The attached cytosolic tyrosine kinase domains in the interior
of the cell begin to mutually phosphorylate one another. This transforms them into
a state in which the functions of other proteins are turned on by phosphorylation.
Another group of oligomeric membrane-bound receptors binds interleukins as
messenger molecules. They also initiate kinase-dependent intracellular signaling
pathways as a result of ligand binding. About a third of our pharmaceuticals act on
GPCRs in a regulatory fashion. The picture is much less clear for the second class of
membrane-bound receptors. These receptors are all regulated by large ligands
making the development of a competitive, small xenobiotic extremely difficult
(▶ Sect. 10.6).

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_29, 719


# Springer-Verlag Berlin Heidelberg 2013
720 29 Agonists and Antagonists of Membrane-Bound Receptors

29.1 The Family of G Protein–Coupled Receptors

The G protein–coupled receptor (GPCR) family represents the largest group of


integral membrane proteins in our genome. About 800 members of this family have
been found. They mediate the information flow through the membrane and react to
very different extracellular signals. They can be activated by light, protons, indi-
vidual ions, but also by small biogenic amines (neurotransmitters), hormones,
prostaglandins, signal peptides, all the way to proteins. Once in the active state,
a cascade of intracellular processes is initiated. A conformational change in the
receptor represents the transition from the active to the inactive state. Both states
are in a thermodynamic equilibrium. The active or inactive state is stabilized by
the binding of an agonist or antagonist, respectively. Even without bound agonists,
the receptor has an intrinsic activity that an antagonist cannot turn off. Antagonists
block the binding of agonists and prevent their activating effect. Inverse agonists
are able to fully suppress the receptor function.
Conformational changes in the receptor are translated to the intracellularly
bound heterotrimeric G protein, which, as a result, exchanges a bound GDP for
a GTP in its a-subunit. Then, the a-subunit dissociates from the activated
heterotrimer, and the complex spatially rearranges. As long as GTP is bound, the
G-protein is in an active state. It returns to its inactive state by the slow hydrolysis
of the bound GTP to GDP. The separated subunit reunites to form the original
trimer. Multiple subtype families are known for each of the individual subunits.
They can be combined to form numerous different trimers.
The receptor remains in the activated state and repeatedly activates G proteins
on the interior of the cell as long as the activating ligand remains bound in
the receptor’s binding pocket. Signal amplification is achieved in this way. On
the other hand, the terminating hydrolysis of GTP to GDP regulates the speed of the
process and also contributes to the intensity and duration of the receptor signal.
Quite different signaling pathways in the cell are initiated depending on whether
an activating, stimulating, or inhibitory G protein makes up the a-subunit. If a Gs or
Gq/11 protein represents the a-subunit, an effector protein is activated that releases
a second messenger in the cell. This is then responsible for the actual effect, for
instance, the activation of further proteins, especially kinases, or the regulation of
an ion channel. The best-known effector protein, adenylate cyclase, forms
adenosine-30 ,50 -cyclophosphate (cAMP) from ATP. The newly generated cAMP
can then activate kinases such as protein kinase A or MAP kinases. It can also act on
other channels in a stimulating manner. Other second messengers are guanosine-
30 ,50 -cyclophosphate (cGMP), inositol-1,4,5-triphosphate (IP3), diacylglycerol,
arachidonic acid, or simply Ca2+ ions. Their formation can be partially initiated
by Gq/11 proteins, which in turn activate phospholipase C-b. Then other
messenger molecules are formed over multiple steps. There are also inhibitory
G proteins (Gi and Go) that exert a blocking effect on the enzymes that are
responsible for the synthesis of the second messenger. Another family of
G proteins (G12/13) activates Rho proteins that serve to regulate the actin–myosin
cytoskeleton. Muscle contraction is regulated by such a pathway.
29.2 Rhodopsins Provide the First Models of G Protein–Coupled Receptors 721

The misregulation of GPCRs is associated with many disease patterns because


of their central role in mediating information during changes in the cell’s state and
function. Therefore GPCRs represent common targets for drugs. Altogether, five
classes of GPCRs are differentiated, among which class A is by far the largest and
most important. Numerous subtypes are known for the individual receptors. They
differ in their tissue distribution, ligand specificity, and also in the subsequent
signaling pathways that the different G proteins initiate. This fact explains why
influencing these receptors by drugs has been successful in entirely different
indications. Adrenaline and noradrenaline (▶ Sect. 1.4) act on the so-called adren-
ergic receptors. In 1948, Raymond Ahlquist demonstrated that the different effects
of adrenaline on diverse organs are attributed to two different types of these
receptors, a and b receptors. Later, the division into a1 and a2, and b1 and b2
receptors and even further subtypes was made. For example, depending on the
tissue, the b2-adrenergic receptor is associated with asthma, hypertension, or heart
attack. These differences helped very much in the development of specific drugs,
for example, b-agonists or b-antagonists (b-blockers, ▶ Sects. 8.5, 29.3). The
serotonin receptor displays an exceedingly complex spectrum of different sub-
types; these are also called 5-HT receptors according to the chemical structure of
serotonin (5-hydroxytryptamine; Table 29.1). Their misregulation is associated
with disease patterns such as migraine, pulmonary hypertension, depression,
schizophrenia, eating disorders, and nausea and vomiting.
Initially it was assumed that the GPCRs carry out their function as monomers.
This picture has evolved. Today it is known that the formation of homo- and
heterodimers represents an additional regulatory and controlling signal for the
differentiation of pharmacological cell response.
About 30% of all drugs on the market today exert their effects on GPCRs.
Therefore it is all the more desirable that more information about the spatial
geometry of these structures is available.

29.2 Rhodopsins Provide the First Models of G Protein–


Coupled Receptors

The crystallographic structure determination of G protein–coupled receptors has


proved to be extremely difficult. These membrane-bound proteins have an extra-
cellular N terminus and a cytosolic C terminus, and as such are not easily trans-
ferred from their natural habitat into a crystal lattice. Moreover they have
pronounced loop areas on both sides of the membrane that bridge the individual
segments that traverse the membrane. These loops are crucial for the function and
also for the intact spatial architecture. Furthermore, the production of adequate
quantities of these receptors for structure determination represents a big problem.
In 1990 the publication of the structure of bacteriorhodopsin (Fig. 29.1a) gave
a first concept about the construction of models of GPCRs, which belong to the
group of seven-transmembrane receptors. With the help of high-resolution electron
microscopy (▶ Sect. 13.6), Richard Henderson determined the structure on
722 29 Agonists and Antagonists of Membrane-Bound Receptors

Table 29.1 Particularly many subtypes are known for the serotonin receptors with therapeutic
possibilities to treat hypertension, migraine, schizophrenia, depression, anxiety, emesis, and
gastrointestinal motility disorders
Modulated
Receptor Gene Type; therapeutic indication enzyme
5-HT1A 5-ht1A GPCR, Gi; CNS diseases such as anxiety Adenylate
and depression cyclase
5-HT1B 5-ht1B GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT1D 5-ht1Da (h), 5-ht1Db GPCR, Gi; neuronal inflammatory processes,
≙ 5-ht1B (R) migraine
5-HT1E 5-ht1E GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT1F 5 ht1F GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT2A 5-ht2A GPCR, Gs; CNS disease, atypical Phospholipase C
antipsychotic, wound healing, arterial
hypertension
5-HT2B 5-ht2B GPCR, Gs; CNS disease, atypical
antipsychotic, wound healing, arterial
hypertension
5-HT2C 5-ht2C GPCR, Gs; CNS disease, atypical
antipsychotic, wound healing, arterial
hypertension
5-HT3 5-ht3 Ion channel, suppression of cytostatic-induced –
emesis
5-HT4 5-ht4 GPCR, Gs; gastrointestinal tract, irritable Adenylate
bowel syndrome cyclase
5-HT5 5-ht5A, 5-ht5B GPCR, ?; circadian rhythm ?
5-HT6 5-ht6 GPCR, Gs; involved in memory and Adenylate
learning cyclase
5-HT7 5-ht7 GPCR, Gs; regulation of the day/night Adenylate
rhythm cyclase
HT, ht 5-hydroxytryptamine (serotonin), R rat, h human, GPCR G protein–coupled receptor, Gs, Gi
stimulatory or inhibitory G protein

Fig. 29.1 Schematic representation of the spatial orientation of the transmembrane helices in
bacterial rhodopsin (a), bovine rhodopsin (b), and in the human b2-adrenergic receptor (c).
29.3 Structure of the Human b2-Adrenergic Receptor 723

two-dimensional crystals. Bacteriorhodopsin is itself not a GPCR but rather a proton


pump that establishes a pH gradient across the membrane. Like all GPCRs, however, it
is constructed from seven transmembrane helices. Bacteriorhodopsin has negligible
sequence homology with human GPCRs. That notwithstanding, the structure was
intensively used in the early 1990s to model a large number of pharmacologically
relevant GPCRs. This difficult work stimulated the development of many technical
advances at the time to reliably model multiple sequence alignments, model helix
properties, and consider a large number of mutation data and binding profiles from
ligand series. One challenge was the exact determination of the beginning and end of
a sequence segment that corresponds to a transmembrane helix. One erroneous amino
acid assignment automatically leads to a shift of about 100 along the helix axis
because on average, 3.6 amino acids contribute to each winding (▶ Sect. 14.2).
Moreover, the loop regions proved to be especially problematic. Often they were
left out of the models entirely or were modeled with knowledge-based methods based
on database information. Nevertheless, a proof of the relevance of these models by
structure determination of a real GPCR was still lacking in those days.
It took another 10 years until the first high-resolution structure of a real GPCR
could be determined (Fig. 29.1b). Bovine rhodopsin represents a particularly favor-
able case. It occurs at a high concentration in the eye and is stabilized by covalently
bound 11-cis-retinal. Rhodopsin is a light-regulated receptor. The high-resolution
structure determination was accomplished in an inactive state, in which the receptor
is found in darkness. There, no signal would be transmitted to the cell. Years later,
the structure determination was achieved with the photoactivated receptor. Indica-
tions about the activation process and conformational changes in the receptor could
be deduced from this. In the inactive state, a network of charge-assisted hydrogen
bonds is formed between two glutamate and one arginine residues. The interactions,
termed an “ionic lock,” couple transmembrane helices 3 and 6 to each other. Upon
light activation, the network is torn apart, which leads to helices 3 and 6 shifting to
the cytosolic side. This shift creates a binding site for the heterotrimeric G protein in
the interior of the cell and is associated with its activation. Recently, these structural
rearrangements upon activation have been confirmed by a structure with a bound
peptide epitope of the G protein (cf. next section).
The structure of bovine rhodopsin served as a basis for many modeling
approaches with the knowledge that this receptor represented a special, light-
sensitive switch with a covalently bound ligand. Additionally, all modeling
attempts were based on the geometry of the receptor in its inactive state.

29.3 Structure of the Human b2-Adrenergic Receptor

How the actual structure of a human GPCR really looks became all the more
exciting. In 2007 it was made possible. Two structures of the human b2-adrenergic
GPCR were described under the auspices of Brian Kobilka at Stanford University
with the collaboration of groups at Scripps Research Institute in La Jolla, California
and MRC in Cambridge, England. For this tremendous achievement Brain Kobilka
724 29 Agonists and Antagonists of Membrane-Bound Receptors

Phe193
Val114
Asp113
p
Ser203

Asn312

Phe290 Phe289

Fig. 29.2 Segment of the crystal structure of the human b2-adrenergic receptor with bound
carazolol 29.1, a partial inverse agonist.

was awarded the Nobel prize in 2012 together with Robert Lewkowitz who first
characterized and isolated the b-adrenergic receptors in the 70s. The receptor’s
high flexibility and proteolytic instability represented the biggest problems in its
crystallization. The third intracellular loop proved to be particularly troublesome.
The researchers had to use a trick to overcome this. It was only after a specific
antibody was found that binds to this loop and stabilizes the receptor in its native
functional structure that the crystallization was successful. In a second strategy, the
critical loop was excised from the receptor and replaced by T4 lysozyme, a well-
known and easily crystallized protein. The newly formed fusion protein displayed
largely unchanged pharmacological properties. Both structures were crystallized
together with a potent partial inverse agonist, carazolol 29.1 (Figs. 29.2, 29.3). They
differ only slightly from each other in the transmembrane region. However, it is just
in this area that the difference to the previously determined structure of bovine
rhodopsin is almost three times as large. This underscores the structural differences
to the rhodopsin receptor (Fig. 29.1). Because carazolol only blocks about 50% of
the receptor’s basal activity, it is referred to as partial inverse agonist.
Carazolol binds to Asp1133.32 and Asn3127.39 (the superscript numbers indicates
the helix and position at which the amino acid is found, respectively) with its
alkylamino and alcohol functions. From mutational studies it was known that the
exchange of Asp113 for an asparagine leads to the loss of antagonist binding.
It hampers the G-protein activation by agonists by four orders of magnitude.
The mutation of Asn312 for a non-polar amino acid such as alanine or phenylala-
nine causes the receptor function to collapse, whereas the function is partially
29.3 Structure of the Human b2-Adrenergic Receptor 725

HN +
N +
H N
H H
HO H
O OH
OH
29.1 Carazolol HO
29.5 Isoprenalin
Isoprenaline

HN + CH
N + 3
H N
H H
O OH HO H
OH
29.2 Pindolol HO
29.6 Adrenaline
+
N
H
H
O OH

29.3 Propranolol

+
N
H
O
H
O OH

29.4 Betaxolol

Fig. 29.3 Ligands of the human b2-adrenergic receptor. Carazolol 29.1, pindolol 29.2, propran-
olol 29.3, and betaxolol 29.4 are b-blockers, whereas isoprenaline 29.5 and adrenaline are receptor
agonists.

preserved by replacement with an amino acid with a polar side chain (threonine or
glutamine). Carazolol’s heteroaromatic tricyclic moiety forms a hydrogen bond to
Ser2035.42 with its NH group. This group was also recognized as critical from
a mutagenesis study by using catecholamine agonists. It was known from b-
blockers from the aryloxyaminopropanol class with nitrogen-containing heterocy-
cles such as pindolol 29.2 that exchanging this serine for another residue causes the
affinity of these compounds to the b-adrenergic receptor to drop substantially.
Carazolol is surrounded by numerous contacts to hydrophobic amino acids
(Val1143.32, Phe2906.52, Phe1935.32). This explains why all b-blockers display an
aromatic moiety in this region (Fig. 29.3).
Many b-blockers display poor selectivity for the subtypes of the b-adrenergic
receptors (b-AR). Nonetheless, such selectivity is exceedingly desirable because,
for example, the b1 receptor is found in the cardiac vasculature and the b2 subtype is
found in the bronchi. Efficient b1 receptor inhibition reduces the contractility and
frequency of the heart. At the same time though, bronchoconstriction by blocking
the b2 receptors is undesirable. Interestingly, all amino acids that surround
726 29 Agonists and Antagonists of Membrane-Bound Receptors

carazolol in the binding site of the b2 receptor are conserved in the b1 receptor. The
observed 94 exchanges between the b1 and b2 receptors are all in the loop regions.
Therefore it is assumed that the pharmacological differences that are exploited in
selective ligands such as betaxolol 29.4 are found in the entrance region to the
binding site and cause small changes in the helix packing.
The concept of cutting out unstable loops and exchanging them for T4 lysozyme
proved to be extremely successful. In the meantime, the research group of Ray
Stevens at the Scripps Institute in La Jolla, California has managed to elucidate
the structures of the adenosine A2A receptor, the dopamine D3 receptor, and the
CXCR4 chemokine receptor. All have the same overall architecture, but the shape
and position of the binding pockets differ significantly. The CXCR4 receptor is not
regulated by a small ligand, but rather by a protein. A structure determination was
carried out with two agonists, initially with the low-molecular-weight ligand IT1t,
then with the cyclic 15-residue peptide CVX15. In spite of the binding pocket of
this receptor being much larger, it was shown that the receptor can be controlled by
peptidic macromolecules as well as small ligands. The CXCR4 receptor represents
a possible target structure in cancer therapy as well as for HIV infection. In the
latter case, it serves as a co-receptor that give the virus initial access to T cells.
It binds to the viral glycoprotein gp120 (▶ Sect. 31.4).
In the first structures of the b-adrenergic receptor using the T4-lysozyme-fused
receptor, the protein was in an inactive state. If the partial inverse agonist carazolol
is compared with structurally smaller agonists such as isoprenaline 29.5, it is
obvious that the two hydroxyl groups of the catechol moiety form hydrogen
bonds with Ser2045.43 and Ser2075.46. Moreover Asn2936.55 and Tyr3087.35 have
been described as being critical for agonist binding. These residues are too far apart
in the carazolol structure to efficiently interact with agonists such as isoprenaline.
This confirms the assumption that the receptor must undergo a conformational
change to successfully accommodate an agonist.
The structure of the b1 receptor with an antagonist, cyanopindolol 29.6
(Fig. 29.4) was resolved in the research group of Gebhard Schertler, initially in
Cambridge, England, and later at the Paul Scherrer Institute in Zurich, Switzerland.
The scientists used a thermostable receptor variant from turkey with six exchanged
amino acids. Its overall structure is not different from the b2 receptor, but the
stability of the inactive form is increased. Overall, the b1 receptor has less intrinsic
(or basal) activity than the b2 receptor. This increased intrinsic activity of the b2-AR
is physiologically important. The T264I mutant of b2-AR, which occurs as a human
polymorphism and displays reduced intrinsic activity compared to b1-AR, is asso-
ciated with a heart disease.
In addition to the cyanopindolol antagonist, a crystal structure was also deter-
mined for a complex with the full agonist isoprenaline 29.5. As expected, binding of
this structurally small agonist leads to a contraction of the binding pocket; helices
H5 and H7 move toward one another (Fig. 29.4). Agonists as well as antagonists use
their aminopropanol group to form two hydrogen bonds to Asn327 on H7. The
larger antagonist, cyanopindolol 29.6 uses its indole NH function for an interaction
with the side chain from Ser211 on H7. The catecholamine isoprenaline 29.5, on the
29.3 Structure of the Human b2-Adrenergic Receptor 727

Fig. 29.4 Superposition of CN CH3


the crystal structures of the H3C OH
N NH
b1-adrenergic receptor with H
N
H
OH
the agonists isoprenaline 29.5 HO O HO
(red-brown) and the
antagonists cyanopindolol 29.6 Cyanopindolol 29.5 Isoprenaline
29.6 (green). The agonist
forms an H-bond to Ser211
and Ser215 and the binding
pocket contracts because
helices H5 and H1 lean in by
about 1 Å. This helix
movement leads to activation Asn329
of the receptor.
Ser211

Ser215

other hand, employs its two aromatic OH groups for hydrogen bonds to Ser211 and
Ser215 to pull H5 to the agonist-bound conformation. The relative shift of these two
helices against one another seems to change their mutual interaction areas and
contribute to the transition from the inactive to the active state. The mentioned
polymorphism, which leads to a reduction in the intrinsic activity of b2-AR, also
exerts an influence on the contact between H5 and H7 in this receptor.
As biophysical investigations have demonstrated, the activation of the
b-adrenergic receptor occurs analogously as in rhodopsin. In the case of the light-
dependent receptor, the activation is triggered by a cis–trans isomerization of cova-
lently bound retinal. The retinal-binding site is spatially located in the same region as
the ligand-binding site in other GPCRs. Detailed glimpses into the activation mech-
anism have allowed a comparison of the structures of a stabilized Glu113Gln mutant
of the active and inactive rhodopsin. After photoactivation, the receptor binds one
retinal molecule in the all-trans configuration (Fig. 29.5). At the same time, the
activated receptor was characterized in complex with an 11-residue peptide, which
728 29 Agonists and Antagonists of Membrane-Bound Receptors

inactive active

11-cis Retinal all-trans Retinal

Trp2656.48
Trp2656.48
Tyr3017.48

Arg1353.50
Arg1353.50
Glu1343.49

Glu2476.30
ion lock

αG peptide

Fig. 29.5 Comparison of the crystal structures of inactive (green) and active rhodopsin. The
photoactivation is triggered by retinal, when its cyclohexene ring shifts because of an isomeriza-
tion of the 11-double bond from a cis to a trans configuration. This movement is translated to
Trp265 and from there through a cascade of water-mediated H-bonds all the way to the “ionic
lock,” which is made up of Glu134, Arg135, and Glu247. The salt bridges are dissolved, and the
binding site for the binding epitope of the a-domain of the G protein is established.

corresponds to the interacting epitope from the a-subunit of the G protein. The
b-ionone ring of retinal is shifted by 4.3 Å in the direction of a gap between helices
H5 and H6 in activated rhodopsin. In doing so, Trp2656.48 is also moved from its initial
position in the ground state. This transition requires a global restructuring of the
orientation of the helices, and a water-mediated interaction network between H6
and H7 is disrupted and rearranged. The cytosolic end of helix 6 moves by twisting
away from the center of the helix bundle H1–H4 and H7. The side chains from
Tyr2235.58 and Tyr3067.53 orient in the interior of the receptor, form new contacts to
the water network, and undergo interaction with the highly conserved E(DRY) motif
at the cytosolic end of helix 3. The salt bridges between the side chains of Glu1343.49,
Arg1353.50, and Glu2476.30, which form the so-called ionic lock, open and allow
access to the binding site for the peptide epitope of the a-domain of the G protein.
Upon ligand binding on the extracellular side, pharmacologically relevant
GPCRs presumably undergo very similar spatial shifting of the helical ends at the
contact surface in the cell interior. The activation process is presumably a multistep
cascade, during which multiple conformational states are passed through.
After all the new structures became available, it is interesting to compare
the initially constructed homology models of the b2-adrenergic receptor with the
29.4 Tracing Selective Dopamine D1 Agonists 729

later-determined crystal structures. Discrepancies occur between the helices of


rhodopsin and the human b2-adrenergic receptor that do not change the overall
topology, but have consequences for the immediate geometry. It is directly apparent
that the models correctly reflected the total topology of the receptor, and that many
of the observations from the mutation experiments were correctly interpreted. The
accuracy that is required for modeling the exact binding mode of a ligand in detail,
however, could not be achieved by the models. It showed that the models all have
greater structural similarity to the rhodopsin template structure than to the actual
structure of the b2 receptor. This result naturally leads one to ponder for the
moment. It illustrates the difficulty of modeling complex and flexible proteins.
Time will tell whether the modeling of new structures of pharmacologically
relevant GPCRs will be simpler and more reliable, and whether the generated
models will achieve greater relevance and significance in lead-finding and lead-
optimization campaigns.

29.4 Tracing Selective Dopamine D1 Agonists

More effort has been made in the search and optimization of potent ligands for
G protein–coupled receptors than in any other area in drug research. Thousands of
examples could be introduced here, but only two cases shall be discussed as
representative examples. The first case deals with the development of selective
agonists for a receptor that recognizes a small neurotransmitter as its natural ligand:
the dopamine receptor. An example of a receptor that is controlled by peptide
ligands will be discussed in Sects. 29.5 and 29.6.
Dopamine 29.7 (Fig. 29.6) is an important neurotransmitter that carries out
multiple functions in the body. A reduction in the dopamine concentration is
observed in particular brain regions in patients that suffer from Parkinson’s disease;
this is caused by the destruction of dopamine-producing cells. The disease can be
treated by the administration of L-DOPA. This compound is actively transported

HO NH2

29.7 Dopamine
HO

Ki (nM)
R R D1 D2
HO 29.8 Phenyl 63 6300
NH
HO 29.9 H 10000 2500

Fig. 29.6 Dopamine 29.7 and the dopamine receptor ligands 29.8 and 29.9. Compound 29.8
binds selectively to the D1 receptor. A comparison of the binding affinities of 29.8 and 29.9 shows
that the introduction of a phenyl substituent is responsible for D1 selectivity.
730 29 Agonists and Antagonists of Membrane-Bound Receptors

Fig. 29.7 Comparison of two conformations of 29.8 with the phenyl substituents in the plane of
(left) or above the plane of (right) the seven-membered ring. The phenyl rings have different
spatial segments. It can be assumed that only one of these two conformations is suitable for
binding to the receptor.

across the blood–brain barrier as an amino acid and is transformed in the brain to
biologically active dopamine (▶ Sect. 9.4).
In this section, work carried out at Abbott on the search for new dopamine
agonists that selectively bind to the D1 receptor shall be discussed. The goal of the
work was to find a compound that could be used to treat Parkinson’s disease that
lacked the known side effects of L-DOPA. The investigations, which were carried
out between 1988 and 1991, proved that the use of computer-aided methods, even
without knowledge of the 3D structure of the protein, can deliver decisive contri-
butions to the discovery of a new lead structure.
Initially an attempt was made to obtain data about the receptor-bound confor-
mation of D1 agonists to use the information later for the targeted selection of new
structures. The starting point for this work was another company’s compound:
SKF 38393 29.8 (Fig. 29.6). The simpler derivative 29.9, which lacks the phenyl
substituent found in 29.8, was first synthesized at Abbott. Compound 29.9 binds
more than a hundred times less potently to the D1 receptor than 29.8. Interestingly,
the affinity to the D2 receptor remained almost unchanged. This aroused the
suspicion that the phenyl group binds in an additional pocket in the D1 receptor
that is absent in the D2 receptor. Because it was known that the hydroxyl groups and
the amino groups are important for receptor binding, the question as to how the
phenyl ring is positioned relative to these functional groups was raised.
A conformational analysis showed that 29.8 can basically adopt two different,
energetically favorable conformations. In one, the phenyl substituent lies approx-
imately in the plane of the bicyclic ring; in the other, it is significantly above the
seven-membered ring (Fig. 29.7). To decide which of the two conformations are
adopted in the receptor, pairs of compounds were synthesized, each with a phenyl
substituent either above or in the plane of the ring. The corresponding unsubstituted
derivatives were also prepared. In doing so, rigid compounds were chosen that
correspond to only one of the two conformations. It was shown that the compound
with the phenyl substituents in the plane of the neighboring seven-membered ring
displayed potent dopamine D1 receptor binding. Obviously this is the biologically
active conformation.
In parallel to this work, an unselective new dopamine agonist 29.10 (Fig. 29.8)
was identified that binds equally potently to the D1 and D2 receptors. The previ-
ously compiled criteria for potent D1 binding were then used to determine
29.4 Tracing Selective Dopamine D1 Agonists 731

OH
HO R Ki (nM)
R D1 D2
29.10 H 1600 5000
N 29.11 Phenyl 63 >100000
H
OH
HO R
29.12 H 16000 >100000

29.13 Phenyl 250 6300


NH2

OH
HO R

O 29.14 Phenyl 2 1000

NH2

Fig. 29.8 The pharmacophore hypothesis developed at Abbott for D1-selective agonists led to the
synthesis of the highly affine and selective compound 29.14 via compounds 29.10–29.13.

the position where a phenyl substituent should be attached, analogous to 29.8. The
molecular comparison produced the suggestion for 29.11 (Fig. 29.8). This com-
pound was extremely successful! The binding affinity corresponds roughly to that
of 29.8, but 29.11 is D1 selective. The synthesis of 29.11, however, was not entirely
simple. Therefore additional D1 agonists were sought.
The problem was addressed on the computer by a 3D database search. ALAD-
DIN, a program developed at Abbott for this purpose, was used. The 3D database
of all Abbott substances was searched for structures that could have dopaminergic
activity by using the known pharmacophore pattern of dopaminergic compounds.
The computer search produced, among others, compound 29.12. This compound
indeed binds to the dopamine receptor. By adding an additional phenyl group in the
correct position, compound 29.13 resulted, for which a strong increase in the
binding affinity was observed. This lead structure, found in a 3D database search,
was systematically modified. The result of the work was finally 29.14. Of all the
analogoues that were known at the time, this compound represented the most
potently binding selective D1 agonist.
Yvonne Martin, who was intimately involved in the above-described work at
Abbott, offered an explanation for the success of the project. She identified two
factors as decisive: on the one hand the rational, very systematic approach in which
appropriate synthetic model compounds were chosen to establish the
pharmacophore hypothesis, and on the other hand the very close cooperation
between computer-based considerations and synthetic chemistry.
732 29 Agonists and Antagonists of Membrane-Bound Receptors

29.5 Peptide-Binding Receptors: Development of Angiotensin


II Antagonists

The importance of the renin–angiotensin–aldosterone system for the treatment of


hypertension was already emphasized in ▶ Sects. 24.4 and ▶ 25.4. The vasoactive
octapeptide angiotensin II, Asp–Arg–Val–Tyr–Ile–His–Pro–Phe is formed from
angiotensinogen by the effects of the renin and ACE enzymes. Blocking this system
at any arbitrary step leads to a drop in blood pressure. Initially, renin secretion from
the kidney can be suppressed by inhibiting the b-adrenergic receptors. Then renin
and ACE can be blocked by inhibitors, and of course the use of an angiotensin-II
antagonist thwarts the binding of angiotensin II to the AT1 receptor. ACE,
a relatively unspecific protease, cleaves other peptides such as bradykinin, enkeph-
alin, and substance P in addition to angiotensin I. These reactions are suppressed by
ACE inhibitors (▶ Sect. 25.5), which have long been used therapeutically. As
a consequence, about 5–10% of all ACE-treated patients develop a dry cough as
a burdensome side effect. This is a result of the inhibition of the metabolism of
bradykinin. An angiotensin-II antagonist leaves bradykinin levels unaltered. An
intervention is made at the terminal end of the cascade with an AT1 receptor
antagonist, also turning off the effects of angiotensin synthesized in the organism
by other proteases that are independent of renin–ACE.
In 1971 the octapeptide saralasin, Sar–Arg–Tyr–Val–His–Pro–Ala
(Sar ¼ sarcosine, or N-methylglycine) was identified as the first specific
angiotensin-II antagonist. This peptide acts to lower the blood pressure in patients
who have high renin levels, but it is not orally available. Moreover it has a short
half-life and other undesirable properties. Therefore saralasin was not suitable as
a drug. Attempts to find non-peptidic antagonists based on saralasin and other
peptides were futile.
Until the beginning of the 1980s, there was hardly any progress in this research
area. At that point at Takeda, efforts toward angiotensin-II antagonists were
abandoned in favor of an ACE project. In 1982 however, the decisive impulse for
further research was provided by the publication of two patents. The non-peptidic
antagonists S 8307 29.15 and S 8308 29.16 (Fig. 29.9) that were covered in these
patents were indeed only weakly active, but the first non-peptidic antagonists
caused a sensation.
Numerous companies investigated these new lead structures, Dupont included.
Because of the extensive basic research on the peptidic structures, broad knowledge
about the conformation of angiotensin II and many analogues was available. The
Takeda structure was compared with the assumed receptor-bound conformation of
angiotensin II. The structural superposition led to the conclusion that a modification
of the Takeda structure at the para position of the benzyl group should be the most
promising strategy to increase the affinity. The result of these considerations was
the synthesis of 29.17. The compound is 10 times more potent than S 8307 and
S 8308.
Further systematic variations at this position lead to 29.18, which binds another
10 times more potently to the AT1 receptor (IC50 ¼ 140 nm). The first compounds
29.5 Peptide-Binding Receptors: Development of Angiotensin II Antagonists 733

Cl
N
COOH
N
29.15 S-8307, R = Cl IC50 = 40 mM

R 29.16 S-8308, R = NO2 IC50 = 13 mM

Cl
N
X
N

29.17 X = COOH, R = COOH


IC50 = 1.6 mM
R

O COOH
29.18 X = COOMe, R =
N
H
IC50 = 0.14 mM

COOH
29.19 X = OH, R=
IC50 = 0.30 mM

H
N N
N N

29.20 X = OH, R=

IC50 = 0.019 mM

Fig. 29.9 The most important intermediates in the development of the angiotensin-II receptor
antagonist losartan. The basic structure of the angiotensin-II antagonists 29.15 and 29.16, which
were published in a patent from Takeda, was retained. Variations at the substituents R were
oriented on a superposition of the Takeda structure with a model of the receptor-bound confor-
mation of angiotensin II (Fig. 29.10). Compounds 29.19 and 29.20 are orally available angioten-
sin-II receptor antagonists. Losartan 29.20 successfully completed clinical trials and is available
for clinical use since 1994.

to be prepared from this substance class led to a dose-dependent reduction in blood


pressure in the rat, but they were not orally available. Biphenyl derivative 29.19
made the breakthrough to oral availability. The slightly poorer binding to the
receptor is unimportant in view of this important quality. The replacement of the
carboxyl group on the aromatic ring by a more lipophilic tetrazole isostere finally
lead to DuP 753, 29.20 (Losartan), which binds to the receptor with 19 nm, is orally
available, and has a very long half-life. Losartan passed all of the clinical trials
734 29 Agonists and Antagonists of Membrane-Bound Receptors

successfully, and has been marketed as Lozaar ® since 1994. Losartan was therefore
the first angiotensin-II receptor antagonist to gain approval for the treatment of
hypertension. Only one year later, Novartis followed their colleagues at Dupont
with valsartan 29.21. In the meantime, an entire class of drugs, called sartans, have
received approval for therapy (Fig. 29.10). After multiple years of clinical use,
however, not all sartans have proven to be equally efficient, for instance for the
treatment of congestive heart failure. A comparative study with more than 5,000
patients in Sweden demonstrated that candesartan gives better therapeutic results
than losartan. Ninety percent of patients treated with candesartan survived for
one year, and 61% survived for 5 years. On the other hand, 83% of patients treated
with losartan survived for one year and 44% survived for 5 years. The higher
affinity that candesartan has for the AT1 receptor compared to losartan is conspic-
uous as well as its prolonged persistence at the site of action, which is 10–30 times
longer. Perhaps these parameters express the increased efficiency of candesartan.

29.6 Do Peptidic Agonists and Small-Molecule Antagonists


Bind at the Same Position of the AT2 Receptor?

The blood-pressure-increasing effect of angiotensin II is based on its binding to the


AT1 receptor, of which two isoforms have been described. As a result, arterial
vascular contraction is provoked. Aldosterone is released to control the electrolyte
homeostasis, the cardiac contractility is increased, and the glomerular filtration rate
of blood through the kidney is regulated. The AT2 receptor, which belongs to the
same family of GPCRs, is associated with other regulatory processes.
The octapeptide that binds as an agonist served as the reference compound for
the development of low-molecular-weight antagonists. As described above, the
design hypothesis intended an overlap of the C-terminal amino acids, Ile–His–
Pro–Phe, with the sartan scaffold (Fig. 29.11). Even if this comparison had perhaps
produced a successful working hypothesis, it turned out to be incorrect later.
Mutation studies on the AT1 receptor showed that the amino acids that have an
influence on angiotensin II binding are all in three extracellular loops and on the
N-terminal sequence segment. In contrast, the mutations that change losartan
binding lie deep in the interior of the transmembrane portion of the receptor. No
overlapping binding areas for the peptide agonists and the low-molecular-weight
antagonists can exist. This result of divergent binding areas could be supported by
a study on the frog Xenopus laevis. The AT1 receptor of the frog recognizes the
octapeptide with nanomolar affinity; losartan, on the other hand, binds in the two-
digit micromolar range. The highly potent antagonist on the human receptor
therefore fails on the frog receptor. This observation inspired the scientist to
incorporate the antagonist-binding site of the human receptor into the initially
non-responsive amphibian receptor in an experiment. The targeted mutation of
13 residues in the transmembrane region that are important for losartan binding
in the human receptor caused the frog receptor to suddenly become highly affine for
antagonists! This elegant experiment underscores the spatial separation of binding
29.7 Lessons Taught by the Nose: We Smell with GPCRs 735

Cl
N O
HO CH3
CH3 S
HOOC N
N N
HOOC CH3
N

N N
N N COOH
NH NH
N N
29.20 29.21 29.22
Losartan Valsartan Eprosartan

N
CH3 N N N
O
N
CH3 N O CH3
N N
O
O
O
O
N O
N N
NH HOOC N
N NH
N
29.23 29.24 29.25
Irbesartan Telmisartan Candesartan-Prodrug

Fig. 29.10 Losartan 29.20 was the first angiotensin-II receptor antagonist, shortly thereafter
valsartan 29.21 followed. Eprosartan 29.22, irbesartan 29.23, telmisartan 29.24, and candesartan
29.25 are further representatives of the sartan class. Candesartan is a prodrug; the red-colored
portion is cleaved to release the actual active substance.

areas for peptide agonists and low-molecular-weight antagonists. The pragmatist


can still claim that sometimes incorrect design hypotheses nonetheless have their
value and can guide the development in the correct direction.

29.7 Lessons Taught by the Nose: We Smell with GPCRs

The wealth of nuances that our sense of smell can perceive is impressive. Almost
poetically we try to describe gradations in scent with words. Our sense of smell is
probably the biological system that can be most easily illustrated when it comes to
the biological activity of the spatial chemical structure of molecules. With each
breath, volatile molecules are pulled into our noses and brush against the olfactory
receptors. There they leave a nuance-rich signal that is translated to a multifaceted
sense of smell in the brain. That the shape of molecules is coupled to a particular
order has been known for a long time. Elliptical molecules, for example, have
a camphor-like scent. Long stretched-out molecules are described as having an
ether smell, and floral character requires a construction that is reminiscent of the
shape of a violin case. However, even small structural changes can exert impressive
effects on our sense of smell (▶ Sect. 5.7).
736 29 Agonists and Antagonists of Membrane-Bound Receptors

HN NH2

NH OH HN N

O O O
H H
H2N N N N
N N N
H H H
O O O NH
HO2C H3C CH3 H3C O
CH3 CO2H

HO
Cl
His6
Losartan
N N
Pro7

N
N
N Ile5
N
H3C H C-Terminus
Phe8
Angiotensin II

29.20

Fig. 29.11 The C-terminal part (red) of the octapeptide angiotensin II (yellow, below) is
compared with the structural architecture of losartan (green) to generate a working hypothesis,
which later turned out to be wrong. The butyl side chain mimics an isoleucine residue and the
imidazole ring with the CH2OH group lies on the histidine. The proline and the phenyl ring of
a phenylalanine are mimicked by a biphenyl group. The tetrazole represents an isostere for the
terminal acid function.

The elucidation of our sense of smell is based on the work of Linda Buck and
Richard Alex, who were awarded with the Nobel Prize in medicine in 2004 for their
accomplishments. Scent molecules are perceived by the olfactory cells in the
olfactory mucosa of the nasal cavity. Different olfactory cells are depolarized and
activated by diverse scents, this means the receptor proteins on the cells can
distinguish between structurally divergent aromas by their affinity to them. GTP-
dependent activity of an adenylate cyclase increases as a signal in the cell. This is
interpreted as a distinct indication of the involvement of intracellular G proteins in
the olfactory process.
Linda Buck and Richard Alex therefore sought after a family of G protein–
coupled receptors that are expressed in the olfactory mucosa of the rat. They
were quickly successful. In the meantime it is known that there are about
29.7 Lessons Taught by the Nose: We Smell with GPCRs 737

1,000 GPCR-type olfactory receptors in mice, and about 350–400 different


receptors in humans. They make up about 1–5% of the mammalian genome.
Despite their similar function, they vary in sequence severely.
This hypervariability is consistent with the nuance-rich recognition and binding
of odorants of very different structures. But for such diversity, are 350–400
different receptor variants really enough? Only one type of olfactory receptor is
expressed per neuron, which means, each has one individual olfactory receptor
gene available to it. By studying the reaction profile of different neurons that are
generated by recognizing structurally modified odorants, the following result was
obtained: Each olfactory receptor can recognize multiple odorants. On the other
hand, a given odorant is recognized by multiple receptors, but the induced receptor
response varies in degrees of intensity. That means that different scents are regis-
tered by different receptor combinations and an attenuated signal distribution is
generated. This corresponds to a combinatorial coding of the sense of smell. This
trick of encoding olfactory signals in composite receptor profiles makes it possible
to distinguish an almost unlimited number of scents. This is the secret to the
versatility of our sense of smell. The kind of magical world that is accessible to
mice is almost unimaginable. They are equipped with almost three times as many
olfactory receptors as we are. Perhaps even Jean-Baptiste Grenouille from Patrick
S€uskind’s novel, Perfume: The Story of a Murderer would pale in comparison. It is
well known that the olfactory sense can vary between people. Some scents are
inaccessible to some people. Other individuals experience a scent as disgusting and
offensive, while others describe it as pleasant and welcoming. This, for example,
can be seen in case of the steroid androstenone. Androstenone is an important
component of a typical male scent. It is a metabolite of the male sexual hormone
testosterone and serves as a pheromone in diverse mammals. In wild boars, it makes
the sow receptive to the boar. In a study with a group of almost 400 test persons, the
sense of smell was divided into three populations of approximately equal size.
A third could not smell the compound at all, another third experienced it as
repulsive, and the last 30% found that it had a pleasant vanilla-like smell.
As an explanation for this discrepancy, genetic differences in the olfactory
receptors activated by androstenone were found. The receptor OR7D4 showed the
strongest sensitivity to the steroid. Variations in single base pairs in the genome of
individuals (so-called single nucleotide polymorphisms, SNPs, ▶ Sect. 12.11) were
found in the search for genetic polymorphisms. Two coupled exchanges occur
most often that cause the exchange of an arginine for a tryptophan, and a threonine
for a methionine. Both expression forms (alleles) of the mutated receptor show
a reduced response to the odorants under in vitro conditions. Interestingly, the
probands that carry a mutated receptor seem not to perceive the scent, or they
perceive it only weakly. People with the unchanged gene variation are overly
sensitive to the odorant. Therefore it could be in the genes that some men cannot
smell each other!
The exchange of a serine for an asparagine at another position in the OR7D4
receptor increases the sensitivity to the steroid. Interestingly, this exchange in
addition to four other mutations in the receptor is what distinguishes humans
738 29 Agonists and Antagonists of Membrane-Bound Receptors

from chimpanzees. The recognition of possible rivals in the wild for a male chimp
or a sexual partner for a female chimp might be more important than for us humans.
Perhaps Nature equipped chimpanzees with a more sensitive olfactory receptor for
androstenone for this reason.
Two aspects from the study of the sense of smell can be translated to the effects
of drugs on GPCRs. Synthetic agonists and antagonists compete for the same
binding site in cases of GPCRs that are regulated by small biogenic amines in
particular. Usually the synthetics are larger and interact with an increased number
of amino acid residues than the endogenous competitor. Polymorphisms based on
individual base exchanges are described for these receptors, too. Therefore an
attenuated sensitivity to the effects of these drugs on the mutated receptors must
be counted upon. As a result, this is noticeable when, within one group of patients,
variations in the therapeutic window are found. The other aspect that was illumi-
nated by the research on olfactory receptors is the combinatorial composition of
a binding profile made up of the individual interaction signals from the different
receptors. Multiple subtypes of pharmacologically relevant GPCRs are known, for
example the serotonin receptor, which are expressed on the cells. Efforts have
indeed been made to develop highly selective ligands for these subtypes, but this is
not an easy task if the receptor subtypes are particularly closely related. There is
always an attenuated binding to all related receptors. Therefore the signal that
reaches the cell is a composite of information from all the individual binding
profiles. These profiles differ for different ligands and afford a divergent pharma-
cological activity spectrum. This makes it extraordinarily difficult to estimate
the therapeutic value of a development candidate in this area before clinical trials.
It could be that these ligands have just the right balance on multiple subtypes to
achieve their value in therapy. Analogously, a scent develops its lofty potential by
the optimally graduated stimulation of a shotgun of multiple olfactory receptors,
whereas another does not exceed the modest niveau of a cheap perfume!

29.8 Receptor Tyrosine Kinases and Cytokine Receptors: Where


Insulin and EPO Display Their Activity

Not only GPCRs relay extracellular signals into the cell’s interior. A further large
group of membrane-bound receptors that can also achieve this task are the classes
of dimerizing or oligomerizing receptors that bind growth factors. These recep-
tors carry a tyrosine kinase domain on the cytosolic side. Therefore these receptor
tyrosine kinases can also be considered to be allosterically regulated enzymes, the
controlling domains of which are found on the cell’s exterior. The ligands for these
receptors, the growth factors, are themselves proteins with ca. 50–400 amino acids.
By binding, presumably initially to a monomeric building block of the receptor, the
dimerization is accomplished. Conformational changes on the cell’s interior are
induced in both of the tyrosine kinase domains that had come together, and this
results in the autophosphorylation of the receptor at multiple sites. This is the
trigger for the recruitment of further adapter proteins that also activate
29.8 Receptor Tyrosine Kinases and Cytokine Receptors 739

phosphorylation. These processes lead into a subsequent kinase-dependent signal


cascade and activate processes in the cell nucleus that regulate gene expression.
About 20 classes have been characterized in the family of dimerizing tyrosine
kinase receptors. A large group of these are represented by the insulin-like growth
factor receptors (IGFRs). Insulin, a 51-residue protein composed of two chains,
binds to one such receptor. One special feature of the IGFRs is that they are
permanently in the dimeric form. Disulfide bridges couple both receptor halves. It
was discovered for another group, the epidermal growth factor receptors, that
heterodimers between receptor subtypes can be formed. Not all of the tyrosine
kinases of these receptors are functional so that an additional receptor regulation of
the subsequent signaling cascade is possible by the dimerization of different units.
The therapeutic goal to be achieved by activating or suppressing these receptors
can be very differently. Insulin receptor stimulation represents a concept for the
treatment of diabetes mellitus because this receptor regulates the uptake of glucose
into the cell. On the one hand, modified insulin derivatives (▶ Sect. 32.2) that
stimulate the receptor analogously to natural insulin can be concentrated on.
Alternatively, an attempt can be made to activate the receptor’s tyrosine kinase
activity. The non-peptidic fungal metabolite L-783281 29.26 (Fig. 29.12) was
discovered in a cell-based screening assay at Merck & Co. This orally administered
compound reduces the blood sugar level in a diabetic mouse model. Perhaps the
binding of specific low-molecular-weight ligands represents a new perspective for
the development of orally available insulin-replacement therapy. The insulin-like
receptors such as the epidermal growth factor receptor represent an interesting
target structure for tumor therapy. Their expression is upregulated in tumors.
They stimulate cell growth and prevent programmed cell death (apoptosis). In
these cases, blocking the function of these receptors is of interest. Until now all
experiments to develop low-molecular-weight inhibitors that block the recognition
of growth factors on the surface segment of the dimerizing receptor have failed. The
interaction between the large interaction surface and the protein being recognized
represents a nontrivial problem for drug design (▶ Sects. 7.10, ▶ 10.6, and
Fig. 29.13). Nonetheless a highly specific antibody could be found that competes

O H
HO N O HN Cl
N O
N

N OH MeO N
H O

29.26 L-783 281 29.27 Gefitinib

Fig. 29.12 The fungal metabolite L783281 29.26 was discovered as an insulin mimetic. It
stimulates the tyrosine receptor kinase of the insulin receptor and has antidiabetic properties.
Gefitinib 29.27 inhibits the tyrosine kinase domain of epidermal growth factor.
740 29 Agonists and Antagonists of Membrane-Bound Receptors

Fig. 29.13 The crystal


structure of erythropoietin
(EPO) with the ligand-
binding domain of the
erythropoietin receptor
(EPOR). The structural
elements are represented
schematically. EPO adopts
a tetrahelical bundle folding
pattern (yellow). The dimeric
receptor is basically
constructed from b-pleated
sheets. The contact surface
between receptor and ligand
are shown in white and blue
(interior side is yellow).

with the natural ligand and foils receptor binding (▶ Sect. 32.3). An alternative
concept is the inhibition of tyrosine kinases from the interior of the cell. In this case
successes have been registered. Gefitinib 29.27 is a tyrosine kinase inhibitor
(▶ Sect. 26.3) for the epidermal growth factor receptor that has found clinical
application (Fig. 29.12). Antisense nucleotides (▶ Fig. 32.4) represent alternative
therapeutic concepts, as does gene silencing with siRNA (▶ Sect. 12.7) to reduce
the expression rate of the target receptor. In addition to the signal cascades that the
tyrosine kinase receptors use, living organisms also use cytokines for signal trans-
duction. Here too, protein-like signaling molecules that often adopt folding patterns
made up of a bundle of four helices (Fig. 29.13) are dealt with. Cytokines are
released from many cells. Their principle function is to disseminate signals in the
immune system. They are therefore involved in immune, inflammatory, and infec-
tious diseases. They also give signals to leukocytes and macrophages to migrate to
sites of inflammation. They play an important role in cell differentiation and cell
proliferation so that they also have importance in cancer therapy.
Cytokines are recognized by cell-surface receptors that are also coupled to
a protein kinase on the cell’s interior. They are able to initiate cellular processes
through these kinases. This can lead to the up- or downregulation of gene expres-
sion. Cytokines are also called interferons, interleukins, and chemokines. Interferons
stimulate cells involved in immune defense during viral infections in particular.
Interleukins were initially considered to be involved in the communication between
leucocytes, but because they are also involved in the modulation of cell growth and
cell death, they are also used for the treatment of tumors. Chemokines are signaling
molecules that attract immune cells to sites of inflammation.
From the point of view of therapy, cytokines themselves or functional surrogates
are interesting. Either stimulation or inhibition of their receptors can represent
a therapeutic concept. Because, once again, receptors that are regulated by proteins
29.9 Synopsis 741

are being dealt with, it is difficult to intervene with low-molecular-weight com-


pounds. Therefore native cytokines are used as the therapeutic medication. Eryth-
ropoietin, EPO which is used to stimulate the formation of erythrocytes and is
therefore abused by athletes for doping purposes, the interferons INF-a and INF-b,
which are used to treat multiple sclerosis and virally induced chronic hepatitis, or an
artificial TNF-a receptor, which is for the treatment of chronic arthritis represent
the forefront of sales of the so-called biologicals (▶ Chap. 32, “Biologicals:
Peptides, Proteins, Nucleotides, and Macrolides as Drugs”). Anakinra, an
interleukin-1 receptor antagonist that is manufactured by gene technology, was
approved for the treatment of rheumatoid arthritis in 2002. As an antagonist, it
blocks the inflammatory effects of the interleukin IL-1.

29.9 Synopsis

• Membrane-bound receptors of the family of G protein–coupled receptors


(GPCRs) and oligomeric receptors with attached tyrosine kinase domains trans-
mit information from outside of the cell into the interior.
• GPCRs represent a large family of proteins in the human genome; the family has
about 800 members and are targeted by 30% of the marketed drugs. GPCR
activation is initiated by the binding of an extracellular ligand and transmitted
through conformational transitions. This results in the generation of a binding
site to accommodate the G protein.
• Extracellular ligands can feature the properties of an agonist by activating the
receptor and stabilizing its active conformation. An antagonist prevents
the binding of an agonist; however, it does not turn off the basal activity of the
receptor. An inverse agonist is able to fully suppress the receptor function and
stabilizes the inactive conformation.
• GPCRs occur as many subtypes, and their misregulation is associated with many
disease patterns, which strongly depend on the tissue in which they are exposed.
• The first structural insights into GPCRs were obtained from bacteriorhodopsin, a
seven-transmembrane receptor that is a proton pump and not a real GPCR.
Rhodopsin, the light-regulated receptor in the eye, is at present the best-studied
GPCR, and structures of its inactive and active state are known. They suggest
how activation is transmitted via conformational rearrangement of amino acid
residues and reshuffling of a water network leading to helical movements and the
opening of an ionic lock. This results in the generation of the recognition site of
the G protein.
• Crystal structures of the b-adrenergic receptors explain binding features of
classical b-blockers and suggest tiny structural transitions from the inactive to
active state.
• Ligand-based optimization, including freezing the conformational state of bound
agonists, has helped in the development of selective D1 agonists.
• Antagonists for the peptide-binding AT1 receptor have been developed by
taking the four C-terminal residues of the peptide angiotensin II as a reference.
742 29 Agonists and Antagonists of Membrane-Bound Receptors

The resulting sartans are potent antihypertensives. Later it was demonstrated that
the peptide agonist and the small-molecule antagonists do not share a common
binding site, nevertheless an incorrect working hypothesis resulted in successful
drug design.
• The wealth of nuances of our sense of smell is achieved by a simultaneous
recognition of odorants at multiple GPCRs with composite and attenuated receptor
profiles.
• Genetic polymorphism of the odorant receptors results in an attenuated sensi-
tivity of individuals for different scents.
• Composite receptor profiles and attenuated sensitivity due to genetic polymor-
phism can be expected for GPCRs targeted by marketed drugs too.
• Dimerizing or oligomerizing receptors bind growth factors and cytokines and
carry a tyrosine kinase domain on the cytosolic side. Upon activation, the kinase
domain starts autophosphorylation, which initiates kinase-dependent signaling
cascades.
• Activation or suppression of oligomerizing receptors needs ligands that interfere
with the binding of macromolecular endogenous ligands. Antibodies have suc-
cessfully been raised to compete with the natural ligands. Furthermore, small-
molecule kinase inhibitors have been developed to block function of the attached
cytosolic tyrosine kinase domain.

Bibliography

General Literature

Buck LB (2005) Unraveling the sense of smell (Nobel lecture). Angew Chem Int Ed Engl
44:6128–6140
Martin YC et al (1991) Molecular modeling-based design of novel, selective, potent D1 dopamine
agonists, in QSAR: rational approaches to the design of bioactive compounds. Elsevier,
Amsterdam, pp 469–482
Rexler RR et al (1996) Nonpeptide angiotensin II receptor antagonists: the next generation in
antihypertensive therapy. J Med Chem 39:625–656
Timmermans PBMWM, Wong PC, Chiu AT, Herblin WF (1991) Nonpeptide angiotensin II
receptor antagonists. Trends Pharmacol Sci 12:55–61

Special Literature

Bianco R et al (2007) Rational bases for the development of EGFR inhibitors or cancer treatment.
Int J Biochem Cell Biol 39:1416–1431
Cherezov V et al (2007) High-resolution crystal structure of an engineered human b2-adrenergic
G protein-coupled receptor. Science 318:1258–1265
Copeland RA, Pompliano DL, Meek TD (2007) Drug–target residence time and its implications
for lead optimization. Nat Rev Drug Discov 5:730–739
De Meyts P, Whittaker J (2002) Structural biology of insulin and IGF1 receptors: implications for
drug design. Nat Rev Drug Discov 1:769–783
Bibliography 743

Eklind-Cervenka M, Benson L et al (2011) Association of Candesartan vs. Losartan With


All-Cause Mortality in Patients With Heart failure. J Am Med Assoc 305:175–182
Ji H, Zheng W, Zhang Y, Catt KJ, Sandberg K (1995) Genetic transfer of a nonpeptidic antagonist
binding site to a previously unresponsive angiotensin receptor. Proc Natl Acad Sci USA
92:9240–9244
Keller A, Zhuang H, Chi Q, Vosshall LB, Matsunami H (2007) Genetic variations in a human
odorant receptor alters odour preception. Nature 449:468–472
Rasmussen SG et al (2007) Crystal structure of the human b2 adrenergic G-protein coupled
receptor. Nature 450:383–387
Rosenbaum DM, Cherezov V et al (2007) GPCR engineering yields high-resolution structural
insights into b2-adrenergic receptor function. Science 318:1266–1273
Standfuss J, Edwards PC et al (2011) The structural basis of agonist-induced activation in
constitutively active rhodopsin. Nature 471:656–661
Timmermans PBMWM et al (1993) Angiotensin II receptors and angiotensin II receptor antago-
nists. Pharmacol Rev 45:205–242
Warne T, Serrano-Vega MJ et al (2008) Structure of a b1-adrenergic G-protein-coupled receptor.
Nature 454:486–491
Warne T, Moukhametzianov R et al (2011) The structural basis for agonist and partial agonist
action on a b1-adrenergic receptor. Nature 469:241–245
Wu B, Chien EYT et al (2011) Structures of the CXCR4 chemokine GPCR with small-molecule
and cyclic peptide antagonist. Science 330:1066–1071
Ligands for Channels, Pores, and
Transporters 30

The cell is the smallest structural and functional unit of all living things. Single-cell
organisms contain only one such unit. In complex organisms such as humans,
1013–1014 cells come together. Due to their constitution, cells are capable of
metabolism. They possess a complex architecture that is directly related to their
function. Because of the high degree of differentiation in higher-developed
organisms, it is not possible to refer to a typical, representative cell. Each cell is
surrounded by a cell membrane. It ensures that the cell represents an individual
closed unit. Signals must be transmitted through this membrane. Systems that achieve
this task were discussed in ▶ Chaps. 28, “Agonists and Antagonists of Nuclear
Receptors” and ▶ 29, “Agonists and Antagonists of Membrane-Bound Receptors”.
Material exchange must also be possible, however, so that the cell can be supplied
with the necessary substances for its function. The selective permeability of the
membrane is of special importance. Amphiphilic compounds can passively diffuse
through the membrane of their own accord. For example, the steroid hormones
discussed in ▶ Chap. 28, “Agonists and Antagonists of Nuclear Receptors” have
this property. Polar compounds such as amino acids, peptides, or sugars do not
permeate the membrane passively but they are essential for the maintenance of the
cell. Therefore the cell is equipped with special transporters that are sometimes
highly selective, but sometimes also surprisingly promiscuous. Because the substance
transport of polar compounds generally occurs against a concentration gradient, this
is accomplished only with the consumption of energy. Nature couples the task of such
a transporter with an energetically favorable reaction. In biological systems, the
hydrolysis of the triphosphate unit of ATP primarily serves this purpose.
Another group of charged particles, the ions, have fundamental importance in
the regulatory function of cells. Without special protein systems, however, they
could not permeate the membrane. If different concentrations of certain ions are in
the cell interior and exterior, a difference in the electrochemical potential will
result. Changes in the membrane permeability for ions play a decisive role in cell
stimulation and signal transduction. Nerve and muscle cells in particular react to
such stimuli with specific changes in their states. For example, the contractions of

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_30, 745


# Springer-Verlag Berlin Heidelberg 2013
746 30 Ligands for Channels, Pores, and Transporters

muscle cells determine the heart beat. Nerve cells transmit stimuli over short or
long distances and serve to distribute information in the central nervous system.
The establishment and maintenance of such concentration gradients across the
membrane requires the transport of the relevant ions across the membrane barrier. It
is primarily the ion pumps that build a concentration gradient across the membrane
barrier. They work relatively slowly and consume energy. Therefore their function is
coupled to an energetically favorable reaction. Ion pumps achieve a transport rate of
102–104 particles per second. Indeed, at 103–105 molecules per mm2 their local density
in the membrane rate is relatively high, but for the fast switching of cellular processes,
the ion pumps are much too slow. Therefore there are specific ion channels that are
responsible for selective and passive ion passage along a concentration gradient. They
achieve a flow rate of 106–108 ions/s, which is only slightly below the rate of diffusion.
Their occupancy density in the membrane is much lower at 1–10 molecules/mm2. Ion
channels are either voltage or ligand-gated and allow a change in the membrane
potential within the millisecond range. If the pumps have established an electrochem-
ical gradient across the membrane, the opening of a specific ion channel leads to ion
flow across the membrane purely for entropic reasons.
The cell must also regulate its water homeostasis. Individual water molecules
can directly diffuse across the membrane. To transport larger quantities of water
across the membrane, specific pores called aquaporins are necessary that regulate
the in- or outflow of water according to the gradient in the osmotic pressure.
These systems of specific particle transport across a membrane shall be consid-
ered in this chapter in detail. They are all integral membrane proteins. Examples
of these membrane proteins that have been characterized by structural biology will
be presented. Ligands shall be discussed that represent an approach to the thera-
peutically relevant regulation of these proteins. Furthermore bacterial transport
systems that change the permeability of membranes can act as antibiotics to combat
other microorganisms shall be discussed.

30.1 Electric Potential and Ion Gradients Stimulate Cells

Certainly, everyone is familiar with the construction of electrochemical redox


cells from electrochemistry. If a vessel that is fitted with an ion-permeable glass
membrane is filled on either side of the membrane with solutions of different
concentration of, for example, copper sulfate, and copper metal plates are placed
in both solutions, a voltage difference can be measured. The voltage can be
calculated by using the Nernst equation. Because an identically constructed half-
cell is used on both sides of the membrane, just the logarithm of the concentration
difference on both sides and the number of migrating charges determines the
potential. In a thought experiment, imagine that the different sides of the vessel
are filled with potassium and sodium chloride solutions. The separating membrane
should only be permeable for potassium ions, and impermeable for sodium ions.
The system will try to balance the concentration gradient of both Na+ and K+ ions
for entropic reasons. The membrane, however, only allows this for potassium ions.
30.1 Electric Potential and Ion Gradients Stimulate Cells 747

[Na+] 150 mM [K+] 5 mM [Cl−] 120 mM [Ca2+] 2 mM


K+
Membrane
Exterior

Membrane
Cl- Na+/K+ Ca2+ Na+ K+ Interior
Channel ATPase Na Channel
+ Channel Channel

[Na+] 15 mM [K+] 150 mM [Cl−] 10 mM [Ca2+] 10–100 nM

Fig. 30.1 Different pumps and ion channels ensure the calibration of ion gradients across the cell
membrane, and in doing so establish a potential difference across the membrane. They can be
either ligand or voltage-gated. Potassium channels (blue) carry potassium ions out of the cell
highly selectively and are largely responsible for the calibration of the resting potential. The
opening of fast sodium (red) and also calcium channels (green) cause an action potential, which
leads to depolarization. A pump (violet) that exchanges three Na+ for two K+ ions ensures the
re-establishment of the Na+/K+ ion concentration in the resting state. Chloride channels (yellow)
allow an influx of Cl- ions, which hyperpolarizes the cell and hinders depolarization. The
concentrations in mol/L (M) that are given on both sides of the membrane correspond to the
approximate values in the resting state.

Therefore an excess of positive ions accumulates on one side of the membrane, and
a deficit is on the other side. A potential difference is formed that, as in the first
case, can be calculated from the concentration difference at the boundary surface by
using the Nernst equation. After a little while the net migration of potassium ions
comes to a stop because the tendency to reduce the concentration gradient is
counterbalanced by the difference in electrostatic potential. Only a few potassium
ions, in fact, migrate before this dynamic equilibrium is established.
In living nature, it is above all sodium, potassium, calcium, and chloride ions
that form such potential differences between the interior and exterior of the cell.
Initially ion pumps are responsible for the membrane’s concentration gradient. If
the cell membrane were only permeable for potassium ions, a 30-fold concentration
gradient of K+ ions between the interior and the environment would result in
a voltage of 90 mV (Figs. 30.1 and 30.2). This is the situation in the resting
state of a cell. As described in the thought experiment, the outflow of K+ ions
through a highly specific potassium channel establishes this voltage difference. The
above-mentioned 90 mV is not measured, but rather a resting membrane
potential of about 70 mV (Fig. 30.2) is observed. Because other ions also have
a certain permeability, the membrane potential at any arbitrary time reflects
a complex mixture of different contributions of individual ions and their conduc-
tivities (Fig. 30.1). To stabilize the cell in a particular phase, (for instance, in the
resting state), the cellular ion distribution is maintained by Na+/K+ATPases. They
pump ions against the electrochemical concentration gradient by consuming ATP.
For each transport process, three sodium ions are pumped out of the cell while two
potassium ions are pumped in. There is another such pump to establish the calcium
ion concentration.
748 30 Ligands for Channels, Pores, and Transporters

+50 mV Membrane Potential

0 mV

−50 mV

−100 mV

Ion Time
Flow
iK+←

iCa2+→ iNa+→

Time

Fig. 30.2 The membrane potential in the resting state is about 70 mV and is stabilized by the
efflux of potassium ions (iK+ , blue). Upon excitation, a fast sodium channel opens. The influx of
sodium ions (iNa+!, red) shifts the membrane potential by about 100 mV in the positive region.
When this value is reached, the sodium channel closes. The efflux of potassium ions repolarizes the
cell and shifts the potential below the threshold of the resting membrane potential (hyperpolari-
zation). In cells that have calcium channels, their opening can also contribute to depolarization and
therefore to an action potential (iCa++!, green).

If the cell is stimulated by a so-called action potential, the membrane perme-


ability for the individual ions changes. First of all, the permeability for sodium ions
changes dramatically. In the resting state, the sodium ion concentration in the
extracellular space is about tenfold higher than in the cell’s interior. At
a threshold potential of about 60 mV, the sodium channels open, and the mem-
brane potential is temporarily shifted toward the positive range to about +40 mV by
depolarization. Even before the Na+ equilibrium potential of about +60 mV is very
quickly achieved, the fast influx of sodium ions stops because the sodium channel
closes again. Finally the potential changes in the direction of the resting potential
and the so-called repolarization causes the outflux of potassium ions from the cell.
The voltage-dependent KV potassium channels control this process. The membrane
potential decreases slightly below the resting membrane potential, and the potas-
sium channels close (Fig. 30.2). If the membrane potential has fallen to more
negative values than the resting potential, a hyperpolarization is spoken of. The
excitability of the cell is reduced in this state.
Calcium ions can contribute to the stimulation and action potential of a cell. These
play a decisive role in, for instance, heart muscle cells. The extracellular calcium
concentration is significantly higher than that inside the cell. Therefore an additional
influx of Ca2+ ions into the cell can intensify the depolarization (Fig. 30.1). Entry into
the cell’s interior is gained through highly selective calcium channels. A slowing of
30.2 Molecular Function of a Potassium Channel at the Atomic Level 749

the depolarization can be achieved if the opening of these sodium or calcium channels
is blocked. Many local anesthetics work by the principle of inhibiting sodium
channels on nerve cells. Calcium channel blockers minimize the Ca2+ influx. This
slows, for example, the diastolic depolarization in heart cells and the heart muscle
works more efficiently. Therefore compounds such as nifedipine, diltiazem, and
verapamil are used to treat hypertension and cardiac arrhythmias (▶ Sect. 2.6).
The electrophysiological processes described in this section reflect a highly
simplified picture. According to the function and tissue-specific location of the
considered cell, multiple ion-specific channels are at work to achieve a finely tuned
setting of the required membrane potential.

30.2 Molecular Function of a Potassium Channel at the Atomic


Level

The finely attenuated setting of ion gradients across the membrane establishing the
overall membrane potential shows that channels must have a high selectivity for
individual ions. The difference between the ions to pass is very small. Sodium and
potassium ions have the same charge and differ in size by only a little more than
0.35 Å. Their hydration enthalpies are slightly different, but the geometry of their
hydration shell differs significantly. The larger potassium ion is surrounded by eight
water molecules, whereas the sodium ion preferably accommodates six nearest
neighbors. How can a protein exploit this small difference efficiently so that
a selective ion filter results? The achieved discrimination is impressive: only one
sodium ion is smuggled in for every 10,000 potassium ions!
Ion channels are gigantic molecular constructions. They are embedded in the
membrane. It is extremely tricky to remove them from the membrane and embed
them in a crystal lattice with auxiliary material without destroying them. Once this
is achieved, a crystal structure can be determined. Roderick MacKinnon managed
this masterpiece in 1998 at Rockefeller University in New York. Only 5 years later
his achievement was honored with the Nobel Prize.
Initially the structure of the KcsA potassium channel from the bacteria Strep-
tomyces lividans was determined. The channel is constructed from a homotetramer
and traverses the membrane with two long helices per monomer (Fig. 30.3). The
C-terminal end of another shorter helix is oriented in a cavern in the middle of the
channel. Such a helix forms a dipole moment due to the periodic orientation of well
aligned amide bonds along the protein backbone (▶ Sect. 14.2). The preferred
binding site for a positive charge is formed at the end of such a helix (Fig. 30.4).
Four of these helices are oriented toward the cavern in the interior of the channel.
The potassium ions, which are surrounded by a shell of eight water molecules, are
essentially pulled out of the cytosol. This allows the potassium ions to enter the
hydrophobic membrane environment. With that, however, no discrimination
against sodium ions has been achieved. After the acceleration course into the
channel a selectivity filter is enabled. For this, the potassium ions must shed their
hydration shells. During structure determination it was possible to capture some
750 30 Ligands for Channels, Pores, and Transporters

Fig. 30.3 The four shorter


helices (bright red) of the
tetrameric potassium channel
orient their negatively
polarized C terminus toward
the binding site where the
potassium ion (violet sphere)
sheds its water shell. They
draw the positively charged
ions into the ion channel and
stabilize them in the interior.

potassium ions in the channel. A potassium ion has a quadratic-antiprismatic


water shell shortly before entering into the selectivity filter (Fig. 30.4). The latter is
constructed in a way that four oxygen atoms from four threonine residues (Thr75) of
the tetramer take on four coordination sites of the octahedrally coordinated potassium
ion. Adjacent four carbonyl oxygen atoms of the main chain (Thr75, Val76, Gly77, and
Gly79) act as further coordination ligands. This motif of four ring-forming carbonyl
groups arranged as selectivity filter repeats three times. In doing so, these ring-like-
oriented carbonyl groups take on an arrangement relative to one another that replaces
the perfectly quadratic-antiprismatic orientation of the water molecules in the potas-
sium ion’s coordination sphere. Nature has ingeniously reconstructed the coordination
geometry of the potassium ion. With the architecture constructed from the TVGYG
motif, it managed to achieve impressive selectivity. The orientation of the coordinating
groups simply does not fit lithium or sodium ions. Nature preferred to use main chain
carbonyl groups for this task. They built up a rigid corset; oxygen functionalities from
side chains would be spatially too flexible for this purpose. Only the entry is opened by
side-chain OH groups from four threonine residues, which have a certain mobility.
By now, also the structure of a channel from Bacillus cereus has been deter-
mined which shows the same overall architecture but lacks the selectivity for the
potassium ion. It can barely distinguish between K+ and Na+ ions. The construction
of its selectivity filter in this protein is different. The tyrosine residue in the
TVGYG motif of the potassium channel has been exchanged for an aspartic acid
to give TVGDG. This has far-reaching consequences. In the lower part that is
formed by the threonine and valine residues, the geometry of the filter remains
largely unchanged in both channels. The backbone carbonyl groups of the
30.2 Molecular Function of a Potassium Channel at the Atomic Level 751

Fig. 30.4 Crystal structure of the bacterial potassium channel KcsA in the open state. The
channel forms a tetramer (a), each monomer is constructed from three helices. Two of these
helices (red) traverse the entire membrane whereas the third, shorter helix (blue-violet) is oriented
toward a cavern in the interior of the channel. There the potassium ions (violet spheres), which are
surrounded by eight water molecules, shed their water shell and enter the selectivity filter (b).
A potassium ion with its quadratic-antiprismatic coordination is shown before entering into the
filter. The carbonyl groups from the main chain adopt the octavalent coordination sphere wrapping
around the potassium ion with similar geometry and transfer the ion across the membrane.

subsequent amino acids are turned toward the interior in the potassium-selective
channels and contribute to the filter (Fig. 30.5). In unselective channels, these
carbonyl groups are turned away from this area and open a chamber that can indeed
accommodate an ion, but cannot achieve selectivity filtering.
The mechanism of the opening and closing of the potassium channel is under-
stood in greater detail thanks to further structural determinations, also on channels
from other organisms. A change in the membrane potential of about 50 mV causes
the channel to open. Because this voltage difference occurs within a distance of
about 50 Å, this causes a tremendous effect of about 100,000 V/cm. Obviously parts
of the channel that sense the difference in voltage become severely positively
charged and swim like paddles on the exterior of the membrane. A change in the
voltage across the membrane causes a movement of these paddles and initiates the
opening or closing of the channel. A kink in one of the extended transmembrane
752 30 Ligands for Channels, Pores, and Transporters

a b
Gly
Gly
Gly Gly
Asp Asp
Tyr Tyr

Gly Gly
Gly Gly
Val Val
Val Val

Thr Thr
Thr Thr

Fig. 30.5 Comparison of the tetrameric ion filter of the highly selective potassium channel KcsA
from Streptomyces lividans (a) and the sodium and potassium-permeable channel from Bacillus
cereus (b). The selective channel forms a tetramer from a TVGYG motif; a TVGDG is found at the
same place in the Na+/K+ channel. Both channels have the same geometry in the lower part formed
by threonine and valine residues. The backbone carbonyl groups of the following amino acids
Gly–Tyr are rotated toward the interior in the potassium-selective channel and contribute to the filter,
whereas the C¼O groups from the four Gly–Asp motifs are rotated away in the unselective channel.
It opens to a chamber that can accommodate an ion, but does not achieve selectivity filtering.

helices enables for this process. A combination of a kink and turn movement by about
30 of the helical end in each subunit of the tetramer causes the closure or the opening
of the channel. A highly conserved glycine residue is found at the bend position. Its
lack of a side chain affords this amino acid a larger conformational flexibility.
Therefore glycine is predominantly involved in conformational switches.
One group of potassium channels is ATP-dependent. Their structural architec-
ture is much more complex than the described bacterial channel. Two genes Kir6.1
and Kir6.2 are known that code for the pore-forming part of the ATP-dependent
channel. The channels are hetero-octamers, each constructed from four Kir channel
proteins and four regulatory units. The latter are called sulfonylurea receptors
because they can be blocked by sulfonylureas. ATP binds to the Kir subunit and
the channel closes. ATP is hydrolyzed to ADP via a multistep process, and the
ATP-induced closure is reversed. Through dissociation and renewed binding of
Mg–ADP, the channel is arrested in the open state. Its state therefore depends on the
ATP/ADP ratio in the cell. Active substances such as pinacidil 30.1, diazoxide 30.2,
or levcromakalim 30.3 are known to stabilize the channel in the open state
(Fig. 30.6). Pinacidil is used to treat high blood pressure, and diazoxide is used in
the therapy of Langerhans islet cell tumors. In contrast, the large group of sulfo-
nylureas (Fig. 30.6) blocks the regulatory subunit and leads to closure of the
attached potassium channel in the insulin-producing cells of the pancreas. Elevated
glucose concentrations stimulate insulin secretion from the pancreatic b-cells.
This release occurs as a response to a series of intracellular metabolic and
30.2 Molecular Function of a Potassium Channel at the Atomic Level 753

CH3
O O O
H3C CH3
S R1
N N
H H
N N CH3
R2
30.4 Sulfonylurea
N NH
H O O O
CN S
30.1 Pinacidil N N CH3
H H
H
N CH3 H3C
30.5 Tolbutamide
N
Cl S O O O
O O S N
N N
30.2 Diazoxide H H
H3C
O 30.6 Gliclazide CH3
N
HO
CN OH O O O
H3C CH3
S
N N
O CH3 H H
CH
C 3
H3C
30.3 Levcromakalim 30.7 Glibornuride

O O O
S
O N N
H H
Cl
N
H 30.8 Glibenclamide
OMe O O O
S
O N N
H H
MeO
N
30.9 Gliquidone
CH3
O O O O
H3C CH3
S
O N N
H H

H3C N N
H 30.10 Glimepiride
O O O O
H3C S N
O N N
H H

H3C N
H 30.11 Glisoxepide
O N

Fig. 30.6 Pinacidil 30.1, diazoxide 30.2, and levcromakalim 30.3 are potassium channel openers.
In contrast, sulfonylureas such as 30.4 block the regulatory subunit of the ATP-dependent
potassium channel in the insulin-producing cells of the pancreas. The basic scaffold of the
sulfonylureas can be broadly varied on both termini (30.5–30.11) with aliphatic (R1), aromatic,
or other cyclic groups (R2).
754 30 Ligands for Channels, Pores, and Transporters

electrophysiological processes. Glucose penetrates the b-cells via the GLUT-2


transporter. There it is phosphorylated and extensively metabolized. This results
in an increase of the intracellular ATP/ADP ratio and is associated with the closure
of the ATP-dependent potassium channels. The membrane potential depolarizes.
When the threshold potential of about 50 mV is reached, the voltage-dependent
calcium channels open. The influx of calcium provokes an action potential. Finally,
insulin release occurs at the end of this complex cascade as a result of insulin-
containing granules (membrane enclosed, lysosomal vesicles) merging with the cell
wall. The insulin secretion happens in two phases. The first, immediate release
occurs via the merging and opening of the vesicles that are found in the direct
vicinity of the membrane. The second phase initially requires a refurbishment of
insulin granules from the cellular storage. This is caused by a signal cascade that is
initiated by the signaling of some GPCRs (e.g., GLP1R, GPR119). These receptors
are stimulated by incretin hormones such as GLP-1 (▶ Sect. 23.6). The activation of
the adenylate cyclase downstream (▶ Sect. 29.1) finally causes an increase in the
intracellular calcium level and therefore increased insulin release. Sulfonylureas
block the regulatory subunit of ATP-dependent K+ channels and mediate an
increase in the intracellular calcium concentration. The ensuing effect is analogous
to an increase in the ATP/ADP ratio and, as a result, arguments the insulin secretion
from the pancreatic b-cells. This is exploited as a therapeutic principle for the
treatment of type-II diabetes mellitus. A side effect of the sulfonylureas is the
danger of insulin release despite a low glucose level. This can lead to life-
threatening hypoglycemia. Therefore agonists of the GLP1R and GPR119 receptors
have been developed for which insulin release in the presence of a low glucose level
has not been observed.
To date, none of the voltage-gated calcium channels have been structurally
characterized. It has been speculated that four glutamate residues (so-called
EEEE locus), arranged in a ring relative to one another, form their center and act
as a selectivity filter. The selectivity and transitory speed of these channels is
impressive; sodium and calcium ions are almost the same size and sodium ions
occur at 110-fold higher concentrations. Despite this, a Na+/Ca2+ selectivity of
1:1000 is achieved. It has been proposed that calcium channels bind Ca2+ ions more
tightly, and this results in their selectivity. Sodium ions, which can indeed fit into
the channel well, are prevented from passing by the competition with the more
tightly binding calcium ions. The higher affinity of the Ca2+ ions for their own
channel therefore blocks the flow of the less-potent Na+ ions. The channel manages
the selective discrimination of structurally larger potassium ions even better.

30.3 Binding Unwanted: The hERG Potassium Channel as an


Antitarget

In September 2007, after more than 45 years of use in therapy, the drug clobutinol
30.12 was withdrawn from the market (Fig. 30.7). This drug was used to treat dry
coughs. Over the years, it is estimated that it was used by approximately 200 million
30.3 Binding Unwanted: The hERG Potassium Channel as an Antitarget 755

Cl CH3
CH3
HO N
CH3
C 3
CH
30.12 Clobutinol
HO CH3

CH3
H3C
H
HO N

F
30.13 Terfenadine

H
N N

N N

F
30.14 Astemizol
MeO
O
N
HN N
N

30.15 Sertindol
Cl
N
H3C
HN
N N
H3C
N S
CH3 O
F
S O OH

30.16 Thioridazin 30.17 Grepafloxacin


MeO
O N O
Cl
N
H
H2N OMe
30.18 Cisapride F
OH
H
N
MeSO2

O
N

30.19 MK499
CN

Fig. 30.7 Clobutinol 30.12 was withdrawn from the market after 45 years of clinical use because
of the risk of provoking an arrhythmia. Terfenadine 30.13, astemizole 30.14, sertindole 30.15,
thioridazine 30.16, grepafloxacin 30.17, and cisapride 30.18 met with the same fate and were
either withdrawn, or their indications for use were severely limited. MK499 30.19 is a potent
class-II antiarrhythmic agent, and it binds to the hERG potassium channel.
756 30 Ligands for Channels, Pores, and Transporters

Fig. 30.8 The prolongation


of the QT interval between the mV
beginning (Q) and end (T) of
the heart’s ejection phase can R
lead to fatal arrhythmias,
including sudden tachycardia
(heart racing), ventricular
fibrillations, and cardiac
arrest. T
P

mV Q
S

Time

patients. It was even converted to an over-the-counter drug. Recent clinical studies


on healthy adults raised the suspicion that the compound could cause cardiac
arrhythmias, which can lead to fatalities in the worst-cases. Several other known
drugs have also been withdrawn from the market. For the same reason terfenadine
20.13, astemizole 30.14, sertindole 30.15, thioridazine 30.16, grepafloxacin 30.17,
and cisapride 30.18 have been withdrawn, or their use has been severely limited
(Fig. 30.7). In all of these cases the danger of a rare but life-threatening cardiac
arrhythmia was the reason for the withdrawal. This is all the more unsettling
because it happens with medications that are usually not used to treat life-
threatening conditions, but rather, for example, nervous coughing, allergies, infec-
tions, or gastrointestinal diseases.
What happens that suddenly causes arrhythmias in the worst case leading to
death, especially during physical activity? The depolarization and repolarization of
heart muscle cells, as described above, is regulated by the in- or outflow of sodium
and potassium ions through ion channels. As a result, pharmaceuticals are known to
act as antiarrhythmics by blocking the sodium channel. Other drugs inhibit the
potassium channel and extend the action potential of the cells. If a prolongation of
the so-called QT interval occurs, which is between the beginning and end of the
ejection phase of the heart beat, a dangerous arrhythmia can result. This can lead to
sudden heart racing (torsades de pointes tachycardia, Fig. 30.8), ventricular fibril-
lation, and cardiac arrest. The prolongation of the QT interval is caused by blocking
one potassium channel, the hERG channel (human Ether-à-go-go Related Gene).
30.4 Tiny Ligands Gate Giant Ion Channels 757

The channel was found as a result of detailed genetic investigations of patients with
an inherited long-QT syndrome. An undesirable drug side effect can cause the same
condition when the hERG channel is inhibited by the administered drug. Even
though this side effect is rare, in acute cases it is extremely dangerous. It is
estimated that about 3,000 fatalities each year in the USA are attributable to such
adverse events. To avoid this side effect, attempts are now made to eliminate
binding to the hERG channel immediately during drug development. A structure
of this potassium channel is currently still unavailable. It is, however, related to the
bacterial KcsA channel that was discussed in the previous section.
An alanine-scan was carried out to determine which amino acids are decisive for
the inhibition. The altered binding of the potent class-II antiarrhythmic drug MK499
30.19 was tested. Two aromatic residues in this channel, Tyr652 and Phe656, proved
to be decisive. They are found on the four subunits in the interior of the broad cavern
before the entry into the selectivity filter (cf. Fig. 30.3, approximately at the height of
the potassium ion’s position). Moreover, the binding decreased even more when four
additional residues were replaced by alanine. With this information, homology
models of the hERG channel were constructed based on the crystal structure of the
KcsA channel. The residues that were determined to be critical are all oriented into
this cavern. Drugs that are held responsible for a prolonged QT interval fit in the
model so that an interaction with the aromatic residues is suspected. These model
considerations allow the construction of a superimposition model of known inhibi-
tors. It indicates that inhibitors bind with extended geometry and exhibit a charged
basic nitrogen atom in the center. This atom is in the middle of a pyramidal arrange-
ment formed by three to four hydrophobic aromatic moieties. This spatial pattern has
been further refined by using structure–activity relationships. It serves as a kind of
reference to check whether newly designed active substances could possibly bind to
the hERG channel. The goal of this design is not to optimize but to prevent binding.
The hERG channel is therefore considered as anti-target. In addition to these design
considerations, today the actual hERG-channel inhibition of synthesized compounds
is measured. In this way, attempt is made in an early phase of drug discovery to avoid
the bitter and very expensive surprise of finding severe side effects later.

30.4 Tiny Ligands Gate Giant Ion Channels

There are multiple classes of ligand-gated ion channels. The nicotinic acetylcho-
line receptor, the 5-HT3 receptor, and the inhibitory glycine and GABAA
receptors belong to the first class: the Cys-loop superfamily. The first two are
excitatory receptors that respond to acetylcholine and serotonin. They are essential
for the fast nerve impulse transmission at the synapses. The inhibitory glycine and
GABAA receptors are controlled by glycine and g-aminobutyric acid, respectively.
These ion channels have a common architecture. They form a pore in the mem-
brane, and open and allow the passive flow of ions in response to the binding of an
agonist. They have a pentameric construction. The composition of this
heteropentamer varies. A multitude of different receptors are composed from
758 30 Ligands for Channels, Pores, and Transporters

a set of 17 homologous subunits (10 a, 4 b, as well as g, d, and e units). Each of the


four transmembrane helices meets with one of the five transmembrane domains and
encircles the ion channel in their interior. Each pentamer has two extracellular
ligand–binding domains. In the center of the five transmembrane domains, each
of the most internal of the five helices, the so-called M2 helices, form the channel.
They have hydrophobic amino acids such as valine, phenylalanine, and leucine in
their mid-point that are responsible for the opening and closing.
Thanks to the ground-breaking work of Nigel Unwin, we have detailed glimpses
into the construction of a channel from this family, the nicotinic acetylcholine
receptor. By using an electron microscope on two-dimensional crystals, he managed
to derive a picture of this ligand-gated ion channel in the closed state (Fig. 30.9) from
the electric organ of the electric ray Torpedo species with 4-Å resolution. It traverses
the membrane with five subunits that form the ion channel, each composed of four
transmembrane helices. It protrudes about 60 Å over the membrane into the synaptic
gap. The ligand-binding domain, which is made of b-pleated sheets, is found here; it
carries the binding pocket for the neurotransmitter acetylcholine. After loading the
two-dimensional crystal with acetylcholine, Unwin was able to observe the resulting
conformational changes. He registered spatial rearrangements that were invoked in
the vicinity of the acetylcholine-binding region (Sect. 30.5) and then transmitted into
the transmembrane portion of the ion channel. In this way the receptor “feels” the
ligand binding and transmits this to the pore for ion permeability, which is 30 Å away.
The pore remains in an open state after ligand binding. The M2 transmembrane
helices, which are arranged directly in the interior of the channel, bear bulky,
hydrophobic amino acids. These surround the channel’s center in the closed state
like a hydrophobic belt (Fig. 30.9). The remaining opening of approximately 6 Å is
too narrow to allow Na+ or K+ ions with their hydration shells to pass through.
Because the channel does not have a polar environment similarly to the potassium
channel described in Sect. 30.3, the ions cannot simply shed their hydration shells for
the passage. Membrane permeability is forbidden for them.
By binding the ligand acetylcholine, a cascade of conformational changes is
initiated that are transmitted through to the M2 helices. This is accomplished by
a concerted rotation of all five M2 helices by about 15 . The tension on the
hydrophobic belt loosens. As a result, the pore widens by 3 Å, and this allows the
passage of sodium ions with their hydration shells in the open state. The work has
allowed a first fascinating glimpse into the function and dynamics of a ligand-gated
ion channel. Huge protein constructions react to the binding of a comparatively tiny
agonist. Information is transmitted over large distances. It is suspected that all
channels of the Cys-loop family work on this principle.

30.5 Ligands Gate as Agonists and Antagonists: The Function


of an Ion Channel

In the meantime the extracellular domains of different acetylcholine receptors have


been successfully isolated and crystallized with bound ligands. Analogously to the
30.5 Ligands Gate as Agonists and Antagonists: The Function of an Ion Channel 759

Fig. 30.9 In the crystal structure, the nicotinic acetylcholine receptor has a diameter of 80 Å, and
is 125 Å long (a). It represents a pentamer made from five subunits. It traverses the membrane with
the central region that is composed of four helices per monomer. An extracellular domain binds to
the ligands and additional helices attach on the cytosolic side. The narrowest position in the
interior of the channel (b) reduces to 6 Å in the closed state (middle, indicated by the white
surface). There a belt of hydrophobic residues constricts and prevents the passage of sodium ions.
Upon opening, the helices rearrange by a concerted rotation and expand the channel passage by
3 Å, which is enough to allow the sodium ions to pass with their hydration shells. The interior of
the channel is polar and has many acidic amino acids (b, yellow indication).

receptor that was introduced in Sect. 30.4, these acetylcholine-binding proteins


represent a pentamer. The binding protein of the California sea slug (Aplysia
californica) exists as a homopentamer. Its structure could be determined with
agonists and antagonists. This allowed insight into two very interesting aspects.
On the one hand, it illustrates the molecular origins of the conformational
rearrangement that is translated from the ligand-binding domain to the isthmus of
the ion channel, which then opens or closes. Agonists and antagonists differ
severely in size in the above example. The agonist nicotine 30.20 is the main
alkaloid in tobacco (Fig. 30.10). In low doses it stimulates the neurotransmission,
at high doses it leads to a permanent depolarization and blocks neurotransmission.
Epibatidine 30.21 naturally occurs in the skin of the Ecuadorian poison dart frog
and has strongly analgesic properties. Lobelia inflata, a flowering plant, contains
a-lobeline 30.22. The foliage of this plant was smoked by Native American Indians
to treat asthma. At higher doses, the compound is exceptionally toxic. As a result of
the binding of this tiny agonist, a long loop lies across the binding site at the
receptor. The extracellular domains of the receptor then adopt a compact structure.
Upon antagonist binding, for example, by the peptide a-conotoxin 30.23, the loop
760 30 Ligands for Channels, Pores, and Transporters

Fig. 30.10 As agonists,


nicotine 30.20, epibatidine H
30.21, and a-lobeline 30.22 N
open the nicotinic
acetylcholine receptor. The N
dodecapeptide a-conotoxin O
30.20 Nicotine
30.23 and the diterpene
alkaloid methyllycaconitine H
30.24 block the receptor as NH
N CH3
antagonists. All bind to the
ligand-binding domain of the
pentameric receptor. They Cl N
have functional groups (red) HO
that can exist in a positively 30.21 Epibatidine
charged state.
N
S
N 30.22 a-Lobeline

H OMe
30.25 Pyrantel OMe
Cys S
Arg OH
MeO
Trp OH
Ala
OMe
S Cys N
CH3
Arg
Pro O
Asp
OO
Ser
Cys S
H3C N
S Cys
Gly
O
30.23 a-Conotoxin 30.24 Methyllycaconitine

remains spread apart for steric reasons. The peptide a-conotoxin serves
a carnivorous sea snail, which lives in tropical oceans, as a venom. Because the
snail cannot bite, it shoots its venom, which is packaged in small chitin-coated
arrows that even have barbed hooks, through a sort of blowpipe. The diterpene
alkaloid methyllycaconitine 30.24 from the seeds of the medicinal plant Larkspur
achieves the same effect. A movement of more than 10 Å is registered (Fig. 30.11)
in the receptor protein upon the binding of this ligand. This difference is transmitted
to the most narrow passage region of the channel via a cascade, and in doing so
regulates the sodium ion permeability.
As an additional aspect, these structures offer an insight into how chemically
completely different structures can invoke the same effect on a receptor.
30.5 Ligands Gate as Agonists and Antagonists: The Function of an Ion Channel 761

Fig. 30.11 By binding an agonist or antagonist in the ligand-binding domain of the nicotinic
acetylcholine receptor, a loop (red) lies either directly on the binding site, or it remains spread apart
by about 10 Å (right). The conformational signal is transmitted to the channel isthmus, which is
30 Å away, and leads to the channel remaining closed or opening.

The antagonist a-conotoxin 30.23, a dodecapeptide with two intramolecular disul-


fide bridges, binds with the geometry shown in Fig. 30.12. The herbal alkaloid
methyllycaconitine 30.24 binds in the same pocket, but the binding areas of the
peptide and alkaloid overlap only in the central region. The agonists epibatidine
30.21, nicotine 20.30, and a-lobeline 30.22 also bind only in this region, but they
occupy a much smaller area. The diterpene and the above-mentioned agonists all
have a secondary or tertiary basic nitrogen atom that is most probably protonated in
the binding pocket. In all of these structures, this nitrogen atom lies in the vicinity of
a tryptophan residue. There an H-bond can be formed to the carbonyl group of the
amino acid, and a cation–p interaction with the neighboring aromatic ring can
also play an important role. The peptide a-conotoxin does not have a chemically
comparable nitrogen atom. It places, however, a positively charged arginine
residue in the vicinity of the tryptophan and, in doing so, provides a comparable
binding relationship. These structures certainly represent an extreme example of
bioisosteres. They illustrate, however, the diversity that living nature creates to
arrive at the same goal through entirely different molecular skeletons. Medicinal
chemists can only learn from these creative solutions!
Pyrantel 30.25 (Fig. 30.10) is an antihelmintic used to treat ascariasis and
enterobiasis (intestinal round and pin worm infections, respectively) which is
related to the above-mentioned agonists in size and structure. It binds to the
nicotinic acetylcholine receptor of the worms. As a result, the ion channel opens,
and this leads to depolarization. Consequently, a neuromuscular blockade is initi-
ated, and the worms are paralyzed. They can then be washed out of the infected
intestines. Because of the poor absorption of the drug from the gastrointestinal tract,
it is safe for humans and well tolerated.
762 30 Ligands for Channels, Pores, and Transporters

Fig. 30.12 Crystal structure of the ligand-binding domain of the nicotinic acetylcholine receptor
with the bound agonists epibatidine 30.21 (a) and a-lobeline 30.22 (b), the peptidic antagonist
a-conotoxin 30.23 (c) and the diterpene alkaloid methyllycaconitine 30.24 (d). Despite their very
different sizes, they all occupy the same binding site. In the case of the agonists a loop (Fig. 30.11)
lies across the binding site that spreads apart in the case of the antagonists. All of these ligands
have a positive charge with which they undergo a cation–p interaction with the aromatic ring of
a tryptophan in the vicinity.

30.6 Power Brake Boosters for GABA-Gated Chloride Channels

The glycine and GABAA receptors are inhibitory neuroreceptors because they
regulate the influx of chloride ions. This leads to hyperpolarization and lowers the
voltage-dependent excitability; the depolarization of the cell is hindered. Both of
these receptors are regulated by the low-molecular-weight ligands glycine 30.26
and g-aminobutyric acid (GABA) 30.27 (Fig. 30.13). Anesthetics as well as alcohol
30.6 Power Brake Boosters for GABA-Gated Chloride Channels 763

Fig. 30.13 Glycine 30.26


H2N COOH H2N COOH
and g-aminobutyric acid
30.27 regulate ligand-gated 30.26 Glycine 30.27 GABA
chloride channels. Alfaxalone O
30.28 and barbiturates such as
barbital 30.29 open the
O
GABAA receptor for
a prolonged period;
benzodiazepines such as
diazepam 30.30 enhance the H H
effect of GABA and activate HO
the receptor by allosteric H
regulation. 30.28 Alfaxalone
H3C
O
O N
H
N Et
O Cl N
N Et
H O

30.29 Barbital

30.30 Diazepam

modulate the activity of these receptors and lead to a stabilization of the channel in
the open state. Even cholesterol and other steroids can achieve this effect. The
synthetic pregnane steroid alfaxalone 30.28 opens the GABAA receptor for a longer
duration.
The GABAA receptor, like other members of the nicotinic receptor family, is
a heteropentamer whereby the simultaneous incorporation of an a, b, and g-subunit
is required. The inhibitory effect of drugs such as barbiturates 30.29 or benzodiaz-
epines 30.30 on the channel is based on this regulation (Fig. 30.13). Presumably,
they exert an influence on the dynamic properties of the receptor and stabilize it in
the open state. The benzodiazepines amplify the effect of the endogenous ligand
GABA in that the excitability of the cell is hindered by opening the chloride
channel. They are allosteric regulators and are therefore also termed “power
brake boosters.” The barbiturate-binding site is on the b-subunit, whereas benzo-
diazepine binds on the a-subunit. They act as sedatives, hypnotics, anxiolytics,
anticonvulsives, muscle relaxants, and anterograde amnestics (they cause memory
loss for the time the drug is in the system). Barbiturates have lost their importance
as sleeping pills and sedatives, above all because of their high addictive potential
and the risk of being used for suicide. They have been replaced by benzodiazepines,
which are better tolerated and cannot be used for suicide as a monosubstance.
Indeed, hardly any other substance class has illustrated the concept of
bioisosteric replacement as thoroughly as the benzodiazepines. As a result,
a plethora of derivatives are available that, depending on the individual profile
and pharmacokinetics, have opened therapeutic approaches for the treatment of
764 30 Ligands for Channels, Pores, and Transporters

Fig. 30.14 Numerous CH3 CH3


benzodiazepines with varying
R1 = H, CH3⬘ N
pharmacodynamic profiles
and kinetics have been R1
O
provided for therapy by the R2 = Cl, NO2 N
1 2
systematic bioisosteric 7 3 R3 R3 = H, OH
replacement of the groups 5
R2 N
R1–R4. Flumazenil 30.31
represents an antagonist that
R4 R4 = H, F, Cl
reverses the sedating effects 2⬘
of benzodiazepines.
Benzodiazepine-Scaffold

N X = C,N
H3C
O X
S N N

H3C N Cl N

Cl R4

N O

N OEt

F N
CH3
O

30.31 Flumazenil

insomnia, for sedation, anxiety, and agitation, as well as hypnotics, or muscle


relaxants (Fig. 30.14). A seven-membered 1,4-diazepine ring upon which a benzene
moiety is condensed is common to all drugs. Moreover, a phenyl group is found in
the 5-position, which can be replaced with a thiophene ring as a bioisostere. Most
benzodiazepines have a lactam moiety in the seven-membered ring. This can be
replaced by an amidine or a condensed heterocyclic five-membered ring. The
lactam nitrogen atom bears an aliphatic substituent in many derivatives. A C¼N
bond occurs as an additional unsaturated structural element that can also be
expressed as an N-oxide function. The 7-position of the condensed benzene core,
para to the lactam nitrogen atom, is usually blocked with a chlorine, bromine, or
nitro substituent. Such a group can help to attenuate the lipophilicity, but it also
blocks the activated position for metabolism and reduces the electron density on the
benzene core. Substituents in the 2’-position of the attached phenyl group also serve
to increase lipophilicity, but have a conformational effect too. The 3-position in the
seven-membered ring is also interesting. As an enantiotopic position, a chemical
change at this site leads to the introduction of a stereogenic center. 3-Hydroxylation
results in more hydrophilic derivatives that absorb more slowly. Benzodiazepines
30.7 The Mode of Action of a Voltage-Gated Chloride Channel 765

with increased lipophilicity (alkylation at N1, chlorine substituents in the 7- and 2’-
positions) quickly reach their effective concentration in the central nervous system.
This causes the sedative and hypnotic components to be amplified. Increased
hydrophilicity (unsubstituted N1 atom, 3-hydroxylation, no 2’-halogenation) is
desired for the profile as a tranquilizer.
Almost all benzodiazepines have agonistic effects and amplify the effect of
GABA. The modification to flumazenil 30.31 led to a compound with antagonistic
activity. It prevents the agonistic effects of benzodiazepines and reverses the
sedative effects. Interestingly, it is missing the phenyl substituent in the 5-position.
The above-described activity profile of benzodiazepines is broad and multifac-
eted. Therefore selective representatives of this class have been worked on in
pharmaceutical research that have only one quality of action, for example, that
have only an anxiolytic or only a sedative component.

30.7 The Mode of Action of a Voltage-Gated Chloride Channel

The structure of the nicotinic acetylcholine receptor was introduced in Sect. 30.4.
As mentioned, the ligand-gated chloride channels belong to this family and
have a pentameric architecture. The structural details of this channel are still
unknown because, until now, a high-resolution structure determination of such
a channel has not yet been achieved. On the other hand, it is possible to gain more
detailed insights into the architecture of another class, the voltage-gated chloride
channels.
Nine isoforms of these ClC channels are present in our genome. They take on
numerous physiological functions, for example, the control of the resting potential
in skeletal muscle and non-excitable cells. Moreover, they exert an influence on the
absorption of sodium chloride from the kidney into the blood stream or they are
involved in processes that are necessary for the establishment of an acidic milieu.
Malfunction and genetically caused mutations in these channels are associated with
diseases such as myotonia, a pathological muscle tension or particular forms of
epilepsy, neuropathy, and osteopetrosis (a bone disease).
In 2003 the research group of Roderick MacKinnon managed to elucidate the
crystal structure of a bacterial ClC channel. It is constructed from two identical
subunits that are coupled through twofold symmetry. Interestingly, this membrane
protein does not have long helices that are oriented perpendicular to the membrane.
Rather, the 18 helices of this channel are packed tightly together and tilted up to 45
to the membrane axis. The channel pore is reminiscent of the form of an hourglass.
The pore broadens to an atrium on the intracellular and extracellular sides, where
positively charged arginine residues are found in the vicinity (Fig. 30.15). The
channel narrows in the center over a distance of about 15 Å. A selectivity filter
together with a conserved glutamate residue is found at the apex. This residue takes
on the function of a gatekeeper. Additionally, the ends of two antiparallel-oriented
helices end exactly there. They form a preferred binding site for a negative charge.
For this, the helices must have the opposite orientation compared to the potassium
766 30 Ligands for Channels, Pores, and Transporters

Fig. 30.15 Two long helices


orient their positively
polarized, N-terminal ends
toward the narrowest position
in the channel in the crystal
structure of the voltage-gated
ClC channel (a). Glu148 is
found at this position, which
acts as a gatekeeper and opens
and closes the channel.
A conformational
rearrangement of the
negatively charged residue
opens passage for chloride
ions. Upon passage, the
chloride ion sheds its water
shell. This is replaced by
coordination to the hydroxyl
groups of Ser107 and Tyr445
and two contacts to NH groups
from the main chain (b).

channel. Here they have their N-terminal ends at the most narrow place in the
channel. As with the potassium channel, the dipole moment in the helices generates
a special binding site for negatively charged ions. In the crystal structure, the
carboxylate group of Glu148 is found exactly at this position. If this residue is
exchanged for a neutral glutamine, the position is freed, and the glutamine adopts
another position. Instead a bound chloride ion is then found in this position. Upon
mutation to glutamine, the channel is left in a permanently open state. It is assumed
that the two structures describe the open and closed state of the ClC channel. The
fact that the Gln148 mutant exhibits a chloride ion in this position underscores how
important the special position between the two oppositely oriented helix ends is for
the stabilization of a negative charge.
In addition to this chloride ion, two other chloride ions were found in the open as
well as the closed channel. One sits deep in the pore and has completely shed its
solvation shell. It is stabilized by two NH groups from the main chain and the OH
groups from Ser107 and Tyr445 (Fig. 30.15b). The other chloride ion is found at the
entry and is still partially solvated by water molecules.
30.8 Transporters: The Gatekeepers to the Cell 767

The regulation via the glutamate as a placeholder allows the channel to open and
close based on external signals. The structurally related human ClC-0 channel is
voltage-gated when the potential on the interior of the cell shifts to the positive
range. An adjacent negative potential closes the channel. Upon increase of the
extracellular chloride ion concentration, the channel opens. The same can be
observed when the pH value of the environment drops. It is possible that the
glutamate residue changes its protonation state when it swings out of the cusp of
the pore to make way for the chloride ion. This would explain its regulatory
function during the pH conditions and the stoichiometric exchange of Cl for H+.
The ClC channels are specific for monovalent anions. In addition to chloride, albeit
with reduced permeability, Br, I, NO3, and SCN are also able to pass through.
Because the latter-mentioned ions play a subordinate role in biological systems,
a pronounced selectivity is not necessary. Nonetheless divalent ions such as sulfate
and hydrogenphosphate are denied passage. Time will tell how well the structure of
the bacterial channel reflects the properties of channels in higher organisms. It is in
question whether it is possible to modulate the functions of the channels with
ligands that can be developed into drugs.

30.8 Transporters: The Gatekeepers to the Cell

All cells need to be able to selectively transport endogenous and exogenous


compounds across the cell membrane. A large class of proteins that fulfills this
task is the membrane transporters. They ferry, for example, hormones, amino
acids, bile acids, uric acid, or lipids across the membrane barrier. Mutations in these
transporters accompany serious genetic diseases such as adrenoleukodystrophy
(which causes neurological degeneration) or retinal degeneration. An important
group of transporters is responsible for the economical return of released neuro-
transmitters from the synaptic gap (▶ Sect. 22.7, Fig. 22.7) into the presynaptic
nerve cell. This reuptake can be blocked with drugs. For example, the serotonin and
noradrenaline transporter reuptake inhibitors have been very intensively and suc-
cessfully worked on in pharmaceutical industry. Frequently these inhibitors also
display an additional mode of action as antagonists against the corresponding
receptor on the postsynaptic side. These receptors belong to the GPCR family
and divide into a broad palette of subtypes (cf. ▶ Table 29.1). Based on this binding
to entirely different places with structurally apparently related binding sites, these
inhibitors have varying pharmacological profiles and different side-effect spectra.
Transporters not only import compounds into cells, they also take on the task of
removing exogenous compounds from the cell. The majority of drugs belongs to the
group of exogenous, or xenobiotic compounds. Frequently drug resistance develops
during therapy, for instance, to drugs used to treat infections. Transporters that are
presumable increasingly expressed to expel drugs from the cell are also responsible
for the development of multiple drug resistance (MDR). They exploit either
a proton gradient for the substance passage, or their transport is coupled with
the energetically favorable hydrolysis of ATP (in the ABC cassette transporters).
768 30 Ligands for Channels, Pores, and Transporters

The latter group of so-called ABC transporters represents a large family of pro-
teins that imports a broad palette of substances such as amino acids, ions, sugars,
lipids, or other drugs into the cell, but also removes them again. To date, 46 of these
ABC transporters have been identified in humans. They are composed of at least
two nucleotide-binding (NBD) and two transmembrane (TMD) domains. Several
structures of NBDs have been elucidated, and they all have a largely similar
construction. They bind ATP, which is essential for their operation. The TMDs
are decisive for the actual membrane passage. They ensure a buffer from the
hydrophobic membrane environment for hydrophilic substances.
The best-investigated transporter is the human MDR-ABC transporter
P-glycoprotein GP170 (MDR1/ABCB1). Like a hydrophobic vacuum cleaner, it
removes lipids as well as a broad palette of drug molecules from the cell. Electron
microscopy on 2D crystals afforded the first indications about its 3D structure and
mode of action (▶ Sect. 13.6). It traverses the membrane with 12 helices. The
NBDs are found on the cytosolic side. In its initial state, the transporter has low
affinity to ATP, and the two NBDs are found in spatially separate configurations.
The two transmembrane domains spread apart and open a cavity in the center that
can accept molecules with high affinity from the outer leaflet of the membrane’s
interior. The cavity seems to be highly adaptive which explains the transporter’s
pronounced substrate promiscuity and its ability to adapt to the requirements of
very different molecules. The substrate is passed from the membranes interior to the
exterior of the cell. Once initiated by substrate binding, the transporter undergoes
a dramatic conformational change that brings the two transmembrane domains
together again. The binding affinity for ATP increases. Simultaneously, the NDB
completes its rotational movement. Spatially, they come together. Presumably the
energetically favorable ATP hydrolysis is also coupled with this step. The activa-
tion barrier for the conformational transition is decreased. The substrate being
transported is released from the transmembrane domain into the exterior surface
layer of the membrane.
The development of resistance due to transporters represents a serious problem for
drug therapy. Therefore, it is all the more important to investigate the molecular
criteria that make molecules good substrates for these transporters. Consequentially,
it can be understood how to modify molecules so that they are no longer good
substrates. This task is nontrivial because the binding pockets in these transporters
are obviously distinctively adaptive, and therefore the typically small changes to
a drug molecule that are tolerable to its mode of action have no effect on its binding
behavior to the transporters. On the other hand, potent inhibitors of these transporters
can be sought. Some compounds such as R-verapamil (▶ Sect. 2.6) have been
discovered to serve this purpose. Their clinical use for breaking resistance, however,
has proven problematic because the inhibition of the transporters also prevents their
natural function from being carried out. On the other hand, it must not be forgotten
that the inducible and heterologous expression of these transporters represents
a decisive defensive mechanism of the cells against xenobiotics. It is not without
cause that Nature has developed such a highly efficient and flexible protective
mechanism. Therefore, it is possible that these transporters do not represent an
30.9 Membrane Passage in Bacteria: Pores, Carriers, and Channel Formers 769

ideal drug target in humans. This picture, however, may be very different for the fight
against bacteria and parasites. They also invoke such transporters to fight drug
molecules (cf. ▶ Sect. 3.2). The currently used weapons against bacteria and parasites
will eventually become ineffective. To break resistance, attempts have recently been
made to inhibit parasite and bacterial transporters. If these goals are achieved, it
would be a double success. On the one hand, resistance against older and well-proven
therapeutic drugs would be broken. On the other hand, the undesirable pathogens
would be additionally damaged, because the transporter would no longer be available
as a defense mechanism against undesirable foreign substances that are potentially
injurious for them. We must wait and see whether this concept, which is currently
being pursued in research, will bring the desired success.
Ion pumps also belong to the transporters that can carry ions across the mem-
brane against a concentration gradient with consumption of ATP. Recently crystal
structures have been elucidated for the first representatives of this protein class, the
so-called P-type pumps. Embedded by multiple long transmembrane helices, these
pumps undergo complex conformational rearrangements to accomplish their task.
These systems are also points of attack of very well-known and successful drugs.
Digoxin exerts its effect on the sodium/potassium pump (▶ Sect. 6.1). The proton
pump inhibitors omeprazole and pantoprazole block the H+/K+ pump in the stom-
ach (▶ Sect. 9.5).

30.9 Membrane Passage in Bacteria: Pores, Carriers, and


Channel Formers

Gram-negative bacteria are surrounded by two membranes: an inner plasma mem-


brane and an outer membrane. These are separated by the periplasmatic space.
Although most proteins penetrate the inner membrane with a helical sequence
segment, interesting pores are found in the outer membrane that display
a pleated-sheet construction. They belong to the most commonly occurring proteins
in bacteria. Each of these openings, called porins, represents a water-filled channel
that allows the passive diffusion of nutritional building blocks and waste products
out of the cell. Their diameter is limited which prevents potentially toxic com-
pounds from being selected. The porin structure from the bacteria Rhodobacter
capsulatus was first elucidated in the research group of Georg Schulz and Wolfram
Welte in Freiburg, Germany (Fig. 30.16). The pore exists as a trimer in which the
monomers are packed together in a triangular form. Each pore is formed by
a 16-stranded up-and-down b barrel (▶ Sect. 14.3), and the individual b strands
adopt an antiparallel orientation. The b barrel is a commonly occurring folding
pattern in enzymes. As a general rule, however, only up to eight pleated sheets
come together there and form the tightly packed core of the barrel-like structure.
Due to the large number of strands, there is enough room in porins to open a passage
to the interior. Nonetheless, it is partially closed by a long loop limiting the
remaining eyelets to a maximum diameter of 8 Å. The eyelet area is almost entirely
made up of positively and negatively charged amino acids that are oriented to the
770 30 Ligands for Channels, Pores, and Transporters

Fig. 30.16 Crystal structure


of the porin from the bacteria
Rhodobacter capsulatus.
Each pore of the trimeric
proteins (only one monomer
is shown) is made up of
a 16-stranded “up-and-down”
b barrel. The pore traverses
the membrane along the view
axis and broadens to about
8 Å. It is flanked by positively
(blue) and negatively (red)
charged amino acids that
establish an electrical field
gradient across the
membrane.

opposite sides of the pore. This orientation of charged groups also contributes to the
selection of molecules that can pass through the pore.
Bacteria also synthesize small peptide-like systems that penetrate the mem-
branes of other organisms and in doing so also offer a possibility for the passage
of, for example, ions. These systems are termed transport antibiotics. They render
the membrane permeable in different ways. The antibiotic gramicidin A is an
oligopeptide made of 15 amino acids that have alternating L and D configurations.
The peptide forms a tube-shaped helical structure and traverses the membrane as
a dimer (Fig. 30.17). This creates a channel with a diameter of 4 Å in the interior. It
is highly permeable for monovalent cations such as Na+ and K+. On the other hand,
multivalent cations and anions are prevented from entering. Up to 107 cations per
second can pass through this channel, a transport rate that is only a factor of
10 below the diffusion rate in water. The cations must shed their hydration shells.
Then they apparently slide through the opening along the amide bonds that are
oriented parallel along the channel’s axis. The side chains of the hydrophobic
amino acids orient in the surrounding lipid membrane. The depsipeptide
valinomycin follows an entirely different mode of action. It is made up of valine,
lactate, and hydroxyisovalerate residues. It encapsulates the potassium ion with its
polar groups, which are oriented toward the interior. It presents its hydrophobic
groups to the outside. When wrapped into such a chelate–ligand complex, charged
ions can pass through the membrane barrier inside the covered, hydrophobic
particle. In addition to valinomycin, other such carriers are known, for instance,
nonactin (Fig. 30.18). These transport antibiotics alter the ion permeability of the
bacterial cell membranes and intracellular compartments. As a consequence, they
can cause bacterial cells to die. Valinomycin accumulates in, for example, the
mitochondrial membranes, increases the potassium influx, and in doing so disrupts
the mitochondrial energy homeostasis and ATP synthesis. The transport antibiotics
have importance as combination pharmaceuticals for external use, for instance, to
treat oropharyngeal infections.
30.10 Aquaporins Regulate the Cellular Water Inventory 771

Fig. 30.17 Gramicidin A,


(Val–Gly–Ala–Leu–Ala–
Val3–(Trp–Leu)3–Trp–
ethanolamine) is made up of
15 alternating L- and
D-configured amino acids and
forms a narrow channel of
about 4 Å through the
membrane along which
monovalent cations such as
Na+ and K+ can migrate.

Recently the lipopeptide daptomycin has been introduced into therapy to fight
Gram-positive bacteria. The cyclic peptide penetrates the bacterial cell membrane
with its hydrophobic side chain. It forms channels for ions by oligomerization. This
causes the cell membrane to be permeable for potassium ions. Their efflux leads to
depolarization and finally to bacterial cell death. Peptides with 20–25 amino acids
such as magainin (Locilex ®) use an analogous mechanism to form amphipathic
helices in the membrane.

30.10 Aquaporins Regulate the Cellular Water Inventory

The cellular lipid double layer represents a barrier for water molecules. Despite an
osmotic gradient across the membrane, simple diffusion does not occur. Therefore,
larger amounts of water molecules cannot cross actively or passively, or in
772 30 Ligands for Channels, Pores, and Transporters

O O O
O

O O

O
O O O

O
Nonactin

Fig. 30.18 Nonactin represents a chelating Ligand to coordinate potassium ions. It wraps
optimally around the ion and can penetrate as chelate complex the membrane. This transport
antibiotic then presents to its exterior hydrophobic side chains.

association with other particles the membrane. In 1992, the group of Peter Agre in
Baltimore, MD, discovered a 28-kDa protein in the erythrocyte membrane that turned
out to be a water pore. It only serves for water transfer, neither ions nor other small
molecules such as glycerol or urea can pass through it. The direction of the water flow
is determined by the osmotic pressure alone. This first aquaporin, discovered in
erythrocytes, was termed AQP1. In the meantime, over 100 aquaporins have been
discovered in all possible organisms. Humans alone have over ten isoforms, seven of
which are used in the kidney at different sites. Some porins are exclusively special-
ized on water, others, despite their similar architecture, also allow the transfer of
small molecules such as glycerol and urea. The discovery of aquaporins has revolu-
tionized our understanding of the regulation of water homeostasis. Therefore, Peter
Agre was awarded the Nobel Prize in 2003 for this achievement.
Sequence analyses of the aquaporins indicate an architecture constructed from
two almost identical segments. Each half contains a highly conserved Asn–Pro–
Ala–(NPA) motif. The functional aquaporin unit is a tetramer in which each
monomeric unit encloses a pore. As the crystal structure determination shows,
each pore is made up of six transmembrane helices. The channel extends like
a hose through the protein and widens on the extracellular and cytosolic sides to a
15 Å funnel-shaped vestibule (Fig. 30.19b). At its mid-point it narrows to a diameter
of 2.8 Å. The vestibules have many polar, but mostly uncharged amino acids.
A chain of accessible carbonyl oxygen atoms stretch along the wall of the pore,
which are presumably involved in passing the transitory water molecules along
(Fig. 30.19a). The opposite wall is made up of hydrophobic residues. Both impart
amphipathic character to the hose-shaped selectivity filter. The geometry of the
30.10 Aquaporins Regulate the Cellular Water Inventory 773

a b

Arg197
Arg197 Cys191

His182

Fig. 30.19 An aquaporin widens like a funnel on the extracellular and cytosolic sides (a). At the
narrowest position, the pore reduces to about 2.8 Å. At this site, positively charged His and Arg
residues are opposite one another and this prevents ion passage. As if on a string, the water
molecules migrate through the channel as they are passed along by hydrogen bonds to the carbonyl
oxygen atoms (a). The carbonyl groups are on one side of the channel, the opposite wall is made up
of hydrophobic amino acids. A cysteine residue is found in the vicinity of the isthmus that can
complex mercury ions and clog the channel. This explains the diuretic effects of mercury salts.

carbonyl groups that are arranged toward the interior is reminiscent of the selec-
tivity filter in the potassium channel. Because they are only found on one side of the
channel, they cannot completely replace the hydration shell around a cation.
A cation that wanders into the pore is therefore too large to pass through the
pore. A histidine and an arginine are found at the smallest isthmus. A phenylalanine
is found on opposite side. These three amino acids are highly conserved among
the porins that specialize in water permeability. Because of the charge on His and
Arg, they cause a further sieving for positively charged ions, even H3O+. Nega-
tively charged ions are so strongly repelled by the many negatively polarized
carbonyl groups that their passage is energetically much too unfavorable. The
channels that allow glycerol to pass aside from to water have an additional 1 Å in
diameter at their most narrow position. Simultaneously, the histidine, which is
conserved in exclusive water channels, is replaced with a glycine. Altogether, the
glycerol-permeable channel has a somewhat more hydrophobic character.
Aquaporins occur virtually ubiquitously in our bodies, though in larger numbers
and diversity in the kidney. To achieve quick control over their function, they are
partly stored in vesicles. When needed, the vesicles fuse with the cell membrane.
774 30 Ligands for Channels, Pores, and Transporters

In this way, the number of active aquaporins is increased. The water channels
represent an outstanding target structure for therapeutic intervention. In addition to
the development of diuretics, their use for the treatment of glaucoma, obesity, or to
fight angiogenesis in tumors have all been discussed. They have also moved into the
focus of research as a target for the development of drugs to treat parasitic
infections. Interestingly, mercury salts were used as diuretics a long time ago.
The thiol group of an accessible cysteine residue is found in the upper pore region
of AQP1 (Fig. 30.19b). Presumably, the mercury ion blocks the pore by coordina-
tion to this cysteine. Because of their toxicity, mercury salts are certainly not drugs
of choice. Time will tell whether research can find potent and selective alternatives
that can intervene in the targeted regulation of aquaporins to treat diseases that are
associated with their misregulation.

30.11 Synopsis

• Cells require material exchange across the membrane. Amphiphilic compounds


can diffuse through the membrane of their own accord. For the transfer of polar
compounds, cells are equipped with special transporters that sometimes exhibit
remarkable selectivity, but sometimes have broad promiscuity too.
• For the biologically relevant ions (Na+, K+, Ca2+, Cl) special ion channels exist
that allow ions to flow along a concentration gradient building up an electro-
chemical potential across the membrane.
• Cells are stimulated by action potentials. In the resting state a potential of
70 mV is maintained. The extracellular sodium ion concentration is tenfold
higher than in the cell interior. At a potential of about 60 mV fast sodium ion
channels open allowing Na+ influx and gradually shift the potential to +40 mV.
They close at this value.
• Repolarization of the cell results from an efflux of potassium ions through slow
and highly selective potassium channels. If the membrane potential falls to more
negative values than the resting state, hyperpolarization is spoken of. In some
cells this state can also be induced by a Cl influx through chloride channels.
In other cells, an influx of Ca2+ ions through specific channels can intensify the
depolarization across the membrane.
• KcsA potassium channels cross the membrane as tetramers with long helices.
Four helices that are oriented with their N-terminal ends into a central cavern
drag the positively charged ions across the membrane. The potassium ions,
which are coordinated with eight water molecules in a quadratic-antiprismatic
geometry, pass through a selectivity filter by shedding their solvation shell. The
protein provides a fourfold arrangement of backbone carbonyl groups that
perfectly replaces the water coordination sphere around potassium ions thus
achieving impressive selectivity over other cations.
• Some potassium channels are ATP-dependent with several domains and regula-
tory units. Sulfonylureas can block the regulatory unit on the pancreatic b-cells
that is responsible for insulin secretion. This is exploited as a therapeutic
30.11 Synopsis 775

principle for the treatment of type-II diabetes mellitus because blocking the
regulatory unit results in enhanced insulin secretion from the b-cells.
• Depolarization and repolarization of the heart muscle cells are important for
correct control of the heart beat frequency. Drug molecules with a particular
pattern of aromatic moieties and a central basic nitrogen can block the hERG
channel, a potassium channel involved in the regulation of heart beat. A fatal
arrhythmia can occur. Therefore, potential binding to the hERG channel as an
anti-target is avoided in the early phase of drug discovery.
• Ligand-gated ion channels are huge transmembrane constructions of pentameric
architecture with 20 transmembrane helices. The extracellular ligand–binding
domains, also of pentameric geometry, accommodate binding sites of agonists
and antagonists and transmit the signal of ligand binding through a cascade of
conformational changes to the isthmus of the channel pore. The pore widens by
3 Å by concerted rotations of the five innermost helices; this allows passage of
sodium ions with their hydration shell.
• The ligand-binding domains of the pentameric ion channels can be addressed in
the case of the AChR with agonists such as nicotine or antagonists such as
a-conotoxin. Allosteric regulators are known for the GABA-gated chloride
channel of similar construction. They can amplify the effect of the endogenous
ligand GABA which blocks the excitability of the cells by opening the chloride
channel.
• Benzodiazepines bind to the a-subunit of the GABA-gated chloride channel and
dependent on their substitution patterns act as sedatives, hypnotics, anxiolytics,
anticonvulsives, muscle relaxants, or anterograde amnestics.
• The voltage-gated CIC chloride channels orient two extended helices with their
N-terminal ends toward the center of the channel. Together with a conserved
glutamate residue at the apex, they achieve the required selectivity, possibly via
an intermediate change of protonation state of a glutamate residue taking the role
of a gatekeeper.
• Transporters shuffle endo- and exogenous compounds across the cell membrane.
Transport is usually coupled with the energetically favorable hydrolysis of ATP
to allow membrane passage against a concentration gradient. Particularly the
human MDR-ABC transporter P-glycoprotein GP170 is upregulated in drug
resistance and removes a broad palette of drug molecules from the cell.
• Bacteria have developed special transporter systems to either allow access to
cells, or to penetrate the membrane of other organisms. One class of pores is
formed by large b barrels of parallel-oriented strands that open a passage to the
interior. Other systems either wrap around cations to form hydrophobic carriers
on their exteriors, or penetrate into membranes with helix-forming elements to
build-up channels that make them permeable for ions.
• Despite an osmotic gradient, water molecules cannot diffuse passively across the
membrane. The regulation of water homeostasis is performed by aquaporins,
which are channels that extend like a hose through the membrane-bound protein.
They have 15-Å wide funnel-shaped vestibule on both sides and narrow to
a diameter of 2.8 Å in the center. A chain of accessible carbonyl oxygen
776 30 Ligands for Channels, Pores, and Transporters

atoms stretches along one side of the pore and passes the transitory water
molecules along. The opposite wall is made up of hydrophobic residues. At
the isthmus, charged His and Arg residues prevent permeation of cations. To
achieve quick control over their function, aquaporins are partly stored in vesicles
and fused with the cell membrane as needed.

Bibliography

General Literature
Cascio M (2006) Modulating inhibitory ligand-gated ion channels. AAPS J 8:E353–E361
Higgins C (2007) Multiple molecular mechanisms for multidrug resistance transporters. Nature
446:749–757
MacKinnon R (2003) Nobel lecture, potassium channels and the atomic basis of selective
ion conduction, Accessed on 6 June 2012 from http://nobelprize.org/nobel_prizes/chemistry/
laureates/2003/mackinnon-lecture.html
Sanguinetti MC, Mitcheson JS (2005) Predicting drug–hERG channel interactions that cause
acquired long-QT Syndrome. Trends Pharmacol Sci 26:119–124
Sather WA, McCleskey EW (2003) Permeation and selectivity in calcium channels. Ann Rev
Physiol 65:133–159
Sui H, Han BG, Lee JK, Walian P, Jap BK (2001) Structural basis of water-specific transport
through the AQP1 water channel. Nature 414:872–878
Triggle DJ, Gopalakrishnan M, Rampe D, Zheng W (2006) Voltage-gated ion channels as drug
targets. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal
chemistry, vol 29. Wiley, Weinheim
Unwin N (1993) Nicotinic acetylcholine receptor at 9 Å resolution. J Mol Biol 229:1101–1124
Unwin N (1995) Acetylcholine receptor channel imaged in the open state. Nature 373:37–43
Unwin N (2003) Structure and action of the nicotinic acetylcholine receptor explored by electron
microscopy. FEBS Lett 555:91–95
Vaz J, Klabunde T (ed) (2008) Antitargets. Prediction and prevention of drug side effects In:
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry,
vol 38. Wiley, Weinheim

Special Literature

Doyle DA, Cabral JM, Pfuetzner RA, Kuo A, Gulbis JM, Cohen SL, Chait BT, MacKinnon R
(1998) The structure of the potassium channel: molecular basis of K+ conduction and selec-
tivity. Science 280:69–77
Dutzler R (2004) Structural basis for ion conduction and gating in ClC chloride channels. FEBS
Lett 564:229–233
Shi N, Ye S et al (2006) Atomic structure of a Na+- and K+-conducting channel. Nature
440:570–574
Ligands for Surface Receptors
31

In ▶ Chap. 29, “Agonists and Antagonists of Membrane-Bound Receptors,” recep-


tors were discussed that allow a signal transduction from outside the cell to its
interior. Numerous processes into the cell are initiated by these systems that alter
the cell’s state. In addition to this type of information exchange, a cell must also
have other ways to remain constantly in contact with its environment. To accom-
plish this task, they have many other surface receptors. For example, the cell’s
integrin receptors not only can accept signals from outside, they can also transmit
signals into the environment. If a cell moves, for example in a blood vessel or in
tissue, it must remain in constant communication with the environment during the
translocation. In this way, leukocytes find their way to sites of infection as a part of
immune response to pathogens. For this, they receive signals from the environ-
ment through their surfaces by using special surface receptors. In viral diseases,
a virus attempts to adhere to a host cell and finally to penetrate the cell. The
recognition of endogenous cell-surface receptors or special adhesion molecules
initially occurs before the target cell under attack can be reprogrammed for the
invasion process. After viral maturation and reproduction, the new virus must be
budded and released from the infected host cell (exocytosis). Proteins that are
exposed to the surface also regulate this process. Drugs can be used to intervene
in both processes: the attack and release of viruses. Our immune system uses
specific surface proteins to distinguish between diseased and healthy cells. Influenc-
ing these processes leads to immune stimulation. The structure and function of the
above-mentioned surface receptors shall be discussed in this chapter. How specific
ligands can suppress or reprogram the actual tasks of these surface receptors to lead
to a successful therapeutic concept shall be explained.

31.1 The Family of Integrin Receptors

Integrin receptors are responsible for bidirectional communication between cells.


As surface-exposed receptors, they penetrate the membrane and possess an archi-
tecture with intra- and extracellular domains. With their extracellular portion,

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_31, 777


# Springer-Verlag Berlin Heidelberg 2013
778 31 Ligands for Surface Receptors

which is easily accessible to possible active substances, they interact with the
extracellular matrix and mediate cell adhesion. This property could already be
used to reconstruct the contact between bones or bone implants and the surrounding
tissue. An improved regression of the tissue around bones can be achieved by the
adhesion of the extracellular domains of integrin receptors or the fixation of ligands
that stimulate these receptors.
The integrins are found in almost all types of cells in mammals. The family of
integrins is divided into numerous subtypes, and multiple subtypes can be simulta-
neously expressed on the same cell. They react quickly to external signals, that is, in
less than a second. They have the complex structural constitution of a heterodimeric
membrane protein; a and b subunits are distinguished, each of which is composed of
multiple domains. Some subtypes display an additional insertion domain. Several
divalent calcium and magnesium ions that form the so-called metal-ion-dependent
adhesion site (MIDAS) are essential for the function of integrin receptors. To date,
18 a and 8 b-subunits have been characterized in humans. They can be combined to
form heterodimers with different compositions. Until now, 24 different combinations
of these subunits have been evidenced in integrin receptors. The nomenclature for the
receptors matches with the following convention: they are termed axby receptors,
whereby x is expressed as a Roman numeral, and y is an Arabic number.
Signal processing occurs via a complex scheme of multiple sequential confor-
mational transformations. The completed transformation is reminiscent of the
opening of a pocketknife (Fig. 31.1). The folded receptor geometry initially goes
into a twisted geometry as the knife blade and the handle spread apart, and then
goes into an open horseshoe-like form. This geometry is presented when the receptor
is in an active state. The extracellular domain of the activated receptor is available for
interactions with other proteins. The binding takes place via a so-called b-propeller-
like domain and an insertion domain (I-like domain, Fig. 31.1), which is transferred
to the active state by the conformational changes outlined. At the same time, the
active conformation makes the MIDAS binding site available. The described struc-
tural considerations are based on crystal structure determinations made on the
individual domains of the receptor. Assembling these individual building blocks
allows an overview of the composition of the total construction. Nonetheless, more
accurate ideas about the individual conformations of intermediates that the receptor
goes through during its activation are eluded by this approach.
The construction and function of the aIIbb3 receptor shall be considered in
detail as an example. Fibrinogen receptor antagonists could be successfully devel-
oped for this receptor and introduced into therapy. The aIIbb3 receptor plays an
important role in the coagulation cascade. It occurs on the surface of platelets
(thrombocytes). In the resting state, about 50,000–70,000 inactive copies of this
receptor are available. If an injury occurs that stimulates blood coagulation, an
additional 50,000 receptors are transferred from the interior to the surface and are
conformationally activated. The receptor can now bind ligands that contain
a specific motif: an Arg–Gly–Asp (RGD motif) sequence. Fibrinogen, a dimeric
soluble plasma protein contains such a motif and reacts with the aIIbb3 integrin
receptor on the surface of the activated platelet. This crosslinking leads to
31.2 Successful Design of Peptidomimetic Fibrinogen Receptor Antagonists 779

Receptor
Binding Site
EGF1+2 b MIDAS-
Binding Site
Thigh
I-like
PSI Calf-1

b-Propeller Hybrid
EGF3
I-like Calf-2
I-Domain EGF4

b-Tail

α β
Inactive Active

Fig. 31.1 Integrin receptors have a complex structural construction consisting of a membrane-
bound heterodimer made from a- and b-subunits. Each subunit is constructed from multiple
domains, and some subtypes show an additional insertion domain (I-domain). The signal processing
occurs by a complex scheme of sequentially occurring conformational modifications that progress
from an inactive folded structure to an active horseshoe form. Multiple divalent calcium and
magnesium ions that form a metal-ion-dependent adhesion site (MIDAS) are essential for the
function. The receptor–ligand-binding region is on the b-propeller domain and an I-like domain.

aggregation of the platelets and initiates the formation of a thrombus for the wound
closure, a so-called primary or cellular hemostasis. Via a second docking site on the
platelet, the forming blood clot binds to the Von Willebrand factor, which is
produced by endothelial cells. A permanent connection is then created between
the aggregating blood platelet and the vascular wall by this contact.
A blockade of the surface receptors on the blood platelet leads to an arrest in the
coagulation process. Because this process is broadly needed over the entire organism,
internal bleeding could be the consequence. A snake, the common saw-scaled viper
or carpet viper (Echis carinatis), uses this active principle in its venom to subdue its
prey. Because they are often found in the vicinity of human settlements in Africa and
Asia, their bite has already been fatal for some members of our species as well. They
use a 49-residue peptide as venom that has an RGD sequence in its center. A drug
following this inhibitory principle is desirable to achieve a local anticoagulatory
effect. This is of interest in the context of angina pectoris, myocardial infarction,
stroke, atherosclerosis, or in emergency medicine to prevent ischemic complications.

31.2 Successful Design of Peptidomimetic Fibrinogen


Receptor Antagonists

As described in the last section, antagonists of the aIIbb3 integrin receptor, found on
the surface of thrombocytes, represent a rewarding point of attack for the
780 31 Ligands for Surface Receptors

N N
N N
N N
O H O H
N N
HN H O H3C-N H O
O HN O HN

NH O NH O
NH NH
O O
COOH COOH
31.1 31.2
H2N O O
S
S N
S S H N
H3C HN O
N O O
H O
H
H2N N N COOH O NH N
N NH O NH O
H H
NH O H COOH
H2N N N N
H H
31.3 O
31.4 Eptifibatide

O CH3
N
N N
H O
O
N N
N H COOH
H COOH HN
H2N O

NH
31.5 Ki = 2.3 nM 31.6 Lotrafiban Ki = 2.3 nM

Fig. 31.2 The bound conformation of the RGD motif of the natural ligand fibrinogen to the
aIIb/b3 integrin receptor subunits could be determined by using structurally rigid cyclopeptides.
They served as the first lead structures for the development of non-peptidic receptor antagonists
such as the benzodiazepines 31.5 and 31.6. The cyclopeptide eptifibatide 31.4 was introduced to
therapy as a drug.

development of anticoagulants. Fibrinogen initiates the coagulation cascade by


interacting with the aIIbb3 receptor with a sequence containing the tripeptide
motif Arg–Gly–Asp (RGD motif). First of all, the question about the conformation
with which this tripeptide binds to the receptor had to be determined. For this,
cyclic pentapeptides with the Arg–Gly–Asp sequence were synthesized. The
peptide cyclo-(Arg–Gly–Asp–Phe–D-Val) 31.1 (Fig. 31.2) proved to be a high-
affinity ligand for the aIIbb3 receptor with an inhibition constant of IC50 ¼ 2 nM.
NMR spectroscopic investigations showed that this cyclic pentapeptide adopts the
conformation of a b turn. Years later, this geometry was confirmed in a crystal
structure with the structurally related pentapeptide cyclo-(Arg–Gly–Asp–Phe–D-
MeVal) 31.2 (Fig. 31.3).
31.2 Successful Design of Peptidomimetic Fibrinogen Receptor Antagonists 781

H O
N
b-Propeller –
Domain H2N H O O
N O
+ N O
H HN
Asp224 H2N O
NH
Tyr122

O N
31.2 CH3
2+
Ca
Asp232 MIDAS
Domain

Fig. 31.3 Crystal structure of the aIIb/b3 integrin receptor with the cyclopeptide 31.2. The
structure confirms the assumption that the peptide is in a b-turn conformation at the receptor.
The peptide’s RGD motif binds in an extended geometry with its arginine residue between two
aspartic acids in the propeller domain, and with the aspartic acid residue to the metal ions in the
MIDAS binding site.

Additional highly potent peptidic structures were found; among others,


the cyclic peptide 31.3, with a disulfide bridge, was discovered at SmithKline
Beecham. Another cyclopeptide, eptifibatide 31.4, which is also stabilized by
a disulfide bridge, was introduced into therapy in 1999 by COR Therapeutics
under the name Integrilin ®. However, the actual goal of arriving at non-peptidic
low-molecular-weight structures was not achieved. Therefore, small organic mol-
ecules were sought with functional groups which could mimic the orientation of the
side chains of arginine and aspartic acid in 31.1–31.4.
The researchers at SmithKline Beecham concentrated on benzodiazepine deriv-
atives. This structural class displays two favorable properties. On the one hand,
benzodiazepines have been intensively investigated by synthetic chemistry, and
many derivatives are easily accessible. On the other hand, benzodiazepines are rigid
and therefore outstandingly well suited for conformational stabilization. Moreover,
they had already been extensively investigated as b-turn mimetics (▶ Sect. 10.5).
A comparison of multiple benzodiazepine derivatives with the peptidic lead struc-
ture indicated that derivative 31.5 should be able to position the Arg and Asp side
chains exactly as in 31.3. In fact, 31.5 proved to be a potent fibrinogen receptor
antagonist (Ki ¼ 2.3 nM). Through further modifications, lotrafiban 31.6 could be
found as a candidate for clinical studies (Fig. 31.2). The compound failed later in
the clinic, because of inadequate efficacy and because of isolated fatalities.
The research group at Searle went in a somewhat different direction (Fig. 31.4).
The starting point here was the peptide Arg–Gly–Asp–Phe (31.7, IC50 ¼ 29 mM).
In the first step, the dipeptide fragment Arg–Gly was exchanged for an
8-guanidinooctanoyl group (31.8, IC50 ¼ 3 mM). Inspired by the result with thrombin
inhibitors indicating that an alkylguanidine group can be replaced by
a benzamidine, such a moiety was introduced. This gave a dramatic increase in
the binding affinity (31.9, IC50 ¼ 0.072 mM). Although this compound was not
orally available, SC-52012 was the first fibrinogen receptor antagonist from Searle
782 31 Ligands for Surface Receptors

COOH
NH2 O
H H
H2N N N 31.7
N N
H
NH O O COOH

COOH
O
H H
H2 N N N 31.8
N
H
NH O COOH

COOH
O
N 31.9
N
H SC-52012
H2N O COOH

NH
COOEt
O

N N 31.10
H
H2N

NH
COOEt
H O
N
N 31.11 Xemilofiban
H
H2N O

NH

O COOEt
O CH3
N
N O SO2nBu
H HN
H2N O
HN
NOH COOH

31.12 Sibrafiban 31.13 Tirofiban IC50 = 35,7 nM

Fig. 31.4 By starting with the linear peptide Arg–Gly–Asp–Phe 31.7, xemilofiban 31.11 was
obtained by stepwise modification. The ethinyl group instead of the pyridine ring does not change
the binding affinity but it significantly increases the bioavailability. A similar development
candidate, sibrafiban 31.12, was already tested but was not pursued to a marketed product because
of bleeding problems. Tirofiban 31.13 was introduced to the market for the emergency prevention
of ischemic complications due to a thrombus in the course of a stroke or heart attack.

for which a clinical trial as an i.v. application was undertaken. The goal of the work
was no longer a further increase in the binding affinity, but rather an improvement
in the bioavailability. For this, derivatives with reduced molecular weight were
preferentially investigated. It was shown that the C-terminal amino acid,
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 783

phenylalanine, could be replaced with a simple pyridine ring without a massive loss
in affinity. By additionally esterifying the carboxylate group, the Searle research
group arrived at a compound with weak oral activity. Compound 31.10 is a prodrug
that is quickly transformed in the body by esterases to the free carboxylate, which is
the actual active substance, (IC50 ¼ 0.15 mM for the free carboxylate). Finally,
aminobenzamidino succinates were investigated. Here the idea was to increase the
affinity by forming an additional H-bond to the receptor by reintroducing an amide
group. Indeed, 31.11 is a highly potent fibrinogen-receptor antagonist (IC50 ¼
0.067 mM for the free acid). The compound is well absorbed after oral administra-
tion. Searle introduced xemilofiban 31.11, as the compound was later named, into
clinical studies, which were, however, discontinued in phase III.
The work at Roche had led to the comparable development candidate
sibrafiban 31.12. A double prodrug came into the clinical trials. The company
undertook a broad study on 9,000 high-risk patients with this compound. At low
doses, an effect comparable to ASA (Aspirin ®) was found. At higher doses,
bleeding problems increased significantly. The development of this compound
was therefore abandoned. Despite many clinical studies with a large number of
development candidates, only Merck introduced a non-peptidic receptor antag-
onist, tirofiban 31.13 (Aggrastat ®), for emergency medicine to prevent ischemic
complications associated with a thrombus as a result of a stroke or heart attack.
According to the established RGD pharmacophore pattern of a basic group,
a bridge, and an acidic group, 31.13 was formed as an inhibitor with an IC50 ¼
375 nM (Fig. 31.5) by replacing the benzamidino group with a piperidine ring,
and by abandoning the amide group in the bridge between the basic group and the
acid function. Because it has inadequate oral availability, it is administered
intravenously. Time will tell whether fibrinogen-receptor antagonists will
achieve importance in the therapy of thrombotic diseases over and above their
use in emergency medicine.

31.3 Selectins: Surface Receptors Recognizing Carbohydrates

Leukocytes, white blood cells, are transported through the body with the blood
flow. Their principle task is to defend against pathogens during inflammatory
processes. To achieve this task, they initially must be stopped in the normal
blood flow in the vessel that runs alongside the site of inflammation (Fig. 31.6).
This deceleration is made apparent in a type of leukocyte rolling adhesion. Surface
receptors that are also found on the rolling leukocyte are involved in stopping the
leukocytes. On the other hand, in cases of inflammation and in the vicinity of
the actual site, selectins are increasingly expressed on the cell surface of the
endothelium. Temporary contacts that are weak but very selective sugar–protein
interactions are responsible for deceleration. Finally the leukocyte is stopped
completely. Integrins on the leukocytes are responsible that interact with
intercellular adhesion molecules (ICAMS) on the endothelium. In the last step,
the leukocytes leave the blood vessel (extravasation). After their migration to
784 31 Ligands for Surface Receptors

Asp224
Tyr122
31.13 Tirofiban

31.4 Eptifibatide
Ca2+

Mg2+

Ca2+

Fig. 31.5 Superposition of the crystal structures of eptifibatide 31.4 with tirofiban 31.13 and the
aIIb/b3 integrin receptor. Peptidic as well as non-peptidic marketed products bind on one side to the
aspartic acids of the propeller domain and on the side opposite to the metal ions of the MIDAS
binding site. The example demonstrates how amino-acid residues can be replaced by other, non-
peptidic groups.

the site of inflammation, they fight the infection by releasing cytokines and
degrading substances. The latter attack the inflammation site both oxidatively and
proteolytically.
Some inflammatory processes lead to vascular damage by excessive leukocyte
infiltration, for instance in conjunction with a heart attack (reperfusion), by chronic
irritation as in rheumatoid arthritis, in atherosclerosis, diabetic angiopathy, or during
a carcinoma metastasis. In such situations, a therapeutic concept that intervenes in the
inflammatory cascade lends itself well to reducing excessive leukocyte infiltration.
This can be achieved by binding an antagonist to the selectins.
The selectins belong to the large group of lectins, a family of complex glyco-
proteins. They form interactions to carbohydrate structures and are able to achieve
the anchoring between cells and/or cell membranes through these contacts. The
selectins are a subgroup of these glycoproteins. They are classified as E-, L-, and
P-selectins. Structurally, they are related to each other and differ in the number of
particular repeat sequences (short consensus repeats). In addition to a C-terminal
cytoplasmatic part they have a transmembrane domain. The binding site for carbo-
hydrate molecules is found on a lectin domain at the N terminus. The structure of
such selectin domains is shown in Fig. 31.7.
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 785

a b integrin receptor

3 selectine
leukocyte
ligands
4
1
blood vessel

2
inflammation

P+E selectines

c d endothelial cell
ICAMs

6 1 rolling
2 inflammation
3 selectine receptors
4 integrin receptors
7 5 fixation
6 penetration
7 degradation

Fig. 31.6 (a) Leukocytes are transported through the body in blood vessels with the blood flow
(1). If the vessel passes a site of inflammation (2), the leukocytes are stopped out of the normal
blood flow. (b) They change their rolling behavior by interactions with selectin receptors (3),
which are increasingly expressed on the endothelium in the vicinity of the actual site. Integrin
receptors on the surface of the leukocytes are activated (4). The leukocytes are fully fixed because
of integrin receptor binding to intracellular adhesion molecules (ICAM; 5, c). Leukocytes leave
the vasculature (6, d) and migrate into the neighboring tissue to the site of inflammation, which
they fight by releasing cytokines and degrading substances such as oxidants and proteases (7).

The endogenous ligand of the selectins is PSGL-1, a glycoprotein on the surface


of leukocytes. As an exposed binding epitope, the PSGL-1 protein has multiple
copies of a motif made from four sugar molecules. It is termed sialyl LewisX
31.14, and abbreviated sLeX. The four sugar moieties are made up of an
N-acetylglucosamine, a fucose, a galactose, and a sialic acid. Their binding mode is
schematically outlined in Fig. 31.7a. The four sugar molecules form numerous
hydrogen bonds with their hydroxyl groups to a shallow, bowl-shaped binding site
on the protein. A calcium ion that undergoes interactions with multiple exposed
residues in the binding pocket binds directly next to the sLeX binding epitope. It also
forms contacts with the fucose. The sLeX binding epitope is unsuitable as a drug
because it is easily degradable by glycosidases, and because it has relatively weak
binding affinity (IC50 ¼ 4 mM). Therefore compounds that mimic the sugar binding
have been sought. Initially the hydroxyl group of fucose that interacts with Asn82,
Glu80, Asp106, and the Ca2+ ion was recognized as being critical for binding.
Moreover, the acidic function on the sialic acid, which forms interactions with
Tyr48 and Ser99, was concentrated upon. These two polar ligand-binding regions
should be coupled with a hydrophobic biphenyl moiety. Instead of fucose,
the synthetically more easily accessible mannose was used, and inhibitor
786 31 Ligands for Surface Receptors

31.14 Sialyl-Lewisx
a Sialic Acid
NHAc OH
N-Acetylglucosamine HO
Oligosaccharide HO
OH
O O OH
O Galactose
O
Glu107 NHAc OH
O O O
O
OH O O
Fucose HO O
Asn105 O OH
O HO
H2N Ca 2+ H3C
HO
OH HO
O
O HO Ser99
H2N O O
O NH2
O O O O
Asn83 Tyr48
Asp106
Asn82 Glu92 Tyr94
Glu80

Asn83

Ser99
Asn82

Glu80 Asn105

Glu92 Tyr94

Fig. 31.7 The crystallographically determined binding mode of sialyl–LewisX 31.13, exposed
binding epitope of the PSGL-1 protein to the selectin surface domain (a). The four carbohydrate
moieties: N-acetylglucosamine (violet), fucose (green), galactose (blue), and sialic acid (red) form
numerous hydrogen bonds with their oxygen atoms to the protein in a shallow, bowl-shaped
binding pocket (b). A calcium ion (violet sphere) is involved in the binding and interacts with
multiple protein residues as well as with the ligand’s fucose moiety.

31.15 (Fig. 31.8) with an IC50 ¼ 500 mM resulted. The affinity was further improved
by a factor of 5 by adding a second, structurally similar group to give 31.16.
Another path was forged at Revotar Biopharmaceuticals. Compound 31.15 was
used as a reference substance. Smaller, multiply hydroxylated aromatic rings were
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 787

Sialic acid
OH
N-Acetylglucosamine HO NHAc
HO
OH
RO O OH
Galactose
OH O
NHAc O O
O
OH O O
OH O
O
H3C HO HO
Fucose OH

31.14 Sialyl-Lewisx IC50 = 4mM

(CH2)6
COOH COOH

O
O O
O OH
O OH O OH
OH
OH OH OH
OH OH OH
OH OH

31.15 TBC265 IC50 = 500μM 31.16 Bimosiamose IC50 = 95μM

COOH
COOH S

O
OH
OH NH
O
O OH
OH
OH
OH

31.17 IC50 = 1μM 31.18 IC50 = 0.75μM

Fig. 31.8 By starting from sialyl–LewisX 31.14, a micromolar lead structure (31.15) was
developed by exchanging fucose for mannose and adding a hydrophobic linker with a terminal
oxygen group. The addition of a second, structurally analogous building block to give
bimosiamose 31.16 improved its affinity. By starting from a pyrogallol scaffold, sugar-dissimilar
structures with submicromolar affinity were obtained (31.17, 31.18).

sought as a replacement for the mannose moiety. A pyrogallol substituent proved to


be the best mimetic. When coupled with a biphenyl moiety, 31.17 showed an IC50
value for L-selectin in the nanomolar range. The scaffold was further optimized.
The introduction of an expanded bridge between the two terminal anchor groups
and the exchange of a phenyl ring for a thiophene led to 31.18. This compound
788 31 Ligands for Surface Receptors

showed in vitro affinity in the upper-nanomolar range with a molecular weight


below 500 Da. In view of the very shallow and broadly opened binding pocket, this
is an impressive binding affinity for such a small antagonist for this protein. At the
current state of research, we can only wait and see whether the followed strategy
finally leads to compounds that also prove to be successful therapeutic principles in
the clinic.

31.4 Fusion Inhibitors Impede Viral Invasion

Because viruses do not have their own metabolic and reproductive machinery, they
are forced to hijack a host cell for these tasks. They do, however, contain the
program and information for their reproduction archived in the form of their own
DNA or RNA. To gain entrance into a host cell, they must dock onto this cell, and
their envelope must merge with the host cell membrane. Let us discuss an example.
The HI virus fuses with T lymphocytes and in so doing, initiates an AIDS infection
(Fig. 31.9). The virus has a diameter of about 120 nm (1,200 Å). More than
70 glycoproteins are embedded in its membrane envelope. Each of these surface
proteins consists of so-called gp120 and gp41 subunits that arrange themselves as
trimers. The gp41 unit sticks up like a sewing pin in the membrane envelope, whereas
the gp120 unit is a nearly spherical external head to this pin. Both subunits could be
structurally and biologically characterized. The gp120 protein, which is constructed
from pleated sheets and helices, acts as a mooring anchor for the virus. It binds to the
CD4 receptor on the surface of the T lymphocytes. A conformational change in
the gp120 protein then occurs. This initiates the subsequent interaction with the
CCR5 or CXCR4 co-receptor, which are found in the vicinity. The binding to these
chemokine receptors causes another conformational change in the sewing-pin-like
“warhead” on the envelope of the virus. The monomers that make up the trimeric helix
bundle and form the gp41 subunit are each composed of three segments, the HR1,
HR2, and FP domains. The virus penetrates the membrane of the host cell with the FP
domains. The bundle of the three HR1 domains makes three grooves on its surface
available that are optimally suited to accept the HR2 domains (Figs. 31.9 and 31.10).
For this, they must adopt a helical geometry. The three initially extended and parallel-
oriented HR1 and HR2 peptide chains “zip” together and form a compact bundle of six
helices. This zipping together causes the membranes of the virus and host cell to be
pulled together. The fusion process of the envelopes is therefore initiated.
Can the fusion process be blocked so that the beginning of the infection process
can be stopped? The tightly packed bundle of HR1 helices makes a groove available
on the surface to accommodate the HR2 peptide with its helical construction.
Therefore peptides were synthesized at Duke University in Durham, North Caro-
lina, USA, to mimic the sequence of the HR2 domain. In 1996 in the subsequently
founded company, Trimeris, one of these peptides was discovered. DP178,
a 36-residue peptide is able, as is the HR2 peptide, to dock in the available groove
on the HR1 peptide and block gp41 from zipping. The lead structure was further
developed in cooperation with Roche to the drug enfuvirtide, a peptide made up of
31.4

HR2
Virus
a b c d

gp41
gp120
HR1
CD4
Fusion Inhibitors Impede Viral Invasion

Host Cell

Fig. 31.9 An AIDS infection is initiated by an attack of the HI virus (orange) on T-lymphocytes (gray; a). It uses a trimer of its surface proteins containing
a gp120 (violet) and gp41 subunits (red/green) for this purpose. The gp120 protein binds to the endogenous CD4 receptor (blue). A conformational change in
the gp120 protein takes place (b). For this, an interaction with the CCR5 or CXCR4 co-receptors (yellow), which are in the vicinity of the CD4 receptor, is
formed (c). Both receptors belong to the GPCR class. By binding to these chemokine receptors, the sewing-pin-like “warhead” gp41, which consists of three
segments in a helix bundle (red/green), undergoes a conformational change. The virus penetrates the membrane of the host cell with this helix bundle, and the
fusion process is initiated (d). Finally, the initially extended peptide chains assemble and compress themselves into a tight bundle of six helices. This brings the
virus and the host cell even closer together (d, inset).
789
790 31 Ligands for Surface Receptors

Fig. 31.10 The bundle of three HR1 domains (green) makes three grooves on its surface
available that are optimally suited to accept the HR2 domains (red) once these have transformed
to a helical geometry. Three initially extended and parallel-oriented peptide chains fold together
and form a tight bundle of six helices. This tying together pulls the membranes of the virus and the
host cell together.

36 amino acids (Ac-Tyr–Thr–Ser–Leu–Ile–His–Ser–Leu–Ile–Glu–Glu–Ser–Gln–


Asn–Gln–Gln–Glu–Lys–Asn–Glu–Gln–Glu–Leu–Leu–Glu–Leu–Asp–Lys–Trp–Ala–
Ser–Leu–Trp–Asn–Trp–Phe-NH2) and a molecular weight of 4,492 Da. It was
introduced to the market under the name Fuzeon® as the first fusion inhibitor for
viral disease. It must be subcutaneously injected, and to date it is used as a replacement
therapy when resistance to the HAART therapy (▶ Sect. 24.5) has developed.
The interactions of the helical structure for the HR2 peptide strand with the bundle
of HR1 domains has stimulated the search for low-molecular-weight fusion inhibi-
tors. Above all, the three hydrophobic amino acids, Trp628, Trp631, and Ile635
cause the contact between the helix strands. Until now, only a few, relatively highly
charged structures such as 31.19 and 31.20 have been found by screening that can
form contacts and block zipping (Fig. 31.11). Because low-molecular-weight inhibitors
have been successfully found in other projects, such as the helix mimetic for the
BCL-XL protein described in ▶ Sect. 10.6 that compete with a contact between
a helix and an extended groove, there is hope that low-molecular-weight lead structures
can also be found here. Time will tell whether resistance also develops to these
compounds.
At this point it should be mentioned that there is another important co-receptor
for cell entry processes that can also be antagonized with low-molecular-weight
ligands: the chemokine receptor CCR5 (Fig. 31.9). The chemokine receptors
belong to the class of GPCRs (▶ Sect. 29.1). CCR5’s function can be suppressed
with ligands such as maraviroc 31.21, which was introduced into therapy by Pfizer
in 2007. Viral fusion processes can be suppressed with this concept too.
31.5 Neuraminidase Inhibitors Prevent Budding of Mature Viruses 791

31.19 31.20 31.21

Fig. 31.11 The HR2 peptide strand (red) interacts with the three hydrophobic amino acids
Trp628, Trp631, and Ile625 with the bundle structure of the HR1 domain (green). In screening,
the multiply charged structures 31.19 and 31.20 were discovered as mimics that can block the
bundle-type packing of the helices. Maraviroc 31.21 antagonizes a cytokine receptor that is
involved in initiating the fusion process between the HI virus and T lymphocytes (Fig. 31.9).

31.5 Neuraminidase Inhibitors Prevent Budding of Mature


Viruses

As described in the previous section, viruses are incapable of living an autonomous


life. Therefore they are forced to find a host cell that they can reprogram for their
own reproduction and exploit for their own metabolism. Viruses store their genome
and therefore their construction plans in single or double-stranded DNA or RNA,
according to type. These nucleic acids are found in the interior of the virus and are
surrounded by a protein coat, the so-called capsid that can also be built from lipid
building blocks depending on the virus type. Glycoproteins are embedded in this
capsid. Defense mechanisms that are mediated by antibodies are particularly
targeted against the proteins that are presented in the capsid. Viruses code the
information for numerous enzymes that are specifically needed for their replication
in their genome. Moreover, they can also have channel proteins that allow material
transfer between the interior of the virus and its environment.
One of the most common viral diseases, the flu, is caused by the influenza virus.
This virus belongs to the family of enveloped viruses, and three subtypes, A, B, and
C are known. Their transmission is airborne; usually the viruses are released by
sneezing and are transferred to the next creature. Influenza viruses do not only
infect humans, they are also taken in by animals and can be transmitted in this way.
Usually at first there is no infection transmission from one species to another.
However, such transmission routes from animal to human and vice versa are
observed in regions in which these individual species live in close contact to one
another. Once arrived in the airway of a new victim, the influenza viruses adhere to
792 31 Ligands for Surface Receptors

M2-Protein
Neuraminidase H3C NH3+
NH3+
Hemagglutinin Cl− Cl−

Matrix Proteins

Viral Nucleic Acids,


Polymerase
31.22 Amantadine 31.23 Rimantadine

Fig. 31.12 In addition to the docking protein hemagglutinin (blue), of which 15 subtypes are
known, the envelope of the influenza virus contains a neuraminidase (red), which has nine
variations (N1–N9), and the M2 proton channel protein (green). This pore can be blocked by
amantadine 31.22 and rimantadine 31.23. Upon maturation and budding of a newly formed virus,
the glycolytic activity of neuraminidase is needed to detach from the host cell (gray) in the last step
by cleaving a sugar chain (green).

the mucous membranes by using hemagglutinin proteins found on their surfaces. In


addition to the docking protein hemagglutinin, the viral envelope also contains
neuraminidase and the M2 proton channel protein (Fig. 31.12). Proteins can
sometimes significantly vary in their sequential amino acid constitution and still be
able to perform the same functions. Such variations are called subtypes. Sixteen
subtypes are known for hemagglutinin (H1–H16), and nine variants of the surface
enzyme neuramidinidase (N1–N9) have been characterized. New variations con-
stantly form from new combinations, which then make their way through the
population. In recent years, the variants H5N1 (bird flu) and the H1N1 (swine
flu) have kept us on edge. Their development is also traced to a jump from the
corresponding animal species to humans. The observed H5N1 variant proved to be
particularly pathogenic, but the infection mechanism in humans and the interspe-
cies crossing was less efficient. The swine flu variant H1N1 of autumn 2009 was
especially infectious, but its clinical course was less serious. Unfortunately, this
picture can be quickly modified by small changes in the viral proteins. Antigen
drift and antigen shift are distinguished. In drift, genetic changes occur that are
usually a result of copy errors from an error-prone transcription process of the viral
genome. The virus slowly changes its surface proteins by pure chance. Because the
31.5 Neuraminidase Inhibitors Prevent Budding of Mature Viruses 793

infected host organism develops its own antibodies, or the immune system is stimu-
lated to produce such antibodies by a flu vaccine, these can still be adequately
recognized, even with small modifications in the capsid proteins, and rendered
harmless. Antigen shift is much more dangerous; it is caused by the exchange of
genetic information between virus species or subtypes. It can especially occur between
species, that is, upon transmission from animal to human. Because this route comes
from new combinations of surface proteins, it is difficult for the immune system to
build adequately high antibody titer fast enough to render such a modified virus
harmless. Such antigen shifts can lead to pandemics. As mentioned, usually they
originate in regions where numerous species such as ducks, chicken, pigs, cats, dogs,
and humans live together in close quarters. Because of the lifestyle, high population
density, and traditional animal husbandry practices in which animals and humans
live under the same roof, the East Asian regions of Southern China or Mexico have
repeatedly proven to be incubators for such genetically varied virus forms. There
have been many pandemics in the past. The most serious one worldwide was
certainly the so-called Spanish flu in 1918 that claimed at least 25 million fatalities.
The influenza viruses of this pandemic had a particularly virulent subtype, H1N1. In
1957 a pandemic occurred with the H2N2 subtype, and in 1968 it was the H3N2
combination that was particularly dangerous. The last pandemic warning from the
WHO was issued in the fall of 2009 after a renewed H1N1 variant (cf. 1918) in the
form of the so-called swine flu took its starting place in Mexico. One year later, we
knew better. This variant proved to be not nearly as dangerous in its observed form
as was initially expected. The current preventive therapy for the flu is a vaccination.
The vaccine contains parts of the surface proteins hemagglutinin and neuraminidase,
or also matrix proteins as antigens, and it stimulates the immune system to produce
antibodies. The production of a new vaccine takes some time and represents a great
financial effort. Therefore an attempt is made to estimate which viral subtypes might
be involved before a flu wave strikes. Viral envelope proteins are isolated from these
subtypes, and a vaccine is developed for the next vaccination campaign based on
these proteins. It was just this step that was initiated in the summer of 2009 to
prepare a vaccine against the swine-flu-type H1N1 for the more heavily populated
northern hemisphere in time for winter. The population was also simultaneously
asked not to neglect the vaccines from virus strains from previous years and to obtain
adequate protection from such a vaccination.
Three surface proteins can be concentrated upon for a defensive therapy with
low-molecular-weight compounds. The drugs amantadine 31.22 and rimantadine
31.23 (Fig. 31.12), which block the M2 proton channel protein, are already rather
old. The target protein is a pore that is open for protons. It is opened and regulated in
a pH-dependent manner by a ring of four histidine residues. If the four histidine
residues are deprotonated, the pore is closed due to a network of H-bonds between
the histidines. If the histidines are protonated and exist in a charged state, their
spatial orientation changes and the H-bond network is disrupted. As a consequence,
the channel of the M2 protein is opened. The two ligands 31.22 and 31.23 are not
very specific and do not allow an efficient therapy. Furthermore, multiple resistance
mutations have been observed that abolish the action of these drug molecules.
794 31 Ligands for Surface Receptors

Asp151
O
Asp151
O
O− H O R
H O− H O
R
H O−
O O
HO O H2N H O
H3C N − N HO O H2N H
H HO O − H2N Arg371 H3C N HO − N
O H Arg371
H + O H 2N
O
HO OH O HO
HO OH O HO
Glu277 O−
Glu277 O−
Tyr406
31.25 Tyr406
Asp151
O
R
O− H O O
HO
HO H3C N
O H H HO O HO
HO O H2N
N HO − N
H3C H O H2N Arg371 HO COO−
OH
O
O
HO HO
OH O HO
H3C HO O COO−
Glu277 O−

Tyr406 HO OH OH

Fig. 31.13 Reaction mechanism of the glycolytic cleavage of a sialic acid residue. The residue is at
the end of a carbohydrate chain that couples the virus to the host cell. The sialic acids binds to viral
neuraminidase. The glycosidic bond is cleaved from the remaining sugar chain with assistance from
the two neighboring acidic amino acids, Glu277 and Asp151. A sialosyl cation is formed that is
temporarily stabilized by Tyr406. The sugar is released after transfer of an OH group to the trigonal
center. The stable stereoisomer is formed by ring opening and reformation of the cyclic sugar.

The docking protein hemagglutinin initially seemed to be an ideal structure for


a further low-molecular-weight ligand. If this protein were rendered non-functional,
the viral infection would be stopped at the host-cell-penetration stage. Unfortu-
nately, this protein changes so severely by constant mutations that it is difficult to
develop ligands for long-term use. Therefore neuraminidase remains as a further
target structure. It plays no role in the viral penetration of the host cell. On the other
hand, it regulates the budding of a newly formed virus, and in particular the
detachment from the host cell. In the last step of detachment, the newly formed
virus is coupled to the host cell by a sugar chain (Fig. 31.12). The two last sugar
residues of this anchor are a galactose and a sialic acid 31.24 (or N-acetylneuraminic
acid), which is coupled by a glycosidic oxygen bridge. The virus uses its neuramin-
idase to permanently free itself from the host cell. Because this protein must
specifically recognize sialic acid, the virus cannot structurally change too much
without giving up the efficient recognition of sialic acid and the catalytic glycosidic
cleavage process. It would be in danger of sacrificing its own ability to survive.
Neuraminidase has an enzymatic glycosidase function. Such enzymes possess
a dyad of two aspartatic or glutamic acid residues. We have already seen the
mechanism of such an enzyme in ▶ Sect. 21.3. Neuraminidase attacks the glyco-
sidic bond between the terminal sialic acid and the galactose with its Glu277
residue. A cation 31.25 (Fig. 31.13) is formed that is temporarily stabilized by the
31.5 Neuraminidase Inhibitors Prevent Budding of Mature Viruses 795

Fig. 31.14 The development of the neuraminidase inhibitor zanamivir 31.28 and oseltamivir
31.32. Compound 31.26 was developed as a stable structural analogue to the sialosyl cation 31.25.
By exchanging the OH group for an NH2 group, 31.27 is formed with Ki ¼ 40 nM. The introduction
of a guanidinium group to form 31.28 brought a further improvement in the activity. Carbocyclic
analogues 31.29 and 31.30 were synthesized at Gilead Sciences and further optimized to 31.31 by
exchanging an OH for an NH2 group. To improve the bioavailability, an ester prodrug, 31.32 was
introduced into therapy. A depot form was developed with 31.33. Peramivir 31.34 is an intrave-
nously applicable neuraminidase inhibitor.

spatially neighboring Tyr406 residue. The crystal structure of the influenza neur-
aminidase with a sialic acid analogue 31.26 was determined in 1983 (Fig. 31.14). It
imitates the transition state of the enzymatic reaction. Compound 31.26 blocks the
protein with Ki ¼ 4 mM. To discover further key positions for additional functional
groups in the binding pocket, the GRID program from Peter Goodford
(▶ Sect. 17.10) was consulted. This method suggested that a favorable position
for a large, positively charged group exists in the vicinity of the 4-OH group and the
neighboring Glu119 and Glu227 residues. Exchanging this OH group for an
aliphatic amino group led to 31.27, which results in a stronger hydrogen bond to
the protein. The binding affinity improved into the nanomolar range. If the amino
group is modified to a guanidino group as in 31.28, the two neighboring glutamate
residues can be involved in an interaction to the ligand. Compound 31.28 binds to
the protein with Ki ¼ 0.2 nM. The substance was clinically developed by
GlaxoSmithKline (GSK), and zanamivir 31.28 was marketed under the name
Relenza® in 1999. Because of its high polarity, the drug has poor oral bioavailability.
It can only be applied by inhalation. A special inhalation device had to be developed
for the administration of the drug. Nevertheless, the launch of an orally available drug
to the market was still desired. A different approach was taken at Gilead Sciences
with the goal of reducing the high polarity of zanamivir (Fig. 31.14). Initially the
796 31 Ligands for Surface Receptors

a Arg149 b A rg 149

A rg 222
Arg222
Asp148 Asp148

His274

His274
Glu119

Glu277
Glu276 Glu276
Glu119

Arg291 Arg115 Arg291


Arg115

Arg373 Arg373

Fig. 31.15 The crystallographically determined binding geometries of zanamivir 31.28 (a) and
oseltamivir 31.32 (b) in neuraminidase. The acidic function of the inhibitors is anchored
by Arg115, Arg291, and Arg273. The opposite N-acetyl group interacts with Arg149.
Asp148 forms a layered geometry with the guanidinium group from zanamivir, whereas the
amino group found in the same position in oseltamivir forms a hydrogen bond to Asp148.
Compound 31.28 forms a hydrogen bond to Glu276 by using its glycerol groups, whereas
the more hydrophobic iso-pentylether group in oseltamivir induces a rearrangement of Glu276
to form a salt bridge to Arg222. Interestingly, upon exchange of His274 for a Tyr, resistance
to this drug occurs because the rearrangement that is needed for oseltamivir binding can no
longer occur.

central pyran ring of 31.26 was exchanged for a carbocycle (31.29) and the double
bond was relocated by one position. With this, 31.30 can better imitate the transition
state of the reaction. Next, the glycerol function was exchanged for a more hydro-
phobic iso-pentylether group in 31.31. To improve the bioavailability of 31.31
a prodrug strategy was chosen. By esterifying the free acid function, a new orally
available inhibitor, oseltamivir 31.32, resulted. Roche licensed this compound in
1999 and introduced it to the market as Tamiflu® (Fig. 31.15).
After almost 10 years of clinical use, the first cases of resistance to oseltamivir
and zanamivir were described. Even cross-resistance to both compounds has
occurred. The effects of such mutations can be illustrated for a His!Tyr exchange
at position 274, which creates oseltamivir resistance. The reorientation of Glu276,
which is necessary for oseltamivir binding, is blocked in the viral mutant.
A substantial reduction in the binding affinity is the result. Significantly less
resistance has been described for zanamivir. Perhaps this is because it is structurally
more similar to the sialic acid substrate and it is therefore more difficult for the
virus to develop a mutation without limiting its ability to bind its own substrate.
Such a concept is certainly the silver bullet to prevent fast resistance to a new
31.6 Stopping the Common Cold: Inhibitors for the Capsid Protein of Rhinovirus 797

promising drug. Perhaps, however, zanamivir has been largely protected from
resistance development because its inhalative application route is less convenient,
and it has therefore simply been used less often.
Follow-up drugs are already being sought. A divalent zanamivir 31.33 has been
described. It requires much fewer applications, analogous to a depot form. For
special circumstances peramivir 31.34, an intravenously administered drug, could be
found. It was developed from a furanose derivative and, like zanamivir, is inade-
quately orally available. During the last H1N1 pandemic (“swine flu”), peramivir was
in the last phase of its clinical trials and received an emergency use authorization for
the parenteral treatment of severe cases. Its binding also requires a rearrangement of
the Glu276 side chain, as is also required for oseltamivir. Therefore, cross-resistance
between the two drugs has already been described.

31.6 Stopping the Common Cold: Inhibitors for the Capsid


Protein of Rhinovirus

The common cold or mundane “sniffles” is caused by rhinoviruses, which belong to


the family of picornaviruses (RNA viruses, pico ¼ small). They are non-enveloped
viruses that do not have a lipid coat. Their genome is contained on a single, positive
RNA strand that is packed in an icosahedral capsid. Their diameter is ca.
200–300 Å. They prefer temperatures between 3 C and 33 C, that is, at higher
temperatures, such as body temperature, their growth is inhibited. Cooler weather
suits them especially well, and infection occurs preferably on cold, wet days. They
infect our noses in particular because the local cool temperature there offers them
an ideal breeding ground. Usually rhinoviruses are transferred by direct contact via
contaminated hands (so-called smear infections). People whose immune systems
are suppressed by momentary constitutional weakness, or children whose immune
systems are not completely developed are particularly vulnerable to infection. The
incubation time for a common cold is only 12 h. Then, the first newly formed
viruses already leave the infected host cells. Rhinoviruses are strictly localized to
the nose and throat. Humans respond to the viral attack with an inflammatory
reaction of the mucous membranes. The nose becomes red, swells up, and its
temperature increases. A general feeling of being unwell ensues with headache
and fatigue. Often a secondary bacterial infection or infection with a much more
pathogenic virus occurs on top of the primary viral infection, which, as
a consequence, can represent a real health risk.
As a general rule, our immune systems can handle rhinoviruses without inter-
vention and after a week, the body has defeated the viral invaders. There are over
100 viral serotypes. This term refers to variations in the surface proteins of these
viruses that repeatedly force the immune system to create different antibodies to
defend against them. Even if the common cold is seldom a real long-term health
threat to us, it must not be forgotten that the economy is severely taxed by this
disease. It is estimated that over 40 million workdays are lost each year to
absenteeism due to the common cold!
798 31 Ligands for Surface Receptors

a b c

VP-1 Canyon Canyon

VP-3 ICAM ICAM

VP-2 VP-1 VP-1


Canyon

Fig. 31.16 The capsid of the picornaviruses has an icosahedral construction (a). Each of the
20 triangular surfaces of the icosahedron is made up of three viral surface proteins, VP1 (yellow),
VP2 (green), and VP3 (red). A fourth chain, VP4 attaches to VP2 and forms a deep ridge
(“canyon,” blue) that is important for the recognition of the adhesion proteins (ICAM-1, blue
chain) of the infected host cell (orange; b). The binding of an antiviral compound such as
pleconaril 31.40 (Fig. 31.18) below the canyon causes its conformation to change so drastically
that binding to the host cell’s surface protein can no longer occur.

The picornaviruses belong to one of the largest families of viruses, which in


addition to the harmless rhinoviruses, also contains viruses that are very dangerous
such as the poliovirus, hepatitis A and B viruses, or viruses that can cause
meningitis, myocarditis, or encephalitis. So-called foot-and-mouth disease also
belongs to this family. Although this virus is not dangerous for humans, it can
quickly spread among cloven-hoofed animals such as cattle, swine, or sheep to
epidemic proportions. Many times already, serious threats to animal husbandry in
entire areas have occurred, above all when severely virulent strains with malignant
courses and high lethality occur. The surviving animals are often left with perma-
nent heart muscle damage. The poliovirus, which was a massive threat to humanity
until the 1960s, could be largely defeated with the introduction of a successful oral
vaccine. In this infection, which is also known as infantile paralysis, the poliovirus
attacks the nerve cells in the spinal cord that control muscles, and this leads to
permanent paralysis, or even death. The victory lap over this virus will only
continue if future generations show a high degree of discipline with prophylactic
vaccinations. Complacency easily creeps in when the acute threat of such a disease
falls out of sight of the general population because of fewer cases.
The 30-Å-thick capsids of the picornaviruses are constructed from 180 polypep-
tide chains, of which 60 copies occur identically. This is caused by the icosahedral
architecture of the virus (Fig. 31.16a). The three polypeptide chains VP1, VP2, and
VP3 are arranged on a 20-faced icosahedron. Trenches that are about 25-Å deep,
called “canyons,” form between the faces. A fourth, short peptide chain, VP4,
attaches to VP2. It is oriented in the interior of the virus and has no contact to
the exterior. The viral genome contains 7,500 bases that code for the capsid
proteins, two proteases, a polymerase, an ATPase, and four additional proteins.
31.6 Stopping the Common Cold: Inhibitors for the Capsid Protein of Rhinovirus 799

Fig. 31.17 Superposition


of the crystal structure of
the capsid proteins VP1
(yellow), VP2 (green), VP3
(red), and VP4 (violet). The
bound inhibitor pleconaril is
shown in the lower right.
The course of the protein’s
Ca chain is depicted.
A cryo-electronmicroscopic
structure determination of the ICAM
Capsid Proteins VP-1
viral proteins (white chain)
and the adhesion protein VP-3
ICAM-1 (blue) of the host cell
is superimposed with this
structure. That the chain
adopts a deviant course in the
vicinity of the binding site
(yellow and white strands,
circled in orange) is VP-2
recognizable. It is enough to
alter the canyon for the
interaction with ICAM-1 to
VP-4
prevent host cell infection.

The surface-exposed canyons are particularly important in the area that is formed
by VP1. On the one hand, the canyon forms binding sites for adhesion molecules
(ICAM-1, cf. Sect. 31.3) that are found on the surface of the infected host cells.
Because the virus may not change its surface composition there too much, antibodies
from vaccination serums (▶ Sect. 32.1) are also targeted against this part of the
canyon. Such a strategy can be very successful with viruses that have less-broad
distributions of serotypes than the rhinoviruses. On the other hand, the canyon has an
opening in the vicinity of VP1 to the interior of the viral capsid that is important for
the release of the viral genome. Michael Rossman’s research group at Purdue
University studied the proteins of the viral capsid in detail. By using cryo-electron
microscopy (▶ Sect. 13.6) they managed to determine a structure of the capsid
protein with an adhesion molecule. This complex was only determined at
a reduced resolution, but it proves that the binding of the adhesion molecule
ICAM-1 occurs in the deep crevice of the canyon (Fig. 31.17).
Because of the common architecture of the picornaviruses, an antiviral
therapy can be developed against these viruses by using the same concepts. One
strategy that was initially developed and pursued at Sterling–Winthrop was oriented
toward the stretched-out pocket that is found underneath the VP1 canyon
(Fig. 31.16b). Accommodation of antiretroviral compounds in this pocket causes
a conformational change at the bottom of the canyon. Because of this change in
geometry, the interactions with the adhesion molecule on the surface of the infected
800 31 Ligands for Surface Receptors

H3C

O O Cl O
O N
O
N
O H 3C O

H 3C Cl
31.35 β-Diketone Juvenile 31.38 WIN 54954
Hormone Mimetic
H3C
CH3
H 3C N
Cl O
O N O
O
N
O
O H3C
OMe H3C
H3C
31.36 Arildone 31.39 WIN 61893
CF3
O H3C N
N O O
N O
N
H3C N O
O H3C
H 3C
31.37 Disoxaril 31.40 WIN 63843 Pleconaril

Fig. 31.18 b-Diketones (31.35, 31.36) that showed antiviral activity against picornaviruses were
prepared in the course of a synthetic program toward developing juvenile hormone mimetics. By
introducing a terminal heterocycle, varying the chain length, and blocking positions on the
heterocycles to improve the metabolic stability (31.37–31.39), pleconaril 31.40 could be devel-
oped as an inhibitor of the viral attack on host cells.

host cell are altered. A stable contact with the host cell is no longer possible, and
the virus cannot transfer its viral RNA to the infected cell. The viral infection
is stopped.
The first lead structures at Sterling–Winthrop were b-diketones 31.35, which
were synthesized as intermediates in a research project for the development of
juvenile hormone (Fig. 31.18). Arildone 31.36 resulted from this lead substance,
and it successfully blocked the replication of the poliovirus. Because the b-diketone
building block exhibited unsatisfactory chemical and metabolic stability, it was
replaced with an oxazole ring. Further optimization led to disoxaril 31.37, which
was able to block viral infection by multiple picornaviruses in animal models.
By the end of the 1980s Michael Rossmann had already managed to elucidate
the binding geometry of the Winthrop compounds in complex with the viral capsid
proteins. The structures showed the occupancy of an extended pocket below
the canyon. Disoxaril 31.37 has no activity against the rhinovirus, and its bioavail-
ability of less than 15% also seemed to be unsatisfactory. Further development led
to derivatives with a di-ortho-substitution on the central phenyl ring. WIN54954
31.38 is significantly more potent and has better bioavailability. However, it does
not have the desired broad efficacy against all viral strains, and its metabolic
stability still leaves something to be desired. The methyl derivative 31.39 with
31.6 Stopping the Common Cold: Inhibitors for the Capsid Protein of Rhinovirus 801

VP-3 VP-1

VP-2
Pleconaril

VP-4

Fig. 31.19 Crystal structure of the viral capsid proteins of the rhinovirus HRV-14 with the
inhibitor pleconaril. The antiviral drug binds to VP1 (yellow) below the canyon (cut away in the
figure) in a narrow, stretched-out pocket formed by numerous hydrophobic amino acids. They sit
above an opening into the interior of the viral capsid. The drug induces a conformational change in
VP1 upon binding, and consequently disrupts the recognition of the host cell’s adhesion proteins.

the terminal oxadiazole ring also lacks the desired stability. It was only the
replacement with a CF3 group that led to success. The compound, pleconaril
31.40, entered clinical trials. Its binding mode to the capsid protein is shown in
Fig. 31.19. Its application in 2,100 patients with a rhinovirus infection showed that
the length and severity of the disease was shorter and milder, respectively, com-
pared to a placebo group. The regulatory authority the FDA, however, rejected the
market approval for pleconaril in 2002. Concerns regarding the safety of the drug
were expressed. There were indications that complications in women using oral
contraceptives occurred. For a disease such as the common cold from which our
bodies can recover without drugs, it is certainly appropriate to examine the effects
and risks of a drug very stringently. Undoubtedly a compound such as pleconaril
can reduce the inappropriate use of antibiotics or prevent a severe secondary
bacterial infection. Such a compound can also help asthma and COPD (chronic
obstructive pulmonary disease) patients with an infection. Schering–Plough
conducted further clinical trials with a compound that is used as a nasal spray.
Knowledge about the mode of action of this compound, which is transferable to
other picornaviral diseases, could be very valuable. It could help to develop drugs
for other infectious diseases with higher health risks. For these, however, there will
hardly be a comparably lucrative market.
802 31 Ligands for Surface Receptors

31.7 MHC Molecules: Where the Immune System Presents


Peptide Fragments

Our immune system defends us against harmful invasions by antigens and elimi-
nates cells that either have been infected or have transformed into a potentially
pathological state. A distinction is made between unspecific and specific defensive
mechanisms that are served by cellular and humoral (in body fluids) components.
The unspecific defensive mechanisms try to deactivate pathogens and foreign
substances upon first contact. We have various glycoproteins and interferons in
circulating blood and in tissues that carry out the first attack as the so-called
humoral complement system. Their defensive efforts are not targeted and serve
to degrade the foreign substances (cf. Sect. 31.3). For example, they adhere to
bacteria and create an opening in their membrane that allows fluid and salts to flow
in. This causes the bacterial cell to swell and finally burst. Lysozyme represents
a further factor that enzymatically hydrolyzes the cell walls of particular bacteria.
Additionally, interferons are released that have an immune-stimulating effect on the
neighboring cells. Proteins are produced in these cells that initiate entirely different
mechanisms of combating foreign substances. Moreover, the organism has an
additional, very effective and specific protective barrier, which, however, must
first be developed and “trained.” This immunological defense mechanism is first
active when a damaging material is recognized as such. The consequently initiated
immune response is largely made up by three types of cells: macrophages and B
and T lymphocytes. These defensive mechanisms are highly specific and usually
lead to immunity. With this, the body becomes insensitive to foreign materials that
it has had contact with once before. In the context of the humoral defense,
antibodies (▶ Sect. 32.3) assume this task. They are formed 5–7 days after an
immune-competent B lymphocyte makes contact with an antigen. After the first
contact with the invader, effector cells are formed for the production of antibodies
as well as memory cells that further circulate in the blood. Upon renewed exposure,
the defenses can immediately attack, even if the antigen was recognized years ago.
Vertebrates have developed an adaptive system for cellular defense that distin-
guishes between healthy and infected cells. T lymphocytes, also known as T cells,
play a decisive role in the cellular immune response. They belong to the white blood
cell group. Produced in the stem cells of the bone marrow, they mature in the
thymus to the actual T cells. They carry T-cell receptors on their surface that are
responsible for the recognition of antigens. By scanning cells for the characteristic
“diseased or healthy” they find these antigens in the form of peptide sequences that
are presented on the surface of the cells by MHC molecules (major histocompat-
ibility complex). Two different types of MHC molecules are distinguished that are
termed class I and II. MHC-I presents 8–10-residue peptides that are preferably
found in the cytosol of cells that have a nucleus. MHC-II presents longer peptides
that are formed during endosomal protein degradation. They occur on professional
antigen-presenting cells such as macrophages or B cells. T helper cells “look at” the
antigens presented by these cells, and regulate the immune response to these
antigens. The term for these molecules is “histocompatibility complex” because it
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 803

was originally recognized that the rejection reaction in organ transplantation is


initiated by the presentation of foreign proteins. Therefore before a transplantation,
“tissue typing” of the antigen patterns between the donor and patient is carried out.
In the meantime, it is known that MHC molecules differentiate between healthy and
infected cells for the cellular immune system. Analogous to the learning process of
the B lymphocytes, the T lymphocytes also form an antigen daughter cell after their
first contact. These serve as long-living memory cells and can initiate immune
defense upon renewed exposure to the antigen.
The developmental pathway for the class-I MHCs shall be considered in detail
(Fig. 31.20). MHC-I molecules are usually loaded with peptides that come from
cytosolic proteins. They are cleaved to peptide fragments of 8–10 amino acids in
the proteasome, a kind of cellular shredder (▶ Sect. 23.8). In healthy cells peptides
are formed that originate exclusively from endogenous cellular proteins. If the cell
is infected with a virus or has undergone transformation that has led to mutated
proteins, foreign or altered peptide fragments result. These are also bound to MHC
molecules and presented on the cell surface. To the T cells, it is immediately
apparent which cell has been infected with a virus or has degenerated into
a diseased state. The loading process for MHC molecules for peptides that were
generated in the proteasome occurs in the endoplasmatic reticulum (ER). For this,
the peptides are channeled into the ER with a specific transporter (TAP). The
passage of the loaded, membrane-anchored MHC molecules to the cell surface is
carried out by vesicles that fuse with the ER and/or the cell. If the foreign or viral-
infected cell or a tumor cell carries such an antigen peptide fragment in complex
with an MHC-I molecule, these are recognized by so-called CD8+ lymphocytes.
Cytotoxic T-killer cells belong to this cell type, and after binding they release
cytokines, pore-forming perforines, as well as proteases. As a result, they lyse the
cell that has been recognized as diseased, and initiate apoptosis.
At this point, the interest in the molecules of cellular immune defense becomes
attractive for drug therapy. Tumor diseases are a common cause of death, espe-
cially in advanced years. Surgical resection, chemotherapy, and radiation therapy
with tissue-destructive effects are the mainstay of cancer treatment. The newest
knowledge about the role of the immune system in the control of malignant
degeneration, the molecular interaction of antigen-presenting tumor cells with
immune cells, as well as a view of the antigen processing as presentation have
opened entirely new perspectives in tumor therapy. The immune system recog-
nizes and destroys many cells that have degenerated into tumor cells. The tumor
cells, however, use a variety of strategies to evade immune response. One approach
to the development of drug therapy tries to stimulate the immune response by using
specific tumor antigens. Peptide-like vaccines are developed that are able to
stimulate an immune response to the tumor cells. The first successes of such
a therapy have been described in melanoma (malignant degeneration of pigment
cells) patients. The target structure of the peptidic vaccines are the antigen-
presenting MHC-I molecules in complex with the CD8+ T-cell receptors. After
they have been stimulated to T-effector cells by so-called dendritic cells, the T cells
are qualitatively and quantitatively able to capture peptides that are presented on
804 31 Ligands for Surface Receptors

CD8
T-Cell

Virus

MHC

Viral-Protein Vesicle

Proteasome

Peptide-
Fragments

TAP

Cytosol

Fig. 31.20 When a cell is infected with a virus, viral proteins are found in the cytosol (gray).
They are degraded in the proteasome along with endogenous cellular proteins and cut into peptide
fragments. These fragments are relocated to the endoplasmatic reticulum (ER) by the TAP
transporter (green). There, the membrane-bound MHC class-I molecules (blue-violet) are loaded
with 8–10-residue peptides. Enclosed in vesicles, the peptide-presenting MHC molecules are
relocated to the cell surface and anchored with the membrane. T cells (green) scan through the
presenting MHC molecules by forming a complex with their T-cell receptors and recognize
whether the presented fragments are from endogenous or foreign proteins. If the protein is of
foreign origin, or an endogenous protein that has been overexpressed (e.g., in tumor cells), an
immune response is initiated.

somatic cells. On the one hand foreign proteins are recognized, on the other hand
the cytotoxic killer cells are also able to filter out cells that overexpress endogenous
proteins based on the high density of presented peptides.
These properties can be exploited to design peptide vaccines. By offering a large
amount of endogenous peptides, the immune tolerance to native proteins should be
overcome. Killer cells stimulate a specific and amplified immune defense against
the degenerate tumor cells. The goal of the development of such specific vaccine
serums is the replacement of endogenous peptides with analogues that can
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 805

Fig. 31.21 Crystal structure of the complex of an MHC-I molecule with a bound nonapeptide
Leu–Leu–Phe–Gly–Tyr–Pro–Val–Tyr–Val (gray) and the T-cell receptor: (a) total structure,
(b) peptide-binding site. The MHC molecule is composed of a heavy chain with the domains a1,
a2, and a3 (violet) and a light bm chain (blue). The pleated-sheet structure formed by the a1/a2
domains forms a bowl that is open above and is bordered by two long, parallel-oriented a helices
(yellow). It accommodates the antigen peptide fragment and presents its upper face to the T-cell
receptor. This hetereodimeric receptor made up of one a- (light-blue) and a b-chain (gray), also
has a pleated-sheet-like geometry. It recognizes the amino acid tyrosine in position-5 of the
antigen peptide with its hypervariable loops CDR3a and CRD3b.

provoke the same or an exaggerated immune stimulation, but that also have
much better stability and bioavailability because of the incorporation of non-
proteinogenic amino acids or peptidomimetic groups.
First the architecture of the complex of an MHC-I molecule with a presented
peptide and the T-cell receptor shall be considered in greater detail (Fig. 31.21).
The MHC-I molecule is composed of a heavy (ca. 360 amino acids) and a light
chain (90 amino acids). The heavy chain is anchored to the membrane and is
constructed from three domains a1, a2, and a3. The a1 and a2 domains form
a sort of bowl, the base of which is made up of a six-stranded antiparallel
806 31 Ligands for Surface Receptors

pleated sheet. The bowl is edged by two long helices oriented parallel to one
another. A crevice that accepts the antigen peptide fragment opens between the
helices. Peptides with a length of 8–10 Å fit in this area. Peptide binding occurs
largely because of hydrogen bonds to the N and C termini. MHC molecules are
highly polymorphic in their amino acid composition, even with the same architec-
ture so that interactions with the backbone are primarily responsible for their
binding in the crevice, and these interactions can be formed by peptides in general.
In the middle sequence segment, the antigen peptide protrudes slightly out of the
binding pockets of the a1/a2 domains. Residues at the beginning and end of the
oligopeptides orient in the small pockets of the MHC molecule. They determine
the binding affinity of each peptide to the protein. The residues in the center that
bulge out do not contribute much to the binding to the MHC molecule, but they
are decisive for the recognition and interaction with the T-cell receptor. The
sequence of the b-chain is virtually invariant, and most genetic modifications
occur in the a-chain. Furthermore, polymorphisms (▶ Sect. 12.10) have been
discovered there that vary from individual to individual. This is what the tissue
compatibility between donor and recipient in the case of organ transplantation
depends on. The susceptibility to infection and autoimmune disease can also find
an explanation in these variations.
MHC molecules bind the antigen peptide based on its sequence. They force it into
an extended conformation and expose the peptide’s central amino acid residues to the
exterior for molecular recognition by a T-cell receptor. The T-cell receptor is
a heterodimeric transmembrane glycoprotein that occurs exclusively on T cells. It
is constructed from an a- and a b-chain. The folding pattern of the two chains
is reminiscent of the structural construction of the light chains in antibodies
(▶ Sect. 32.3). The antigen-binding site is found in the loop area between the
individual pleated sheets of the domains. These loops are hypervariable and deter-
mine the recognition properties of each receptor. The receptor lies diagonally over the
peptide-binding site in complex with the MHC molecule. With its variable loops, it
covers, above all, CDR3a and CDR3b, the amino acid residues of the antigen peptide
that point away from the MHC molecule. At the same time, the T-receptor is in
contact with the surface portions of the flanking helices of the MHC molecule.
The design of peptidomimetics as candidates for a vaccine therapy to stimulate
the immune defenses shall be illustrated on the case of the melan-A/MART-1
antigens. These antigens are presented on the surface of melanoma tumor cells by
an MHC-I complex. The nonapeptide Ala–Ala–Gly–Ile–Gly–Ile–Leu–Thr–Val
and the decapeptide Glu–Ala–Ala–Gly–Ile–Gly–Ile–Leu–Thr–Val 31.42 were
isolated from melan-A in patients with this disease (Fig. 31.22). Both oligopeptides
bind with low affinity to the MHC molecule. An exchange of alanine for leucine
in position-2 significantly increases the binding affinity. The leucine-carrying pep-
tide 31.42 exhibits significantly larger immunogenic character than 31.41. It was
therefore chosen for a clinical vaccine study on melanoma patients. As a peptide
though, it has low stability in the organism and is quickly degraded. Therefore the
group of Francine Jotereau and Stéphane Quideau in Bordeaux, France adopted
the goal of developing a peptidomimetic. It should show the same binding affinity to
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 807

31.41 Glu-Ala-Ala-Gly-Ile-Gly-Ile-Leu-Thr-Val

Glu Leu Ala Gly Ile Gly Ile Leu Thr Val

O - O

OH
O O O O
H H H H H
N N N N N COOH
H2N N N N N
H H H H
O O O O O

31.42 Glu-Leu-Ala-Gly-Ile-Gly-Ile-Leu-Thr-Val

H
H2N N H
N N
O

H
b-Ala Leu Ala N Leu Thr Val

O
O OH
H H O
H2N N H H H
N N N
N N N COOH
H N
O O O H
O O

31.43

Fig. 31.22 Decapeptide 31.41, which was isolated from patients, was optimized to 31.42
by exchanging an alanine for a leucine in position-2 to develop peptidomimetics as candidates
for an immune-stimulatory vaccine for melanoma. It served in clinical vaccine studies. By
the stepwise replacement of building blocks in 31.42, a peptidic lead structure could be modified
into a stabilized peptidomimetic that binds identically to the MHC molecule but provokes
an amplified immune defense by binding to the T-cell receptor. Four modifications were under-
taken in the stepwise development of 31.43. The N-terminal glutamic acid was exchanged
for a b-alanine (red) to improve stability. The replacement of the Gly–Ile unit by a
2-aminoethylene (blue) together with a change to a CO–CH2-indoyl group increased the immune
response on the T-cell receptor. The exchange of the second Gly–Ile unit for a peptidomimetic
moiety, 3-aminomethylbenzoic acid (green, AMBA) allowed the peptide backbone to take
the same course.

the MHC molecule, have the same or better affinity to the T-cell receptor, and be
significantly more stable. Because only a crystal structure of the binary complex of
31.42 with the MHC molecule was available, without the T-cell receptor a model
was developed with the help of a tertiary complex with a structurally similar peptide.
A glutamate residue in the first position was intended to be replaced by a peptidase-
stable moiety. The choice fell on b-alanine, which is barely proteolytically
808 31 Ligands for Surface Receptors

Fig. 31.23 Modeled binding geometry of the reference peptide 31.42 (brown) superimposed with
the peptidomimetic 31.43 (green) in the binding pocket of the tertiary complex with the MHC
molecule (yellow) and the T-cell receptor (a-chain is light-blue, b-chain is gray). The receptor
recognizes the side chain of the first isoleucine or the CO–CH2-indolyl group with the hypervar-
iable loops CDR3a and CDR3b. The introduced AMBA building block allows the course of the
peptide chain to remain unchanged and replaces the volume of the replaced Ile side chain with its
phenyl ring.

cleavable. The leucine in position-2 and the valine in position-10 should be retained
because they are decisive for anchoring the MHC molecule. A spatially conserved
orientation of the backbone scaffold was anticipated. The group proceeded stepwise.
The amino acid in position-5, which is an isoleucine in the reference peptide 31.42,
seemed to be critical for the interaction with the CDR3 loop of the T-cell receptor.
The sec-butyl group of isoleucine was replaced by an aromatic moiety, whereby an
indole group proved to be optimal. Next an attempt was made to change the central
peptide bonds of the Gly–Ile–Gly–Ile motif by reduction. Finally, an N-(2-
aminoethyl) bridge was chosen for this segment. The second Gly–Ile unit could be
replaced by the known peptidomimetic group 3-aminomethylbenzoic acid (AMBA).
The peptidomimetic 31.43 was obtained as a result of this optimization, which
displays virtually the same binding affinity as the reference peptide 31.42 to the
MHC molecule. Its modulated binding mode is shown in Fig. 31.23. This compound
provoked the most intense release of g-interferon in an assay to test the immune-
response stimulation. This is probably a result of the more intense interaction with
the T-cell receptor. Further development must demonstrate whether 31.43 repre-
sents a promising lead structure for the development of peptidomimetic vaccines for
an immune therapy for melanoma tumors.
31.8 Synopsis 809

31.8 Synopsis

• Integrin receptors are responsible for the bidirectional communication between


cells. They are cell-surface exposed and possess intra- and extracellular domains
of complex architecture. Upon activation, these receptors undergo a series of
sequential conformational transformations.
• Integrins are formed as heterodimers of 18 a and 8 b subunits. Transition to the
active conformation makes the MIDAS binding site available, which is com-
prised of divalent calcium and magnesium ions. The activated receptor is
accessible for interactions with other proteins.
• In the case of the aIIbb3 receptor, which is present on the surface of platelets,
recognition of an Arg–Gly–Asp (RGD) motif leads to activation and subse-
quently to platelet aggregation, an initial step in thrombus formation.
• Blockage of the surface receptors on blood platelets leads to an arrest in the
coagulation process. Compounds either of cyclic peptide or open-chain
peptidomimetic structure comprising a mimic of the RGD motif have been
successfully designed as potent fibrinogen-receptor antagonists for the therapy
of thrombotic events.
• To defend against pathogens during inflammatory processes, leukocytes, which
are transported through vessels with the blood stream, have to be stopped and
fixed through sugar–protein interactions with selectins.
• Intervention in the inflammatory cascade can help in disease situations resulting
from damage by excessive leukocyte infiltration. Inhibitors of the selectins on
the surface of leukocytes have been developed as low-molecular-weight surro-
gates of the corresponding sugar moieties on the endogenous PSGL-1 proteins.
• Viruses gain entry into host cells by docking and subsequently merging their
envelope with the host cell membrane. First contact with the HI virus occurs via
the CD4 receptor and is assisted by contacts to cytokine receptors. A trimeric
helical bundle with a sewing-pin-like structure accesses the host cell as a kind of
warhead and a helical peptide stretch refolds and zips together to initiate the
fusion process. Structurally similar peptides such as enfuvirtide can block the
zipping and thus work as fusion inhibitors.
• Influenza is a viral disease caused by influenza viruses. These enveloped viruses
exhibit the docking protein hemagglutinin, the glycosidase neuraminidase, and
the M2 proton channel on their surface. For hemagglutinin and neuraminidase
various subtypes (H1–H6, N1–N9) are known, and they constantly vary giving
rise to antigen drift and shift. They force the immune system to constantly adapt
and produce new antibodies.
• The influenza virus conquers the host cell, exploits its machinery for reproduc-
tion, and, after reassembling, buds from the host cell. Final detachment from the
host cell is catalyzed by its neuraminidase function, which cleaves the terminal
galactose–sialic acid bridge. Two potent inhibitors for the catalytic cleavage site
of the glycosidase, zanamivir and oseltamivir, have been developed as antivirals.
• The common cold is caused by rhinoviruses belonging to the class of non-
enveloped single-stranded RNA picornaviruses. Their capsid, formed as
810 31 Ligands for Surface Receptors

a regular 20-faced icosahedron encompassing the RNA, is constructed from four


surface proteins. They show a structured surface with deep canyons that bind to
cellular adhesion proteins. A stretched-out pocket is found underneath the
canyon to which small molecules can be bound. They induce a small shift in
the canyon, and this prevents efficient binding to the cell-adhesion proteins.
• Even though picornaril showed efficacy fighting the common cold, FDA
approval was not granted due to the risk of interference with oral contraceptives.
• The cellular immune system can scan cells for the characteristic of being “healthy”
or “diseased” via interactions of surface-exposed MHC molecules and hypervar-
iable recognition loops of the CD8+ T-cell receptor on T lymphocytes.
• Antigen-presenting MHC molecules are loaded by 8–10-residue-long peptide
sequences originating from protein degradation in the proteasome. This way,
peptides fragments from foreign (viral attack) or altered proteins are also exposed.
• To use cellular immune defense mechanisms in drug therapy (e.g., for tumor
treatment) strategies have to be followed that break the evasion of immune
response. This can be achieved by applying peptidomimetic surrogates as vac-
cines of the MHC-exposed peptide stretches that stimulate the immune response
on tumor cells.

Bibliography

General Literature

Andronati SA, Karaseva TL, Krysko AA (2004) Peptidomimetics– antagonists of the fibrinogen
receptors: molecular design, structures, properties and therapeutic applications. Curr Med
Chem 11:1183–1211
Chhabra SR, Abdul Rahim AS, Kellam B (2003) Recent progress in the design of selectin
inhibitors. Mini Rev Med Chem 3:679–687
De Palma AM, Vliegen I, De Clercq E, Neyts J (2008) Selective inhibitors of picornavirus
replication. Med Res Rev 28:823–884
Doranz BJ, Baik SW, Doms RW (1999) Use of a gp120 binding assay to dissect the requirements
and kinetics of human immunodeficiency virus fusion events. J Virol 12:10346–10358
Kolata G (2001) The story of the great influenza pandemic of 1918 and the search for the virus that
caused it. Touchstone, New York
Lazoura E, Apostolopoulos V (2005) Rational peptide-based vaccine design for cancer immuno-
therapeutic applications. Curr Med Chem 12:629–639
Matthews T, Salgo M et al (2004) Enfuvirtide: the first therapy to inhibit the entry of HIV-1 into
host CD4 lymphocytes. Nat Rev Drug Discov 3:215–225
Shimaoka M, Springer TA (2003) Therapeutic antagonists and conformational regulation of
integrin function. Nat Rev Drug Discov 2:703–716
Somers WS, Tang J, Shaw GD, Camphausen RT (2000) Insights into the molecular basis of
leukocyte tethering and rolling revealed by structures of P- and E-selectin bound to SLeX and
PSGL-1. Cell 103:467–479
von Itzstein M (2007) The war against influenza: discovery and development of sialidase inhib-
itors. Nat Rev Drug Discov 6:967–974
Bibliography 811

Special Literature
Douat-Casassus C, Marchand-Geneste N, Diez E, Gervois N, Jotereau F, Quideau S (2007)
Synthetic anticancer vaccine candidates: rational design of antigenic peptide mimetics that
activate tumor-specific T-cells. J Med Chem 50:1598–1609
Garboczi DN et al (1996) Structure of the complex between human T-cell receptor, viral peptide
and HLA-A2. Nature 384:134–141
Jiang S, Zhao Q, Debnath AK (2002) Peptide and non-peptide HIV fusion inhibitors. Curr Pharm
Des 8:563–580
Kim CU, Lew W et al (1997) Influenza neuraminidase inhibitors possessing a novel hydrophobic
interaction in the enzyme active site: design, synthesis and structural analysis of carbocyclic
sialic acid analogues with potent anti-influenza activity. J Am Chem Soc 119:681–690
Kim CU, Lew W et al (1998) Structure–activity relationship studies of novel carbocyclic influenza
neuraminidase inhibitors. J Med Chem 41:2451–2460
Kolatkar PR et al (1999) Structural studies of two rhinovirus serotypes complexed with fragments
of their cellular receptor. EMBO J 18:6249–6259
Kranich R, Busemann AS et al (2007) Rational design of novel, potent small molecule pan-selectin
antagonists. J Med Chem 50:1101–1115
Ku TW, Ali FE, Barton LS et al (1993) Direct design of a potent non-peptide fibrinogen receptor
antagonist based on the structure and conformation of a highly constrained cyclic RGD
peptide. J Am Chem Soc 115:8861–8862
Smith PW, Sollis SL et al (1996) Novel inhibitors of influenza silaidases related to GGI67. Bioorg
Med Chem Lett 6:2931–2936
Williams MA, Lew W et al (1997) Structure–activity relationships of carbocyclic influenza
neuraminidase inhibitors. Bioorg Med Chem Lett 7:1837–1842
Zablocki JA, Rico JG, Garland RB, Zablocki JA, Rico JG, Garland RB et al (1995) Potent in vitro
and in vivo inhibitors of platelet aggregation based upon the Arg-Gly-Asp sequence of
fibrinogen. (Aminobenzamidino)succinyl (ABAS) series of orally active fibrinogen receptor
antagonists. J Med Chem 38:2378–2394
Zhang Y et al (2004) Structural and virological studies of the stages of virus replication that are
affected by antirhinovirus compounds. J Virol 78:11061–11069
Biologicals: Peptides, Proteins,
Nucleotides, and Macrolides as Drugs 32

The importance of peptides, proteins, sugars, and nucleotides for functional pro-
cesses in our bodies has been discussed in many chapters in this book. An attempt
can be made to regulate or intervene in the processes that these endogenous
substance are involved in with exogenous, low-molecular-weight drugs. On the
other hand, the question can be raised as to whether the administration of endog-
enous biomolecules themselves might be a promising therapeutic concept in case of
some diseases. This is especially true for diseases in which a particular endogenous
substance is insufficiently produced by the organism, or is produced but is not
functional, for instance, because of an amino acid mutation. Only gene technology
methods (▶ Chap. 12, “Gene Technology in Drug Research”) opened the perspec-
tive to selectively produce polypeptides and proteins with specific characteristics in
adequate quantities.
As part of a strategy to use endogenous proteins and peptides as drugs, it can be
reasonable to slightly modify the native substances to endow them with additional
properties such as a longer half-life, better stability, or higher bioavailability. Often,
the serious problem occurs that peptides and proteins have much too poor stability and
bioavailability for oral application. Nonetheless, there are many promising application
areas such as the treatment of digestive disorders with the administration of lipases.
The issue of bioavailability is also different for skin diseases than it is for oral
application and systemic drug use. Even the skin, however, has a protective enzymatic
barrier that sensitive biomolecules cannot easily overcome. In hospital drug use, the
treating physician can easily choose an intravenous application for which this issue is
less critical. This problem shall be discussed in more detail by using the example of
insulin, the daily exogenous administration of which is essential for diabetics.
Another pharmaceutical concept with regard to the application of exogenously
administered biomolecules exploits the principle of the body’s own immune
defense. The body uses macromolecular structures for the recognition and targeted
deactivation of pathogenic substances. A drug therapy can copy this principle to
fight pathogens or malignant cells according to the same concept. These antibody
proteins from the humoral defense system are not orally bioavailable because of
their size and require intravenous application.

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5_32, 813


# Springer-Verlag Berlin Heidelberg 2013
814 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

The recognition of an oligonucleotide or a segment of DNA or RNA is decisive for


many biological processes. Transcription factors (▶ Sect. 28.2) act here, as do
enzymes that undertake the translation of DNA to RNA, or that translate RNA and
DNA into one another. DNA is a huge molecule that can only be stored in the cell if
efficiently packaged. In ▶ Sect. 12.14 we learned that it could be, for instance,
wrapped around histone proteins. It is brought into a compact form in bacterial cells
by over-spiralization, that is, by adding additional turns in the DNA. For this, a strand
must be broken, which is catalyzed by the enzyme topoisomerase with consumption of
ATP. The function of such proteins, which require an exact recognition of DNA or
RNA, can be blocked by drugs that act as a molecular wedge in the complex structure.
Further drug concepts have been developed in the area of nucleotides that inter-
vene in the translation of RNA or transcription of DNA. For this, oligonucleotide or
nucleoside analogues have been developed. The goal of the transcription into mRNA
is the final translation into the amino acid sequence of a protein in the ribosome.
Microorganisms have developed multifaceted strategies to outcompete rivals, which
are often other microorganisms or parasites. They have a multienzyme complex that
allows the construction of complex, often macrocyclic compounds according to
combinatorial principles. They achieve their cyclic construction in the form of larger,
multimembered rings through lactone formation. Many of these substances, termed
macrolides, block the ribosomes of hostile organisms. Other members of this com-
pound family inhibit cell-cycle processes. Here too, it is possible to use these natural
products, sometimes with minor chemical modification, as biologicals in therapy.
A few showcase examples of such biologicals will be discussed in this chapter.

32.1 Gene-Technological Production of Proteins

Endogenous proteins have long since been used in substitution therapy. Earlier,
material from animal pancreases was used as an insulin source for the therapy of
diabetes mellitus; this insulin was different from human insulin by one amino acid
(porcine insulin) or three amino acids (bovine insulin). Although these insulins are
suitable for therapy, and there are techniques to exchange the structurally deviant
amino acid of porcine insulin for that in human insulin, all of the slaughterhouses
in the world would not be enough to supply all diabetics with the necessary
insulin. Factor VIII deficiency in hemophiliacs used to be compensated for by
blood transfusions. Today, recombinantly manufactured proteins are exclusively
used because the possibility of contamination with viruses is too great with products
taken, e.g., from human blood. Often it was recognized much too late that the factor
VIII batches were infected with hepatitis viruses and HIV, the causative agent of
AIDS. Therefore efforts were made very early on to produce human proteins by
using gene technology. The first protein to be produced in this way was human
insulin from the bacterium Escherichia coli, which was introduced into therapy by
Eli Lilly in 1982. Although Hoechst also had worked out a promising method for
industrial manufacturing, their production could not begin until 1994. It took that
long in Germany until all of the objections to the manufacturing license were
32.1 Gene-Technological Production of Proteins 815

Table 32.1 Important gene-technologically manufactured proteins in the pharmaceutical indus-


try, their applications, as well as the manufacturer (at the time the drug was launched).
Drug Indication Manufacturer
Insulin Diabetes Novo-Nordisk, Eli Lilly, Hoechst,
and others
Growth hormone Growth disorders Pharmacia, Novo-Nordisk, Eli
Lilly, and others
Hepatitis B vaccine Vaccine SmithKline Beecham, Merck & Co.
Tissue plasminogen activator Thrombolysis Genentech, Boehringer Ingelheim
a-Interferon Viral hepatitis, leukemia, Sumitomo, Schering-Plough,
diverse tumors, AIDS Roche, and others
Erythropoietin (EPO) Anemia of renal failure Amgen, Johnson & Johnson,
Chugai, and others
Factor VIII Hemophilia Baxter, Cutter/Miles (Bayer)
Granulocyte colony- Chemotherapy Amgen, Chugai, Sankyo,
stimulating factor (G-CSF) Immunex, and others
Glucocerebrosidase Gaucher’s disease Genzyme

addressed. Afterward, gene-technologically manufactured insulin, human growth


hormone, a hepatitis B vaccine, tissue plasminogen activator, and many other
proteins followed. An overview of the most important proteins to be manufactured
in pharmaceutical industry by gene technology is given in Table 32.1.
The production in bacteria and cell cultures is only one possibility to manufacture
human proteins. For a few years, efforts have been made to introduce genetic
information into animals. In the Netherlands, Herman the bull has achieved fame.
The animal, which died in 2004, carried the information in his genome for human
lactoferrin, a component of human milk that protects small children from gastroin-
testinal infections. Patients with weakened immune systems, for example, because of
AIDS or during chemotherapy, can also benefit from lactoferrin. Herman’s female
offspring produce milk that contains lactoferrin. The company Genzyme Transgenics
has bred transgenic sheep. They produce tissue plasminogen activator (tPA) in their
milk, which is used to dissolve blood clots, in quantities of up to 3 g per liter of milk.
In doing so, they reduce the costs for the production, which are a few hundred dollars
per gram in cell culture, to a few dollars. The next step could be the transfer of genetic
information into agricultural crops. Imagine a field of sugar beets that can produce
insulin or another human protein in large quantities!
In addition to the production of proteins to treat diseases in which a particular
protein must be replaced, and the already-mentioned application in drug research
(▶ Chap. 12, “Gene Technology in Drug Research”), gene technology plays an
important role in:
• Antibodies and vaccines
• Enzymes for medical diagnostics
• Proteins for biosensors
• Proteins for biotechnological processes, for instance, the enzymatic production of
optically active intermediates (▶ Chap. 5, “Optical Activity and Biological Effect”)
816 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

32.2 Tailored Modifications to Insulin

Diabetes is caused by a deficiency in the hormone insulin, which is produced in the


pancreas. This polypeptide is made up of two chains with 21 (A-chain) and 30
(B-chain) amino acids that are cross-linked by three disulfide bridges. It acts as an
agonist on the insulin receptor, which is structurally related to the group of growth
hormone receptors (▶ Sect. 29.8). Frederick Banting and Charles Best isolated insulin
in a pure form for the first time in 1921. Two years later it was successfully used
therapeutically when a 13-year-old boy was rescued from certain death. Since this time,
insulin has been used therapeutically as isolated from the pancreas of pigs and cattle.
Since 1982 insulin is available that is manufactured by gene technology. Companies
that gene-technologically produce insulin have seized concepts to improve the prop-
erties of human insulin by using purposeful modifications. Longer-acting insulin has
been of particular interest. Such a depot effect can be achieved, for instance, by
decreasing the solubility of the protein. Because of its overwhelmingly acidic amino
acids, insulin’s isoelectric point is at pH ¼ 5.5. Because the solubility of a peptide or
protein is the lowest at its isoelectric point, a shift in this value to the neutral point at
pH ¼ 7 should lead to a decrease in solubility of insulin. In practice, this concept can
indeed be used successfully. The introduction of an additional arginine in the B-chain
(ArgB31) does not change the biological properties, but it does lead to an intensified
depot effect. If another arginine B32 is added, this modified insulin is no longer active.
It crystallizes after injection and is therefore not available in adequate quantities. X-ray
structural analyses of the double-mutated insulin show that the crystal packing is more
stable than in normal insulin because of additional contacts. An additional amino acid
exchange that weakens this contact led to an insulin analogue with optimal depot
properties. It need only be administered once daily. The C-terminal end of the B-chain
of insulin was also modified at Eli Lilly. Amino acids were not added, but rather the
next-to-last amino acids Pro28 and Lys29 were exchanged with one another. This
double mutant has the same hypoglycemic properties as native human insulin, but
surprisingly, it acts much faster. This made a quickly absorbed, short-acting, and
therefore well-controlled insulin available. It represents a huge advantage for patients
because the time between injection and meal is significantly shorter.

32.3 Monoclonal Antibodies as Vaccines, Chemotherapeutics,


and Receptor Antagonists

The structure and function of our immune system, which defends against foreign
substances, so-called antigens, was introduced in ▶ Sect. 31.7. It is divided between
unspecific and specific defense. In specific immune response, a distinction is made
between the humoral and cell-specific systems. The role of MHC molecules in
complex with the T-cell receptor as a control system to detect and cull diseased
and healthy cells was discussed in detail. Antibodies adopt a corresponding role in
the humoral system in that they detect foreign substances, which then are delivered
to phagocytic cells, such as macrophages, for degradation. Analogous to the
32.3 Monoclonal Antibodies 817

Fig. 32.1 Crystal structure of a complete IgG antibody. The two Fab regions form the left and
right branches (red, green) of the Y-shaped molecule. They are made up of a light (light color) and
a heavy chain (dark color). The antigen-binding site (light-blue arrow) is found at the end of both
branches. It is formed by eight loop regions. The Fc domain is connected through a hinge region
with multiple disulfide bridges. It forms the trunk of the Y-shaped molecule. Two chain strands
with a pleated-sheet architecture are positioned against one another here too. The schematic
construction of the antibody with the same color codes is shown below right.

cell-specific system, the humoral defense is a “learning” system. Once a foreign


substance has been recognized, it is remembered by the immune system in the form
of memory cells that can initiate an immediate response when confronted with the
same invader, even if the second contact is years later. Antibody-producing cells can
only ever produce one specific antibody. About 1012 different antibodies occur in the
human organism. Upon appearance of an antigen, only those immune cells are
selected for propagation that produce antibodies to intercept the antigen. Once
recognized, intruders are bound at their surface and delivered to the immune system
for disposal.
Antibodies have a common structural architecture. Their geometry is roughly
comparable to the form of the letter Y. The branches of the Y are made up of two
identical copies that are formed by a light and a heavy chain, which are coupled in
the center by multiple disulfide bridges (Fig. 32.1). The light chain folds into two
domains (VL and CL) with a pronounced pleated-sheet-like architecture. The heavy
chains VH and CH1 adopt a very similar spatial structure. These areas are called the
Fab domains (ab for antigen binding). Then there is still the trunk of the Y, which
also adopts a pleated-sheet-like construction from two chains (CH2 and CH3). It is
termed the Fc domain (c for constant). The last-mentioned domain contains the
recognition regions where the antibody is detected by the phagocytic cells.
818 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Fig. 32.2 Comparison of the crystal structures of two Fab domains that were released by
proteolytic cleavage with papain. The eight variable loop regions that form the antigen-binding
site are represented with different colors. The structure (a) binds a small molecule as an antigen;
structure (b) on the other hand recognizes the surface of a protein as a foreign substance.

Loop areas are found at both ends of the bifurcated Y, three loops of which have
proven to be extremely variable among different antibodies in terms of their length
and sequence. Antibodies are able to offer binding sites for very different antigens
with these hypervariability loops, or complementarity-determining regions
(CDR). Like the fingers of a hand, these variable loops grasp or surround the
antigen. Eight CDR loops are shown in different colors in Fig. 32.2. Two antibody
structures are shown that, despite their almost identical folding, bind to two entirely
different antigens. One structure grasps phosphocholine 32.1, a small antigen, whereas
the other recognizes and binds the protein lysozyme (129 amino acids) via a large
surface patch (Fig. 32.3). Phosphocholine orients its charged quaternary ammonium
group to interact with two glutamic acid residues and one asparagine. The terminal
phosphate group forms H-bonds to a tyrosine and an arginine residue. The interface
between the antibody and lysozyme stretches over an area of about 20–30 Å. The
highly structured contact area takes on a shallow form. Seventeen residues of the
antibody are in direct contact with 16 lysozyme residues. Only a few antigen residues
burrow deeper into the antibody surface and form hydrogen bonds on their ends.
32.3 Monoclonal Antibodies 819

a H3C
O b
P
H3C N +
O O− Lysozyme
O− Phosphocholine
CH3
32.1

CDR1-8

CDR1-8

Fig. 32.3 The contact surfaces with the bound antigens are shown for both of the Fab domains
shown in Fig. 32.2. (a) The small molecule phosphocholine (green surface) is bound in a deep
pocket in the antibody. It penetrates the violet-colored surface of the antibody deeply. In the case
of the antigen lysozyme (b) a 20–30-Å contact surface is formed and 16 or 17 residues of both
binding partners, respectively, are involved in the interaction. An additional antigen contact
surface (green) with the antibody (violet) is in the direct vicinity.

Their ability to bind highly efficiently to chemical structures that have entirely
different sizes and compositions seems to make antibodies ideal for the detection
and culling of disease-causing foreign substances and malignant or degenerated cells.
To use them in diagnostics or therapy, they must be purposefully developed against
specific antigen surface structures and produced in adequate quantities.
The development of suitable antibodies can be accomplished in a donor organ-
ism. Antibody-producing cells can be isolated from the serum of an immunized
mammal and purified. To obtain larger quantities of antibody-producing cells, an
attempt can be made to culture the cells. Under these conditions though, the cells
grow for only a few generations, then they die. In 1975 Georges Köhler went to the
laboratory of César Milstein in Cambridge, England, to improve the production of
antibodies in cell cultures. There the idea emerged to hybridize normal antibody-
producing cells with easily reproducing tumor cells to make hybridoma cells, and
to combine the properties of both cell types in this way. Once again, serendipity
helped. Köhler decided on murine cells. Later it was discovered that these cells fuse
100-fold better with tumor cells than other cells do. The hybridoma cells produce
the desired antibodies and continue to divide for unlimited generations. They
became immortal antibody-producing cells. In the meantime, this method for the
manufacture of monoclonal antibodies has developed into a billion-dollar busi-
ness. Georges Köhler and César Milstein received the Nobel Prize. That was,
820 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

however, their entire pay. They neither patented their method, nor tried to establish
a company to profit from their invention.
Antibodies that are produced in this way are useful for medical diagnostics. On
the other hand, they can be used, for instance, to treat tumors or septic shock. In
general, they are used to fight diseases in which a protein in the body should be
neutralized. A problem occurs when antibodies are isolated from an animal organ-
ism. They can act as antigens themselves and therefore provoke an immune
response. Here, the formation of chimeric proteins, that is, combinations of
mouse antibody with parts of human antibodies, can help. The so-called humaniza-
tion, in which only the variable antigen-binding site of the mouse antibody is coupled
with a human antibody, is even more elegant. In vitro production of completely human
antibodies with certain viruses is another method. Just as with important proteins,
human antibodies can be produced in the milk of sheep. Companies such as Genzyme
Transgenics have bred transgenic sheep that produce human monoclonal antibodies in
their milk.
A bigger application field for antibodies is the prevention and treatment of
diseases with vaccines. The development of gene technology has contributed
very decisive progress for the production of vaccines. For example, vaccines for
hepatitis B used to be isolated from the blood of chronically infected patients, a very
laborious and dangerous technique. For a vaccine, however, the entire virus is not
needed. For the recognition by an antibody, it is sufficient to reproduce only a typical
surface segment from the envelope. The genetic information for this area is taken from
the virus and incorporated into plasmids (▶ Sect. 12.1). The envelope protein segment
is then produced just like any other protein in bacteria or in other appropriate cells.
Gene-technologically produced vaccines against AIDS and other viral and bacterial
diseases, even against parasitic diseases such as malaria, are being intensively
investigated.
Antibodies have achieved increasing importance as drugs in recent years. Well
over 200 examples are in clinical trials. Increasingly more recombinantly pro-
duced antibodies, usually with tongue-twisting names that end with “-mab” (for
monoclonal antibody) are arriving on the market. As mentioned, antibodies have
the huge advantage that they can be specifically raised against virtually any surface
structure. They then fish the corresponding antigen out of the organism highly
selectively and, once bound, deliver it to the usual degradation pathway of the
immune system via the phagocytic cells. Along the way, not only undesirable
intruders are neutralized, even cancer cells or undesirable signaling and regulatory
proteins can be removed from the organism. On the other hand, antibodies can illicit
or block a cell-specific receptor as proteinogenic signal molecule just as a drug
would. They can be exploited as a sort of tracking hound and combined with a
sophisticated molecular ferry they can transport an active molecule to the site of
action. Once arrived, the transported molecule is released in a very high local
concentration to evolve its action.
One disadvantage of antibodies should not go unmentioned. For them the cell
membrane represents an insurmountable barrier in almost all disease processes.
They are limited to the recognition of structures on the cell’s surface or they can
32.3 Monoclonal Antibodies 821

a b c

Ligand Homodimerized
Growth Ligand Receptor
Factor Membrane
Receptor Exterior

Membrane
Tyrosine Interior
Activated Deactivated
Kinase Tyrosine Tyrosine
Domain Kinase Kinase

Fig. 32.4 (a) The epidermal growth factor receptor is stimulated by binding a macromolecular
ligand. (b) Autophosphorylation initiates the intracellular tyrosine-kinase cascade and the signal is
transmitted into the cell. (c) A specific antibody that was raised against the surface structure of the
receptor can bind so tightly to the receptor that it blocks the uptake of the natural ligand. The signal
cascade is antagonized, and the signal transmission does not occur.

only be raised against extracellularly occurring substances. If an intervention with


intracellular processes is desired, an interaction with cell surface receptors must
occur that are at the start of a signaling cascade. For this, of course, it must be
ensured that they bind specifically to the correct receptor, on the desired cell, in the
right compartment of our bodies.
Cetuximab is an antibody that is manufactured by gene technology. It was raised
against the epidermal growth factor receptor as its antigen-binding site; it blocks
this receptor. This receptor is overexpressed on solid tumor cells. Once the antibody
arrives at its site of action, it binds specifically to the extracellular domain of the
receptor (Fig. 32.4). There it completely blocks the docking of the endogenous ligand:
the epidermal growth factor. The subsequent steps of the signaling cascade that
stimulate cell division are then turned off. In addition to this therapeutically indicated
signal transduction inhibition, these cells, which are labeled with the antibody, are
delivered to phagocytic cells for disposal. Many antibodies exert their effect according
to this strategy, which is described by using cetuximab as an example. Aside from the
therapy of cancer, an attempt is made to quench excessive or undesired immune
reactions with antibodies. Basiliximab and daclizumab were raised against the
a-subunit of the IL-2 receptor on T cells. It is only after they are activated that
T cells form the a-subunit. The binding affinity for IL-2 then also increases signifi-
cantly. After organ transplantation, T cells are activated by the presentation of the organ
donor’s antigens. Processes that finally lead to rejection of the transplanted organ are
initiated. If a specific antibody against the a-subunit of the IL-2 receptor is given, it is
possible to selectively eliminate only the immune response to the transplanted organ.
The tumor necrosis factor, TNF-a, is an important indicator in the pathogenesis of
infectious and neoplastic diseases, including autoimmune disease. The elimination or
suppression of this factor represents a beneficial therapeutic option for these diseases.
822 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Fig. 32.5 An antibody raised against the surface protein (red) from tumor cells (orange) finds
such cells in the organism. If a metal ion chelator carrying a radioactive isotope is covalently
coupled to the antibody, a radiation source can be specifically brought into the direct vicinity of the
malignant cancer cell. Ionizing radiation from the nuclear decay is released, where it exerts its
tissue-destroying effects locally. The tumor tissue is treated by radiation therapy directly at the site
to fight the tumor.

TNF-a binds to a receptor that is constructed from three identical domains. To


deactivate TNF-a, the extracellular, soluble ligand-binding site of this receptor is
cut out and fused with the Fc part of an antibody. A soluble hybrid protein is obtained
that recognizes TNF-a specifically through the original receptor domain, and, analo-
gous to an antibody, delivers it to the phagocytic cells due to the FC domain. These
hybrid molecules have been introduced to therapy by Amgen under the name
etanercept (Enbrel®) for the treatment of autoimmune rheumatic diseases and severe
forms of psoriasis. Another option is to isolate purely human antibodies by phage
display and to produce them in animal cells such as CHO cells (adalimumab,
Humira®; Abbott Laboratories).
Another interesting concept to improve the efficiency of antibodies was acted
upon in cancer therapy. CD20 antigen is overexpressed on the surface of cancer
cells. A specific antibody could be developed that finds the CD20 cells in the body.
This therapeutic measure can be combined with local radiation therapy. For this, the
CD20 antibodies are marked with a radioactive metal ion. Depending on the half-
life of the isotope, a local release of ionizing radiation is achieved because of the
decomposition of the instable nuclide. According to the nuclear reaction, either a or
b particles are emitted. This radiation can exert its tissue-destroying effects in the
direct vicinity of the tumor, and in doing so cause the tumor to die (Fig. 32.5). The
attachment of a radioactively labeled antibody to a cancer cell with the CD20
antigen ensures that high concentrations of ionizing radiation are emitted only
there. Usually ions such as copper, yttrium, rhenium, lutetium, bismuth, or astatine
are used as radioactive ions. The stable coupling between the metal ion and the
antibody is accomplished by a suitable chelating ligand. Additionally, the scaffold
of the chelating agent is equipped with a reactive group that can, for instance, be
coupled through an exposed lysine residue to the surface of the Fc domain of the
antibody via a covalent bond. Tositumomab and Ibritumomab are two antibodies
32.4 Antisense Oligonucleotides as Drugs? 823

that have been raised against CD20. They carry 131I or 90Y as radioactive sources.
This therapeutic approach couples radiation therapy with the body’s own immune
defense.

32.4 Antisense Oligonucleotides as Drugs?

It is desirable to suppress the unwanted effects of very specific proteins in cancer,


excessive immune reactions, and septic shock, but also in other diseases such as
hypertension, emphysema, or pancreatitis. This can be accomplished in many different
ways. At the protein level, enzymes can be blocked by inhibitors, and receptors can be
blocked by antagonists or inverse agonists. These modes of action were discussed in
previous chapters in great detail. It is also possible to intervene on the DNA level
by inhibiting protein biosynthesis. Soluble (cytosolic) receptors, for example,
steroid receptors, act directly on DNA in that they regulate specific gene segments
(▶ Sect. 28.1). Agonists and antagonists of these receptors indirectly regulate the
de novo synthesis through the enzymes that are produced by the gene. In this way it is
not necessary to suppress the protein function because its biosynthesis is prevented.
As already discussed in ▶ Sect. 12.7, there is another way to block the formation
of a particular protein: by intervening on the level of the messenger RNA (mRNA).
When a protein is expressed, the double-stranded DNA is first transcribed into
mRNA. Only one of the two DNA strands makes “sense,” which means, the “sense
strand” carries the hereditary information. The corresponding mRNA is single-
stranded. After docking to the ribosome (Sect. 32.6), the base sequence is translated
to the amino acid sequence of the coded protein as the final step. This step can be
prevented by the addition of a complementary mRNA antisense oligonucleotide. If
a length of 12–28 bases is reached, the mRNA forms a double strand to the
complementary sequence. The hybrid that results from the base pairing is then
either digested by RNAse H (see below), or this sequence segment cannot be read
during the protein biosynthesis. As a result, the cell does not produce the protein.
Nature uses an analogous principle in RNA interference in which it suppresses gene
expression by using short RNA sequences of 20–23 bases: so-called RNA silencing
(▶ Sect. 12.7).
mRNA can be complexed with an antisense–DNA or RNA segment. This leads
to the enzymatic digestion of the mRNA. Another possibility is the preparation of
an antisense mRNA segment that competes with native mRNA in the ribosomal
protein synthesis. In viral diseases, it is possible to synthesize a complementary
oligonucleotide sequence that is targeted directly against individual viral genes.
As simple as the principle of complementary complexation of a nucleic acid
sounds, it is difficult to translate into practice. Oligonucleotides (Fig. 32.6) are very
polar and highly negatively charged due to their sugar–phosphate scaffold; they
cannot penetrate the cell membrane without assistance. Their scaffold must be
chemically changed, for example, by exchanging the oxygen atoms on the phosphate
groups for sulfur. This simple modification indeed causes better nuclease stability,
but also results in poorer complexation to the complementary mRNA. In addition to
824 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

O Base
O

O RO
P O Base
−X O

O RO
P RNA
−X O
X=O Oligonucleotide
(R = H, OH)
X=S Oligonucleotide-Analogues
(R = H, OAlkyl)

N Base
H
N
O

N Base
O H
N
O

N
O H

PNA
Peptide–Nucleic Acid

Fig. 32.6 Modifications are performed on the backbone of the oligonucleotide strand to reduce its
polarity and increase its metabolic stability. A complete exchange of the ribose phosphate chain is
accomplished by using an oligogylcine peptide strand. Such a PNA shows a high degree of
geometric analogy to the RNA strand. As the crystal structure (right) of an RNA (gray arrow
stands for the phosphate sugar strand) and PNA (orange-colored strand with the green-colored
amide bonds) double strand shows, both scaffolds can successfully hybridize with one another.

the many further modifications of this sort, for example, substitution with carbonates,
carbamates, acetals, imines, or oximes, the sugar moiety has also been chemically
modified. Methylation or methoxyethylation of the 20 -OH group of the ribose ring
leads to reduced toxicity and improved stability to RNAse H. This enzyme has the task
to degrade the RNA that is needed for the gene-expression process, but without use
thereafter. By cleaving the bond in the sugar–phosphate backbone, the nuclease
reduces the mRNA into its monomeric building blocks again. The desired higher
stability can also be achieved by the formation of a cyclic ether between the 20 -OH
group and C40 of the ribose ring to create the so-called locked nucleic acid (LNA).
A rather extensive exchange is the replacement of the sugar–phosphate group with an
oligoglycine strand. The thus-formed peptide–nucleic acid (PNA) can form a com-
plex with the mRNA very well. The crystal structure of a double-stranded hybrid of
a DNA and a PNA strand is shown in Fig. 32.6. The PNA strand shows little toxicity
because of its high biological stability, but there are problems with its cell penetration
32.5 Nucleosides and Nucleotides as False Substrates 825

due to its poor solubility. Chimeric structures of LNA/PNA with DNA oligomers have
been considered as alternatives.
The important criteria that an antisense drug must fulfill are as follows:
• Simple chemical synthesis
• Adequate in vivo stability
• Good membrane permeability and distribution in the organism
• Adequate intracellular half-life
• Strong and sequence-specific binding to the target mRNA
• Good nuclease stability
• No unspecific binding to other biological macromolecules.
Antisense therapy can be applied locally as well as systemically. The local
application allows a high concentration of the antisense nucleotide at the site of action.
In 1998, fomivirsen (Vitravene®) was introduced by Novartis Ophthalmics as the first
antisense nucleotide for the treatment of cytomegaloviral retinitis. This disease occurs
as an opportunistic infection in immunodeficient AIDS patients. The compound must
be applied directly in the vitreous humor and prevents the production of viral proteins
by binding to viral mRNA. In 2002 the company discontinued its marketing for
financial reasons. Other local therapies have skin diseases such as psoriasis as
a goal. The systemic application is usually oriented toward the treatment of different
cancer diseases. Antisense nucleotides against the mRNA of the BCL-2 protein, which
is expressed in many malignant diseases, have been developed. Other approaches are
oriented against TGF-b2 (transforming growth factor b2) because this protein is not
only held responsible for the growth and metastasis of tumors but also because it
protects tumor cells from attack by the body’s own immune cells (▶ Sect. 31.7).
Moreover, antisense nucleotides are used to fight inflammatory diseases (Crohn’s
disease, ulcerating colitis, and asthma) and the metabolic syndrome. It was discussed
in ▶ Sect. 27.8 how high hopes are placed on an antisense strategy for the blockade of
the expression of phosphatase PTP-1B.
It is noteworthy that antisense–DNA technology is already well established in
plants and is an important auxiliary for elucidating specific metabolic pathways.
Here, an mRNA nucleotide is not applied, but rather an antisense DNA that is
loaded onto small gold particles and “shot” into the cell. Transcription of the
antisense DNA affords antisense mRNA, which then forms a complex with the
“right” mRNA and prevents the biosynthesis of the corresponding protein in this
way. The first gene-technologically altered food products to be generated in
this way were the long-keeping Flavr-Savr tomatoes.

32.5 Nucleosides and Nucleotides as False Substrates

As monomeric DNA and RNA building blocks, nucleosides have an analogous role
for the construction of oligonucleotides and genes as amino acids have for protein
construction. As carriers of the hereditary information and coding instructions for
protein biosynthesis, DNA and RNA are essential biomolecules for a multitude of
processes in our bodies. Interventions in the synthesis of these biomolecules, above
826 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

all in processes that are necessary for the production of larger quantities of
these molecules, can afford important principles for drug therapy. The inhibition
of these processes is especially interesting. This is primarily possible with mole-
cules that are very similar to nucleosides but that are modified at decisive positions.
As false substrates, they are indeed recognized by the enzymes as starting material
for the DNA and RNA biosynthesis, but in subsequent steps they lead to
a termination of the synthesis. An increased synthetic capacity is especially needed
in reproducing cancer cells and proliferating viruses. A restriction in the synthesis
rate of these molecules can lead to an effective strategy for the fight against cancer
diseases and viral infections.
Nucleosides are constructed from a purine (adenine and guanine) or
a pyrimidine base (cytosine and thymine in DNA or cytosine and uracil in RNA)
and a pentose. If the OH group is missing from the 2-position of the cyclic five-
membered-ring sugar, the nucleoside is used as a building block for DNA. The
hydroxylated form serves RNA as a monomeric building block. By transforming
the exocyclic hydroxymethylene group into a phosphate ester, a nucleoside
becomes a nucleotide.
The biosynthesis of thymine was discussed in ▶ Sect. 27.3. The enzyme
thymidylate synthase transfers a methyl group onto the pyrimidine base uracil
to convert it into thymine (▶ Fig. 27.8). If a slightly modified substrate is offered
to thymidylate synthase, this molecule is indeed recognized by the enzyme and
bound, but the subsequent biosynthesis is terminated. Therefore such pyrimidine
analogues are used as chemotherapeutics in tumor therapy. Exchanging
a hydrogen atom for fluorine in the 5-positon of the uracil scaffold 32.2 to
5-fluorouracil 32.3 is initially not recognized because of the very similar sizes of
H and F (Fig. 32.7). The modified base is then metabolized via the mono- and
diphosphate to 5-fluoro-20 -desoxyuridinediphosphate. After cleavage of the phos-
phate group, it is accepted by thymidylate synthase as a false substrate. There it
reacts with Cys146 by forming a covalent bond, and in doing so, it irreversibly
blocks the enzyme. Tegafur 32.4 represents a prodrug of 5-fluorouracil that is
activated in the liver by CYP 3A4. As an advantage to 5-fluorouracil, tegafur can
be orally administered as a chemotherapeutic and used ambulantly as palliative
chemotherapy. Capecitabin 32.5 represents another prodrug for the treatment of
colorectal cancer. It must be activated in multiple steps in tumor tissue. After
cleavage of the carbamate group and exchange of the NH2 function for
a carbonyl group by cytidine deaminase, fluorouracil is released, which can be
further biotransformed.
Several purine base analogues have also been described such as
6-mercaptopurine 32.6 or 6-thioguanine 32.7. After biotransformation and phos-
phorylation, they competitively inhibit purine biosynthesis. Accordingly, nucleo-
sides such as fludarabine 32.8, cladribine 32.9, and pentostatin 32.10 inhibit
adenosine deaminase and are used as chemotherapeutics for leukemia.
Antivirals follow a completely different mode of action. Because viruses
store a program for their reproduction and proliferation, but lack their own
metabolism, they must exploit the infected host cell for their own purposes.
32.5 Nucleosides and Nucleotides as False Substrates 827

O
O O
O HN O
NH F
NH F F

NH N
O3PO −
N O O3PO
O N O
O N O HC N O
O 3
O
OH OH
32.2 Uracil-desoxy- 32.3 5-Fluorouracil- OH OH
monophosphate desoxymonophosphate 32.4 Tegafur 32.5 Capecitabin
NH2 NH2
N N
N N
SH SH
HO N N HO N N Cl
N N HO
N N
O O
N N H2N N N
H H OH OH
32.6 6-Mercaptopurine 32.7 6-Thioguanine 32.8 Fludarabin 32.9 Cladribin

HO O
N H3C
O NH
NH
HO N N HO
N NH N O
O O
HO N NH2
OH O OH
32.11 Aciclovir 32.12 Thymidine
32.10 Pentostatin
NH2
O NH2
O
H3C N
N H3C N
NH
NH
HO HO N N
N O N O HO −
O O N O O3P O
O
CH3
N3
32.13 AZT Zidovudine 32.14 Zalcitabine 32.15 Stavudine 32.16 Tenofovir

Fig. 32.7 Nucleoside analogue inhibitors of thymidylate synthase, diverse deaminases, and
reverse transcriptase.

For this, they reprogram the host cell so that it takes on the production of the
necessary viral components. As a prerequisite, the viral hereditary information must
be introduced into the genome of the infected cell. Depending on the type of virus,
a reverse transcriptase (RT) or a DNA polymerase carries out this task. These
enzymes need RNA/DNA nucleosides as starting material for the synthesis or
translation. If a false substrate is offered as a nucleotide building block, this can
lead to the termination of the reproduction of the viral genome by the synthetic
machinery of the host cell. An effective principle for the treatment of viral
infections is therefore achieved.
828 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

The group of herpes viruses stores their genes on double-stranded DNA that is
synthesized by a viral DNA polymerase. If this viral polymerase is offered a false
substrate that is very similar to the natural nucleosides, but unsuitable to continue
the nascent chain construction, termination of the DNA synthesis will result. It is
important that this drug has adequate selectivity for the viral polymerase so that the
endogenous DNA polymerases of the host cell are not excessively inhibited in
parallel. The OH groups in the 50 - and 30 -positions are critical with respect to the
construction of the backbone of a DNA strand. Drugs that are meant to lead to
a chain termination during DNA replication are usually altered at the 30 -position
of the pentose ring. Aciclovir 32.11 was introduced in ▶ Sect. 9.5 as a prodrug for
the treatment of viral infections (Fig. 32.7). Formally, the five-membered ring of the
nucleoside is opened, and the OH group at the 30 -position is missing. Nonetheless,
the guanoside analogue is initially phosphorylated by the viral thymidine kinase
only, and therefore transformed into the 50 -monophosphate exclusively in virally
infected cells. The further transformation to the triphosphate is carried out by
endogenous kinases. Once activated in this manner, it is incorporated into the
nascent DNA strand by the viral DNA polymerase with hydrolysis of the triphos-
phate. In the subsequent step however, any further attachment of a nucleoside
building block is impossible, and chain termination occurs because the necessary
30 -OH group is absent.
In so-called retroviruses, a large group of enveloped viruses, the genetic
information is stored in the form of a single RNA strand. These viruses are the
cause of a few widespread infectious diseases. They infect animals and humans, but
most are specialized on a particular host. In humans, it is above all the HI virus that
represents a deadly threat.
To reproduce, the retroviruses must transcribe their RNA into DNA and incor-
porate the latter into the genome of the host cell. For this purpose, they have the
following enzymes: a reverse transcriptase (RT) and an integrase. The principle
of a reverse transcriptase was first described in 1970 by Howard Temin and David
Baltimore independently of one another; they were awarded the Nobel Prize in
1975. The discovery toppled the previously accepted dogma that information in
biology must always flow in the direction from DNA to RNA to protein. The RT
initially synthesizes an RNA–DNA hybrid strand. For this, the enzyme uses its
DNA polymerase function. It reads the synthetic protocol, however, from its own
single-stranded RNA. Then the hybrid must be converted into a pure double-
stranded DNA. For this, the RT uses a second domain that has an RNAse
H function. Proteins with this activity are used to degrade RNA after it has already
been read in the protein biosynthesis and is of no longer need. The remaining
single-stranded DNA is finally completed to a double-stranded DNA by the DNA-
polymerase activity of the RT. The newly formed DNA with the viral construction
plan is then incorporated into the host cell’s chromosome by the integrase.
Since its discovery and structural characterization, HIV reverse transcriptase
represents a preferred target enzyme for drug design and shall be considered in the
following section in greater detail. The enzyme is a heterodimer constructed from
a p66 and a p51 subunit (Fig. 32.8). Both subunits are coded by the gag-pol gene
32.5 Nucleosides and Nucleotides as False Substrates 829

Thumb
Palm
p66 DNA Strand
RNA Strand

Finger

Guanine
Adenine
Thymine
Uracil
Cytosine
p51

Fig. 32.8 Crystal structure of the HIV reverse transcriptase. The protein is made up of a p66
(purple) and a p51 (yellow) subunit. A hybrid double strand of DNA (pink) and RNA (bright-
green) is positioned in the protein structure. The palm area, where the polymerase activity of the
transcriptase is carried out, lies between the finger and thumb area.

and are cut out of the primary gene product by HIV protease. The p66 subunit
carries the residues for the polymerase and RNAse activity. The p51 domain is
important for the protein’s structural architecture, and it completes the binding site
for the double-stranded DNA and the DNA–RNA hybrid strand. The architecture of
the p66 subunit can be compared with the shape of a hand. It can be divided into
finger, thumb, and palm regions. To accomplish its function, the RT must undergo
significant conformational changes. The thumb and finger regions in particular must
rearrange to grasp the DNA strand and to accommodate the next nucleotide
triphosphate that is to be incorporated into the DNA sequence. The crystal structure
of HIV-RT together with the RNA–DNA hybrid strand is shown in Fig. 32.8. By
artificially anchoring the DNA strand covalently with the enzyme, it was possible to
determine the crystal structure of a tertiary complex of protein, DNA, and the newly
accepted nucleoside triphosphate (Fig. 32.9a). The nucleotide to be incorporated is
coordinated by two magnesium ions through its phosphate group and brought into
position at the end of the nascent DNA strand. The two magnesium ions that
mediate the binding are fixed in place by two aspartic acid residues, 110 and 185.
830 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Fig. 32.9 (a) Crystal structure of reverse transcriptase with a covalently attached DNA strand.
A thymidine-50 -triphosphate (TTP) together with two magnesium ions are in the polymerase site.
As the reaction proceeds, this TTP substrate is added to the backbone of the phosphate sugar chain
of the newly synthesized DNA. (b) Binding mode of AZT–monophosphate in the binary complex
with reverse transcriptase and the DNA strand. The AZT substrate is added to the nascent DNA
strand. In the subsequent step, the chain elongation stops because the azide group is unsuitable for
the addition of the next phosphate group.

RT can be inhibited by structurally modified nucleoside analogues. Azidothymi-


dine or zidovudine 32.13 (AZT) was approved in 1987 as the first HIV-RT inhibitor
(Fig. 32.7). This thymidine analogue has an azide function in place of the 30 -hydroxyl
group. Initially the modified analogue is phosphorylated and incorporated into the
DNA strand, as the natural substrate thymidine 32.12 would be (Fig. 32.9b). The
azide function orients itself into the binding region of both aspartic acids and induces
a rearrangement of these residues. Upon incorporation of the next nucleotide onto the
nascent DNA stand, a chain termination occurs. Because the 30 -OH group is absent,
the next requisite phosphoester bond of the backbone cannot be formed.
Aside from AZT 32.13, an entire series of additional nucleoside analogues
32.14–32.16 have been developed that all have either a chemically modified
30 -OH group, or lack a substituent in the 30 -position (Fig. 32.7). The open-chain
inhibitor tenofovir 32.16 already carries a terminal phosphate group. To exert its
effect, it is also converted to a triphosphate and then incorporated as a false
substrate. As with HIV protease, the high mutation rate of the virus represents
a challenge in drug development. Resistant strains occur very quickly that render
inhibition by a potent nucleoside analogue useless. Initially, residues in the direct
vicinity of the catalytic site mutate, which causes better discrimination between the
natural and false substrates. At the site of action, both substrates are in competition
with one another. Their local concentration as well as the binding affinity determine
whether the correct or false substrate is incorporated. Here a small reduction in the
binding affinity relative to the false substrate can lead to significant effects. An
additional resistance-breaking mechanism is achieved if the growth-terminated
DNA strand with the false substrate is phosphorolytically degraded at an acceler-
ated rate. Mutations to improve this degradation step have also been shown.
32.6 Molecular Wedges Destroy Protein–Nucleotide Recognition 831

32.6 Molecular Wedges Destroy Protein–Nucleotide


Recognition

A second HIV-RT inhibition mechanism was elucidated that was initially discov-
ered by serendipity in screening. It causes an allosteric enzyme blockade that is
not competitive with the natural nucleosides. A hydrophobic pocket in the palm
region of the protein can open and accommodate small organic molecules. Like
a wedge, it fixes the enzyme in a broadly open conformation that prevents the protein
from accepting the RNA–DNA hybrid strand (Fig. 32.10). In doing so, these allosteric
inhibitors do not prevent the uptake of the nucleoside triphosphate substrate, but rather
obstruct the subsequent reaction steps that cause the incorporation of the nucleotide
into the nascent DNA strand. The small, allosteric binding site is formed by aromatic
and hydrophobic residues that almost exclusively come from the p66 subunit. Inter-
estingly, the binding pocket accepts ligands that are chemically very different
(Fig. 32.11). The first-discovered inhibitors nevirapine 32.17, TIBO 32.18, and
loviride 32.19 adopt a butterfly-like geometry in the binding pocket.

Palm
Thumb

Finger

N N
N

N
H3C H
O

32.17 Nevirapine

Uncomplexed
Nevirapine Bound

Fig. 32.10 Nevirapine 32.17 was discovered as an allosteric inhibitor of reverse transcriptase in
screening. The rigid molecule binds to the protein in a small, hydrophobic pocket adopting
a butterfly-like conformation (below left). Like a wedge, the occupancy of this pocket leads to
the fixation of the open conformation of the enzyme (green). The thumb and finger regions remain
far from one another. Upon binding the RNA–DNA hybrid double strand, both of these regions
must move toward one another (green ! red) to grasp the double helix. The allosteric inhibitor
prevents this movement and does not allow the protein to rearrange into its active conformation.
832 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

H3C O Cl
Cl
H
N N N N
NH
N N Cl
O NH2
N S
H3C H O CH3

32.17 Nevirapine 32.18 TIBO, Tivirapine 32.19 Loviride


CN
Cl H3C CH3
CH3
H
NH N SO2
Cl Cl
H
F3C N N N
O NH
N
N NH S
O H
O
32.20 Efavirenz 32.21 Delavirdine 32.22 ITU

CN CN CH3 CN CN CN
H
N

Cl Cl Cl H3C CH3 H3C CH3


N NH N NH HN N NH O N NH

N N N N N N
Br
NH2 NH2 NH2
32.23 32.24 32.25 Dapivirine 32.26 Etravirine

Fig. 32.11 Non-nucleosidic, allosterically acting reverse transcriptase inhibitors.

Resistance mutations were observed very quickly in this allosteric binding site
too. They change the form and the aromatic character of the binding pocket and
rapidly lead to a drop in binding affinity of the allosteric inhibitors. At Janssen
Pharmaceuticals in Beerse, Belgium, under the direction of Paul Janssen and in
close cooperation with the research group of Edward Arnold at Rutgers University
in New Jersey, a triazine or pyrimidine moiety was incorporated as structural
element into 32.23–32.26 by starting with loviride 32.19 and indolylthioureas
(ITU) 32.22 (Fig. 32.11). The new derivatives were systematically analyzed by
crystallography. To the great surprise of the scientists, different binding modes
were evidenced for the structurally very similar derivatives 32.23 and 32.24
(Fig. 32.12). In the context of evading resistant mutants, this result is ideal. It is
distinctly more difficult for the viruses to effectively develop resistant mutants
against compounds that experience adaptive, chameleon-like binding modes. Con-
sequently, the researchers exploited this behavior. Compounds were developed that
had the ability to reorient into alternative binding modes (so-called jiggling). On the
other hand, they had a sufficient amount of conformational degrees of freedom so
that they could adapt to small changes in the enzyme (so-called wiggling), if,
for example, a small amino acid is exchanged for a larger one upon mutation.
32.6 Molecular Wedges Destroy Protein–Nucleotide Recognition 833

Phe227
Val106
Phe227
Val106

Trp229

Trp229

CN CN
H
N

Cl Cl Cl
N NH N NH

N N N N

NH2 NH2

32.23 32.24

Fig. 32.12 The two triazines 32.23 and 32.24 block the allosteric binding site of HIV reverse
transcriptase. Surprisingly, the ligands, which have very similar chemical structures, adopt entirely
different binding modes. Clinical candidates were developed from this compound series that have
a remarkable resistance-breaking profile. This is attributed to the multiple binding modes of the
adaptive ligands able to adjust to a binding pocket that has been altered by mutagenesis.

As a result dapivirine 32.25 and etravirine 32.26 were developed that display an
impressively invariable resistance profile compared to the precursor compounds.
This example shows that adaptive inhibitors in particular have a clear advantage.
This is especially true if substances are to be developed that should have a high
tolerance profile against a broad range of mutated variants of a viral protein.
Another class of molecular wedges to destroy protein-nucleotide recognition are
the quinolone carboxylic acids or quinolones for short. They represent an important
class of antibiotics to fight infections that are caused by Gram-negative bacteria
in particular. They attack gyrase, an enzyme that belongs to the group of
topoisomerases and catalyzes the over-spiralization of bacterial DNA. This DNA
over-spiralization is caused by the addition of extra turns and is necessary to pack
the molecule in the bacterial cell as efficiently as possible. Gyrase must twist the
cyclic bacterial chromosome around itself so that the DNA is placed in the form of
a noose around the enzyme. To introduce an additional turn, the enzyme must make
a temporary break in the DNA double strand. Then the topologically lower end of
the cut strand must be moved to the upper end and reconnected. The cleavage of the
834 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Tyr118

Ser78
3¢ 5¢
32.30

5¢ 32.30
Asp508 3¢
Tyr118

Fig. 32.13 Crystal structure of the topoisomerase (topo IV from Streptococcus pneumoniae, gray
ribbon model) with two oligomeric DNA sequences (blue and violet) and two bound moxifloxacin
molecules (green). The protein must wrap the ring-shaped bacterial DNA around itself like
a noose for over-spiralization. The two DNA segments in the crystal structure emulate this
orientation. To achieve an extra turn in the DNA the double strand must be broken. This cut
occurs with an offset of four base pairs. In doing so, the 50 -end of the free phosphate group is
temporarily covalently attached to Tyr118. The 30 -end remains non-covalently bound in the
vicinity of the magnesium ion (near Asp508). The 1-cyclopropyl group stays in the direction of
Ser78. Exchanging this residue for a Phe or Tyr leads to resistance to this antibiotic. The large
basic group in the 7-position orients itself outside the complex in the surrounding solvent.

DNA double strand is accomplished so that an offset of four base pairs occurs. The
50 -end of the freed phosphate group is temporarily coupled via a covalent
phosphoester bond with a tyrosine residue (Tyr118, Fig. 32.13). The 30 -end with
its OH group remains non-covalently bound in the spatial vicinity of one of the
acidic residues of the magnesium-binding site that is formed by Glu433, Asp508,
and Asp510.
The first representative of the quinolones was nalidixic acid 32.27, which was
introduced into therapy in 1962 for the treatment of urinary tract infections
(Fig. 32.14). The 1-alkyl-4-pyridone-3-carboxylic acid scaffold 32.28 was varied
in further drug development, especially around the 1-alkyl group and in
the 7-position with basic piperazine-like groups. The addition of a fluorine at the
6-position led to a significant improvement in the activity. Important representa-
tives of the drug class are ciprofloxacin 32.29 and moxifloxacin 32.30.
A structure determination of the protein–DNA complex with moxifloxacin was
accomplished in 2009. Two antibiotic molecules intercalate between the two
cleaved ends of the DNA (Fig. 32.13). Like a wedge, they prevent the reassembly
of the cleaved ends of the double strand. Their planar heteroaromatic scaffold is
sandwiched on either side by a guanine from the one strand and an adenine from
the other strand. The cyclopropyl group resides in a pocket that is formed by Ser78 and
Asp83. The development of resistance has been observed as a result of mutations in
these residues. Above all, the replacement of Ser78 by larger residues such as Phe or
Tyr led to a reduction in activity due to steric reasons. The basic ring substituent in the
7-position is oriented between the base pairs four positions further in the sequence
32.6 Molecular Wedges Destroy Protein–Nucleotide Recognition 835

Fig. 32.14 Nalidixic acid O


O
32.27 was the first quinolone COOH
H,F 4 COOH
to be approved; all quinolones 6
have the 1-alkyl-4-pyridone- H3C N N 7 1
3-carboxylic acid scaffold R1 X N
C2H5
32.28. Variations were made Alkyl
in the 1-, 6-, and 7-positions. X = N,C
32.27 Nalidixic acid 1
R = Aliphatic Heterocycle
A fluorine in the 6-position 32.28
proved to be beneficial for the
activity. Aliphatic, basic O
O
heterocycles were substituted F COOH
in the 7-positon. Two F COOH
important representatives of HH N N
this drug class are N N N N
OMe
ciprofloxacin 32.29 and HN
H
moxifloxacin 32.30.
32.29 Ciprofloxacin 32.30 Moxifloxacin

from the cleavage site and resides in a solvent-accessible volume area. This explains
why this group could be broadly varied in the context of quinolone development. The
6-fluorine group is oriented away from the protein and DNA; presumably its electron-
withdrawing properties are needed to optimally adjust the electron density of the
central aromatic moiety for stacking with the neighboring bases. Interestingly the
3-carboxyl and the 4-keto groups, which all quinolones have in common, are oriented
away from the above-mentioned magnesium-binding site so that an involvement of
these groups in the chelation of the metal ion seems unlikely.
Another example of such a molecular wedge that disrupts protein–DNA recog-
nition was observed in the resistance development to tetracyclines (6.13,
▶ Fig. 6.3). Tetracyclines inhibit ribosomal function, which will be introduced in
the next section. Interestingly, tetracyclines bind to a transcription factor, the Tet
repressor, which regulates the supply of the transport protein TetA. It is responsible
for expelling foreign substances from bacterial cells, including tetracyclines. As
long as the Tet repressor is bound to the gene segment that codes for the transport
protein, its expression is suppressed. If, on the other hand, tetracycline binds to the
repressor, it loses its affinity for the regulatory DNA segment. Similar to a switch, it
falls off the DNA, and the gene expression is initiated. The transport protein is
produced, and the antibiotic is expelled from the cell. Resistance occurs because the
tetracycline concentration in the bacteria cells that is needed to block the ribosome
can no longer be achieved.
Interestingly, tetracycline, together with a bound magnesium ion, positions itself
like a wedge between the helices of the repressor and causes a conformational
change (Fig. 32.15). The repressor works very similarly to the zinc finger that was
discussed in ▶ Sect. 28.2. As a dimer, the protein reads from the two palindromic
DNA sequences, which are two helix–turn–helix motifs that are virtually arranged
symmetrically to one another at a separation of 36 Å. The wedging by tetracycline
causes a broadening of the separation of the helix–turn–helix motif to 40 Å.
836 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

36 Å 40 Å

OH O OH O O
OH
NH2

OH
HO CH3 H N(CH3)2

6.13 Tetracycline

Fig. 32.15 Crystal structure of the Tet repressor with a bound sequence segment of DNA (left)
and an intercalating tetracycline 6.13 (right). The protein is a dimer constructed exclusively from
helices (red and green cylinders). It grasps the repressor palindrome DNA sequence segments with
its C2-symmetrical helix–turn–helix motif. A tetracycline molecule pushes between each of the
two monomers of the repressor like a wedge and causes a conformational change in the helical
protein. This increases the relative distance between the two reading motifs from 36 to 40 Å, which
is too far to still be read. The repressor–DNA binding does not occur.

The DNA base sequence can no longer be correctly read. The repressor loses its
affinity for the gene segment, and the production of the transport protein is initiated.
Tetracyclines practically act as a switch and can specifically regulate the gene
expression. This property is used in molecular biology to purposefully turn on gene
expression.

32.7 Macrolides: Microbial Warheads as Potential Cytostatics,


Antimycotics, Immunosuppressants, or Antibiotics

Biomolecules do not only control and regulate the function of organisms, but can also
be used as chemical weapons in the fight against competitors for survival. Microor-
ganisms such as bacteria and fungi in particular produce a multitude of unusual
substances that they use against their opponents in competition. These enemies,
which are often other bacteria and fungi, should be destroyed to win the continual
battle for limited resources. Microorganisms also endanger the health of humans.
32.7 Macrolides: Microbial Warheads 837

In the time before modern drug research, infectious diseases were the main cause of
death (▶ Sect. 1.3). This makes it all the more obvious that the structure and modes of
action of these microbial weapons should be examined in detail to sound out their
potential for a drug therapy against, for example, bacterial pathogens.
Microorganisms have a unique multienzyme complex that does not exist in
humans for the synthesis of these complex, often macrocyclic substances with
peptidic character; their synthesis is independent in the following described peptide
and protein synthesis in the ribosome. The produced compounds have a size from
a few hundred up to a thousand Dalton. The multienzyme complex (so-called
nonribosomal peptide synthesis machinery) uses many additional amino acids
and low-molecular-weight synthetic building blocks, often with unusual stereochem-
istry, as starting materials as well as the 20 proteinogenic amino acids. Moreover,
peptide construction and ring closures are not only accomplished by the formation of
amide bonds; ester bonds can also be closed. The multienzyme complex for these
syntheses is modular and assembled from multiple function-specific domains.
Depending on the product formed, these domains are compiled in the complex with
the necessary multiplicity. An individual module is composed of domains for the
recognition, activation, and incorporation of particular substrate components into the
desired product. They represent the basic function for the extension of the nascent
peptide. Additionally, continuously new synthetase domains are being discovered that
allow deviation from a simple linear synthesis sequence. Synthetic products that result
from the use of such multienzyme complexes often display variations in the peptide
backbone that allow branching and finally macrocyclization. Another synthetic route
that also produces similarly complex and pharmacologically interesting natural prod-
ucts is the polyketide synthetic pathway. It does not use amino acids, but rather
it represents a modification of the fatty acid biosynthesis. The C2 units of
decarboxylated malonyl-CoA are used as starting materials.
Many of the compounds that are synthesized in this manner are macrocycles of
variable ring size. Relatively small rings with nine members all the way to 30- or
40-atom rings have been discovered. Macrolides with 14–16-membered rings are
especially used as antibiotics for the treatment of bacterial infections.
However, macrocycles can also intervene in entirely different mechanisms that
influence, for example, the cell cycle, the integrity of the cell membranes, or
stimulate the immune system. The macrocyclic undecapeptide ciclosporin
(▶ Sect. 10.1) made organ transplantation possible. Its administration prevents
the rejection of the donor organ as foreign tissue in the recipient. Ciclosporin acts
as an immunosuppressant in that it inhibits both the humoral and cellular immune
response and suppresses the release of interleukin-2 (IL-2) from T cells. The
absence of IL-2 release prevents the maturation of the T cells to cytotoxic killer
cells (▶ Sect. 31.7). After penetration, ciclosporin binds to the cytosolic protein
cyclophilin. The ensuing binary complex inhibits the calcium-dependent phospha-
tase activity of the calcineurin–calmodulin complex responsible for the dephos-
phorylation of an activating nuclear factor. As a consequence the migration of this
transcription factor into the cell nucleus does not occur, and the IL-2 synthesis is
blocked. Macrolides such as nystatin, natamycin, or amphotericin B associate with
838 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

ergosterol in the cell membrane of fungi. Via this antimycotic principle they
influence the membrane integrity and make the cell membrane permeable for
potassium ions. This can lead to cell demise in the corresponding fungus.
Rhizopodin, sphinxolide B, kabiramide C, and jaspisamide A interact with actin
polymerization. In doing so they disrupt the development of the cytoskeleton and
demonstrate cytostatic effects. Zearalenone was discovered in the group of mold
toxins and shows a comparable effect as an estrogen.
The largest group of macrolide compounds exerts its effect against ribosomal
function. In this synthetic machinery, the genetic hereditary information is converted
into the production of new proteins. Based on its central importance of the mainte-
nance of all life, the ribosome has been the focus of intensive research for many years.
This large and multilayered natural complex was discovered in the 1950s, and more
than 25 years ago, work toward its crystallization and structure determination began
in the group of Ada Yonath at the Weizmann Institute in Israel. In small steps,
increasing information could be deciphered from the diffraction data about the spatial
construction of this ribonucleoprotein complex. However, the real breakthrough
came in 2000, when the crystal structure of the large 50S subunit was elucidated at
a resolution of 2.4 Å in the group of Tom Steitz at Yale University in New Haven,
CT. The group of Venkatraman Ramakrishnan at MRC in Cambridge, England, was
successful with the smaller 30S subunit and could contribute to the structure eluci-
dation of the total ribosome. The three researchers were awarded the Nobel Prize in
chemistry in 2009 for this grandiose tour de force. The first high-resolution structure
analysis was accomplished with the ribosome from the very robust bacteria Thermus
thermophilus and Haloarcula marismortui. Recently, the ribosome from the eubac-
terium Deinococcus radiodurans has proved to be an obliging and easily crystallized
workhorse. Many structure determinations of complexes with antibiotic macrolides
have been accomplished using this system (see below). It shows a high sequence
homology to the ribosomes of important pathogenic organisms.
The surprise was great after the first high-resolution structure determination.
Indeed, the ribosome is a molecular complex of proteins and nucleic acids, but
because of its catalytic function it must not be termed an “enzyme” but rather
a “ribozyme.” Proteins do not catalyze the decisive synthesis steps. It is the RNA
molecules that take on this function. This fact provides evidence that the ribosome
is evolutionary one of the oldest catalyst in living Nature. Despite its stately size of
over two million Daltons, it is highly conserved and occurs in archaebacteria,
prokaryotes, and highly developed eukaryotes with great similarity. The organisms
from the three domains of life have a common origin that reaches back over
3.5 billion years! Because of its central importance for the production of proteins,
it is not surprising that the ribosome in particular has become a prominent target
structure for the chemical weapons of microorganisms. They bind to a few vulner-
able points on the ribosome, and in doing so, turn its function off. These binding
sites are in the vicinity of the mechanistic active sites.
To understand the importance of these sites in detail, the working procedure of
the ribosome must next be considered. The blueprints for our proteins are stored as
the genome on DNA (▶ Sect. 12.3). To translate this information into proteins, in
32.7 Macrolides: Microbial Warheads 839

Fig. 32.16 In principle, 64 triplets can be formed with four bases: guanine (G), uracil (U),
adenine (A), and cytosine (C). In the diagram, these are oriented from inside to outside. To decode
an amino acid, begin with the central quadrant, for example, U, and then a base is taken from the
first ring, for example, an U again. The third base is chosen from the second, dark-gray ring. If it is
also a U, then the code is UUU for phenylalanine. Three triplets are interpreted as a stop codon
(UAG, UAA, UGA). Because 20 proteinogenic amino acids are available, up to six codons can
encrypt a single amino acid (e.g., Arg or Leu). Tryptophan (UGG) and methionine (AUG) are
encoded by a single triplet only. In a few enzymes such as glutathione peroxidase, a selenocysteine
is found in the active site. This 21st proteinogenic amino acid is encoded by the UGA codon in
certain contexts; UGA usually serves as a stop codon.

eukaryotes, a transcription onto an RNA strand must be initially undertaken, from


which the non-coding areas are then cut out. The resulting transcription, the
so-called mRNA, migrates out of the cell nucleus to be translated into protein in
the ribosome. In prokaryotes the protein synthesis can begin directly. At first
glance, the genetic code on the DNA is a pure series of four nucleic acids: guanine,
adenine, thymine, and cytosine. Uracil takes on the role of thymine in RNA. Each of
these three bases code for an amino acid, whereby multiple so-called codons can
code for the same amino acid (Fig. 32.16).
The translation of base triplets on the mRNA and the synthesis of a protein take
place in the ribosome (Fig. 32.17). If a particular triplet reaches the catalytic site of
the ribosome, its tRNA counterpart molecule is recruited. The tRNA carries the so-
called anticodon loop in an exposed loop that binds complementarily to the
nucleobases of the mRNA triplet (Fig. 32.18). Each triplet in the anticodon loop
unambiguously encrypts one of the 20 proteinogenic amino acids, which is loaded
onto the 30 -end of the tRNA. Each new protein begins with a methionine. For this,
840 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Nascent H AS2-tRNA
N H H
Peptide AS1 N
AS1 AS1
H AS2-tRNA AS2-tRNA
Chain N
O O
O O O
H OH
N
tRNA tRNA N N
A2451 tRNA
Large Subunit A2451 A2451

E
P A

mRNA

Small Subunit

Fig. 32.17 mRNA carries the genetic translation procedures for the synthesis of new proteins in
the ribosome on a single strand. The tRNAs are loaded with one of the 20 proteinogenic amino
acids according to the codon in the anticodon loop. The ribosome has three tRNA-binding sites, the
A-, P-, and E-sites. The A-site picks-up the aminoacylated RNA, the P-site binds the peptidyl–
tRNA, and the tRNA leaves the ribosome via the E-site. The energy required for the formation of
the polypeptide chain is supplied by coupled GTPase activity. To be correctly recognized, the
tRNA in the A- or P-site must display a complementary base triplet in its anticodon loop. A new
amide bond is formed in the ribosome’s peptidyl transferase center between the amino acid in the
P- and A-sites. The amino group of the amino acid AA2 on the aminoacylated tRNA performs the
nucleophilic attack on the carbonyl group of the AA1 amino acids of the peptidyl–tRNA.
A trigonal geometry is formed at the carbonyl carbon atom via an intermediate tetrahedral
transition state. The surrounding nucleosides, for example, A2451, are responsible for the polar-
ization and stabilization of the temporarily charged transition state.

the starting point on the mRNA is the base sequence AUG. As a result, the so-called
P-site of the ribosome has a tRNA with the pattern UAC in the anticodon loop.
This tRNA carries the amino acid methionine. The next triplet code on the mRNA
is, for example, CGC. This leads to a tRNA with the sequence GCG being taken in
at the A-site, next to the P-site. Such a tRNA is loaded with the amino acid arginine.
The two amino acids at the end of the loaded tRNAs orient in the catalytic peptidyl
transferase center (Fig. 32.19). There, peptide bond formation is catalyzed
between the two amino acids, and the first connection in the backbone of the new
protein is formed. The individual steps of the reaction mechanism are reminiscent
of the reaction sequence in proteases. However, it occurs in the opposite direction,
and the substrate recognition in the catalytic center occurs exclusively by the
nucleic acids (Fig. 32.17). After the methionine transfer, the discharged tRNA
32.7 Macrolides: Microbial Warheads 841

Fig. 32.18 The N NH2


schematically displayed,
clover-leaf-like tRNA is AAsp N
N
composed of 80 nucleotides
O N
that are paired into double 3‘- A
O
strands over several sequence C O O
C tRNA P
segments. Additionally, the A − O OH
O
folded tRNA loop exhibits the 5‘- G C O
spatial shape of an L, with the C G
G C −
bases oriented to the outside. OOC NH3+
G U
The anticodon loop, where the
A U Aspartyl-tRNA
encoded base triplet is found, U A
is of particular importance. In D Loop U A T Loop
the example shown, the U
anticodon sequence is CUG, C U C G A A C A C
which fits to the GAC codon G C G U G U G
and codes for aspartic acid.
The base at the 30 -end is C G
C G
always an adenosine. The U
A
20 -OH group of the ribose G C
moiety is bound as an ester to
the amino acid that is to be Anticodon
Loop
transferred.
C U G

leaves the P-site via the neighboring E-site. The tRNA from the A-site migrates
into the neighboring P-site. This corresponds to a progression of the sequential
information on the mRNA. The emptied A-site is now occupied by a new tRNA, the
base triplet in the anticodon loop of which is complementary to the next triplet
sequence of the mRNA. The new protein grows according to this synthesis
sequence and leaves the ribosome via the so-called ribosomal tunnel
(Fig. 32.19). If the ribosome comes to a triplet sequence that corresponds to
a stop codon, the protein synthesis is terminated.
The biosynthesis is carried out with breathtaking speed. No more than 50 ms are
needed for one synthesis cycle. As mentioned, the ribosome is a mixed complex
made from two thirds RNA and one third protein and is organized in two subunits.
The small subunit (30S in prokaryotes) is responsible for the interpretation of the
genetic code. The large subunit (50S in prokaryotes) adds the individual amino
acids to the nascent peptide chain according to the blueprints on the mRNA.
As already mentioned above, the huge ribosome is blocked by antibiotics at
a few vulnerable points. Although antibiotics show distinct structural differences
among themselves, they bind in overlapping regions that are composed of ribo-
somal RNA molecules. In addition to the large group of macrolides, other ligands
with a completely different chemical structure have been found to block this region
of the 50S subunit. Among these are chloramphenicol 32.31 and clindamycin 32.32
(Fig. 32.20). Both bind in the vicinity of the peptidyl transferase center and compete
with the tRNA for the A- and P-sites. Tetracycline 6.13 and the aminoglycoside
842 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

tRNA E-Site tRNA P-Site


tRNA A-Site

rRNA

Protein

Fig. 32.19 View of the 50S ribosome subunit; the RNA portion is white, the protein portion in
light-blue. The three tRNAs in the A- (violet), P- (orange), and E-sites (red) were fit into the model
based on the crystal structure data. The black frame surrounds the peptidyl transferase center in
which the tRNAs with their anticodon loops protrude. Moreover the amide bond of the nascent
polypeptide chain is formed there. This chain (brown) leaves the catalytic site via the ribosomal
tunnel. Macrolides bind in the front part of the peptide tunnel and stop the chain synthesis after
only a few steps. The binding of structurally diverse antibiotics (space-filled, in red, green, violet,
and blue) to the involved nucleotides is indicated (Figure from Hansen et al., Molecular Cell 10,
117–128 (2002), reprinted with the kind permission of the publisher).

6.14 (▶ Sect. 6.4, ▶ Fig. 6.3) also attack the ribosome, but they inhibit the function
of the 30S subunit. Macrocyclic substances 32.33–32.38 bind at the entry to the
ribosomal tunnel, which is not far from the peptidyl transferase center. Their
inhibitory effect is exerted by blocking the growth of the nascent polypeptide.
According to their size, they allow the synthesis of protein fragments of upto 3–7
amino acids before the synthesis succumbs.
The most important representative from this group of compounds is erythromy-
cin 32.33, a macrolactone with a 14-membered ring. The Philippine scientist
Abelardo Aguilar sent soil samples from the province of Iloilo to Lilly in 1949.
There, a metabolic product was isolated that showed antibiotic effects. The natural
32.7 Macrolides: Microbial Warheads 843

CH3

Cl OH
H
N SMe
Cl N H
N O
O H3C OH
OH NO2 O
H3C
Cl HO OH
32.31 Chloramphenicol 32.32 Clindamycin
CH3 CH3
O O
H3C 10 CH3 OH H3C 10 CH3 OH
HO O NMe2 HO O NMe2
HO 7 2⬘ MeO 7 2⬘
OH OH
O O
H3C CH3 H3C CH3
O O
O CH3 O CH3
CH3 O CH3 O
O CH3 O CH3
H3C OMe H3C OMe
OH OH
32.33 Erythromycin 32.34 Clarithromycin
O
H3C O O CH3
N H3C
H3C 10 CH3 OH H3C N CH3
OH
O NMe2 HO O NMe2
HO HO 7 HO
2⬘
OH OH
O O
H 3C CH3 H3C CH3
O O
O CH3 O CH3
CH3 O CH3 O
O CH3 O CH3
H3C H3C OMe
OMe
OH OH
32.35 Roxithromycin 32.36 Azithromycin

NMe2

O
OH
N CH3
H O
H3C CH3 O N N
O O N
H3C N CH3 O
HN O O N S
O CH3
N H
CH3 O N O
O2S O O
O NH O

OH
NEt2 N

32.37 Dalfopristin 32.38 Quinupristin

Fig. 32.20 Chemical structures of a few antibiotics that bind to the 50S subunit of the ribosome.
The substances 32.33–32.38 represent macrolides.
844 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

product was marketed in 1952 under the name Iloson ®. Its total synthesis was
a challenge for the synthetic chemists. Erythromycin’s total synthesis from simple
starting materials was first accomplished in 1981 in the research group of Robert
Woodward. The compound is well tolerated, but has inadequate acid stability. The
free OH group in the 7-position reacts with the 10-carbonyl group by intramolecular
ketalization. This step initiates the substance’s degradation to products that are
inactive as antibiotics. Therefore, erythromycin must be administered in the form of
gastric acid-resistant tablets. Clarithromycin 32.34 is derived from erythromycin by
ether formation at the 7-OH group. This suppresses the instability under acidic
conditions. Analogously, roxithromycin 32.35 achieves comparable stability by an
exchange of the 10-carbonyl group for an oxime. In azithromycin 32.36, the lactone
ring is expanded to 15 members, and the carbonyl group is replaced by
a methylamino group, which is not susceptible to attack by the OH group.
The sensitivity spectrum of Gram-positive pathogens against these macrolides is
somewhat different, and this is also because of differences in bioavailability.
Erythromycin can be used well topically. Therefore it is often used for skin
diseases. Clarithromycin, roxithromycin, and azithromycin are acid-stable and
have better tissue penetration. They are often used to treat respiratory infections,
and infections of the ears, nose, and throat. Erythromycin and clarithromycin are
potent cytochrome P450 CYP 3A4 inhibitors (▶ Sect. 27.6). Therefore the metab-
olism of numerous other drugs that are metabolized by this enzyme can be blocked.
If this fact is overlooked in the dosing, a dangerous increase in the concentration of
simultaneously applied drugs can result (Fig. 32.20).
The binding modes of erythromycin 32.33 and roxithromycin 32.35 are shown in
Fig. 32.21. As mentioned, they obstruct the exit tunnel of the nascent peptide chain in
the vicinity of the peptidyl transferase center in the ribosome. The region is formed
exclusively by RNA building blocks and the binding takes place largely through
pronounced van der Waals contacts with the tunnel wall. A decisive interaction
is found in the form of an H-bond between the nucleoside adenosine 2058 to the
20 -OH group of the amino sugar group. The development of resistance plays an
important role in the use of these antibiotics too. Exchanging the adenine base for
a guanine causes the inhibitory potential of erythromycin to be reduced by five orders of
magnitude. For steric reasons, a guanine at position 2058 leads to a repulsive interaction
with the ribosome (Fig. 32.21). It is just this exchange that is observed in resistant
mutants of clinical pathogens. Interestingly, eukaryotes also display a guanine at this
position. This fact explains why the 14-membered macrolide has good selectivity for
the inhibition of bacterial ribosomes because they display an adenine there.
In many examples in this book it has been shown how, small molecules find
exactly the intended site of action among many macromolecular target molecules
based on their appropriate steric construction and also their correct placement of
interacting functional groups. Perhaps the question has occurred to some readers
as to whether the situation might occur in which two ligands exert their influence
on a target structure by synergistic binding. In fact, these cases exist. Many of
these have probably not been recognized yet, above all those in which the affinity
of both components differs strongly. The mode of action of such potentiating
32.7 Macrolides: Microbial Warheads 845

A2057

Erythromycin

2.30 A2058→ G
2.99
3.02
A2059
U2609

Roxithromycin

Fig. 32.21 Crystallographically determined binding geometry of erythromycin 32.33 (gray) and
roxithromycin 32.35 (brown) at the beginning of the peptide tunnel near the peptidyl transferase
center. An essential hydrogen bond is formed between the 20 -OH group of the amino sugar moiety
and adenosine 2058 (green, 2.99 Å). A mutation of A2038 to guanosine (orange), causing
resistance, brings an amino group into the direct vicinity of the macrolide. A repulsive distance
of 2.30 Å (violet) indicates an unfavorable interaction. At 3.02 Å, the distance between the amino
group and the ether oxygen atom is not favorable. As a result, the binding affinity of the macrolide
to the A ! G resistant mutant is reduced by five orders of magnitude.

effects has only been characterized in very few cases. One such example shall be
discussed as a final case. The macrocyclic streptogramine A and B, dalfopristin
32.37, and quinupristin 32.38 bind to the ribosome in close proximity to one
another (Fig. 32.22). Quinupristin 23.38 arranges, comparably to erythromycin, in
the front part of the ribosomal tunnel. In this way, very short peptides can still be
synthesized by the ribosome. Dalfopristin 32.37 additionally prevents even these
synthesis steps by its binding in the peptidyl transferase center, and the accom-
modation of the tRNA molecule does not occur. If the binding position of
dalfopristin is compared with that of chloramphenicol 32.27 (Fig. 32.23), a very
similar volume segment is occupied. The mutually enhanced binding of the two
macrolides is explained by a pronounced hydrophobic contact surface that
reduces the solvent-accessible surface. Furthermore, an altered conformation is
observed for the highly conserved, catalytically important U2585 residue. This
causes a stable distortion in the peptidyl transferase center. This additional effect
contributes to the synergistic inhibition of the ribosome when both macrolides are
bound simultaneously. Both compounds came to market as a 70:30 mixture of
dalfopristin/quinupristin in 2000 under the brand name Synercid ®. The drug
represents a potent antibiotic against highly resistant bacterial strains.
846 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

Fig. 32.22 The binding sites


of dalfopristin 32.37 and
quinupristin 32.38 are very tRNA P-site
close to the peptidyl
transferase center (black Dalfopristin 32.37
frame). This obstructs the Quinupristin 32.38
passage through the
ribosomal tunnel. Both
macrolides are in contact with
one another via a hydrophobic
surface patch (Figure from
J. M. Harms et al., BMC
Biology 2, 4 (2004); reprinted
with the kind permission of
the publisher). Tunnel

Fig. 32.23 Dalfopristin 32.37 (red) and quinupristin 32.38 (green) are shown in their crystallo-
graphically determined binding geometries with transparent surfaces. Analogously oriented bind-
ing modes are shown for erythromycin 32.33 (gray), clindamycin 32.32 (yellow), and
chloramphenicol 32.31 (light-blue), superimposed on the structural data. Quinupristin binds in
the ribosomal tunnel analogously to erythromycin. Dalfopristin orients in the peptidyl transferase
center and blocks the uptake of tRNAs in the A- and P-sites, in an analogous manner as
chloramphenicol.
32.8 Synopsis 847

32.8 Synopsis

• Recombinantly produced proteins are used in substitution therapy, particularly if


the endogenous protein is insufficiently or non-functionally produced by the
organism.
• Diabetes is caused by a deficiency in the hormone insulin. Nowadays gene-
technologically produced insulin has also been improved in its properties by
mutational changes for either longer action or as a quickly absorbed form for
shorter action.
• Antibodies specifically recognize foreign substances via their surface properties,
bind them efficiently, and deliver them to phagocytic cells such as macrophages
for degradation. They share a common architecture roughly comparable to the
form of the letter Y. The branches and trunk are made of barrel-like pleated-
sheet geometries. The antigen-recognizing regions are found at the tips of the
branches and are composed of several hypervariable loops that form the com-
plementarity-determining regions to bind the antigen.
• The ability of antibodies to bind efficiently to nearly any chemical structure
makes them ideal for the detection and culling of disease-causing foreign sub-
stances and malignant or degenerated cells. Recombinantly manufactured mono-
clonal antibodies specifically tailored for the recognition of protein surface
determinants are used to capture selectively antigens out of the organism and
to deliver them to the usual degradation pathway of the immune system.
• Antibodies can be raised against surface proteins on tumor cells or to compete
with endogenous macromolecular ligands for cell-surface receptors; once gen-
erated, these antibodies block and interfere with subsequent steps of signaling
cascades. Their specific recognition properties can be exploited for targeting
because the antibody scaffold can be chemically linked to other therapeutic
principles, for example, the local exposure of instable nuclides for tissue-
destroying radiation therapy.
• Protein biosynthesis requires reading single-stranded mRNA. Hybridization
with short sequences of antisense oligonucleotides results in base pairing and
leads either to digestion of the resulting double strand by RNAse H, or the
double strand simply cannot be read during protein biosynthesis.
• Due to their polar character, antisense oligonucleotides are insufficiently bio-
available and suffer from low chemical stability. Chemical modifications either
of the phosphate backbone, conformational locking of the ribose moiety, or
chemical functional group replacements improve their properties for successful
drug applications.
• Nucleosides and nucleotides with crucial chemical modifications can still be
recognized by enzymes as false substrates. Once bound to the catalyst, they can
be covalently attached to the active site to irreversibly block the protein, or they
can be incorporated as a building block in a polymer chain reaction. Due to
sophisticated changes in their scaffold, for instance, by placing an azide group in
the 30 -position of the ribose ring, the chain reaction is terminated and reproduc-
tion of the viral genome is stopped.
848 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs

• Aside of chemically modified nucleotides as false substrates of the polymerase


reaction, the HIV reverse transcriptase can be blocked allosterically by inhibitors
that fix the enzyme in a broadly opened conformation; this prevents recognition
of the nascent RNA–DNA hybrid strand.
• The enzyme gyrase catalyzes the over-spiralization of bacterial DNA. The DNA
has to wrap around the enzyme and through intermediate cutting and
reconnection of the backbone additional turns are introduced. The quinolones
intercalate as a kind of wedge into the cut DNA and prevent reassembly of the
cleaved ends of the double strands.
• Tetracyclines inhibit ribosomal function; however, by high-affinity binding to
the Tet repressor, they can initiate gene expression of a transport protein that
expels foreign substances including tetracyclines themselves from the bacterial
cell. As a consequence, resistance occurs because the tetracycline concentration
in the bacterial cell falls below the level needed to block ribosomal function.
• Macrolides have been developed by microorganisms to fight other bacteria and
fungi by blocking ribosomal function. The ribosome, a large ribonucleoprotein,
operates as a ribozyme and, according to the base triplets read from the single-
stranded mRNA, assembles the nascent polymer chain of the protein to be
synthesized in its peptidyl transferase center.
• Several classes of antibiotics are known that block the ribosome at a few vulnerable
points, such as in the peptidyl transferase center or the ribosomal peptide tunnel.
• Resistance to potent ribosomal inhibitors results in many cases from the
exchange of nucleosides at positions where the ribosome performs crucial
interactions with the bound inhibitors. A single adenine to guanine exchange
can cause the inhibitory potential to drop by several orders of magnitude. For
similar reasons high species selectivity can be achieved with respect to the
inhibition of bacterial or human ribosomes.

Bibliography

General Literature

Aboul-Fadl T (2005) Antisense oligonucleotides: the state of the art. Curr Med Chem
12:2193–2214
Banting A (2004) The bittersweet science. In: Pinker S (ed), Folker T (Series ed) The New York
Times Magazine, 16 Mar 2003 or in The best American science and nature writing 2004,
Houghtom Mifflin Company
Brekke OH, Sandlie I (2003) Therapeutic antibodies for human diseases at the dawn of the twenty-
first century. Nat Rev Drug Discov 2:52–62
Das K, Lewi PJ, Hughes SH, Arnold E (2005) Crystallography and the design of anti-AIDS drugs:
conformational flexibility and positional adaptability are important in the design of non-
nucleoside HIV-1 reverse transcriptase inhibitors. Prog Biophys Mol Biol 88:209–231
Dürfahrt T, Marahiel MA (2005) Peptidantibiotika vom molekularen Fließband. Nachr Chem
53:507–513
Kurreck J (2003) Antisense technologies: improvement through novel chemical modifications.
Eur J Biochem 270:1628–1644
Bibliography 849

Milenic DE, Brady ED, Brechbiel MW (2004) Antibody-targeted radiation therapy. Nat Rev Drug
Discov 3:488–498
Poehlsgaard J, Douthwaite S (2005) The bacterial ribosome as a target for antibiotics. Nat Rev
Microbiol 3:870–881
Vivet-Boudou V, Didierjean J, Isel C, Marquet R (2006) Nucloside and nucleotide inhibitors of
HIV-1 replication. Cell Mol Life Sci 63:163–186
Yonath A, Bashan A (2004) Ribosomal crystallography: initiation, peptide bond formation, and
amino acid polymerization are hampered by antibiotics. Annu Rev Microbiol 58:233–251

Special Literature

Graham J, Muhsin M, Kirkpatrick P (2004) Cetuximab. Nat Rev Drug Discov 3:549–550
Hansen JL, Ippolito JA et al (2002) The structures of four macrolide antibiotics bound to the large
ribosomal subunit. Mol Cell 10:117–128
Harms JM, Schlünzen F, Fucini P, Bartels H, Yonath A (2004) Alterations at the peptidyl
transferase centre of the ribosome induced by the synergistic action of the streptogramins
dalfopristin and quinupristin. BMC Biol 2:4
Laponogov I, Sohi MK et al (2009) Structural insights into the quinolone–DNA cleavage complex
of type II topoisomerase. Nat Struct Mol Biol 16:667–669
Saenger W et al (2000) The tetracycline repressor—a paradigm for a biological switch, Angew.
Chem Int Ed 39:2042–2052
Schlünzen F, Zarivach R et al (2001) Structural basis for the interaction of antibiotics with the
peptidyl transferase centre in eubacteria. Nature 413:814–821
Appendix

Structures, Three-Letter and One-Letter


Codes for Amino Acids

R
+ H
H3N CH2 COO– + +N COO–
H3N COO–
H2
Glycine (Gly) G Cα-substituted Proline (Pro) P
amino acids

CH3
CH3
R= CH3
CH3 CH3
Alanine (Ala) A Valine (Val) V Leucine (Leu) L

CH3 S
CH3
CH3
Isoleucine (Ile) I Methionine (Met) M Phenylalanine (Phe) F

N
OH
N N
H H
Tyrosine (Tyr) Y Tryptophan (Trp) W Histidine (His) H

CH3
OH OH SH

Serine (Ser) S Threonine (Thr) T Cysteine (Cys) C


O
NH2
O–
NH2
O
O
Asparagine (Asn) N Glutamine (Gln) Q Aspartate (Asp) D
+
O NH2
+
NH3
O– N NH2
H
Gluamate (Glu) E Lysine (Lys) K Arginine (Arg) R

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5, 851


# Springer-Verlag Berlin Heidelberg 2013
852 Appendix

a Loop b c
Glu106
Helix
His96
Thr200

Ligand rZn2+

Folded Protein His94


sheet H-Bond
d e f Loop
Helix

Surface
Catalytic Site Folded
Surface Catalytic Site sheet

This figure explains how protein structures with bound ligands are represented
on many images of this book. (a) The protein is schematically represented by the
folding of its main chain. Parts of the polymer chain with β-sheet structure (arrows)
are shown in cyan, helical segments (cylinders) in red and loop regions in green.
(b) Amino acids in the active site are displayed as stick model. If not mentioned
differently, carbon atoms of the protein are shown in orange, those of the ligands in
gray. Oxygens are displayed in red, nitrogens in blue, sulfur in yellow, phosphorous
in orange, fluorine in turquoise, chlorine in green, bromine in brown, iodine in violet
and metal ions in gray-blue. Hydrogen atoms are shown in white, however, mostly for
clarity, they are omitted. (c) Amino acids are labeled by a three letter code (as
indicated in the beginning of this book) along with their position in the sequence
(e.g., His94). Hydrogen bonds formed between a ligand (here: p-fluorophenylsul-
fonamide) and amino acid residues of the protein are indicated as thin green lines.
(d) Next to the binding site, the solvent accessible surface is displayed (cf. section
15.6) as white opal surface. (e) Analogous representation with opal surface, this time
together with the adjacent amino acids of the binding pocket. (f) Overall view on the
protein (here Carbonic Anhydrase II, section 25.7) with the sketched binding pocket
around the catalytic center which is blocked by an inhibitor. The latter molecule binds
to the zinc ion of the protein and forms three hydrogen bonds. The trace of the
polymer chain is shown as a contiguous ribbon, color coding is the same as in (a) for
the different segments (The images were generated with the program DS visualizer
V2.0.1.7347 of Accelrys Inc., Copyright 2005-2007)

Most of the images used in this textbook will be made available as computer animations via the
homepage of the author (www.agklebe.de). Interested readers are advised to consult this
homepage to obtain access to the images as rotatable 3D objects.
Illustration Source References

Fig. 1.4 From Noe CR, Bader A (1993) Chem Britain 29:126–128
Fig. 4.1 Segment of the crystal structure of the complex of the retinol-binding
protein with retinol (PDB code: 1RBP)
Fig. 4.7 From Andrews PR et al (1984) J Med Chem 27:1648–1657
Fig. 5.8 Crystal structure of Candida antarctica lipase with two enantiomers of
the transition-state-analogue inhibitor from Bocola et al (2003) Protein Eng
16:319–322
Fig. 5.12 From Caner H et al (2004) Drug Discov Today 9:105–110
Fig. 5.15 Segment from the crystal structure of a complex of trypsin with DX9065a
(PDB codes: 1MTS and 1MTU)
Fig. 5.16 Segment from the crystal structure of an inhibitor complex of
carboanhydrase II (PDB code: 1CIL and Greer J et al (1994) J Med Chem
37:1035–1054)
Fig. 5.17 Segment from the crystal structure of the complex of the retinoic acid
receptor hRARg with BMS270394/5 (PDB codes: 1EXX and 1EXA)

Figure before Chap. 6: Announcement poster from the research group of the
author on the occasion of a conference in 2003 in Rauischholzhausen, Marburg,
Germany.

Fig. 7.7 Segments from the NMR structure of stromelysin and two fragments 7.1
and 7.2, and the common product 7.3 (Hajduk PJ et al (1997) J Am Chem Soc
119:5818–5827, the coordinates were kindly provided by P. Hajduk at Abbott)
Fig. 7.8 Segment from the crystal structures of thermolysin with different bound
molecular probes (PDB codes: 1FJQ (acetone), 1FJU (acetonitrile), 8TLI
(isopropanol), 1FJW (phenol)), and with bound benzyl succinic acid (PDB code:
1HYT)
Fig. 7.12 Superposition of the crystal structures of thymidylate synthase with
N-tosyl-D-proline derivates (PDB codes: 1F4C, 1F4D, 1F4E)
Fig. 10.10 Segment of the NMR structure of the BCL-XL complex with a 16-residue
peptide from the BAK protein (PDB code: 1BXl)
Fig. 10.14 From Bartlett PA (1992) Caveat user manual, San Francisco

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5, 853


# Springer-Verlag Berlin Heidelberg 2013
854 Illustration Source References

Figure Before Chap. 11: # Dr. Dirk Bossemeyer, German Cancer Research
Center, Heidelberg, Germany.

Fig. 11.3 From Christen HR, Vögtle F (1992) Organische Chemie, 2nd edn, vol II,
Fig. 24.5, p 131. Otto Salle & Sauerl€ander
Fig. 11.4 From Gallop MA et al (1994) Fig. 2, Applications of combinatorial
libraries. J Med Chem 37:1233–1251
Fig. 11.9 From Ramström O, Lehn J-M (2002) Fig. 1, Nat Rev Drug Discov
1:27–36
Fig. 11.10 Superposition of the crystal structures of acetylcholinesterase with a syn
and anti click chemistry reaction product (PDB codes: 1Q83, 1Q84)
Fig. 12.4 Fig. 6 from Lottspeich F (1999) Angew Chem 111:2630–2647; reprinted
with the kind permission of the author and publisher
Fig. 12.5 From an illustration of Fonds der Chemischen Industrie im Verband der
Chemischen Industrie e. V., Mainzer Landstraße 55, 60329 Frankfurt am Main,
Biotechnologie – kleinste Helfer – große Chancen
Fig. 13.2 Crystal packing of the structure with the reference code FUXBIJ
(Cambridge crystallographic database)
Fig. 13.3 Taken from Hargittai I, Hargittai M (1995) Symmetry through the eyes of
a chemist, 2nd edn, Figs. 8–23. Springer, New York, p 363; reprinted with the kind
permission of the author and publisher
Fig. 13.4 Taken from Pohl RW (1983) Einführung in die physik, 18th edn, vol 1,
Mechanik, Akustik and W€armelehre, Fig. 380, p 198; reprinted with the kind
permission of the author and publisher
Fig. 13.5 Taken from Glusker JP, Trueblood KN (1972) Crystal structure analysis,
a primer, Fig. 5. Oxford University Press, New York, p 19
Figs. 13.6 and 13.7 From Keller E (1982) Chem unserer Zeit 16:71–88, Figs. 7 and
25; reprinted with the kind permission of the author and publisher
Fig. 13.9 Electron density of the crystal structure of aldose reductase (PDB code:
1US0)
Fig. 13.10 Reprinted with the kind permission of the Siemens company (b), the
author and publisher, taken from Boese R (1989) Chem unserer Zeit 23:77–85,
Fig. 11, (c), (d), (e), Crystal packing of the structure with the reference code
OXACDH06 (Cambridge Crystallographic database; f)
Fig. 13.11 Reprinted with the kind permission of Bruker AXS Gmbh (b), Crystal
structure of TNF (c–f), (PDB code: 1TNF)
Fig. 13.14 NMR structure of a domain of the guanine nucleotide exchange factor
(PDB code: 1B64)
Fig. 13.15 From Montgomery JA, Niwas S (1993) Chem Tech 30–37: 34, Fig. 4
Fig. 14.1 Stevens ED (1978) Acta Crystallogr B34:544–551, Fig. 1; reprinted with
the kind permission of the publisher
Fig. 14.2 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.7, p 66 and Fig. 2.10,
p 68. MacMillan, New York
Fig. 14.4 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.12, p 70 and
Fig. 2.15, p 73. MacMillan, New York
Illustration Source References 855

Fig. 14.5 Taken from Lesk A (1991) Protein architecture, Fig. 4.1, part b and c,
Oxford University Press, Oxford; reprinted with the kind permission of the
publisher
Fig. 14.7 Kindly provided by Prof. R. Zimmer, LMU Munich (prepared with the
Molscript program; protein structures with den PDB codes: 1TIM, 4FXN, 1I1B,
3MBA, 2RHE, 2STV, 1UBQ, 1APS, 256B)
Fig. 14.8 From Branden C, Tooze J (1991) Introduction to protein structure,
Fig. 5.2, p 60, Figs. 5.14, 5.15, p 69, Fig. 5.17, p 71, Fig. 5.19, p 72. Garland,
New York, and Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.26, p 82. MacMillan,
New York
Fig. 14.9 Crystal structures of triosephosphate isomerase (PDB code: 1TIM) and
flavodoxin (PDB code: 3FXN)
Fig. 14.10 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.12. MacMillan,
New York, p 70, and Illustration of a crystal structure of a Fab fragment with
phosphocholine (PDB code: 2MCP)
Fig. 14.13 Taken from Vyas K, Monahar H, Venkatesan K (1990) J Phys Chem
94:6069–6073, Fig. 1; reprinted with the kind permission of the publisher
Fig. 14.14 From a template, the source is unknown
Fig. 14.15 Taken from Bürgi HB, Dunitz JD (1994) Structure correlation, vol 2,
Fig. 13.24. Wiley, p 585; reprinted with the kind permission of the publisher
Fig. 14.16 Distribution of H-bond donor and acceptor groups around an imidazole
moiety; entry from the IsoStar database. Cambridge Crystallographic Data Centre.
http://www.ccdc.cam.ac.uk/products/ csd_system/isostar/
Fig. 14.17 Crystal structures of trypsin (PDB Code: 3PTB) and subtilisin (PDB
code: 1SBC)
Fig. 14.20 Crystal structures von DNA oligonucleotide strands with cisplatin and
daunorubicin (PDB code: 1A2E and 1AL9)
Fig. 15.1 (1994) Discover manual, Part 1, Fig. 3.5, San Diego
Fig. 16.1 Taken from Christen HR, Vögtle F (1992) Organische Chemie, vol I, 2nd
edn, Fig. 2.3, p 71. Otto Salle & Sauerl€ander

Figure before Chap. 17: Announcement poster from the research group of the
author on the occasion of a conference in 2005 in Rauischholzhausen, Marburg,
Germany

Fig. 17.1 Taken from Mackay MF, Sadek M (1983) Aus J Chem 36:2111–2117,
Fig. 1; reprinted with the kind permission of the publisher
Fig. 17.7 Superimposition of the crystal structure of dihydrofolate reductase with
dihydrofolate and methotrexate (PDB codes: 1DHF, 3DFR)
Fig. 17.9 Taken from Seidel W, Meyer H, Kazda S, Dompert W (1984) Fig. 6. In:
Seydel J (ed) QSAR and strategies in the design of bioactive compounds. Wiley,
pp 366–369; reprinted with the kind permission of the publisher
Fig. 17.10 Segment of the crystal structures of thermolysin with different, bound
molecular probes (cf. Fig. 7.8) superimposed with “hot spots” from a calculation
from DrugScore
856 Illustration Source References

Fig. 17.11 Distribution of hydrogen-bond donor groups around a carboxylic


acid, ester, keto, and ether grouping from the Isostar database http://www.ccdc.
cam.ac.uk/products/csd_system/isostar/, Cambridge Crystallographic Data Centre
Fig. 17.12 Superposition of the crystal structures of DHFR with MTX (PDB code:
3DFR) with the distribution of hydrogen-bonding geometries from IsoStar http://
www.ccdc.cam.ac.uk/products/csd_system/isostar/, Cambridge Crystallographic
Data Centre
Fig. 18.4 From Cramer RD, Patterson DE, Brunce JD (1988) J Am Chem Soc
110:5959–5967, Fig. 1
Figs. 18.9 and 18.10 From Weber A et al (2006) J Chem Inf Model 46:2737–2760,
Figs. 7 and 8
Fig. 20.6 Superposition of the crystal structures of three cytochrome c enzymes
(PDB code: 3C2C, 5CYT, 155C)
Fig. 20.7 Taken from Verlinde CLMJ, Hol WGJ (1994) Structure 2:577–587
Fig. 20.8 Taken from Böhm HJ (1993). In: Wermuth CG (ed) Trends in QSAR and
molecular modelling, vol 92. ESCOM Science, Leiden, Fig. 3, p 30
Fig. 21.2 Crystal structure of TGT with bound tRNA (PDB code: 1Q2S)
Fig. 21.6 Crystal structure of TGT with bound 21.3 (PDB code: 1ENU)
Fig. 21.8 Crystal structure of TGT with bound 21.9 (PDB code: 1N2V)
Fig. 21.13 Crystal structure of TGT with bound 21.14 (PDB code: 1Y5V)

Figure before Chap. 22: Announcement poster from the research group of the
author on the occasion of a conference in 2007 in Rauischholzhausen, Marburg,
Germany

Fig. 22.1 From Hopkins AL, Groom CR (2002) Nat Rev Drug Discov 1:727–730
Fig. 22.4 Segment from the crystal structure of a complex of creatinase with
carbamoyl sarcosine (PDB code: 1CHM)
Fig. 23.2 Binding pocket from the crystal structures of trypsin (PDB code: 1PPC),
thrombin (PDB code: 1DWD), faktor VIIa (PDB code: 1W7X) and factor Xa (PDB
code: 2P93)
Fig. 23.5 Segment of the crystal structure of the complex of thrombin with
cyclotheonamide A, an inhibitor from the marine sponge Theonella sp. (PDB
code: 1TMB)
Fig. 23.6 Modeled geometry from a crystal structure of thrombin with fibrinopep-
tide (PDB code: 1FPH)
Fig. 23.7 Superposition of the crystal structures of thrombin with fibrinopeptide
(PDB code: 1FPH) and PPACK (PDB code: 1PPB)
Fig. 23.10 Segment from the crystal structure of the complex of thrombin with
NAPAP (PDB code: 1DWD)
Fig. 23.12 Comparison of the crystal structures of NAPAP with trypsin and
thrombin (PDB code: 1PPC and 1DWD)
Fig. 23.17 Segment from the crystal structure of the complex of elastase with
a pyridone-like inhibitor (PDB code: 1EAT)
Illustration Source References 857

Fig. 23.18 Segment from the crystal structure of the complex of factor Xa with
rivaroxaban (PDB code: 2W26)
Fig. 23.24 Segment from the crystal structure of the complex of 1b-lactamase (PDB
code: 1TEM)
Fig. 23.26 Crystal structure of the yeast proteasome with bortezomib (PDB code: 2F16)
Fig. 23.27 Segment from the crystal structure of calpain II with leupeptin (PDB
code: 1TL9)
Fig. 24.3 Crystal structures of the aspartic protease cathepsin D (PDB code: 1LYB),
endothiapepsin (PDB code: 4ER1), HIV protease (PDB code: 5HPV), plasmepsin
(PDB code: 1SME), and renin (PDB code: 4APR)
Fig. 24.8 Superposition of the crystal structure of renin with CGP-38560 (PDB
code: 1RNE) and aliskiren (PDB code: 2V0Z)
Fig. 24.11 Segment from the crystal structure of renin with a piperidine-like
inhibitor (PDB code: 1UTH)
Fig. 24.12 Crystal structure of HIV protease with a peptide substrate (PDB code:
1MT9)
Fig. 24.18 Superposition of the crystal structures of HIV protease with a urea-like
(PDB code: 1HVR) and coumarin-like inhibitor (PDB code: 1UPJ)
Fig. 24.24 Crystal structures of HIV protease with inhibitors with a secondary
amine nitrogen atom (PDB codes: 1XL2, 3BHE, 2PQZ, 3BGB)
Fig. 24.25 Superposition of the ligands in the crystal structure with HIV
protease ritonavir (PDB code: 1HXW), atazanavir (PDB code: 2AQU), darunavir
(PDB code: 1T3R), amprenavir (PDB code: 1HPV), indinavir (PDB code: 1HSG),
nelfinavir (PDB code: 1OHR), saquinavir (PDB code: 1HXB), lopinavir (PDB
code: 2O4S), tipranavir (PDB code: 2O4P), and 24.58 (PDB code: 2QQN)
Fig. 25.2 Segment from the crystal structure of the complex of matrix metalloproteinase
MMP-12 with the cleavage product of the protease reaction (PDB code: 2OXZ)
Fig. 25.5 Segment from the superposed crystal structures of the complexes of
thermolysin with the inhibitor Cbz-GlyP-Leu-Leu (PDB code: 5TMN) and
a cyclized inhibitor (PDB code: 1PE5) derived from it
Fig. 25.6 Segment from the crystal structure of the complex of carboxypeptidase
with benzylsuccinate (PDB code: 1CBX)
Fig. 25.12 Segment from the crystal structure of the complex of lisinopril with
t-ACE (PDB code: 1O86)
Fig. 25.14 Segment from the crystal structure of the complex of fibroblast collage-
nase with Ro 31–4724 (PDB code: 2TCL)
Fig. 25.16 Segment from the crystal structures of the complexes of fibroblast
collagenase with a peptidic (25.49) and a non-peptidic inhibitor (25.50, PDB
codes: 1HFC and 966C)
Fig. 25.17 Segment from the crystal structure of the complex of carboanhydrase II
with p-fluorophenylsulfonamide and modeled geometries of a carbonylation in
CA II (PDB code: 1IF4)
Fig. 25.20 Segment from the crystal structure of the complex of phosphodiesterases
5 and sildenafil (PDB code: 1UDT)
858 Illustration Source References

Fig. 25.22 Segment from the crystal structure of the complex of peptide
deformylase from Escherichia coli with actinonin (PDB code: 1G2A)
Fig. 26.3 Crystal structure of the cAMP-dependent protein kinase (PDB code:
1L3R)
Fig. 26.4 Modeled geometries of the transition state on the coordinates of the
crystal structure of the cAMP-dependent protein kinase (PDB code: 1L3R)
Fig. 26.7 Crystal structure des complex of MAP kinase p38 with SB203580 (PDB
code: 1A9U)
Fig. 26.9 Superposition of the inactive and active form of the tyrosine kinase
domains of the human insulin receptor (PDB codes: 1IRK and 1IR3)
Fig. 26.10 From a figure out of Fabian MA et al (2005) Nat Biotechnol 23:329–336;
reprinted with the kind permission of the author and publisher
Fig. 26.12 Superposition of the crystal structures of BCR-ABL protein kinase with
bound imitinib (Gleevec®) and tetrahydrostaurosporine (PDB codes: 2HYY and
2HZ4)
Fig. 26.14 Segments from the crystal structures of Src kinase with
ANP and the mutated Src-kinase with N6-benzyl-ADP (PDB codes: 1KSW
and 2SRC)
Fig. 26:17 Superposition of the crystal structures of Ser/Thr-Kinase PIM-1 with
staurosporine and a ruthenium complex (PDB codes: 1YHS and 2BZH)
Fig. 26.19 Segment from the crystal structure of human tyrosine phosphatase
PTP-1B (PDB code: 1PTY)
Fig. 26.20 Segments from the crystal structures of human tyrosine phosphatase
PTP-1B with different inhibitors (PDB codes: 1PTY, 1NO6, 1NNY, 1N6W)
Fig. 26.22 Segment from the crystal structure of human tyrosine phosphatase
PTP-1B with an allosteric inhibitor (PDB code: 1T4J)
Fig. 26.23 Crystal structure of COMT with a substrate-analogue inhibitor and
S-adenosyl-L-methionine (PDB code: 1VID)
Fig. 26.25 Superposition of the crystal structures of COMT (PDB code: 1VID and
1JR4)
Fig. 26.26 Superposition of the crystal structures of FTase with farnesyl diphos-
phate and the farnesylated tetrapeptide CAAX (PDB codes: 1FT2 and 1D8D)
Fig. 26.28 Superposition of the crystal structures of with BMS-214662 and the
farnesylated tetrapeptide CAAX (PDB codes: 1SA5 and 1D8D)
Fig. 27.3 Segment from the crystal structure of dihydrofolate reductase from
Lactobacillus casei with methotrexate (PDB code: 3DFR)
Fig. 27.4 Segment from the crystal structure of horse liver alcohol dehydrogenase
with bound NADPH (PDB code: 1HET)
Fig. 27.7 Segment from the crystal structure of cytochrome P450 14-a-
steroldemethylase (CYP51) from Mycobacterium tuberculosis in complex with
fluconazole 27.6 (PDB code: 1EA1)
Fig. 27.11 Superposition of the binding pocket of Pneumocystis jiroveci and murine
DHFR (PDB codes: 2FZI and 2FZJ)
Fig. 27.14 Segment from the crystal structure of HMG-CoA reductase with bound
HMG-CoA and mevalonic acid (PDB codes: 1DQA, 1DQ9)
Illustration Source References 859

Fig. 27.15 Segment from the crystal structure of HMG-CoA reductase with bound
inhibitors simvastatin and atorvastatin (PDB codes: 1HW9, 1HWK)
Fig. 27.19 Segment from four crystal structures from aldose reductase with sorbinil,
tolrestat, IDD594, and 27.46 (PDB codes: 1AH0, 2FZD, 1US0, 2NVD)
Fig. 27.20 Binding pocket from the crystal structure of aldose reductase with
sorbinil (PDB codes: 1AH0)
Fig. 27.24 Crystal structure of human 11b-HSD1 with carbenoxolone
superimposed with the complex of murine 11b-HSD1 with bound corticosterone
(PDB codes: 2BEL, 1Y5R)
Fig. 27.25 Segment from the crystal structures of human 11b-HSD1 in complex
with two inhibitors and the complex of murine 11b-HSD1 with bound corticoste-
rone (PDB codes: 2ILT, 2RBE, 1Y5R)
Fig. 27.27 Crystal structures of human CYP 3A4 uncomplexed and in complex with
metyrapone, erythromycin, and ketoconazole (PDB codes: 1W0E, 1W0G, 2J0D,
2V0M)
Fig. 27.30 From Fig. 3 in Weinshilboum and Wang (2004) Nat Rev Drug Discov
3:739–748
Fig. 27.33 Crystal structures of human MAOB in complex with tranylcypromine
and L-deprenyl (PDB codes: 1OJB and 2BYB)
Fig. 27.34 Crystal structures of human MAOA in complex with clorgyline (PDB
code: 2BXR) and MAOB with L-deprenyl (PDB code: 2BYB)
Fig. 27.37 Superposition of the crystal structures of cyclooxygenase-1 and 2 in
complex with arachidonic acid (PDB codes: 1PRH and 1CVU)
Fig. 27.39 Segment from the crystal structures of cyclooxygenase with arachidonic
acid (PDB code: 1DIY) and prostaglandin PGH2 (PDB code: 1DDX)
Fig. 27.41 Segment from the crystal structures of cyclooxygenase-1 with a bromine
analogue of acetylsalicylic acid (PDB code: 1PTH)
Fig. 27.42 Segment from the crystal structures of cyclooxygenase-2 with a bromine
analogue of celecoxib (PDB code: 6COX)
Fig. 28.2 Crystal structure of the DNA-binding domain of the estrogen receptor
with a bound oligonucleotide strand (PDB code: 1BY4)
Fig. 28.4 Segment from the crystal structures of the ligand-binding domain of the
estrogen receptor with bound estradiol (PDB code: 1ERE)
Fig. 28.5 Segment from the crystal structure of the ligand-binding domain of the
progesterone receptor with bound progesterone (PDB code: 1A28)
Fig. 28.6 Comparison of the crystal structures of the estrogen receptor with bound
estradiol and raloxifen (PDB codes: 2J7X and 1ERR)
Fig. 28.8 Segment from the crystal structure of the ligand-binding domain of the
estrogen receptor with bound estradiol and the LxxLL binding motif (PDB code: 2J7X)
Fig. 28.13 Superposition of the crystal structures of the ligand-binding domain of
the PPARg receptors with a bound agonist (PDB code: 1K7L) and antagonist (PDB
code: 1KKQ)
Fig. 28.16 Schematic course of the secondary structural elements in the crystal
structures of the estrogen receptors (PDB code: 2J7X) and three examples for the
PXR receptor (PDB codes: 1NRL, 1M13, 1SKX)
860 Illustration Source References

Fig. 29.1 Folding pattern as they are found in the crystal structure of bacteriorho-
dopsins (PDB code: 1BRD), bovine rhodopsin (PDB code: 1U19), and the human
b2-adrenergic receptor (PDB code: 2RH1)
Fig. 29.2 Segment of the crystal structure of the human b2-adrenergic receptors
(PDB code: 2RH1)
Fig. 29.4 Superposition of the crystal structures of the b1-adrenergic
receptors with bound cyanopindolol (PDB code: 2VT4) and isoprenaline (PDB
code: 2Y03)
Fig. 29.5 Crystal structures of a mutant of the inactive (PDB code: 1GZM) and
active (PDB code: 2X72) rhodopsin
Fig. 29.13 Crystal structure of the erythropoietin receptor with bound erythropoi-
etin (EPO; PDB code: 1CN4)
Fig. 30.3 Schematic representation of the crystal structure of the bacterial potas-
sium channel KcsA (PDB code: 1K4C)
Fig. 30.4 Segment of the crystal structure of the bacterial potassium channel KcsA
(PDB code: 1K4C)
Fig. 30.5 Segment of the crystal structures of a selective and unselective bacterial
potassium channel (PDB codes: 1K4C, 2AHY)
Fig. 30.9 Crystal structure (electron diffraction) of the nicotinic acetylcholine
receptor in the closed state from the electric organ of an electric ray (PDB code:
2BG9)
Fig. 30.11 Crystal structures of the ligand-binding domain of the nicotinic acetyl-
choline receptors from the California sea slug (Aplysia californica) with bound
a-conotoxin (PDB code: 2BYP) and epibatidine (PDB code: 2BYQ)
Fig. 30.12 Segment from the crystal structures of complexes of the ligand-binding
domains of the nicotinic acetylcholine receptor from the California sea slug
(Aplysia californica) with bound a-conotoxin (PDB code: 2BYP), methyllyca-
conitine (PDB code: 2BYR), a-lobeline (PDB code: 2BYS), and epibatidine (PDB
code: 2BYQ)
Fig. 30.15 Crystal structure of the bacterial ClC channel from Escherichia coli
(PDB code: 1OTS)
Fig. 30.16 Crystal structure of the pore from the bacteria Rhodobacter capsulatus
(PDB code: 2POR)
Fig. 30.17 Crystal structure of gramicidin A (PDB code: 1GRM)
Fig. 30.18 Model of nonactins based on a crystal structure (CSD Refcode:
NONKSC)
Fig. 30.19 Crystal structure of bovine aquaporin 1 (PBD code: 1J4N)
Fig. 31.3 Segment from the crystal structure of the aIIb3 integrin receptor with
a cyclopeptide (PBD code: 1L5G)
Fig. 31.5 Superposition of the segment from the crystal structure of the aIIb3
integrin receptor with eptifibatide (PBD code: 1TY6/2VDN) and tirofiban (PBD
code: 1TY5/2VDM)
Fig. 31.6 From a drawing of the research group of Prof. B. Ernst, University of
Basel (http://www.pharma.unibas.ch/molpharm/index.html)
Illustration Source References 861

Fig. 31.7 Segment from the crystal structure of sialyl-LewisX and a selectin (PDB
code: 1G1R)
Fig. 31.9 Schematic drawing analogous to Doranz et al (1999) J Virol
12:10346–10358
Figs. 31.10 and 31.11 Segment from the crystal structure of the gp41 protein (PBD
code: 1AIK)
Fig. 31.15 Segments from the crystal structures of neuraminidase with zanamivir
(PDB code: 1A4G) and oseltamivir (PDB code: 2HT8)
Fig. 31.17 Superposition of the crystal structures of the capsid proteins of HRV-14
in complex with pleconaril (PDB code: 1NA1) and the cryoelectron-
microscopically determined complex with domains of the adhesion protein (PDB
code: 1D3I)
Fig. 31.19 Crystal structure HRV-14 capsid protein in complex with pleconaril
(PDB code: 1NA1)
Fig. 31.21 Crystal structure of the tertiary complex of a MHC I molecule with
a nonapeptide and the T-cell receptor (PDB code: 1BD2)
Fig. 31.23 Modeled complex of 31.42 (brown) and 31.43 (green) based on the
crystal structure of a tertiary complex (PDB code: 1AO7); from Douat-Casassus
C et al (2007) J Med Chem 50:1598–1609; coordinates were kindly provided by the
author
Fig. 32.1 Crystal structure of a complete IgG antibody (PBD code: 1IGT)
Figs. 32.2 and 32.3 Comparison of the crystal structures of the Fab domain of an
antibody with phosphocholine (PBD code: 2MCP) and lysozyme (PBD code: 1FBI)
Fig. 32.5 From Fig. 3 in Milenic DE et al (2004) Nat Rev Drug Discov
3:488–498
Fig. 32.6 NMR structure of an oligomeric double strand of RNA and PNA (PBD
code: 176D)
Fig. 32.8 Crystal structure of the HIV reverse transcriptase with a bound
RNA–DNA hybrid double strand (PBD code: 1HYS)
Fig. 32.9 Structure comparison of the crystal structures of HIV reverse transcriptase
with a bound DNA double strand and bound thymidine-50 -triphosphate (PBD code:
1RTD) and AZT (PBD code: 1N5Y)
Fig. 32.10 Superposition of the crystal structures of HIV reverse transcriptase in an
uncomplexed and a nevirapine-complexed state (PBD code: 1DLO and 1VRT)
Fig. 32.12 Crystal structures of HIV reverse transcriptase with two allosterically
acting triazines (PBD code: 1S9E and 1S9G)
Fig. 32.13 Crystal structure of topoisomerase IV from Streptococcus pneumoniae
with bound moxifloxacin (PDB-Code: 3FOF)
Fig. 32.15 Crystal structure of the Tet-repressor with bound DNA–oligonucleotide
(PDB-Code: 1QPI) and tetracycline (PDB-Code: 1BJY)
Fig. 32.19 Taken from Hansen JL et al (2002) Mol Cell 10:117–128, Fig. 4;
reprinted with the kind permission of the author and the publisher
Fig. 32.21 Crystallographically determined binding mode of erythromycin and
roxithromycin in the ribosome (PBD code: 1JZY and 1JZZ)
862 Illustration Source References

Fig. 32.22 Figure taken from Harms JM et al (2004) BMC Biol 2:4, Fig. 3; reprinted
with the kind permission of the author and the publisher
Fig. 32.23 Superposition of the crystallographically determined binding modes
dalfopristin, quinupristin (PBD code: 1SM1), erythromycin (PBD code: 1JZY),
chloramphenicol (PBD code: 1K01), and clindamycin (PBD code: 1JZX)

Appendix, Fig. 2: Different illustrations of the crystal structure of carboanhydrase


II with p-fluorophenylsulfonamide (PDB code: 1IF4) with the computer graphic
program DS Visualizer V2.0.1.7347 from Accelrys Inc., copyright 2005–2007.
Name Index

A Bocola, Marco, 96
Abraham, Donald, 382 Bode, Wolfram, 503, 504
Agre, Peter, 772 Böhm, Hans-Joachim, 443
Aguilar, Abelardo, 843 Bossenmeyer, Dirk, 210
Ahlquist, Raymond, 721 Böttcher, Jark, 555
Aires, Buenos, 42 Boyer, Herbert, 234, 235
Alarich, 42 Bragg, William, 316
Aldrich, Thomas Bell, 9 Bragg, William Henry, 265
Alex, Richard, 736 Bragg, William Lawrence, 265
Alexander, 42 Brenner, Sydney, 134
Amgen, 815, 822 Brodie, 406
Anderson, E.S., 234 Buck, Linda, 735, 736
Andromachus, 4 B€
urgi, Hans-Beat, 305
Ariëns, Everhardus J., 101
Arnold, Edward, 832
C
Cahn, Arnold, 23
B Cahn, R.S., 93
Babbage, Charles, 233 Capecchi, Mario, 245
Bajusz, Sándor, 502, 503 Capote, Truman, 38
Baltimore, David, 828 Carson, Rachel, 44
Banting, Frederick, 816 Carter, Paul, 496
Bartlett, Paul, 79, 204, 571 Caruso, Enrico, 38
Bayer, Adolf v., 9 Chain, Ernst Boris, 28
Beddell, Chris, 430 Christie, Agatha, 38
Bentham, Jeremy, 423 Cinchon, 42
Berger, Arieh, 302 Clement, Bernd, 509
Bernard, Claude, 372 Clinton, Bill, 239
Bernays, Martha, 53 Cobbe, Frances Power, 423
Berney, W., 31 Cohen, Stanley, 234
Bertini, Ivano, 566 Corey, 323
Besler, Basilius, 2 Craig, Paul, 157
Best, Charles, 816 Cramer, Friedrich, 63
Biot, Jean Baptist, 89 Cramer, Richard, 381
Black, James W., 7, 54 Criciani, Gabriele, 675
Bloch, Felix, 265 Crick, Francis, 316
Blow, David, 494 Crum-Brown, Alexander, 372
Blundell, Tom, 438, 541 Cushman, David, 574, 575, 577

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5, 863


# Springer-Verlag Berlin Heidelberg 2013
864 Name Index

D Gerber, Hans-Dieter, 454


da Vinci, Leonardo, 233 Geysen, H. Mario, 216
Danielson, Helena, 169 Gohlke, Holger, 388
Davy, Humphry, 24 Goodford, Peter, 14, 364, 382, 430, 795
de la Vega, Garcilaso, 52 Gr€adler, Ulrich, 454
de Vega, Juan, 42 Greene, Graham, 38
DeSilva, Ashanti, 259 Grenouille, Jean-Baptiste, 737
Diederich, François, 449, 510 Groom, Colin, 472
Dioskurides, 4 Gr€utter, Markus, 542
Dixon, Scott, 368 Guareschi, Giovanni, 38
Djerassi, Carl, 708
Domagk, Gerhard, 27, 121
Dominik, Hans, 233 H
Dreser, Heinrich, 176 Hamilton, Andrew, 200
Duisberg, Carl, 23 Hammett, Louis P., 373
Dunitz, Jack, 265, 305 Hansch, Corwin, 374, 375
D€
urer, Albrecht, 42 Hasek, Jaroslaw, 38
Heinrich IV, 42
Henderson, Richard, 721
E Hepp, Paul, 23
Ehrlich, Paul, 27, 61, 120 Hillebrecht, Alexander, 394
Elizabeth II, 38 Hoffman, August Wilhelm v., 25
Ellman, Jonathan, 305 Hoffman, Carl, 653
Endo, Akiro, 655 Hoffmann, Albert, 29
Ernst, Richard, 282 Hoffmann, Felix, 37
Hofmann, Albert, 421
Hogben, 406
F Höltje, Hans-Dieter, 380
Farbenfabriken, Bayer, 23 Hopkins, Andrew, 472
Fedorov, Alexander, 533 Hörner, Simone, 460
Ferreira, Sergio Henrique, 39, 573 Howe, Jeffrey, 443
Fersht, Alan, 76 Huber, Robert, 474
Fesik, Steven, 143 Hungerford, David, 609
Fink, Tobias, 215 Huxley, Aldous, 233
Fire, Andrew, 247
Fischer, Emil, 61, 63, 315
Fleckenstein, Albrecht, 31 I
Fleming, Alexander, 28, 34, 521 Imming, Peter, 472
Florey, Howard, 28 Ingold, C.K., 93
Folkers, Karl, 653
Fraser, Claire, 239
Fraser, Thomas, 372 J
Free, S.R., 377 James, Michael, 541
Freire, Ernesto, 167 Janssen, Paul, 11, 12, 832
Freud, Sigmund, 52 Jirtle, Randy, 258
Friedrich, Walter, 265 Jones, Gerrith, 441
Fujita, Toshio, 374, 375 José Ortega y Gasset, 38
Jotereau, Francine, 806

G
Galvani, Luigi, 5 K
Ganellin, Robin, 403 Kafka, Franz, 38
Gasteiger, Johann, 318 Karplus, Martin, 366
Gates, Marshall, 49 Kearsley, Simon, 359
Name Index 865

Kekulé, Friedrich August, 14 Mietzsch, Fritz, 27


Kendrew, John, 316 Milne, M., 381
Kent, Stephan, 106 Milstein, César, 819
Kessler, Horst, 196 Moon, Joseph, 443
Kier, Lemont B., 380 Morton, William T., 25
Kirst, Hans Helmut, 38 Mullis, Kary, 235
Klarer, Josef, 27
Klibanov, Alexander, 432
Knipping, Paul, 265 N
Kobilka, Brian, 723 Napoleon, 37
Koch, Oliver, 296 Nicchols, Anthony, 360
Koch, Robert, 27 Nowell, Peter, 609
Köhler, Georges, 819
Koller, Carl, 53
Koltun, 323 O
Koshland, Daniel E., 64 Olson, Art, 441
Kramer, Peter, 19 Olson, P.N., 197
Kraut, Joseph, 649 Ondetti, Miguel, 574
Kubinyi, Hugo, 401 Oprea, Tudor, 410, 472
Kuntz, Irwin, 445 Otto II, 42
Kuschinski, Gustav, 421 Overton, Charles Ernest, 373

L P
La Roche, Hoffman, 31 Paracelsus, 5, 420
Lacassagne, Antonie, 706 Pasteur, Louis, 89
Le Bel, Joseph Achille, 91, 315 Pauling, Linus, 64, 316, 323
Lehn, Jean-Marie, 227, 229 Pearlman, Robert, 318
Lemmen, Christian, 360 Pemberton, John S., 52
Li Shizhen, 5 Perkins, William Henry, 25
Liebreich, Oskar, 25 Perutz, Max, 316
Lipinski, Chris, 215, 410 Petzko, Greg, 143
Lippold, Bernard, 399 Pincus, Gregory, 708
Lipscomb, William, 565 Popper, Karl, 153
Loewi, Otto, 9 Pravaz, Charles G., 49
Long, Crawford W., 25 Prelog, V., 93
Loschmidt, Joseph, 14 Priestle, John, 542
Purcell, Edward, 265

M
MacKinnon, Roderick, 765 Q
Mally, Josef, 677 Quideau, Stéphane, 806
Mann, Thomas, 8, 38
Mares-Guia, Marcos, 499
Mariani, Angelo, 52 R
Marquardt, Fritz, 503 Ramakrishnan, Venkatraman, 838
Marshall, Garland, 356 Ramström, O., 229
Martin, Yvonne, 731 Rarey, Matthias, 368, 441
Meggers, Eric, 618 Reymond, Jean-Louis, 215
Mello, Craig, 247 Richet, Charles, 372
Merrifield, Robert Bruce, 216 Ringe, Dagmar, 143
Meyer, Emanuel, 460 Ritschel, Tina, 460
Meyer, Hans, 677 Röntgen, Wilhelm, 265
Meyer, Hans Horst, 373 Roosevelt, Theodore D. Jr., 28
866 Name Index

Rossmann, Michael, 799, 800 U


Ruzicka, Leopold, 266 Uclaf, Roussel, 708
Umezawa, Hamao, 521, 537
Unwin, Nigel, 758
S
Sadowski, Jens, 318
Sakel, Manfred, 12 V
Šali, Andrej, 438 Vámossy, Zoltán von, 26
Schanker, 406 van de Waterbeemd, Han, 399
Schechter, Israel, 302 Vane, John Robert, 39, 573
Schertler, Gebhard, 726 van’t Hoff, Jacobus Henricus, 90–91, 315
Schiller, Friedrich, 420 Varrus, Marcus Terrentius, 42
Schmiedeberg, Oswald, 25 Venter, Craig, 239, 255
Schulz, Georg, 769 Verne, Jules, 233
Seidel, Wolfgang, 362 von Hohenheim, Theophrastus Bombastus, 5
Seiler, P., 402 von Laue, Max, 265, 269
Sert€urner, Friedrich Wilhelm Adam, 49
Sharpless, Barry, 227
Shaw, Elliott, 499 W
Shoichet, Brian, 369 Wade, Rebecca, 388
Shokat, Kevan, 615 Waksman, Selman A., 8
Singer, Peter, 424 Walkinshaw, Malcolm, 286
Smith, Graham, 359 Wallace, Edgar, 38
Specker, Edgar, 555 Walpole, Horace, 33
Steinbeck, John, 38 Warzecha, Heribert, 593
Steitz, Tom, 838 Watson, James, 316
Stengl, Bernhard, 454, 460 Wells, Horace, 24
Sternbach, Leo, 31 Wells, James, 146, 496
Stevens, Ray, 726 Welte, Wolfram, 769
Stevenson, Robert Louis, 52 Wermuth, Camille G., 349
Strout, Robert, 146 Willett, Peter, 441
Stubbs, Milton, 503 Williams-Smith, H., 234
Sturrock, Edward, 577 Wilson, J.W., 377
St€urzebecher, Jörg, 503 Withering, William, 114
Sumner, James B., 275 Wood, Alexander, 49
Superti-Furga, Giulio, 251 Woodward, Robert, 844
S€uskind, Patrick, 737

Y
T Yonath, Ada, 838
Takamine, Jokichi, 9
Temin, Howard, 828
Topliss, John, 154, 157
Tschudi, Gilg, 50 Z
Tucholsky, Kurt, 38 Zentgraf, Matthias, 328
Subject Index

A Actin–myosin cytoskeleton, 720


A-64662, 540 Actinonin, 594, 595
Abbott, 143, 200, 432, 540, 624, 626, Action potential, 748
637, 667, 730, 731 Activation of the immune system, 487
Abbott Laboratories, 822 Activator protein AP-1, 698
ABC cassette transporters, 767 Active-analogue approach, 356, 368, 369
ABC transporters, 768 Activity–activity relationship, 412
ab initio calculations, 322 Activity spectrum, 158
ABL tyrosine kinase, 610, 611 Acute lymphatic leukemia, 649
Absinthe extract, 372 Acyl–enzymes, 521, 529
Absorption, 173, 397 complex, 96, 518, 520, 521, 523, 622
profile, 408 form, 493
spectroscopic assays, 132 intermediate, 493, 495, 522, 526
ABT-737, 200 Acyl form, 96
ABT-839, 637 Adalimumab, 822
Accelrys, 368 ADAM family (a disintegrin and
Acceptor groups, 358 metalloprotease), 584
Accuracy of the structure determination, 276 Adamlysines, 584
ACD/pKa, 411 Adaptation of Fields for Molecular
ACE. See Angiotensin-converting enzyme Comparison (AFMoC), 388, 389
(ACE) Adaptive structure, 671
Acetaldehyde, 125 Ada Yonath, 838
Acetaminophen, 24, 674, 690 Adenosine, 123
Acetanilide, 23, 24 Adenosine-30 50 -cyclophosphate (cAMP), 720
Acetazolamide, 587–589 Adenosine deaminase, 826
Acetohydroxamic acid, 143 Adenosine monophosphate, 337, 339, 340,
Acetylcholine, 9, 53, 162, 181, 481, 518 342, 343
Acetylcholinesterase (AChE), 228, 230, 518 Adenosine triphosphate (ATP), 474, 480,
Acetylcholinesterase inhibitor, 115 599–601, 603–607, 609, 613, 615, 617,
Acetylsalicylic acid (ASA), 17, 18, 24, 37–41, 618, 638, 720, 752, 768. See Adenosine
156, 177, 478, 688, 690, 783 triphosphate (ATP)
AChE. See Acetylcholinesterase (AChE) synthesis, 770
Aciclovir, 185, 486, 828 S-Adenosyl methionine, 474
Acquired immune deficiency syndrome Adenoviruses, 260
(AIDS), 8, 260, 450, 489, 546, 554, Adenylate cyclase, 480, 720, 754
814, 820 Adipocytes, 709
infection, 788, 789 ADME, 433
patients, 825 parameters, 397
therapy, 549, 554 properties, 410
Acrolein, 181 tox properties, 398

G. Klebe, Drug Design, DOI 10.1007/978-3-642-17907-5, 867


# Springer-Verlag Berlin Heidelberg 2013
868 Subject Index

ADP, 752 Allopurinol, 121, 122


Adrenal glands, 9, 11, 706 Allosteric binding sites, 601, 833
Adrenaline, 9, 13, 158, 159, 185, 678, Allosteric effector, 62
691, 719, 721, 725 Allosteric effector molecules, 14
Adrenal insufficiency, 646 Allosteric enzyme blockade, 831
b-Adrenergic agonists, 30 Allosteric inhibitors, 478, 831
b-Adrenergic antagonists, 161 Allosteric regulation, 763
b2-Adrenergic receptor, 723, 727, 729, 732 Allosteric regulators, 763
Adrenergic receptors, 222 Alternative RNA splicing, 249
Adrenocortical obesity, 668 Alternative splicing, 241, 471
Adrenoleukodystrophy, 767 Altitude sickness, 589
Affymax, 222 Alzheimer mouse, 7
AFMoC. See Adaptation of fields for molecular Alzheimer’s disease, 115, 245, 257, 562, 677
comparison (AFMoC) Amanita muscaria, 480
African sleeping sickness, 120, 638 Amantadine, 792, 793
Aggrastat ®, 783 AMBA. See 3-Aminomethylbenzoic acid
Aggregation behavior, 132 (AMBA)
Agkistrodon rhodostoma, 118 Ameba, 240
Agonists, 9, 62, 113, 719, 720 American Antivivisection Society, 424
b-Agonists, 414 American Patent Office, 245
Agouti mice, 258 Amgen, 815, 822
Agranulocytosis, 45 Amide bond, 291, 292
AH 18665, 56 Amide group, 292
AH 18801, 56 p-Amidinophenylpyruvic acid, 504
AIDS. See Acquired immune deficiency N-Amidinopiperidine, 507
syndrome (AIDS) Amino acids, 9, 189
aIIb/b3 integrin receptor, 180, 778, 780, 784 b-Amino acids, 195
ALADDIN, 731 D-Amino acids, 197, 503, 509
b-Alanine, 807 L-Amino acids, 99
D-Alanine, 519 Amino acid transporter, 182
D-Alanine transporter, 486 g-Aminobutyric acid, 183, 184
Alanine scan, 192, 757 3-Aminomethylbenzoic acid (AMBA),
Albumin, 164 807, 808
Alcohol, 3, 488 Aminopeptidases, 493, 538
dehydrogenase, 674 Aminophenazone, 32
metabolism, 673 p-Aminophenol, 24
Aldehyde dehydrogenase, 124 Aminopterine, 649
Aldehyde reductase, 663 Aminotransferases, 474
Aldo–keto reductases, 644 Amitryptiline, 673
Aldolases, 299, 474 Amodiaquine, 45
Aldose reductase, 168, 328, 660, 661, 665 Amoeba dubia, 240
Aldosterone, 709, 710 AMP, 591
Aldosterone release, 590 Amphetamine, 102, 158
Alfaxalone, 763 Amprenavir, 548, 556
Aliskiren, 18, 418, 419, 543, 544 Amyl alcohol, 372, 403
Alkaloids, 25, 114, 116, 118, 175, Amyloid precursor protein (APP), 562
608, 759, 761 b-Amyloid protein, 562
Alkylating agents, 486 Anakinra, 741
Alkylating compounds, 185, 609 Analytical–deterministic mindset, 18
Alleles, 254 Anamirta cocculus, 350
Allen and Hansburys, 56 Ancrod, 118
Allergic rhinitis, 54 Androgen receptor, 702, 706, 709
Allergies, 756 Androgens, 413
Subject Index 869

Andromachus, 4 Antigen-presenting cells, 802


Androstenone, 737, 738 Antigens, 301, 818
Anemia, 815 drift, 792
Anesthesia, 423 shift, 792
Angina pectoris, 33, 592, 654 Antihelmintic, 761
Angioedema, 580 Antihistamines, 162
Angiogenesis, 581, 584, 774 Antimalarials, 490
Angiotensin, 719 Antimetabolites, 62, 486, 609
Angiotensinase A, 538 Antimycotics, 416, 646
Angiotensinases, 538 Antiparasitic drugs, 416
Angiotensin-converting enzyme (ACE), 14, Antiparasitic therapy, 594
222, 354, 537, 565, 566, 569, Antipodes, 91, 106
572–578, 596 Antisense DNA, 823
enzymes, 732 Antisense–DNA technology, 825
inhibitors, 56, 117, 176, 222, 357, 537, 565, Antisense drug, 825
574, 577, 732 Antisense nucleotides, 740, 825
Angiotensin I, 537, 572, 732 Antisense oligonucleotides, 486, 823
Angiotensin II (ATII), 190, 537, 538, 572, Antisense RNA, 823
732–734 Antisense strand, 247
antagonists, 732 Antisense therapy, 825
receptor antagonists, 733 Antitarget, 165, 754
Angiotensin III, 538 Antithrombotics, 512
Angiotensinogen, 537, 538 Antitussive effect, 49
Angle deformation, 319 Antivirals, 416
Animal experiments, 412, 424 Antra ®, 58
Animal husbandry practices, 793 Anxiety, 722
Animal Liberation, 424 Anxiolytics, 763
Animal models, 126 AO. See Atomic orbital (AO)
Ankylosing spondylitis, 590, 691, 694 APE (Ala-Pro-Glu) motif, 602
Anomalous scattering, 274 Apixaban, 516
Anopheles, 42 Aplysia californica, 759
Antabuse ®, 124 Apolipoprotein B, 654
Antacid, 488 Apolipoprotein B-100, 654
Antagonists, 9, 62, 113, 479, Apoptosis, 199, 527, 739, 803
719, 720 APP. See Amyloid precursor protein (APP)
Anterograde amnestics, 763 Aprepitant, 18, 203, 204
Anthranilic acids, 690 Aprotinine, 118
Antiallergic, 709 AQP1, 772–774
Antiandrogens, 709 Aquaporins, 243, 746,
Antibacterial therapy, 594 772, 773
Antibiotics, 3, 416, 486, 841 Aqueous environment, 72
Antibiotic therapy, 526 Aqueous humour, 589
Antibodies, 301, 488, 820 b1-AR, 726
Antibody–antigen interactions, 132 b2-AR, 726, 727
Antibody-coupled drugs, 185 Arabidopsis thaliana, 239, 240
Antibody-producing cells, 817 Arachidonic acid, 39, 684, 686, 687, 690,
Anticodon loop, 451, 839, 841 691, 694, 720
Anticonvulsives, 12, 763 Arachidonoylethanolamide, 691
Antidepressants, 12, 124, 162, 414 N-Arachidonoyl-p-aminophenol, 691
Antidiabetics, 125, 160 Arcanum, 3
Antiepileptics, 589 Argatroban, 509
Antifebrin, 23 Arg-Gly-Asp (RGD)
Antigen-binding site, 302, 818 motif, 487, 778, 780
870 Subject Index

Arg-Gly-Asp (RGD) (cont.) Attacks on scientists, 424


pharmacophore pattern, 783 AutoDock, 441
sequence, 779 Autoimmune disease, 591
Ariëns, 101 Automated parallel synthesis, 211
Arildone, 800 Autophosphorylation, 738, 821
Aromas, 736 Autumn crocus, 4
Aromatic hydrocarbons, 673 Azathioprine, 121, 122
Arrhythmias, 45, 755 Azidothymidine, 830
Arsphenamine, 17, 27 Azithromycin, 844
Artemisia annua, 47 Azoles, 486
Artemisine, 48 AZT, 552, 830
Artemisinin, 17 AZT-monophosphate, 830
Artemisinine, 48
Arterial hypertension, 486, 623
Arteriosclerosis, 654 B
Artesunate, 48 BACE-1 and 2, 562
ASA. See Acetylsalicylic acid (ASA) Bacillus cereus, 750, 752
A-site, 840, 841 Back pocket, 605, 614, 615
Aspartame, 33 Bacteria, 62, 489, 650
Aspartic peptidases, 493 Bacterial KcsA channel, 757
Aspartic-protease inhibitors, 536, 537 Bacterial rhodopsin, 722
Aspartic proteases, 243, 418, 533, 537, 541, Bacteriophage M13, 213
543, 546, 556 Bacteriorhodopsin, 721
Aspergillus alliaceus, 119 Baculovirus, 246
Asperlicin, 119 Bajusz, 503
Aspirin ®, 18, 37, 38, 40, 41, 49, 61, BAK, 199
594, 783 Baker’s yeast, 239
Assemblines, 523 Ball-and-stick models, 324
Association constant Ka, 66 Balloon catheter treatment, 260
Association rate (kon), 169 Bambuterol, 177, 179
Astacin, 566 Barbital, 25, 763
Astatine, 822 Barbiturates, 17, 488, 673, 763
Astemizole, 755, 756 b Barrel, 299, 766, 770
Asthma, 257, 591, 759, 801, 825 Barry Sharpless, 227
Astra, 19 Basal membrane, 527
AstraZeneca, 19, 508, 512, 514, 516 Base-exchange reaction, 452
Asymmetric center, 89, 91 Base triplets, 839
Asymmetric unit, 267 BASF, 443
Asymmetrical center, 91 Basiliximab, 821
Atesunate, 47 Batimastat, 582, 584
Atherosclerosis, 653, 659, 784 Batrachotoxin, 117
Atomic orbital (AO), 322 Baxter, 815
Atorvastatin, 19, 658 Bayer, 23, 27, 120, 516, 585, 658
Atovaquone, 48 Bayer cross, 38
ATP. See Adenosine triphosphate (ATP) Bayer-Schering, 19
ATP-dependent channel, 752 Bay K 8644, 102
ATP-dependent K+ channels, 754 B-cells, 802
AT1 receptor, 732, 734 BCL, 199
AT2 receptor, 734 BCL-2, 200
AT1 receptor antagonist, 732 BCL-2 protein, 825
Atrial fibrillation, 502 BCL-XL, 199–201
Atropine, 50, 114 BCL-XL (B-cell lymphoma), 199
Atropisomers, 91 BCL-XL protein, 790
Subject Index 871

BCR-ABL Blood coagulation, 501


fusion gene, 609 B lymphocytes, 802, 803
kinase, 614 BMS, 516
receptor tyrosine kinase, 611, 612 BMS-214662, 635
Bechterew’s disease, 590, 691 Body temperature regulation, 688
Beecham, 655 Boehringer Ingelheim, 508, 509, 552, 553, 815
Behringwerke, 505, 507 Boltzmann distribution, 326
Beilstein, 211, 216 Bond-breaking transition state, 475
Beilstein database, 318 Bond stretching, 319
Benserazide, 183, 629 Bone density, 706
Benzene, 175, 214 Bone resorption, 527
Benzocaine, 53 Bortezomib, 18, 524, 525
Benzodiazepines, 12, 763–765, 781 Boscutan, 18
Benzoic acid, 175, 406 Bothrops jararaca, 573
Benzylpenicillin, 521 Bovine insulin, 814
Benzylsuccinic acid, 145, 574, 575 Bovine rhodopsin, 722–724
Bernhard Stengl, 454, 460 BPP. See Bradykinin potentiating peptide
Betamethasone, 710 (BPP)
Beta-site-APP-cleaving enzymes, 562 Bradycardia, 9
Betaxolol, 725, 726 Bradykinin, 573, 579, 596, 732
Bicarbonate, 586 Bradykinin potentiating peptide (BPP), 573
Bile acids, 488, 653, 697, 709 Breast cancer, 561, 707, 708
Bilharziosis, 33 Brian Kobilka, 723–724
Biliary elimination, 410 Bristol–Myers–Squibb, 19, 514, 613, 614, 635
Biliary excretion, 540 British Biotech, 584
Bilinear model, 401 Brodie, 406
Bimosiamose, 786, 787 Bulk water, 71, 381
Binding affinity, 165, 380, 386 Bump and hole method, 614, 615, 617, 638
Binding constant, 66 Bundesinstitut f€ur Arzneimittel und
Binding enthalpy, 141 Medizinprodukte (BfArM), 659
Bioavailability, 173, 174, 180, 813 Burimamide, 55, 56
Biochemical pathways, 134 Butaclamol, 100
Biogenic amines, 9, 124 Butane, 336, 340
Bioinformatics, 242 n-Butane, 336
Bioisosteric replacement, 155 Butterflies, 115
Biologically active conformation, 730 BVT-2733, 667, 669
Biologicals, 302 BX5633, 104, 105
glue, 529
half-life, 411
macromolecules, 61 C
substances, 121 CAAX sequence, 633
Biopharmaceuticals, 62 Ca2+/calmodulin-dependent phosphatase, 487
Biotin, 252, 474 Caco-2, 409
Biozentrum Basel, 438 Caco-2 models, 400
Bipolar disorder, 33 Cadherins, 243
Bis-b-chlorethylsulfide, 180 Cadmium, 565
Bismuth, 822 Caenorhabditis elegans, 134, 240
Blastocytes, 245 Caffeine, 114, 673
Blindness, 659 Cahn–Ingold–Prelog rule, 92
b-Blockers, 7, 61, 157, 161, 185, 419, 420, 725 CAI, 590
Blood–brain barrier, 158, 162, 173, 176, CAII, 587, 590
182–184, 401, 403, 629, 730 Ca2+ intracellular, 483
Blood–brain barrier penetration, 409 Ca2+ ions, 720, 748
872 Subject Index

CA IX, 590 Carboxy serine peptidases, 524


Calcineurin, 487 Carbutamide, 160
Calcium Carcinogenic effects, 398
channel blockers, 56, 61, 102, 483 Cardiac arrest, 756
channels, 31, 483, 590, 748 Cardiac arrhythmias, 412, 749, 756
ions, 748 Cardiac glycosides, 709
California sea slug, 759 Cardiovascular diseases, 3
Calorimeter, 140 Carfilzomib, 525
Calpain II, 529 Carica papaya, 526
Calpains, 527, 528, 530 Carpet viper, 779
Cambridge Crystallographic Database, 317, Cartilage tissue, 581
318 CAs. See Carbonic anhydrases (CAs)
Cambridge Crystallographic Data Centre, 284, Caspase inhibitors, 530
308, 365 Caspases, 243, 527, 528
Cambridge database, 364, 368, 369, 551 Catalyst, 368
cAMP. See Cyclic adenosine monophosphate Catalytic serine, 523
(cAMP) Catalytic sites, 299, 302
cAMP-dependent kinase, 603 Catalytic triad, 308, 309
Cancers, 245, 257, 524, 697, 823 Cataracts, 623
therapeutics, 515 Catechol-O-methyltransferase (COMT), 479,
therapy, 581 628–633, 637
Candesartan, 734, 735 Cathepsin D, 534, 536, 543, 546, 555,
Candida albicans, 562, 638 561, 562
Candida antarctica, 96 Cathepsin E, 546
Capecitabin, 181, 182, 826 Cathepsins, 527, 528
Captopril, 14, 17, 56, 222, 575, 576, 580, 673 Cathepsins B, L, K, M, 527
CAR. See Constitutive androstane receptor Cats, 793
(CAR) Cattle trypanosomiasis, 120
Carazolol, 724–726 Caudate nucleus, 413
Carbamazepine, 673 Cavbase, 435
Carbamoyl–enzyme complex, 518 CAVEAT, 204, 205
Carbamoyl–esterase complex, 519 CA VI, 590
Carbamoylsarcosine, 474, 475, 477 CA V inhibitor, 590
Carbapenem, 523 CA XII, 590
Carbenoxolone, 668 CB1 receptor antagonist, 691
Carboanhydrase II, 228 CCK receptor, 120
Carbonic anhydrase I (CAI), 389, 391–394 CCR5-receptor, 488, 789
Carbonic anhydrase II (CAII), 104, 389–394, CD20 antigen, 822, 823
587, 589 Cdc28 gene, 617
Carbonic anhydrase II inhibitors, 589 Cdc28 protein kinase (cyclin dependent
Carbonic anhydrase inhibitor, 589 kinase), 617
a-, b-, g-, and d-Carbonic anhydrases, 586 CDK2, 617
Carbonic anhydrases (CAs), 106, 243, 565, CD8+ lymphocytes, 803
569, 585 CDR. See Complementarity determining
Carbon monoxide (CO), 646, 670 regions (CDR)
Carbopeptidase A, 565, 569, 574, 575 CDR3a, 806, 808
Carboplatin, 311 CDR3b, 806, 808
Carboxylases, 474 CD4 receptor, 789
Carboxylesterase, 181 CDR3 loop, 808
Carboxypeptidase–benzylsuccinate CDR loops, 818
complex, 574 CD8+ T-cell receptors, 803
Carboxypeptidase G2, 187 Celecoxib, 18, 178, 588, 590, 691, 692
Carboxypeptidases, 14, 187, 493, 538, 566 Celera Genomics, 239
Subject Index 873

Celiac disease, 529 Chloroquine, 17, 45


Cell adhesion, 778 Chlorproguanil, 48
Cell–cell contacts, 243 Chlorpromazine, 12, 17, 32, 162, 163, 414,
Cell–cell recognition, 487, 599 415, 673
Cell cycle, 837 CHO cells, 822
Cell cycle regulation, 134 Cholecystokinin (CCK), 119
Cell death, 604 Cholesterol, 653, 709
Cell differentiation, 604, 697 Cholesterol biosynthesis, 653
Cell growth, 604, 697 Cholestyramine, 653
Cell membranes, 697, 719, 837 Cholinergic, 9
Cell proliferation, 590 Cholinesterase inhibitors, 518
Cellular immune defense, 803 Cholinesterases, 177
Cellular immune response, 837 Chorea Huntington, 233
Cell-wall biosynthesis, 486, 519 Chromane, 572
Central nervous system (CNS) Chromane group, 572
acting drugs, 414 Chromophoric reactions, 132
active substances, 401 Chronic alcoholics, 674
bioavailability, 403 Chronic anxiety, 12
Cephalosporins, 28, 118, 119, 479, 486, 520 Chronic inflammatory diseases, 3
Cerivastatin, 658, 673 Chronic myeloid leukemia, 251, 609
Cetus, 235 Chronic obstructive lung disease, 591
Cetuximab, 821 Chronic obstructive pulmonary disease
cGMP, 590–592 (COPD), 801
CGO-38560, 542 Chronic polyarthritis, 691
CGP-38560, 543 Chugai, 815
CGS 27023A, 584, 585 Chymosin, 534
Chagas disease, 8, 638 Chymotrypsin, 303, 494, 495, 498, 512, 540
Charge-assisted hydrogen bond, 69 Cialis ®, 591
CHD. See Coronary heart disease (CHD) Ciba, 542
Cheese effect, 13, 681 Ciclosporin, 191, 487, 673, 837
Chemical abstracts, 211, 216 Ciclosporin A, 118
Chemical promiscuity, 64 Cigarette smoke, 510
Chemical shift, 282 Ciglitazone, 711, 713
Chemiluminescence, 134 Cimetidine, 17, 55, 56, 673
Chemoenzymatic synthetic strategy, 191 Cinchona, 8
Chemokine receptor CCR5, 790 Cinchona bark, 42, 43
Chemokines, 740 Cinchona officinalis, 8
Chemotherapeutics, 3, 826 Ciprofloxacin, 673, 834, 835
Chemotherapy of tumor disease, 649 Cisapride, 755, 756
Chicken DHFR, 651 Cisplatin, 311, 313
Chimeric proteins, 820 11-cis-Retinal, 723
Chinese Pharmacopeia, 5 Citalopram, 683
Chiral, 89, 95, 99, 101, 106, 108 Citric acid, 340
centers, 91 Civilization diseases, 623
pool, 94 c-Kit receptor kinase, 611
Chirality, 89 Cladribine, 826
Chiron, 222 Clarithromycin, 673, 844
Chloralhydrate, 25 Claviceps purpurea, 118
Chloramphenicol, 177, 179, 486, 841, 845, 846 Clavulanic acid, 523
Chlordiazepoxide, 12, 17, 31 ClC channel, 765, 766
Chloride channels, 350, 483, 747 ClC-0 channel, 767
Chloroform, 25 ClC-chloride channel, 243
Chlorophenylmethoxybenzyloxypiperidine, 543 Clenbuterol, 160
874 Subject Index

Click chemistry, 227 Complementarity determining regions


Clift ®, 2 (CDR), 818
Clindamycin, 841 Complementary DNA (cDNA), 246, 252, 253
Clinical trials, 397, 421 Complementary mRNA, 823
Clobutinol, 754, 755 Composite receptor profiles, 737
Clofibrate, 33, 176, 653, 711, 713 Computational method, 355
CLOGP, 411 CoMSIA. See Comparative molecular
Clomiphene, 706 similarity indices analysis (CoMSIA)
Clonidine, 33 COMT. See Catechol-O-methyltransferase
Clopidogrel, 478 (COMT)
Clorgyline, 678, 682 Concentration gradients, 484, 746
Clotrimazole, 673 CONCORD, 318
Clozapine, 414, 415 Conformational, 353
ClpP-protein, 526 analysis, 343
CNS. See Central nrevous system (CNS) change, 195
CO. See Carbon monoxide (CO) flexibility, 338
Coactivators, 698 restriction, 203
Coagulation cascade, 478, 494, 514 transformations, 778
Coagulation process, 8 Congestive heart failure, 2, 4
Coagulopathy, 257, 478 Coniine, 114
Coal tar, 23 Conjugation, 174
Cobalt, 565 Connective tissue, 581
Coca Cola, 52 a-Conotoxin, 760–762
Cocaine, 5, 17, 51–53, 59, 61, 108, Constitutionsformel der Organischen Chemien, 14
114, 478 Constitutive androstane receptor (CAR), 714
Cocktail, 146 Construct, 260
Codeine, 49, 113, 673, 675 Contergan ®, 99, 102, 423
Coding triplet, 254 Contour maps, 386, 391
Codons, 213, 839 Convergent synthesis strategy, 226
Cofactors, 342, 641, 644, 699 Copper, 641, 822
Colchicum autumnale, 4 Corepressors, 698
Collagen, 581 CORINA, 318
Collagenase inhibitors, 582 Coronary heart disease (CHD), 623, 653,
Collagenases, 120, 566, 581 706, 708
Combination drugs, 490 Corona viruses, 528
Combination preparations, 490 Corpus luteum hormone, 708
Combination therapy, 554 Correlation coefficient ®, 377
Combinatorial chemistry, 136, 211, 221, COR therapeutics, 781
224, 225 Corticoids, 709
Combinatorial substance libraries, 126 Corticosteroids, 29, 709
COMBINE method, 388 Corticosterone, 666, 668, 669, 710
CoMFA. See Comparative molecular field Cortisol, 666, 710
analysis (CoMFA) Cortisone, 666
Compactin, 655 Coulomb fields, 387
Comparative molecular field analysis Coulomb potentials, 380, 382, 384, 387
(CoMFA), 381, 382, 384, 386, 387 Coulomb’s law, 319
method, 388 Counterion, 404
models, 386, 387 Countess powder, 42
Comparative molecular similarity indices COX. See Cyclooxygenases (COX)
analysis (CoMSIA), 387, 389 COX-1, 40, 684–686, 688, 690–692, 694
Compartmentalization, 413 COX-2, 40, 684, 685, 687–689, 691, 692, 694
Compensation of enthalpy and entropy, 168 CP 96 345, 418
Competitive inhibitors, 491 CPK models, 323
Subject Index 875

3C proteases, 528 CYP2R1, 670


Crack, 53 Cyproterone acetate, 709
CRC220, 505, 507 Cysteine-protease inhibitor, 305
Creatinase, 474, 475 Cysteine proteases, 243, 493, 526, 528, 562
Creatine, 474–477 Cysteine residue, 147
Crohn’s disease, 245, 825 Cystic fibrosis, 245, 257
Crossvalidation, 385 Cytidine deaminase, 181, 182
Cruzipain, 527 Cytochrome C, 435
Cryo-electronmicroscopic, 799 Cytochrome CYP 3A4, 659
Cryoelectron microscopy, 279 Cytochrome P450 enzymes, 165, 641, 675
Crystal lattice, 339 Cytochrome P450s (CYPs), 56, 243, 412, 646,
Crystalline hemoglobin, 286 669, 670, 714–716, 844
Crystallization, 266 Cytochromes, 437
Crystal packing, 364, 369 Cytokine receptor, 791
Crystal structure analysis, 265 Cytokines, 243, 584, 741, 803
Cubane, 214 Cytomegaloviral retinitis, 825
Curare, 115 Cytomegalovirus, 523
Cushing syndrome, 668 Cytoskeleton, 243
Cutter/Miles (Bayer), 815 Cytosol, 482
CVX15, 726 Cytotoxic killer cells, 837
CXCR4, 789 Cytotoxic T killer cells, 803
CXCR4 chemokine receptor, 726
Cyanide, 646
Cyanopindolol, 726, 727 D
Cyclamate, 33 Dabigatran, 509
Cyclic adenosine monophosphate (cAMP), Daclizumab, 821
474, 480, 590, 591, 720 D1 agonists, 730
Cyclic guanosine monophosphate (cGMP), 33 Dalfopristin, 845, 846
Cyclin-dependent kinase inhibitors, 526 Danger of misuse, 19
Cyclines, 601 Danio rerio, 136
Cycloguanil, 179, 180 Dapivirine, 832
Cyclooxygenase II inhibitor, 590 Dapsone, 48
Cyclooxygenase inhibitors, 688 Daptomycin, 771
Cyclooxygenases (COX), 39, 40, 58, 66, 156, Darunavir, 166
178, 646, 684 Dasatinib, 613, 614
Cyclophilin, 487 Database search, 367
Cyclophosphamide, 180, 181 Data set, 381
D-Cycloserine, 486 Daunorubicin, 313
Cyclosporin A, 17 DBD. See DNA-binding domain (DBD)
Cyclotheonamide, 500 DCCI. See Dicyclohexylcarbodiimide (DCCI)
Cyclotheonamide A, 500 3D database, 550, 731
Cylcosporin, 113 DDI, 553
Cylinder, 299 DDT, 490
Cyp3, 286 Deacetylating enzymes, 596
CYP 3A, 716 Deaminases, 827
CYP 3A4, 56, 671–673, 675, 693, 826 Debrisoquine, 675, 676
CYP5A1 (thromboxane synthase), 670 Decahydroisoquinoline, 546
CYP19A1 (aromatase), 670 Decanol, 403
CYP 3A4 inhibitors, 547, 671, 844 Decarboxylases, 474
CYP 2C9, 671, 675, 693 Decarboylase inhibitor, 183
CYP2D6, 671, 675, 676, 693 Deconvolute, 215
CYP2E1, 673, 674, 676 Deconvolution, 220, 225
CYP2J2, 670 Deformylase, 599
876 Subject Index

Degrees of freedom, 74, 75 Diels–Alder reactions, 227


11-Dehydrocorticosterone, 666 Diethylene glycol, 423
Dehydrogenases, 299, 474, 642 Diethyl ether, 372
Deinococcus radiodurans, 838 N,N-Diethyl lysergamide, 29
Delyside ®, 30 N,N-Diethyl nicotinamide, 29
De Materia Medica, 4 Diethylstilbestrol, 703, 705, 706
Dementia, 677 Diffraction experiment, 280
Dendritic cells, 488, 803 Diffraction pattern, 269
de novo design, 433, 444 Diffractometer, 278
de novo drug design, 306 Diffusion character, 133
Density functional theory, 323 Digestive enzymes, 494
Depolarization, 748 Digitalis purpurea, 114
L-Deprenyl, 678–680 Digitoxin, 115
Depression, 12, 125, 677, 717, 721, 722 Digoxin, 113, 115, 769
Design, 157 Dihdrofolate reductase, 646
Desipramine, 14 Dihedral angles, 279, 294, 319
Desolvation, 71, 82 Dihedral angles f and y, 293
Desolvation processes, 381 Dihydroartemisinine, 48
Desoxyuridylate, 646 Dihydrofolate (DHF), 357–360, 367, 649
Detergents, 133, 266 Dihydrofolate reductases (DHFRs), 121, 358,
Devazepide, 119 647, 649, 650
De viribus electricitatis in motu musculari, 6 inhibition, 649
Dexamethasone, 673, 710 inhibitors, 8, 491, 649
DFG loop, 603, 612 Dihydrofolate synthesis, 156
DFG (Asp-Phe-Gly) motif, 602 Dihydrofolic acid (DHF), 28, 486, 644
DFP. See Diisopropylfluorophosphate (DFP) Dihydroimidazoles, 350
DHF. See Dihydrofolate (DHF); Dihydrofolic Dihydropyridine ring, 363
acid (DHF) Diisopropylfluorophosphate (DFP), 494
DHFRs. See Dihydrofolate reductases Diketopiperazine, 176, 177
(DHFRs) Diltiazem, 749
Diabetes, 3, 245, 622, 697, 815, 816, 847 N,N-Dimethyl-b-bromophenethylamines, 376
mellitus, 739, 814 Dioskurides, 4
therapy, 517 Dipeptide isostere, 539
Diabetics, 813 Dipeptidylaminopeptidase IV (DPP IV), 516
angiopathy, 784 Diphenhydramine, 54, 162
neuropathy, 664 Diphosphoglyceric acid, 430, 431
Diacylglycerol, 720 Dipivefrin, 186
3,4-Diaminopyrrolidine, 557 1,3-Dipolar cycloadditions, 222, 227
Diastereomeric mixture, 93 Diprotic acid, 533
Diastereomers, 93 Direct methods, 274
Diazepam, 19, 763 Dirty drugs, 416
Diazoxide, 752, 753 Dirty ligands, 417
Dibenzyl-4,40 -dialdehyde, 431 Dirty test models, 417
Dicer protein, 248 Disoxaril, 800
Dichlorophenyltrichloroethane (DDT), Dissociation constant Kd, 66
43–45, 59 Dissociation rate (koff), 169
Dichorodiphenyldichloroethylene (DDE), Dissociation rate constants, 138
44, 45 Distance–geometry calculations, 283
Diclofenac, 690, 694 Distribution, 397, 433
Dicoumarol, 124, 125 coefficient D, 408
Dicyclohexylcarbodiimide (DCCI), 217 systems, 403
Didanosine, 552 Disulfide bridges, 246
Dielectric constant, 405 Disulfiram, 124
Subject Index 877

Diuretics, 124, 589, 774 D2-type dopamine, 417


D/L nomenclature, 92 Ducks, 793
Dlog P value, 403 Duke University, 788
DMP-323, 551 Duodenal ulcers, 3
DMP-412, 551 DuP 753, 732
DNA, 41, 61, 147, 236 Dupont, 732, 734
ligase, 235 Dupont–Merck, 368, 550, 551
polymerase, 236, 827, 828 Dynamic combinatorial chemistry, 227
synthesis Inhibitors, 609
DNA-binding domain (DBD), 481, 699
DNA-response element, 699 E
DNA–RNA hybrid strand, 829 Eating disorders, 721
Dobutamine, 160 ECE. See Endothelin-converting enzyme
Docetaxel, 717 (ECE)
DOCK, 441 Echis carinatis, 779
Docking programs, 441 Eclipsed, 335
Docking techniques, 429 Ecstasy, 158, 159
Dolabella auricularia, 116 Ecuadorian poison dart frog, 116, 759
Dolastatine, 116 Edatrexate, 649
Doorkeeper, 765 Edman degradation, 220
L-DOPA, 17, 158, 182, 183, 484, 490, 629, EEEE locus, 754
631, 729 Efegatran, 503, 504
Dopamine, 12, 13, 158, 159, 162, 181–183, Effector proteins, 480
324, 414, 415, 417, 484, 485, 677, 678, EGF. See Epidermal growth factor (EGF)
694, 719, 729–730 EGR-1, 257
antagonist, 163 Eicosanoid metabolism, 670
D2 receptor, 51 Elaspol ®, 512
receptor ligands, 729 Elastase, 495, 498, 510
receptors, 415, 729 Elastase inhibitors, 510–512
Dopaminergic, 9 Electric ray, 758
D-orbitals, 594 Electrochemical potential, 745
Dorzolamide, 17, 588, 589 Electrochemical redox cells, 746
Double-prodrug strategy, 180 Electrolyte homeostasis, 537, 709
Double-stranded RNA, 247 Electron microscopy, 481, 721, 768
DP178, 788 Electron-transport protein, 106
DPG, 430 Electrophiles, 175
DPP IV, 495, 516, 517 2D Electrophoresis, 250
3D-QSAR, 388 Electroshock, 12
D1 receptor, 417, 729, 730 Electrostatic interactions, 72
D2 receptor, 414, 730 Electrostatic potential, 358
D3 receptor, 414 Elementary unit cell, 267
D4 receptor, 415 Eli Lilly, 79, 814–816, 843
Dreiding models, 323, 324 ELISA. See Enzyme-linked immunosorbent
Dropsy, 2, 4 assay (ELISA)
Drosophila melanogaster, 136, 239, 240 Elixir, 3
Drug–drug interactions, 674 EMBL, 388
Drug market, 472 Embryonal stem cells, 258
Drug metabolism, 165, 670, 671 Embryonic development, 697
DrugScore, 366, 388 Embryonic morphogenesis, 243
Drug targeting, 183, 185 Emesis, 204, 722
3D structure–activity analysis, 381 Emetin, 114
3D structure–activity relationships, 571 Emphysema, 510, 823
D1-type dopamine, 416 Empty hydroempty hydrophobic pocket, 458
878 Subject Index

Enalapril, 17, 56, 176, 177, 577, 580 Epilepsy, 765


Enalkiren, 540 Epinephrine, 9
Enamelysin (MMP-20), 581 Epipedobates tricolor, 116
Enantiomeric excess (ee) value, 94 Epitope mapping, 218
Enantiomers, 89–92, 94–96, 99, 102–108 Eplerenone, 709, 710
Enantiopreference, 98 Epothilones, 487
Enbrel ®, 822 Eprosartan, 735
Encephalitis, 798 Eptifibatide, 780, 781
Endocrine adenocytes, 706 Equieffective molar dose, 375
Endocytosis, 488 a-ER, 706
Endogenous antioxidant systems, 660 b-ER, 706
Endogenous peptides, 804 Erectile dysfunction, 591
Endogenous proteins, 813, 814 Ergoline alkaloids, 29, 34
Endopepeptidases, 493 Ergosterine biosynthesis, 32
Endopeptidase24.11, 565 Ergosterol, 486
Endoperoxide PGG2, 685 Ergot, 118
Endoplasmatic reticulum (ER), 803 Ergot alkaloids, 119
b-Endorphine, 192 Ergotamine, 118, 119
Endothelin, 190, 719 Erythrocyte deformation, 432
Endothelin-converting enzyme (ECE), Erythrocytes, 772
565, 566 Erythromycin, 672, 673, 843–846
Endothiapepsin, 536, 541 Erythropoietin (EPO), 113, 118, 738, 740,
Enfuvirtid, 18, 553 741, 815
Enkephalins, 192, 732 Erythropoietin receptor (EPOR), 740
Entacapone, 631 Escherichia coli, 135, 213, 240, 246, 450, 595,
Enterobiasis, 761 650, 651, 814
Enthalpic, 380 DHFR, 651
Enthalpically driven binders, 166 K12, 234
Enthalpy (DH), 67, 68, 141, 166, 442 E-selectins, 487, 784
Enthalpy-driven binding, 75 E-site, 840
Entropic, 380 Esterases, 94, 413, 474, 493, 517, 565
Entropically driven binders, 165 Esterification, 178
Entropic optimization, 82 Estradiol, 698, 700, 701, 703, 706, 716
Entropy (TDS), 68, 74, 141, 166, 381, 442 Estrogen receptor, 700, 703, 706, 708
Entropy driven, 75 Estrogens, 413
Enzymatic kinetic resolution, 90 Estrone, 17
Enzymatic reaction, 132 Etanercept, 822
Enzyme-inhibitor assays, 11 ETH, 449, 460, 508, 510
Enzyme-linked immunosorbent assay Ethanol, 25, 125, 403
(ELISA), 132, 252 Ether anesthesia, 25
Enzymes, 62, 113, 484 Ethinylestradio, 673
dihydrofolate reductase, 358 p-Ethoxyacetanilide, 24
inhibitors, 122 Ethylcarbamate, 25
kinetics, 453 Ethylenediaminetetraacetic acid (EDTA), 565
substrates, 113, 121 Etoricoxib, 691
Epalrestat, 663, 664 Etorphine, 50
Ephedrine, 113, 114, 158, 159 Etravirine, 833
(1R,2S)-(–)-Ephedrine, 95 Eukaryotes, 240
Epibatidine, 116, 117, 759–761 Euphorbiaceae, 175
Epidermal growth factor (EGF), 482, 739 Evolution, 256
Epidermal growth factor receptor, 821 Evolutionary strategy, 157
Epigenetics, 258 Exanta ®, 509
Epigenome, 258 Excretion, 397
Subject Index 879

Exocytosis, 777 Fischer convention, 91


Expression pattern, 252 Fishberry, 350
Extensive metabolizers, 675 FK 506, 118
Extracellular ligand-binding domains, 758 Flap region, 541, 544
Extracellular matrix, 527, 778 Flavanoids, 671
Extravasation, 784 Flavin adenine dinucleotide (FAD), 474, 642,
644, 645
Flavinmononucleotide (FMD), 474
F Flavinucleotides, 642
Fab branch, 302 Flavones, 114, 350
Fab domains, 817–819 Flavoproteins, 644, 645
Fab regions, 817 Flavr-Savr tomatoes, 825
Factor VII, 501 FlexX, 441
Factor VIIa, 495, 497, 501, 502, 512, 514 Flip-books, 305
Factor VIII, 113, 118, 502, 529, 814, 815 Fluconazole, 646, 671, 673
Factor VIII deficiency, 814 Fluid mosaic membrane, 65
Factor X, 501 Flumazenil, 764, 765
Factor Xa, 303, 494, 495, 497, 502, Fluorescence
512, 514–516 anisotropy, 133
Factor Xa inhibitors, 514, 516 marker, 305
FAD. See Flavin adenine dinucleotide (FAD) measuring techniques, 133
FAD/NAD(P)-binding domains, 243 Fluorescence correlation spectroscopy (FCS),
Falcilysin, 562 133
Falcipain, 527 Fluorescence resonance energy transfer
Falcipaines, 562 (FRET), 133
False-negative, 136 Fluorine NMR, 143
False-positive, 136 Fluorophores, 133
False substrates, 826 5-Fluorouracil, 181, 182, 826
Famciclovir, 17 Fluoxetine, 14, 17
Familial adenomatous polyposis (FAP), 590 Flurbiprofen, 690
Farnesyl anchor, 633 Flu vaccine, 793
Farnesyldiphosphate, 634–6367 Fluvastatin, 33
Farnesyltransferase inhibitors, 637 Flu virus, 488
Farnesyl transferases (FTases), 487, 633–635, FMD. See Flavinmononucleotide (FMD)
637 FMN, 644, 645
Fatty acids, 697, 698, 709 Foam cells, 654
Fc domain, 302, 817 Folding patterns, 297, 299
FDA. See Food and Drug Administration Folding problem, 430
(FDA) Folic acid analogues, 650
Feature-Trees method, 368 Follicle hormone, 708
Fexofenadine, 162 Fomivirsen, 825
Fibrates, 659, 710, 711 Food and Drug Administration (FDA), 423,
Fibrin, 118, 502 655, 801
Fibrinogen, 118, 190, 502, 653 Foot-and-mouth disease, 798
Fibrinogen-receptor antagonist, 781 Force field, 319, 337, 342
Fibrinopeptide, 503 Force-field calculations, 14, 318–319, 321
Fibronectin, 244 Formamide, 291, 292
Fick’s law of diffusion, 400 N-Formylmethionine, 599
Fidarestat, 76, 77, 664 Fosamprenavir, 548
First pass, 173 Fosmidomycin, 48
First to file, 655 Fourier transform, 274, 275, 280
First to invent, 655 FP domain, 788
Fischer assignment, 92 Fragment, 139
880 Subject Index

Frances Power Cobbe, 423 Gene family, 158


Francine Jotereau, 806 Genentech, 235, 496, 815
Free energy, 141 Gene products, 493
Free enthalpy, 442 General anesthesia, 3
Free–Wilson analysis, 378, 380, 394 Gene regulation, 697
Fructose, 660 Gene silencing, 4, 135
Fruit fly, 136, 241 Gene targeting, 245
FTase inhibitors, 637 Gene technology, 11, 118, 130, 261,
FTases. See Farnesyl transferases (FTases) 741, 814
Fucose, 785 Gene technology methods, 11, 813
Fugu fish, 117, 483 Gene therapy, 259
Fulvestrant, 707, 708 Genetic diseases, 256
Functional antagonist, 62 Genetic polymorphisms, 254, 661, 737
Functional assay, 452 Genome, 252
Fungal infections, 486 Genzyme Transgenics, 815, 820
Furin, 495, 515, 523 Geranylgeranyl anchor, 633
Furosemide, 17, 160 Geranylgeranyl groups, 634
Fusion inhibitor, 790 Geranylgeranyl transferases I and II
Fusion process, 788 (GGTase I and II), 633
Fusion protein, 252 Germanin ®, 120
Fuzeon ®, 790 Gestagens, 413
GF protein, 135
GGTases, 635, 637
G Gi, 720
G12/13, 720 Gibbs free energy, 67
GABA. See g-aminobutyric acid (GABA) Gilead Sciences, 795, 796
GABAA receptors, 757, 762, 763 GIP, 516
Gag-pol gene, 828 Gi/0 proteins, 480
Galactose, 785 Glaucoma, 589, 774
b-Galactosidase, 134 Glaucoma inhibitors, 589
Gallic acid, 631 Glaxo, 19, 56
Galvus ®, 517 GlaxoSmithKline (GSK), 19, 56, 711, 795
g-aminobutyric acid (GABA), 183, 184, 483, Gleevec ®, 611
489, 757, 762, 763 Glibenclamide, 161
Gastric calculus stones, 5 GLP-1, 516, 754
Gastric ulcer disease, 589 GLP1R, 754
Gastrin, 53 Glucocerebrosidase, 815
Gastroduodenal ulcers, 53 Glucocorticoid receptor, 706, 708, 717
Gastrointestinal diseases, 756 Glucocorticoids, 673, 709
Gastrointestinal flora, 256 Glucocorticoid steroids, 413
Gastrointestinal motility disorders, 722 Glucose, 92, 660
Gatekeeper residue, 605, 614, 615 Glucose-1,6-bisphosphate, 474
Gauche, 335, 336 Glucose metabolism disorders, 659
Gaucher’s disease, 815 Glutamate receptors, 483
Gaussian function, 384, 387 Glutathione, 175
G-CSF. See Granulocyte colony-stimulating Glutathione transferase, 175
factor (G-CSF) D-Glyceraldehyde, 92
Gefitinib, 739, 740 L-Glyceraldehyde, 92
Gelatinases, 569, 581 Glycine, 174, 763
Gelatinases (MMP-2, -9), 581 Glycine receptor, 762
Gemfibrozil, 659 Glycopeptide antibiotics, 489
Geminal-diol transition state, 534 Glycopeptide transpeptidase, 519, 520
Gene expression, 240 Glycoprotein 170, 490
Subject Index 881

Glycoprotein aIIbb3, 487 Guanosine, 287


Glycoprotein gp120, 726 Guanosine-30 ,50 -cyclophosphate
Glycoprotein GP 170, 64, 412, 485 (cGMP), 720
Glycoprotein matrix, 581 Na-Guanylhistamine, 55
Glycoproteins, 784 Guanylylcyclase, 592
Glycosides, 114 Guinness Book of World Records, 41
Glycosylated proteins, 237 Gyki 14766, 503, 504
Glycosylation, 599 Gylcosylation, 246
Glycyrrhiza glabra, 666 Gyrase, 486, 833, 848
Glycyrrhizin, 666, 667
GMD, 441
GMP, 591 H
Go, 720 H 261, 547
GOLD, 441 HAART therapy, 554, 790
Gonorrhea, 650 Haemophilus influenzae, 239
Gout, 4, 121 Haloarcula marismortui, 838
gp41, 788, 789 Halofantrine, 45
gp120, 788, 789 Haloperidol, 32, 51, 414, 415, 417, 673
GP170, 490 Hammett equation s, 373
GPCRs. See G protein-coupled receptors Hansch analysis, 375, 378
(GPCRs) Hansch equation, 380
GPCR-type olfactory receptors, 737 Hansch model, 376
gp120 protein, 21 H2 antagonists, 7
GPR119, 754 H1 antihistamines, 54
G protein-coupled receptors (GPCRs), 65, 119, Hantzsch synthesis, 31
222, 243, 244, 254, 438, 472, 489, 677, Hartree–Fock equation, 322
719–721, 736, 789 Hartree–Fock method, 321
G proteins, 479, 480 HATs. See Histone acetyltransferases
Gq/11 protein, 480, 720 (HATs)
Gramicidin A, 770, 771 Hay fever, 54
Gram-negative bacteria, 769, 833 H-bond donor, 358
Gram-positive bacteria, 771 HCV-NS3/4A, 523
Granulocyte colony-stimulating factor HDACs. See Histone deacetylases
(G-CSF), 815 (HDACs)
Granulocytopenia, 55 HDL cholesterol level, 623
Grapefruit juice, 673 Heart attack, 502, 653, 654
Greek key, 300 Heart disease, 726
Greek-key barrel, 300 Heat of reaction, 140
Green-fluorescent protein (GFP), Heat shock protein, 699
134, 135 Hedonal ®, 25
Grepafloxacin, 755, 756 Helena Danielson, 169
GRID program, 364, 382, 441, Helicobacter pylori, 9, 58
443, 795 a Helix, 196, 293, 295, 297, 316
GROW, 443 Helix–turn–helix motif, 835, 836
Growth disorders, 815 Hemagglutinin, 792
Growth factors, 472 Heme, 685
Growth hormone, 815 group, 641, 646
Growth hormone receptors, 816 proteins, 670
Gr€unenthal, 180 Heme-containing proteins, 646
Gs, 480, 720 Hemiacetal bond, 478
GSK. See GlaxoSmithKline (GSK) Hemogglutinin, 488
GTPase, 840 Hemoglobin, 14, 430, 431, 562, 646
Guanidines, 350 Hemophilia, 815
882 Subject Index

Hemophiliacs, 814 Hormone contraceptives, 717


Heparin, 118 Hormones, 113, 413
Hepatitis A, 798 Hortus Eystettensis, 2
Hepatitis B, 798, 820 Hot spot analysis, 457
Hepatitis B vaccine, 815 Hot spots, 137, 199, 364–367, 369, 432
Hepatitis C virus, 523 HPLC separation, 140
Hepatitis viruses, 528 HR1 domain, 788, 790
hERG channel. See Human Ether-á-go-go HR2 domains, 788, 790
Related Gene channel (hERG channel) H2-receptor inhibitors, 3
hERG channel binding, 412 HR1 helices, 788
hERG ion channels, 165, 412 HR2 peptide, 788
hERG potassium channel, 754, 755 HR2 peptide strand, 791
Herman the bull, 815 11b-HSD, 667, 668
Heroin, 5, 49, 53, 176, 177 HSD1, 666
Heroin addiction, 49 11b-HSD1, 666–669, 693
Heroine, 108 HSD2, 666
Herpes simplex virus, 523 11b-HSD2, 666
Herpes viruses, 260, 523 11b-HSD2 inhibitor, 666
Heterotrimeric G protein, 720 HT29, 409
hGH. See Human growth hormone (hGH) 5-HT1B, 419
High blood pressure, 653 5-HT1Db, 419
High-density lipoprotein, 654 5-HT3 receptor, 757
High-resolution DNA chips, 255 5-HT receptors, 420
High-resolution NMR spectroscopy, 265 HUGO. See Human Genome Organization
High-throughput screening (HTS), 130, (HUGO)
136, 432 Huisgen reaction, 227, 228
Hinge region, 602 Human b2-adrenergic receptor, 722
HINT, 382 Human colon carcinomas, 409
Hippuric acid, 175 Human cranium powder, 5
Hirudin, 113, 117 Human Ether-á-go-go Related Gene channel
Histamine, 53, 54, 57, 161, 162, 163, 719 (hERG channel), 8
Histone acetyltransferases (HATs), 259 Human fibroblast collagenase, 583
Histone deacetylases (HDACs), 259, 596 Human genome, 238, 429, 601
Histones, 258 Human Genome Organization (HUGO), 238
Hits, 129 Human growth hormone (hGH), 118,
HIV. See Human immunodeficiency 489, 815
virus (HIV) Human immunodeficiency virus (HIV), 240,
HI-9.2 virus, Phage l, 240 488, 489, 546, 788, 789, 791, 814
HIV-RT inhibitor, 830 infection, 726
H+/K+-ATPase, 57, 185 inhibitors, 561
H+/K+ pump, 769 integrase, 267
HMG-CoA. See Hydroxymethylglutaryl- protease, 106, 167, 534, 536, 546, 547, 551,
coenzyme A (HMG-CoA) 552, 555, 556, 561, 829, 830
H1N1 (swine flu), 792 protease inhibitors, 80, 81, 166, 169, 489,
H5N1(bird flu), 792 546–548, 553, 717
Hoechst, 27, 814 protease substrate analogue, 546
Hoffmann-La Roche, 507 replication, 553
Hogben, 406 reverse transcriptase, 828, 829, 833
Homeobox, 244 Humanization, 820
Homologous recombination, 245 Humanization of animals, 260
Homoproline, 546 Humanized antibodies, 302
Homo sapiens, 240 Human leukocyte elastase, 510
Homoserine dehydrogenase, 643 Human rhinovirus, 528
Subject Index 883

Humira ®, 822 I
Humoral, 9 Ibritumomab, 822
Humoral complement system, 802 Ibuprofen, 104, 690, 692, 694
Humoral immune response, 837 ICAM-1, 799
Humoral system, 816 ICAMS. See Intercellular adhesion molecules
Humulin ®, 235 (ICAMS)
Humulin insulin, 235 ICI, 7, 510, 707
Huperzin A, 115 ICI 200880, 510, 511
Hybridoma cells, 819 IC50 value, 67
Hydantoinases, 94 IDD594, 664, 665
Hydration enthalpies, 749 Idea generators, 445
Hydride ion, 649 I.G. Farbenindustrie, 24
Hydrochlorothiazide, 160 IgG antibody, 817
Hydrocortisone, 710 IH values, 403
Hydrogen-bond acceptor, 69 IL-2, 837
Hydrogen-bond donor, 68 Iloilo, 843
Hydrogen bond network, 72 Iloson ®, 844
Hydrogen bonds (H-bonds), 68, 168 Image-mirror-image pair, 91
Hydrolases, 474 Imatinib, 18, 611–613
a/b-Hydrolases, 243 Imide bond, 306
Hydrophobic contact, 73 Imipenem, 523
Hydrophobic interactions, 71 Imipramine, 13, 32, 162, 163, 673
Hydrophobic protein–ligand interaction, 74 Immune reactions, 486, 823
Hydrophobic test compounds, 132 Immune response, 777
Hydroxamic acids, 569 Immune response modulation, 688
p-Hydroxyacetanilide, 24 Immune stimulation, 777, 805, 837
4-Hydroxycyclohexanone, 551 Immune system, 816
Hydroxydebrisoquine, 676 Immunoassays, 132
4-Hydroxydebrisoquine, 675 Immunoglobulins, 301, 302
Hydroxylases, 474 Immunosuppressants, 260, 837
Hydroxymethylglutaric acid (HMG), 655 Immunosupresssive, 709
Hydroxymethylglutaryl-coenzyme Imperial University in Dorpat, 7
A (HMG-CoA), 176, 178, 653, 655 Incretin hormones, 516
inhibitors, 166 Indinavir, 166
reductase, 176, 643, 653, 655, 693 Indolylthioureas (ITU), 832
reductase inhibitor, 655 Indometacin, 690, 692
3-Hydroxy-3-methylglutaryl-coenzyme-A Induced fit, 64, 350
reductase (HMG-CoA reductase), 653 Induced-fit adaptations, 460
p-Hydroxypropiophenone, 406 Industrialized countries, 3
11b-Hydroxysteroid dehydrogenase INF-a, 741
(11b-HSD), 643, 665, 666 Infantile paralysis, 798
4-Hydroxy-tamoxifen, 703 INF-b, 741
Hygiene, 3 Infections, 756
Hyperforin, 673, 714–717 Infectious disease, 450
Hyperpolarization, 748, 762 Infertility, 697, 707
Hypertension, 722, 749, 823 Inflammation, 113, 527
Hypertensive crises, 681 Inflammatory cascade, 488, 784
Hypervariability loops, 818 Inflammatory diseases, 825
Hypnotics, 763 Inflammatory mediators, 687
Hypoglycemic coma, 12 Inflammatory modulation, 590
Hypoglycemics, 160 Inflammatory processes, 783
Hypotensive effect, 537 Infliximab, 18
Hypothermia, 116 1918 Influenza, 8
884 Subject Index

Influenza virus, 791, 792 b-Ionone ring, 728


Inhalation anesthetics, 488 Ions, 745
Inhibitor, 62 channels, 62, 65, 243, 472, 480, 482, 484,
Inhibitory concentration, 66 485, 518
Inhibitory glycine, 757 pair, 404
Inhibitory G proteins (Gi and Go), 720 transporters, 485
Inhibitory neuroreceptors, 762 Ipecac, 8
Inkretins, 517 Iproniazid, 13, 124, 125, 677, 678
Inosine, 123 Irbesartan, 735
Inositol-1,4,5-triphosphate (IP3), 720 Iron, 565, 594, 595, 641, 646, 670
Insecticides, 518 Iron–sulfur cluster, 641
In Silico design, 445 Iron-triggered cluster bomb, 48
Insulin, 17, 113, 118, 246, 481, 482, 659, 660, Irreversible enzyme inhibitor, 18
673, 719, 738, 813–816 ISI 113715, 628
mimetic, 739 Isis Pharmaceuticals, 628
receptor, 739 Isoamylcarbamate, 25
release, 590 Isoelectric point, 249
resistance, 623 Isomerases, 299, 474
secretion, 752, 754 Isoniazid, 121, 125, 677, 678
shock, 12 Isonicotinic acid, 121
Insulin-like growth factor receptors Isopeptide bond, 529
(IGFRs), 739 Isoprenaline, 159–161, 725–727
Integral membrane proteins, 746 Isostar database, 365
Integrase, 828 Isosteres, 546
Integrase inhibitor, 554 Isosteric replacement, 155, 156
Integrilin ®, 781 Isothermal titration calorimetry, 141, 405
Integrin, 243 IT1t, 726
Integrin a, 243
Integrines, 200
Integrin receptors, 180, 778 J
Interaction kinetics, 169 Janssen Pharmaceuticals, 11, 635, 832
Intercalating tumor therapeutics, 487 Januvia ®, 517
Intercalation, 311 Jaspisamide A, 838
Intercellular adhesion molecules (ICAMS), 783 Jelly roll, 300
Interference, 269 Jesuit powder, 42
a-Interferon, 815 JG 365, 546, 547
Interferons, 740 Jiggling, 832
Interleukin-2 (IL-2), 837 Johnson & Johnson, 815
Interleukin-1 receptor antagonist, 741 Jun, 5
Interleukins, 472, 719, 740 Juvenile hormone, 800
Internal energy (DU), 67
Internationale Gesellschaft zur Bek€ ampfung
der Wissenschaftlichen Thierfolter, 424 K
Interstitial water molecule, 450 Kabiramide C, 838
Intestines, 410 Kala-azar, 638
Intracellular adhesion molecules (ICAM), 785 Kans™, 226
Intrinsic effect, 62 KcsA, 751
Introductory screening, 129 channel, 757
Invasin, 450 potassium channel, 749
Invasion process, 777 keto-ACE, 579, 580
Inverse agonists, 62, 479, 719, 720 Ketoconazole, 646, 671–673
Ion-exchange resin, 653 Ketoprofen, 690
Ionic interactions, 69 k1-Glycoprotein, 164
Subject Index 885

Kidney damage, 659 Leu-Enkephalin, 50, 190


Kidney insufficiency, 654 Leukemia, 121, 815, 826
Kidney perfusion, 688 Leukocytes, 62
Kinase-dependent intracellular signaling infiltration, 488, 784
pathways, 719 rolling adhesion, 783
Kinase-dependent signal cascade, 739 Leukotrienes, 709
Kinase inhibitors, 414 Leupeptin, 528, 529
Kinases, 299 Leuprilid, 191
Kinedak ®, 664 Leuprolid, 191
Kinetic equilibrium constants, 399 Levamisole, 33
Kinetic resolution, 95, 518 Levcromakalim, 752, 753
Kir6.1, 752 Levitra ®, 591
Kir6.2, 752 Levomethadone, 50, 51
Kir channel proteins, 752 Lexiva ®, 548
Ki value, 67 LH. See Luteinizing hormone (LH )
Knock-in and knock-out animal models, 4 LHRH. See Luteinizing hormone releasing
Knock-out method, 245 hormone (LHRH )
Knowledge-based concept, 442 Liberation of water molecules, 71
Library/libraries, 129, 213, 215
Librium ®, 12, 31
L Licorice, 666
L-783281, 739 Lidocaine, 53
Labetalol, 93, 101 Ligand, 62
Lab-on-a-chip, 227 Ligand-based pharmacophore, 349
Lactam antibiotics, 520 Ligand-binding domains (LBDs), 481,
b-Lactam antibiotics, 176 697, 699
b-Lactamase inhibitors, 489 Ligand efficiency, 130, 139, 165
b-Lactamase-resistant antibiotics, 523 Ligand-gated ion channels, 481, 491, 757
b-Lactamase-resistant b-lactams, 523 Ligand–receptor complex, 64
Lactamases, 521 Ligand–receptor interactions, 14
b-Lactamases, 489, 493, 521–523, 565 Ligases, 474
Lactam ring, 489 Linear synthetic route, 226
b-Lactam ring, 518, 520 Linezolid, 682
b-Lactams, 118 Linker, 220
Lactobacillus casei, 651 Lipases, 94, 474, 493, 518
Lactoferrin, 815 Lipid double-layer, 65, 87
Laminin EGF, 244 Lipid homeostasis, 709
Langerhans islet cell tumors, 752 Lipidic arrays, 280
Larkspur, 760 Lipidic cubic phases, 280
Laue technique, 286 Lipid layer, 408
Laughing gas, 24 Lipid membranes, 375, 401
LBDs. See Ligand-binding domains (LBDs) Lipid theory of anesthesia, 373
LD0.00001, 422 Lipitor ®, 658
LD50, 421 Lipobay ®, 658
LDL. See Low-density lipoprotein (LDL) Liponic acid, 474
Lead discovery, 116 Lipopeptide, 771
“Lead-like” molecules, 215 Lipophilic contacts, 86
Lead optimization, 129 Lipophilic electron-donating substituents, 377
Lead structures, 113, 129, 136, 138, 173 Lipophilicity, 408, 409, 433
Lee–Richards surface, 324, 325 Lipophilicity parameter p, 374
Leishmania, 638 Lipscomb, 574
Lembrane-bound receptors, 479 Lipstatin, 518, 520
Lennard–Jones potential, 319, 382, 384, 387 Lisinopril, 577, 579, 580
886 Subject Index

Lithium ureate, 33 Major histocompatibility complex (MHC)


LNA. See Locked nucleic acid (LNA) molecules, 802, 816
Lobelia inflata, 759 Malaria, 8, 25, 27, 42–45, 59, 257, 490,
a-Lobeline, 759, 760, 762 527, 638
Local anesthetics, 3, 488 parasite, 562
Locilex ®, 771 therapy, 562
Lock-and-key principle, 349 Malate dehydrogenase, 643
Locked nucleic acid (LNA), 824 Malathione, 518, 519
log p, 398 Malayan pit viper, 118
log P value, 405 Malondialdehyde, 41
Lonafarnib, 637 Malonyl-CoA, 837
Loperamide, 51 Manganese, 565, 620
Losartan, 17, 675, 732, 733, 735 Mannose, 785
Losec ®, 58 M1 antagonist, 53
N-Lost, 180, 185 Manufacture of monoclonal antibodies, 819
Lovastatin, 17, 19, 119, 176, 178, 655, 657 MAO. See Monoamine oxidase (MAO)
Loviride, 831, 832 MAOA, 646, 677, 678, 680–682, 694
Low-density lipoprotein (LDL), 654 MAOB, 646, 677, 678, 680, 682, 694
Lozaar ®, 734 MAOB inhibitors, 680
LSD. See Lysergic acid diethyl amide (LSD) MAP kinases. See Mitogen-activated protein
L-Selectin, 787 kinases (MAP kinases)
Luciferase, 134 Maraviroc, 18, 488, 553, 790, 791
LUDI, 441, 443, 454, 455, 458 Marimastat, 582, 584
Lumefantrine, 48 Marsilid ®, 677
Luteinizing hormone (LH ), 191 Mass spectrometry, 140, 250
Luteinizing hormone releasing hormone Mass-to-charge ratio, 140
(LHRH ), 191 Materia Medica, 5
Lutetium, 822 Matrilysin (MMP-7), 581
LxxLL motif, 703–705, 717 Matriptase, 495, 515
LxxLL recognition motif, 713 Matrix degradation, 581
Lyases, 474 Matrix metalloproteases (MMPs), 565, 566,
Lymphocyte maturation, 301 569, 581, 584, 596
Lysergic acid diethyl amide (LSD), 29, 34, inhibitors, 584
108, 118, 421 Matrix metalloproteinase, 143
Lysozyme, 29, 819 Matrix synthesis, 581
Mauveine, 26
Max Planck Institute for Biochemistry,
M Martinsried, Germany, 474, 504
M13, 213 MCSS, 366
MACCS keys, 394 MD. See Molecular dynamics (MD)
Macrocyclic substances, 837 MDM2, 200, 201
Macrocyclization, 837 MDMA. See
Macrolide antibiotics, 191, 486, 526 3,4-Methylenedioxymethamphetamine
Macrolides, 837 (MDMA)
Macromolecular biocatalysts, 472–473 MDR. See Multidrug resistance (MDR)
Macrophage metalloelastases (MMP-12, -19), MDR1/ABCB1, 768
581 MDR-ABC transporter, 768
Macrophages, 488, 654, 802 Measure of similarity, 361
Magainin, 771 Medical diagnostics, 820
The Magic Mountain (Zauberberg), 8 Mefloquine, 17, 45
Magnesium, 591, 603, 620, 829, 830, 835 Melagatran, 508, 509
Main chain, 194 Melamine-contaminated baby formula, 423
Major groove, 310, 699, 700 Melan-A, 806
Subject Index 887

Melan-A/MART-1 antigens, 806 Methotrexate (MTX), 121, 357–360, 649


Melinda and Bill Gates Foundation, 8 Methylases, 259
Melting temperature, 139 N-Methylated amide bonds, 195
Membrane-bound proteins, 279 Methylene blue, 27
Membrane-bound receptors, 220 3,4-Methylenedioxymethamphetamine
Membrane-residing enzymes, 65 (MDMA), 158, 159
Membranes, 65, 406 Methylenetetrahydrofolate, 646
barriers, 176 Methylenetetrahydrofolic acid, 149
permeability for ions, 746 Methyllycaconitine, 760–762
potential, 748 Methyl transferases, 258, 628
transporters, 767 Metiamide, 55, 56
Meningitis, 798 “Me-too” research, 157
Menopause, 706 Metoprolol, 161, 673
Menstrual cycle, 706 Metyrapone, 646, 672
Mepivacaine, 53 Mevaldehyde, 654
Meproscillarin, 2 Mevalonic acid, 176, 178, 653
b-Mercaptoethanol, 565 Mevastatin, 655, 657
Mercaptopurine, 121, 122, 826 MFCH, 409
Merck, 653, 655, 691, 783 MHC class-I molecules, 804, 805
Merck & Co, 577, 653, 739, 815 MHC-I, 244, 802
Merck, Sharp and Dohme (MSD), 203, complex, 806
204, 589 molecule, 23
Meropenem, 523 MHC molecules. See Major histocompatibility
MEROPS, 473 complex (MHC) molecules
Merrel, 706 M2 helices, 758
Merrifield peptide synthesis, 217 Miasma, 42
Merrifield solid-phase synthesis, 216 Michael-acceptor group, 530
Merrifield synthesis, 191 Microarray technology, 252
Mesencephalons, 413 Micro RNA (mRNA), 246–248, 252, 253,
Messenger molecules, 719 450, 486
messenger RNA (mRNA), 823, 825, Microtiter plates, 139
839, 840 Microtubule disruptors, 609
Metabolic syndrome, 623, 668, 825 Microtubuli, 487
Metabolism, 397, 745 Microwave spectrum, 291
Metabolites, 411 MIDAS. See Metal-ion-dependent adhesion
Metabolome, 251, 252 site (MIDAS)
Metabolomics, 248, 251 Mifeprostone, 708
Metacholine, 100 Migraine, 721, 722
Metal-ion-dependent adhesion site (MIDAS), Millenium pharmaceuticals company, 524
778, 779 Mineralocorticoids, 709
binding site, 781 receptors, 706
Metallopeptidases, 493 steroids, 413
Metalloprotease (“Zincins”), 243 Minor groove, 310
Metalloprotease–inhibitor complexes, 569 Mirror image, 89
Metalloproteases, 69, 537, 565 Mirror-image world, 107
Metastasis, 581, 584, 784 Mithridates VI, 4
Met-enkephalin, 190, 222, 224 Mithridatum, 4
Methamphetamine, 102 Mitogen-activated protein kinases
Methanol, 403 (MAP kinases ), 604
Methaqualone, 91 Mitosis, 617
Methazolamide, 588, 589 Mitotic cyclins, 617
Methemoglobin, 24 MK499, 755, 757
Methionine aminopeptidase, 599 MK 927, 589
888 Subject Index

MM-GBSA, 321 Multidimensional NMR, 282, 284


MMP-1, 584, 586 Multidrug resistance (MDR), 490
MMP-1, -8, -13, 9, 581 Multidrug resistance-associated protein
MMP-2, -9, 581 (MRP), 490
MMP-3, -10, -11, 9, 581 Multienzyme complexes, 191, 478, 837
MMP-7, 581 Multipin synthesis, 216
MMP-12, 566, 568, 569 Multiple drug resistance (MDR), 767
MMP-12, -19, 581 Multiple myeloma, 525
MMP-20, 581 Multiprotease complex, 524
MM-PBSA, 321 Mummy dust, 5
MMPs. See Matrix metalloproteases Muscarine receptors, 415
(MMPs) Muscarinic acetylcholine receptor, 481
m-nitrobenzoic acid, 406 Muscle atrophy, 527
m-nitrophenol, 406 Muscle contraction, 720
MO. See Molecular orbital (MO) Muscle metabolism, 591
Moclobemide, 681 Muscle relaxants, 763
Modeller, 438 Muscular dystrophy, 561
Modular synthesis, 837 Mus musculus, 239
Molecular database, 137 Mustard gas, 180, 181
Molecular dynamics (MD), 338 Mutual compensation of enthalpy and
calculations, 326 entropy, 84
simulations, 83, 325, 662 Mycoplasma genitalium, 239
Molecular mechanics, 318 Myocardial contractility, 590
Molecular modeling, 316 Myocardial infarct, 527
Molecular orbital (MO), 322 Myocarditis, 798
Molecular replacement method, 274 Myoglobin, 646
Molecular weight, 174 Myopathy, 659
Monacolin K, 655
Monoamine oxidase (MAO), 13, 413, 484
enzymes, 679 N
inhibitors, 102, 103, 124, 183, 479, N-acetylglucosamine, 785
677, 678 N-acetylneuraminic acid, 794
substrates, 683 nAChR. See Nicotinic acetylcholine receptor
Monoamineoxidases (MAOA), 646 (nAChR)
Monooxygenases, 670 Nachwein, 26
Monte Carlo methods, 338 NAD+, 474
Montelukast, 18 NAD(P)+, 132
Mood, 590 NADH, 121
Morpheum, 49 NADH/NADPH, 642, 643
Morphine, 17, 48–50, 59, 113, Na-(b-naphthylsulfonylglycyl)-D,L-p-
114, 192 amidinophenylalanylpiperidide
Mother’s Little Helper, 13 (NAPAP), 504, 505, 507
Moxifloxacin, 834, 835 NAD+/NADP+, 642, 643
M2 proton channel protein, 792 NADP+, 474, 660
MRC, 723, 838 NADPH/NAD(P)H, 132, 644, 649, 660
mRNA. See Micro RNA (mRNA) dependent enzymes, 642
MRP. See Multidrug resistance-associated reductase, 670
protein (MRP) Nafoxidine, 707
MSD. See Merck, Sharp and Dohme (MSD) Naftifine, 32
M2 transmembrane helices, 758 Nagana Red, 120
MTX. See Methotrexate (MTX) Na+/K+ ATPases, 485, 747
Mugwort, 47 Nalidixic acid, 834, 835
Multicompartment systems, 401 Naloxone, 49
Subject Index 889

NAPAP. See Na-(b-naphthylsulfonylglycyl)- Nicotine, 116, 759, 760


D,L-p-amidinophenylalanylpiperidide Nicotinic acetylcholine receptor (nAChR),
(NAPAP) 116, 481, 483, 518, 757–759, 761, 762
Naphthalene, 23 Nicotinic acid, 121
Napoleon, 37 Nifedipine, 17, 31, 56, 362, 363, 483, 749
Napsagatran, 508, 509 NIH. See National Institutes of Health (NIH)
Naringenin, 646, 671 Nil nocere, 175
National Institutes of Health (NIH), 239 Nilotinib, 613, 614, 638
Native American Indians, 759 Nitecapone, 631
Natural products, 113 Nitrogen monoxide (NO), 592
Naturwein, 26 biosynthesis, 488
Nausea and vomiting, 721 p-Nitrophenol, 23
NBDs. See Nucleotide-bindings (NBDs) 5-Nitrosalicylic acid, 406
Nebicapone, 631 Nitrous oxide (N2O), 24
Nebularine, 123 Nizatidine, 56
Needle in a haystack, 138 NK1, 202, 417
Neisseria gonorrhoeae, 650, 651 NK2 receptor, 202, 203, 417
Nelfinavir, 166, 546 NK3 receptor, 202, 417
Neomycin, 245 NK1 receptor antagonists, 203
NEP 24.11, 101, 361, 566 NK2 receptor antagonists, 203
Nernst equation, 746 NMR. See Nuclear magnetic resonance (NMR)
Nero, 4 NO. See Nitrogen monoxide (NO)
Nerve growth factor (NGF), 482 Nobel Prize, 8, 39, 316, 772, 819
Nervous coughing, 756 Nobel Prize in medicine, 736
Neuraminidase, 488, 792, 796 NOE. See Nuclear overhauser effect (NOE)
Neuraminidase inhibitors, 488 Nonactin, 770, 772
Neurodegenerative diseases, 524 Non-bonding interactions, 364, 369
Neurokinin, 719 Non-classical antifolate inhibitors, 650
Neurokinin A, 418 Noncoding segments, 254
Neurokinin B, 418 Non-competitive inhibition, 491
Neuroleptanalgesia, 3 Non-Hodgkin lymphomas, 649
Neuroleptics, 12, 162, 414, 416 Non-linear lipophilicity-activity relationships,
Neuropathy, 765 375
Neuropeptide Y, 190 Nonribosomal peptide synthesis, 191
Neuroprotective agents, 528 Nonribosomal peptide synthesis machinery,
Neurotoxicity, 403 837
Neurotoxins, 485 Non-steroidal anti-inflammatory drugs
Neurotransmission, 759 (NSAIDs), 39, 690
Neurotransmitters, 9, 113, 121, 413, 479, 484, Noradrenaline, 9, 12, 13, 159
677, 729 Noradrenaline transporter, 485
Neutral endopeptidase 24.11, 101 Norepinephrine, 9
Neutral endopeptidases, 565 Novartis, 19, 542, 611, 734
Neutral zinc endopeptidases, 581 Novartis ophthalmics, 825
Nevirapine, 17, 553, 831 Novo Nordisk, 456, 625, 627, 815
Newtonian equations, 326 Nuclear hormone receptors, 472
Nexium ®, 58 Nuclear magnetic resonance (NMR)
NF-kB, 526, 698 spectroscopic, 358
NGF. See Nerve growth factor (NGF) spectroscopic techniques, 14
Nicotiana tabacum, 480 spectroscopy, 136, 142, 143, 200, 215, 265,
Nicotinamide adenine dinucleotide, 643 279, 280, 405, 432, 780
Nicotinamide adenine dinucleotide phosphate spectrum, 282
(NAD(P)+), 641 Nuclear Overhauser effect (NOE), 142,
Nicotinamide moiety, 642 283, 285
890 Subject Index

Nuclear receptors, 482, 697 Osteoporosis, 706, 708


Nucleic acid binding proteins, 242 Otto II, 42
Nucleic acids, 61 Ovaries, 706
Nucleophile, 96 Ovulation, 708
Nucleoside analogues, 486 Oxalic acid, 405
Nucleotide-binding moiety, 643 N-Oxalylanthranilic acid, 624
Nucleotide-bindings (NBDs), 768 Oxicams, 690
Nucleotide-phosphodiesterase cycl., 243 Oxidases, 299, 474
Oxidation, 180
Oxidation reactions, 642
O Oxidative stress, 660
Obesity, 518, 622, 623, 653, 697, 774 Oxidoreductases, 474
Obesity therapy, 590 Oxyanion hole, 496, 524, 526
n-Octanol, 374 Oxygenases, 474
Octanol/water partition coefficient P, 374, Oxyphenbutazone, 32
375, 401 Oxytocin, 191
Octanol/water system, 400
Olfactory cells, 736
Oligoglycine strand, 824 P
Oligogylcine peptide strand, 824 p53, 200, 201
Oligomeric membrane-bound receptors, 719 p66, 829
Oligonucleotides, 246, 823 Paclitaxel, 115, 178, 714, 715, 717
Oligopeptide-binding protein A, 64, 361 Pain, 683
Oligopeptide transporter, 176 Pain therapy, 691
Omeprazole, 17, 19, 57, 68, 185, 186, 478, Palindromic DNA sequences, 835
673, 769 Pallas/pKa, 411
Oncogenes, 260, 487 Palmitate, 177
Ondansetron, 17 PAMPA. See Parallel artificial membrane-
Ondetti, 574, 575 permeability assay (PAMPA)
One-bead-one-compound technique, 220 Pancreas lipase, 518
Onglyza ®, 517 Pancreatic b-cells, 752
Onl2 gene, 524 Pancreatic lipase, 520
ONO-4056, 513 Pancreatitis, 486, 823
ONO-5046, 512 Pandemics, 793
ONO-6818, 512, 513 Pantoprazole, 769
ONO Pharmaceuticals Co, 512, 664 Pan troglodytes, 239
Oocyte maturation, 708 Papain, 526–528, 818
Open-flap conformation, 544 Papain-type proteases, 528
Open-sheet structure, 299 Papaverin, 113, 115
Opiate receptor, 116, 222, 224 Papyrus Ebers, 4
Opium, 3, 48 Paracelsus, 5, 420
Opium Wars, 48 Paracetamol, 673, 674, 690
Optically active, 89, 91, 92, 94, 99, 108 Paracoccus denitrificans, 436, 437
Optimally diverse, 221 Parallel artificial membrane-permeability assay
OR7D4 receptor, 737 (PAMPA), 400, 410
Organ transplantation, 803 Paraoxone, 519
Orlistat, 18, 518, 520 Parasites, 62, 489, 527
Oropharyngeal infections, 770 Parasitic infections, 485, 774
Oseltamivir, 488, 795–797 Parathione (E605), 518, 519
Osmotic gradient, 771 Pargyline, 677, 678, 681
Osmotic stress, 660 Parke-Davis, 551, 553
Osteoarthritis, 691 Parkinson’s disease, 12, 103, 158, 181, 183,
Osteopetrosis, 765 413, 414, 518, 629, 677, 681, 729
Subject Index 891

Paroxone, 518 Peroxisomal proliferator-activated receptor


Partial agonist, 62 (PPAR/PPARa), 699, 709, 711
Partition function, 326, 442 agonist, 33, 654
Patch–clamp technique, 134 receptor, 710, 712
Patents, 246 Personalized medicines, 256
Paternity tests, 236 Pethidine, 32, 50, 51
Paul Scherrer Institute, 726 Pfizer, 19, 410, 591, 658, 790
PCR. See Polymerase chain reaction PGE2, 687, 688
(PCR) PGG2, 686, 694
PDB. See Protein database (PDB) PGH2, 39, 684, 686, 687, 694
PDE 4, 591 PGI2, 687, 688
PDE 5, 591, 593 P-glycoprotein GP170, 768
PDE 6, 593 pH-absorption profile, 406
PDE 11, 593 Phagocytotic cells, 488
PDE 5 inhibitors, 591, 593 Phagosome, 562
PDFs. See Peptide deformylases (PDFs) Pharmaceutical catastrophes, 422
PDGF receptor kinase, 611 Pharmacia, 815
PEG-Paclitaxel, 178 Pharmacokinetic properties, 164
Penem, 523 Pharmacokinetics, 397
Penicillamine, 124, 125 Pharmacology, 416
Penicillic acid, 521, 522 Pharmacophore pattern, 368, 550
Penicillins, 17, 28, 118, 119, 157, 478, Pharmacophores, 163, 164, 349, 350, 353, 356,
486, 520 357, 360, 361, 364, 367–369
Penicillium chrysogenum, 28 Phase-I reactions, 670
Penicillium notatum, 28 Phase problem of crystal structural
Penile erectile function, 591 determination, 274
Pentostatine, 123, 826 pH conditions, 320
Pen-Ts’ao school, 5 pH distribution, 405
P450 enzymes, 671 Phenacetin, 17, 23, 24, 423
Pepsin, 543, 546, 561 Phenelzine, 677, 678
Pepsin–pepstatin complex, 537 Phenethylamine, 378
Pepstatin, 537, 546 Phenobarbital, 673, 714
Peptidases, 413, 474, 493 Phenolphthalein, 26
Peptide deformylases (PDFs), 594 Phenothiazines, 350–354
Peptide-nucleic acid (PNA), 824 Phenotypes, 254
Peptides, 189 Phenoxymethylpenicillin, 521
antibiotics, 99 Phenylaminopyrimidine, 611
bond, 456 Phenylbutazone, 32
conformation, 196 Phenylketonuria, 245, 257
Peptidoglycan, 519 Phenyl sulfonamide, 587
Peptidoglycan strand, 520 Phenytoin, 178, 179
Peptidomimetics, 50, 191, 192, 194, 202, 213, f, 294
487, 510, 545, 806, 807 PHI2, 688
Peptidyl transferase site, 840 (AU: Found as Philadelphia chromosome, 609
pdf Peptidyl transferase center) Phosphatase assay, 133
Peptoids, 222 Phosphatases, 178, 180, 474, 599, 601,
Peramivir, 795–797 620, 621
Perforines, 803 Phosphate, 287
Periodic array, 266 Phosphinates, 569
Periodic translation, 267 Phosphocholine, 818, 819
Peripheral vascular disease, 659 Phosphodiesterase inhibitor, 33
Periplasmatic space, 769 Phosphodiesterases (PDEs), 480, 565, 590, 621
Peroxide, 677 Phospholipase C-b, 720
892 Subject Index

Phospholipids, 697 Polyacrylic acid gel, 249


Phosphonamides, 569 Polyene antibiotics, 486
Phosphonates, 569 Polyethylene glycol (PEG), 178
Phosphonic acids, 569 Polyketide synthetic pathway, 837
Phosphoramidone, 569, 570 Polymerase, 829
Phosphorylation, 599 Polymerase chain reaction (PCR), 235, 236,
Phosphoserine, 622 253
Phosphothreonine, 622 techniques, 246
Phosphotransferases, 474 Polymorphisms, 255, 726, 806
Phosphotyrosine, 622, 625, 626 Polyol pathway, 660
Phosphotyrosine mimics, 624 Polyvalent vaccines, 491
pH-partition profile, 404, 408 Ponalrestat, 664
pH-partition theory, 406 Poperaquine, 48
pH shift, 408 Poppy, 4
Phytopharmacon, 94 Porcine insulin, 814
Piconavirus 3C-proteinase, 527 Pores, 62, 65
Picornaviruses, 528, 797, 798 Portal vein, 173
Picrotoxinin, 350 Postsynaptic neuron, 479
Pigs, 793 Posttranslational modification, 241, 471, 528,
Pilocarpine, 114 599, 633
PIM-1 kinase, 620 Potassium channels, 483, 747, 752
Pinacidil, 752, 753 Power brake boosters, 763
Pindolol, 419, 420, 725 PPARb/d, 709
Pinworm, 134, 241 PPARg, 709, 711, 713
Pin worm infections, 761 PPAR/PPARa. See Peroxisomal proliferator-
Pioglitazone, 711, 717 activated receptor (PPAR/PPARa)
Pirenzepine, 53 P1 position, 498
Pitavastatin, 656 Practolol, 161
pKa values, 70, 405, 406 Pradaxa ®, 509
PKC inhibitor, 611 Pravastatin, 166, 656
Placenta, 9, 687 Praziquantel, 33
Plasmapepsins, 414 Precession, 281
Plasma proteins, 164 Pregnane-X receptor (PXR), 714–717
Plasmepsins, 534, 536, 562 Prenyl groups, 249, 600
Plasmids, 235, 260 PreQ1, 451
Plasmin, 303, 304, 414 Pressure–volume work, 67
Plasmodium, 42, 638 Presynaptic nerve cell, 767
Plasmoquine, 43 Presynaptic neuron, 479
Platelet aggregation, 590 Prilosec ®, 58
Platelets, 39, 40, 58, 778 Princeton, 615
b-Pleated-sheet, 293, 295, 296, 510 Procain, 17
Pleconaril, 798–801 Prodrug, 173, 528, 707
PLS analysis, 385 Progabide, 183, 184
PNA. See Peptide-nucleic acid (PNA) Progesterone, 698, 700, 708, 709
Pneumocystis carinii, 8 Progesterone receptor, 702, 706, 708
Pneumocystis jirovecii, 8, 652 Proguanil, 48, 179, 180
PNP. See Purine nucleoside phosphorylase Prokaryotes, 240
(PNP) Proline, 195
Podophyllotoxin, 114 Prolyl cis–trans isomerase, 487
Point mutations, 237 Promethazine, 162, 163
Polarizability, 375 Pronethalol, 161
Poliomyelitis, 528 Propafenone, 673
Poliovirus, 798 Propanolol, 100
Subject Index 893

Propoxur, 519 Ptp-1b gene, 624, 626


Propoxyphene, 101, 102 Pull-down experiment, 251
Propranolol, 17, 419, 420, 725 Pulmonary emphysema, 486
Pro-prodrug, 185 Pulmonary hypertension, 721
Pro-protein, 546 Purdue University, 799
Proscillaridin, 2 Purine, 826
Prostacyclin, 39, 688, 691 Purine nucleoside phosphorylase (PNP), 286,
Prostacyclin PGI2, 687 287
Prostaglandin G/H synthase, 39 PXR. See Pregnane-X receptor (PXR)
Prostaglandins, 39, 350, 684, 687, 697, 698, 709 Pyrantel, 761
Prostate cancer, 709 Pyridoxal phosphate, 474
Protease inhibitors, 118, 550, 554 Pyrimethamine, 45
Proteases, 94, 299, 565, 622, 732, 803 Pyrimidine base, 826
Proteasome, 524, 803 Pyrogallol, 631
Protein-based pharmacophore, 350 Pyroglutamic acid, 573
Protein biosynthesis inhibitors, 486 Pyronaridine, 48
Protein catabolism, 561 Pyruvic acid, 95
Protein conformer, 662
Protein database (PDB), 433
Protein–DNA complex, 834 Q
Protein–DNA recognition, 835 QSAR. See Quantitative structure–activity
Protein kinase A, 210, 480 relationships (QSAR)
Protein kinases, 243, 599 QT interval, 756, 757
Protein–ligand complexes, 68, 69, 321, 364, Quadratic-antiprismatic water shell, 750
365, 369, 379, 388, 433, 434 Quantitative activity–activity relationships,
Protein–ligand interactions, 68, 71, 72, 86 416
Protein–ligand–water system, 74 Quantitative analysis, 384
Proteinogenic amino acids, 192, 212 Quantitative structure–activity relationships
Protein–protein surface contacts, 147 (QSAR), 371, 373, 374, 376, 377, 384,
Protein-shredding machine, 524 385, 388, 389, 394, 395
Protein synthesis, 213 equation, 442
Proteome, 249, 250, 252 models, 380, 411, 412
Proteomic analysis, 4 Quantum mechanical calculations, 71
Proteomics, 248, 251 Quantum mechanics, 281
Proton pump, 57, 58, 485 Quaternary structure, 297
Proton-pump inhibitors, 3, 53 Queuine, 450, 451
Protonsil ®, 27 Quinidine, 673
Protoporphyrin system, 33 Quinine, 25, 43, 114
Protopterus aethiopicus, 240 Quinquina, 42
Proxopur, 518 Quinupristin, 845, 846
P-selectins, 487, 784
(1S,2S)-(+)-Pseudoephidrine, 95
PSGL-1, 785 R
PSGL-1 protein, 785, 786 R115777, 635
c, 294 Racemates, 90, 93, 96, 102–104, 106
P-site, 840, 841 Rain worm oil, 5
p51 subunit, 828 Raloxifen, 703, 704, 707
p66 subunit, 829 Raltegravir, 18
Psychiatric diseases, 3 Ramachandran plot, 293
Psychotria ipecacuanha, 8 Ranirestat, 664
PTP-1B, 623–627, 629, 639 Ranitidine, 17
inhibitors, 624, 626 Rapamycin, 118
tyrosine phosphatase, 624 Ras genes, 244, 487
894 Subject Index

RAS proteins, 243, 487, 635 Retro–inverso exchange, 195


Ratelgravir, 553 Retro–inverso peptide, 99
Ratinidine, 19, 56 Retro-thiorphan, 101, 361
a-Receptor, 721 Retroviruses, 259, 260, 489, 828
b-Receptor, 721, 726 Reverse transcriptase (RT), 246, 252, 552,
Receptor-binding studies, 132 827, 828
Receptor-bound conformation, 571 Reverse-transcriptase inhibitors, 489, 554, 832
Receptor combinations, 737 Revotar Biopharmaceuticals, 786
Receptors, 9, 62, 484 RFC. See Reduced folate carrier (RFC)
Recognition properties, 357 RGD. See Arg-Gly-Asp (RGD)
Recombinantly produced antibodies, 820 Rhabdomyolysis, 659
Recombinant proteins, 237 Rhenium, 822
Recombination, 234 Rheumatism, 691
Redox reactions, 641 Rheumatoid and chronic polyarthritis, 3
Reduced folate carrier (RFC), 649 Rheumatoid arthritis, 37, 245, 528, 581,
transporter, 650 741, 784
Reductases, 180, 642 Rhinoviruses, 797
Reductions, 641 Rhizopodin, 838
Reflections, 274 Rhizopus arrhizus, 29
Refractive index, 138, 139 Rhizopus nigricans, 29
Regression analysis, 375 Rhodobacter capsulatus, 766, 770
Regression-based scoring functions, 442 Rhodopsin, 727
Relenza ®, 795 Rhodospirillum rubrum, 436, 437
Relibase, 434, 435 Rho proteins, 720
Remikiren, 418, 419, 540, 543 Ribose-1-phosphate, 287
Renin–angiotensin–aldosterone-system, 732 Ribosomal function, 838
Renin–angiotensin system, 538 Ribosomal protein synthesis, 823
Renin inhibitors, 540, 542 p90 Ribosomal S6 kinase (RSK), 607
Renins, 418, 419, 534, 536–538, 542, Ribosomal tunnel, 841, 842, 845, 846
545, 546, 732 Ribosomes, 599, 823, 844
Repolarization, 748 Ribozyme, 838
Repolarizes, 748 Rifampicin, 673, 714–716
Reporter, 143 Rimantadine, 793
Reporter gene, 134 Ringe, 432
Reserpine, 12, 114 RISC. See RNA-induced silencing complex
Residence time, 169 (RISC)
Resistance-breaking mechanism, 830 Risperidone, 17
Resistance mutations, 832 Ritonavir, 17, 547, 673
Resistance to reverse transcriptase Rivaroxaban, 514–516
inhibitors, 553 (S)-Rivastigmine, 518, 519
Respiratory depression, 176 RNA, 61, 77
Resting membrane potential, 747 interference, 135, 247, 823
Restriction enzymes, 234, 235 silencing, 823
Retinal, 728 viruses, 553, 797
Retinal degeneration, 767 RNA–DNA hybrid double strand, 831
Retinoblastoma protein, 260 RNA–DNA hybrid strand, 828, 831
Retinoic acid, 697 RNA-induced silencing complex (RISC), 247
Retinoic acid receptor, 105 RNAse H, 823
Retinoid-X receptor (RXR), 699 Ro 31-4624, 583
Retinol, 63, 64, 80 Ro 31-4724, 583
Retinol-binding protein, 63, 64, 80 Ro 31-8959, 548
Retinopathy, 623 Ro 31-9790, 583
Retro–inverso configuration, 99 Ro 45-5892, 540
Subject Index 895

Robot system, 133 SARS-main proteinase, 527


Roche, 19, 540, 543, 545, 783, 815 Saturation point, 266
Roentgen rays, 265 Saturation transfer difference (STD), 142
Rofecoxib, 689, 691, 692 Saw-scaled viper, 779
Rolling Stones, 12 Saxagliptin, 517
Rosiglitazone, 711, 713, 717 SC-52012, 781
Rossmann fold, 643, 645, 661 Scaffold mimic, 197
Rossmann folding pattern, 666 Schanker, 406
Rosuvastatin, 166, 656 Schering-Plough, 637, 801, 815
Rotational isomers, 91 Schizophrenia, 12, 51, 414, 722
Round worm infection, 761 Schrödinger equation, 321
Roxithromycin, 844, 845 Scilla alba, 2
RU486, 708 Scillaren, 2
Rubredoxin, 106 Scoring function, 429
Rule of five, 215, 411 Screening library, 429
Rule of three, 215 Scr family kinases, 614
Rutgers University, 832 Scripps, 441
Ruthenium, 619 Scripps Research Institute, 723
R-Verapamil, 768 SDS-PAGE, 250
RXP 407, 579, 580 Searle, 33, 781
RXP A380, 579, 580 Sea snail, 760
RXR retinoic acid receptor, 699 Secale cornutum, 118
Second messenger, 480, 601, 720
b-Secretases, 534, 562
S Sedatives, 12, 763
S 8307, 732 Selectins, 487, 784
S 8308, 732 receptors, 785
Saccharin, 588, 590 surface domain, 786
Saccharine, 33 Selection pressure, 489
Saccharomyces cerevisiae, 95, 239, Selection process, 116
240, 615 Selective estrogen receptor modulator
s-ACE, 577 (SERM), 708
S-adenosyl-L-methionine (SAM), 628–631, Selectivity differences, 389
633 Selectivity filter, 765
Salbutamol, 160 Selegilin, 102, 103, 183, 678
Salicin, 37, 177 Selenomethionine, 274
Salicylic acid, 17, 37–39, 177, 405, 406 Self-consistent field, 321
Saligenin, 37 Sense of smell, 590, 735
Salting out, 266 Septic shock, 486, 823
Salvarsan ®, 27 Serendipity, 23, 33
SAM. See S-adenosyl-L-methionine (SAM) Serine aminopeptidase, 517
Sandoz, 29, 31, 118, 610 Serine peptidases, 523
Sankyo, 19, 655 Serine protease inhibitors, 498, 500
Sanofi-Aventis, 19 Serine proteases, 309, 310, 493–495, 501,
SAP. See Secretory aspartic proteases (SAP) 510, 537
Saquinavir, 17, 166, 546–548 Serine transhydroxymethylase, 646
Saralasin, 732 SERM. See Selective estrogen receptor
SAR-by-NMR, 143, 144, 146 modulator (SERM)
method, 626, 627 Seroids, 350
technique, 625 Serotonin, 12, 13, 677, 678, 694
Sarcomas, 649 receptors, 415, 419, 721
Sarcosine, 475, 732 transporters, 485
SARS, 528 Serotonin (5-hydroxytryptamine), 721
896 Subject Index

Serotypes, 797 Society Against Vivisection, 423


Sertindole, 755, 756 Sodium channels, 117
Sertraline, 683 Sodium/potassium, 769
Seveso accident, 421 Solanaceae alkaloids, 3
Seveso dioxine, 421 Solid-phase synthesis, 216
SH2, 244 Solid support, 226
Shigella bacteria, 449 Solubility, 409, 433
Shigella dysentery, 8, 449 Soluble receptors, 481
Short-chain dehydrogenases/reductases, 666 Solvation, 381
Shotgun method, 238 Solvation shell, 72
Sialic acid, 488, 785, 794 Solvent-accessible surface, 323, 324
Sialosylcation, 795, 796 Solvent-exposed glycines, 438
Sialyl-Lewisx, 785–787 Sorangium cellosum, 241
Sibrafiban, 180, 782, 783 Sorbinil, 77, 78, 664, 665
Sibutramine, 18 Sorbitol, 660
Sickle cell anemia, 42, 256, 257, 430 D-Sorbitol, 660
Si face, 528 Sorbitol dehydrogenase, 660
Signal processing, 778 Sortis ®, 658
Signal transduction, 243 Sostril ®, 56
Signal transmission, 484 South American pit viper, 573
Sildenafil, 18, 19, 33, 591, 592, 597 Space-filling models, 324
Silencing RNA molecules (siRNAs), 4, 248, 740 Space groups, 269
Silent Spring, 44 Spanish flu, 8, 793
S-(2-imidazoyl-4-yl-ethyl)isothiourea, 55 Specialization, 697
Similarity considerations, 361 Specific ion channels, 746
Similarity fields, 388 Specificity of the effect, 413
Simvastatin, 9, 10, 33, 657 Spectrin, 244
Single nucleotide polymorphisms (SNPs), 254, Spermatogenesis, 709
255, 737 Sphinxolide B, 838
siRNAs. See Silencing RNA molecules Spin, 280
(siRNAs) Spirodihydronaphthalenes, 31
Sitagliptin, 18, 517 Spironolactone, 709, 710
Site-directed mutagenesis, 237 Split-and-combine technique, 218, 219, 223
Sivelestat, 512, 513 S1 pocket, 498
SKF 38393, 730 Sprycel ®, 614
Sleeping sickness, 527 Spy ligands, 143
sLeX, 785 SQ 14225, 575
sLeX binding epitope, 785 Squibb, 574
Slow metabolizers, 675 Squill, 2, 4
Smear infections, 797 SR12813, 714–716
Smell, 735 30S subunit, 841
SmithKline Beecham, 781, 815 Staggered, 335
Smith, Kline & French, 7, 54 Standard deviation, s, 377
Smoking, 373, 653 Stanford University, 723
SMON. See Subacute myelo-optic-neuropathy Statins, 166, 255, 536, 539, 546, 653, 657, 717
(SMON) Staurosporin, 610, 619, 620
Smooth-muscle contraction, 14 Staurosporine, 606
Sniffing party, 24 Stereogenic center, 91
snoRNAs, 242 Steric and electrostatic interactions, 380
SNPs. See Single nucleotide polymorphisms Steric parameters, 375
(SNPs) Steric properties, 391
snRNA, 241 Sterling–Winthrop, 38, 799, 800
Soaking, 146, 286 Steroid hormones, 10
Subject Index 897

Steroid receptors, 699 Suramin, 120


St.John’s wort, 673, 714, 717 Surface-active substances, 266, 488
b-Strands, 296, 297, 299, 300 Surface plasmon resonance (SPR), 139, 169
Streptavidin, 252 Surface plasmon resonance techniques, 138
Streptococcus pneumoniae, 834 Surface receptors, 65, 777
Streptogramine A, 845 S-Warfarin, 675
Streptogramine B, 845 Symmetry operations, 268
Streptokinase, 120 Symobol only, 372
Streptomyces lividans, 749, 752 Synapses, 479
Streptomyces sp., 537 Synaptic gap, 413, 479, 484, 677, 767
Streptomycin, 8, 17, 118, 119, 486 Synchroton, 274
Strokes, 653, 654 Synercid ®, 845
Stromelysin, 143, 144, 566, 569, 581 Synergistic binding, 844
Stromelysins (MMP-3, -10, -11), 581 Synthases, 299, 474
Structurally conserved water molecule, 551 Synthestases, 474
Structure–activity analysis, 377 Syphilis, 124
Structure–activity relationship analysis, 15
Structure–activity relationships, 221, 371, 372,
378, 380, 412, 422, 432, 433, 460, 757 T
Structure-based design, 412, 445 TACE. See TNF-a-converting enzyme
Structure-based drug design, 429 (TACE)
Sturation transfer difference spectrum t-ACE, 577
(STD), 142 Tachycardia, 756
Subacute myelo-optic-neuropathy (SMON), Tachykinin receptors, 202
423 Tachykinins, 202, 417
Substance P, 190, 418, 732 Tadalafil, 591–593
Substance transport, 399 Tafenoquine, 45
Substantia nigra, 181, 413, 629 Tagamet ®, 55
Substitution therapy, 113, 814 Takeda, 711, 713, 732, 733
Substrate, 62 Tamiflu ®, 796
Substrate specificities, 498 Tamoxifen, 17, 707–709
Subtilases, 515 Tanomastat, 584, 585
Subtilisin, 243, 309, 310, 494, 496, 524 TAP transporter, 804
Succinic acid, 178 Target protein, 237
Sufamates, 587 Target structures, 471
Sugar–phosphate scaffold, 823 Tartaric acid, 90
Sulfadoxine, 45 Tasigna ®, 614
Sulfamides, 587 Taxol, 487
Sulfamidochrysoidine, 17, 27, 178 TCCD, 421
Sulfanilamide, 156, 486 TCCS, 421
Sulfonamides, 61, 121, 160, 350, 486, 587 T-cell protein tyrosine phosphatase (TCPTP),
Sulfonamidochrysoidine, 486 625–627, 629, 639
Sulfonylureas, 752, 753 T-cells, 260, 726, 804, 837
Sulindac, 179, 180, 690 protein tyrosine kinase, 626
Sulpiride, 414, 415 receptor, 805–808, 816
(S)-Sulpiride, 415 TCPTP. See T-cell protein tyrosine
Sumatriptan, 17 phosphatase (TCPTP)
Sumitomo, 815 tcptp gene, 626
SUMO, 600 T-effector cells, 803
Sunesis, 146, 628 Tegafur, 826
Superpositioning technique, 359 (AU: This Telaprevir, 524
word has been removed from the text of Telmisartan, 735
chapter 17) TEM–1b-lactamase, 521, 522
898 Subject Index

Tenofovir, 830 Thrombolysis, 815


Teprotide, 573 Thromboxane, 39, 40, 58
Teratogenic effects, 397 receptor, 688
Terbinafine, 32 TXA2, 687, 688
Terbutaline, 177, 179 Thymidine, 147, 830
Terfenadine, 161, 162, 673, 755, 756 kinase, 185, 828
Terpenes, 114 phosphorylase, 181, 182
Testes, 706 Thymidine-50 -triphosphate (TTP), 474, 830
Testosterone, 698, 700–702, 709 Thymidylate synthase, 146, 150, 181,
Test systems, 129 826, 827
TetA, 835 Thymidylate synthetase, 646
Tetrachlorodibenzodioxine, 421, 422 Thymine (TMP), 646
Tetracyclines, 118, 119, 486, 835, Thymine biosynthesis, 181
836, 848 Thyreotropine-releasing hormone (TRH), 197
Tetrahedral configuration, 496 receptor, 198
Tetrahydrofolate, 647 receptor ligand, 198
Tetrahydrofolic acid, 474 Thyroid hormone T3, 113
Tetrahydrostaurosporin, 612 Thyroxine, 156
Tetraodon nigroviridis, 240 TIBO, 831
Tet repressor, 835, 836 TIGR. See The Institute for Genomic Research
Tetrodotoxin, 117, 483 (TIGR)
TGF-b2. See Transforming growth factor b2 TIM-barrel, 298, 299, 643
(TGF-b2 ) fold, 452
TGT. See tRNA-guanine transglycosylase geometry, 661
(TGT) structure, 299
Thale cress, 239 Timolol, 185, 186
Thalidomide, 99, 102, 423 T264I mutant, 726
The Institute for Genomic Research (TIGR), Tipifarnib, 635, 637
239 Tipranavir, 549, 553
T helper cells, 802 Tirofiban, 782, 783
Theonella sp, 500 Tissue plasmin activator tPA, 118
Theophyllin, 673 Tissue plasminogen activator (tPA), 815
Theory of tetrahedral carbon, 90 Titratable groups, 70, 87
Theriak, 4 T-lymphocytes, 788, 789, 791, 802, 803
Thermodynamic binding characteristics, 141 T4 lysozyme, 724, 726
Thermodynamic parameters, 141 TMDs. See Transmembrane domains (TMDs)
Thermolysin, 144, 361, 364, 365, 566, 571, 572 TNF. See Tumor necrosis factor (TNF)
Thermus aquaticus, 235 TNF-a, 584, 709
Thermus thermophilus, 838 TNF-a-converting enzyme (TACE), 584
Thiacetazone, 121 TNF-a receptor, 741
Thiamine pyrophosphate (TPP), 474 Tobacco, 759
Thioamide, 195 Tolbutamide, 17, 160
6-Thioguanine, 826 Tolcapon, 18
Thiol groups, 569 Tolcapone, 631
Thiophene-2-sulfonamide, 587 Toloxatone, 682
Thioridazine, 755, 756 Tolrestat, 664
Thiorphan, 101, 361 Toluene, 175
Threonine proteases, 493, 524 Tom Steitz, 838
Thrombin, 80, 82, 167, 303, 494, 495, 497, Topiramate, 588, 590
500–502, 505, 507, 512 Topoisomerases, 833, 834
inhibitors, 83, 84, 167, 180, 499, 500, 510 Torpedo, 758
NAPAP complex, 504, 506, 507 Torsade de pointes tachycardia, 756
Thrombocytes, 40, 501, 778 Torsion-angle histograms, 341
Subject Index 899

Torsion angles, 292, 336–342, 344 Trimeris, 788


j and y, 196 Trimethoprim, 107, 650, 651
w, j, y, and c, 190 Triosephosphate isomerase, 299
Tositumomab, 822 Trioxyglutaric acid, 315
N-Tosyl-D-proline, 149 Tripos, 368
Toxicology studies, 397 Triptanes, 683
TP53, 41 Triturus vulgaris, 240
tPA. See Tissue plasminogen activator (tPA) tRNA-guanine transglycosylase (TGT), 450,
TPP. See Thiamine pyrophosphate (TPP) 451, 455
Traceless linkers, 226 tRNAs, 241, 450, 452, 839, 841
Traditional medicines, 116 Troglitazone, 715, 716
Trajectory, 326 Trojan horses, 185, 486
Tramadol, 675 Tropolone, 631
Tranquilizers, 483 Trusopt ®, 589
Transactivation domain, 699 Trypanosoma, 638
Transcriptase, 829 Trypanosoma crucei, 120
Transcription, 4, 839 Trypsanosomiasis, 120
factor PXR, 673 Trypsin, 80, 105, 250, 303, 304, 309, 310, 494,
factors, 450, 697, 699, 814 495, 497–499, 505, 507, 509
Transcriptome, 249, 252 Trypsin inhibitors, 284, 499
Transcriptome analysis, 252 Trypsin-like proteases, 494
Transferases, 474, 676 Trypsin-like serine proteases, 243, 244, 497, 502
Transforming growth factor b2 (TGF-b2 ), 825 Tryptase, 495, 515
Transgenic animals, 244 Tryptase inhibitors, 515
Transgenic sheep, 815 TTP. See Thymidine-50 -triphosphate (TTP)
Transglutaminase-2 (TGT2), 529 Tuberculosis, 3, 8, 121, 489, 677
Transglutaminases, 502, 528, 599 Tuberculostatics, 121
Transition metals, 641 Tubocurare, 372
Transition state, 473, 496 Tubocurarine, 115
analogues, 96, 97, 569 Tubulin, 487
isosteres, 536 TU Darmstadt, 593
mimetics, 123 Tudor Oprea, 410, 472
Translational freedom, 74 Tumor growth, 584
Transmembrane domains (TMDs), 768 Tumor-inhibiting substances, 486
Transpeptidase reaction, 518 Tumor metastasis, 527
Transpeptidases, 493, 521 Tumor necrosis factor (TNF), 278, 284, 285, 482
Transplant surgery, 487 Tumor necrosis factor (TNF-a), 821
Transport, 164 Tumor-suppressor protein p53, 200, 526
antibiotics, 770 Tumor therapy, 528
systems, 62 b-Turn, 196, 197, 503
Transporters, 62, 65, 472, 483, 484, 745 mimetics, 781
Tranylcypromine, 677–680 mimic, 197
Traxler model, 605 TVGYG motif, 750, 752
Trester wine, 26 TXA2, 688
TRH. See Thyreotropine-releasing hormone Type-II diabetes, 517, 623, 659
(TRH) Type-II diabetes mellitus, 754
Trichloroethanol, 25 Typsin-like serine proteases, 514
Trichomonas vaginalis, 241 Tyramine, 13, 677, 678, 681
Triglyceride, 623 Tyramine oxidases, 677
Trigonal bipyramidal phosphorus, 603 Tyr-Lys-Ser triad, 666
Trigonal-planar configuration, 496 Tyrosine kinase, 481, 706
Triiodothyronine, 697, 698 cascade, 821
Triiodothyronine T3, 156 domain, 738
900 Subject Index

Tyrosine kinase (cont.) Vanilloid receptor antagonist, 691


inhibitor, 740 Vardenafil, 591–593
Tyrosine receptor kinase, 739 Varicella zoster virus, 523
Tyrosyl radical, 685, 686 Vascular disease, 623
Tyrosyl-RNA synthase, 76 Vectors, 205
Vegetative nervous system, 518
Velcade ®, 524
U Venom, 573, 760
Ubiquitin, 524, 600 Ventricular fibrillations, 756
UCSF in San Francisco, 369, 441, 615 Verapamil, 17, 30, 56, 102, 483, 673, 749
Ulcerating colitis, 825 Veronal ®, 25
Ultrafast metabolizers, 675 Verteporfin, 18
Unfolding process, 139 Vertex, 524, 528
Unicorn powder, 5 Viagra ®, 19, 33, 34, 591, 594
UNITY, 368 Vibrational degrees of freedom, 68
University of Florence, 566 Vibrational motion, 306
University of Kiel, 509 Vibrational movements, 292
University of Marburg, 394, 454, 555 Vietnam War, 45
University of New Mexico, 472 Vildagliptin, 517
University of Perugia, 675 Vinblastine, 114
University of Prague, 677 Vinca alkaloids, 114
Unwin, 758 Vincristine, 114
Upjohn, 29, 443 Vin Mariani, 52
Uracil (UMP), 648 Vioxx ®, 691
Urea, 475 Viral disease, 790
Urease, 275 Viral hepatitis, 815
Urethane, 25 Viral mRNA, 825
Urginia maritime, 4 Viral proteases, 523
Uric acid, 33, 485 VirF, 450
Uric acid transporter, 485 Virostatics, 486
Uroberos, 14 Virtual screenings, 136, 368, 445, 458
Urokinase, 118, 495, 515 Virtual screening techniques, 136
US FDA, 41 “Virtual” spring, 356, 361
US National Institutes of Health (NIH), 58 Viruses, 62, 489
VitaminB2 (Riboflavin), 644
Vitamin B12, 474
V Vitamin D, 697, 698
Vaccines, 3, 815, 820 Vitravene ®, 825
Vaccine therapy, 806 Vitreous humour, 825
Vagal nerve, 9 Voltage-gated calcium channels, 483
Vagus stoff, 9 Voltage-gated chloride channels, 765
Valaciclovir, 185 Voltage-gated ion channel, 483
Valdecoxib, 691 Volume comparisons, 351
Valinomycin, 770 von Willebrand factor, 501
Valium ®, 12, 19 VP1, 798, 801
Valsartan, 734, 735 VP2, 798
van der Waals VP3, 798
contacts, 308, 844 VP4, 798
energy, 340
interactions, 296, 319
potentials, 380 W
radius, 323 Walter Reed Army Institute of Research, 45
surfaces, 323, 325 Warfarin, 124, 125
Subject Index 901

Warheads, 528 Ximelagatran, 18, 180, 508, 509


Warner-Lambert, 658 X-rays, 269
Water and octanol phases, 398 counter, 272
Water channels, 485 crystallography, 14, 437
Water flea, 240 crystal structure, 104
Water homeostasis, 485, 537, 772 diffraction, 272
Water molecules, 380 structure analysis, 136, 432
Water pore, 772
Wedge sea hare, 116
Weizmann Institute, 838
Wellcome Research Laboratories, 430 Y
WHO. See World Health Organisation (WHO) Yale University, 838
Whole-genome shotgun sequencing, 239 Yellowstone National Park, 235
Wiggling, 832 Yttrium, 822
Wilson analysis, 378
Wilson’s disease, 124, 125
WIN54954, 800 Z
Wisconsin Alumni Research Foundation, 125 Zanamivir, 488, 795–797
Wobble position, 451 Zantac ®, 56
World Health Organisation (WHO), 43, 48, 793 Zearalenone, 838
Worm, 135 Zebra finches, 257
Wound closure, 502 Zebra fish, 136
WPD loop, 621, 628 Zenarestat, 665
WWII, 45 Zeneca, 19, 512
Zidovudine, 552, 830
ZINC, 369
X Zinc, 565, 568, 569, 583, 587, 589, 594, 617, 620
Xamoterol, 161 finger domain, 242
Xanthene, 214 fingers, 243, 481, 482
Xanthine oxidase inhibitor, 121 hydrolase, 104
Xarelto ®, 514 proteases, 14, 562, 565, 569
Xemilofiban, 782, 783 Zinc-dependent metalloenzymes, 521
Xenecal ®, 518 Zinc-fnger motifs, 699
Xenobiotics, 65, 165, 301, 670, 768 Zopolrestat, 665
Xenopus laevis, 734 Zustandssumme, 326
X-gal, 134 Zymogens, 494

You might also like