Professional Documents
Culture Documents
Drug Design
Methodology, Concepts,
and Mode-of-Action
1 3Reference
Drug Design
Gerhard Klebe
Drug Design
Methodology, Concepts, and
Mode-of-Action
Translator
Leila Telan
D€usseldorf, Germany
This work is based on the second edition of “Wirkstoffdesign”, by Gerhard Klebe, published
by Spektrum Akademischer Verlag 2009, ISBN: 978-3-8274-2046-6
Library of Congress Control Number: 2013933987
The present handbook on drug design builds on the German version first written by
Hans-Joachim Böhm, Hugo Kubinyi, and me in 1996. After 12 years of success on
the market, the German version of this handbook was entirely rewritten and
significantly extended, then by me as the sole author. The new edition particularly
considers novel approaches in drug discovery and many successful examples
reported in literature on structure-based drug design and mode-of-action analysis.
This novel version appeared in 2009 on the German market. Several attempts were
made to translate this book into English to make it available to a wider audience.
This intention was driven by the fact that the author was repeatedly approached
with the question as to why such a successful book is not available in the English
language. An analysis of the textbook market made apparent that no similar
compendium was (and still is) available covering the same field of interest. Finally,
Springer agreed in the translation project, and Dr. Leila Telan, a gifted bilingual
medicinal chemist and physician, was found willing to take the task of producing
a first draft of a cover-to-cover translation of the German original. This version was
corrected, and some chapters extended by the author. The book is meant for
students of chemistry, pharmacy, biochemistry, biology, chemical biology, and
medicine interested in the design of new active agents and the structural founda-
tions of drug action. But it is also tailored to experts in drug industry who want to
obtain a more comprehensive overview of various aspects of the drug discovery
process.
Such a book project would not have been possible without the help of many
friends and colleagues. First of all, I want to express my sincere thanks to Dr. Leila
Telan, D€ usseldorf, Germany, who produced the first version of this translation. Her
version and the modifications of the author have been carefully proofread by many
colleagues in the field. Their help is highly appreciated. Furthermore, I would like
to acknowledge the help of Prof. Dr. Hugo Kubinyi, Heidelberg, Germany, who
assisted in correcting the first version of the English translation. Particular thanks
go to Dr. Simon Cottrell, Cambridge, England, and to Dr. Nathan Kilah, Hobat,
Tasmania, Australia, for their excellent and very thorough proofreading of the
different chapters. The project was ideally guided by Dr. Daniel Quinones and
v
vi Preface
Dr. Sylvia Blago, Springer, Heidelberg, Germany. The author is grateful to the
publisher for their assistance and technical support in producing the electronic and
printed version of this handbook.
vii
viii Contents
Drug design is a science, a technology, and an art all in one. An invention is the
result of a creative act, and a discovery is the detection of an already-existing
reality. Design encompasses the two processes with emphasis on a targeted
approach based on the available knowledge and technology. Furthermore, the
creativity and intuition of the researcher play a decisive role.
Drugs are all substances that affect a system by inducing a particular effect. In
the context of this book, drugs are substances that exhibit a biochemical or
pharmacological effect, in most cases medications, that achieve a therapeutic result
in humans.
The idea of rational drug design is not new. Organic compounds were prepared
more than a century ago with the goal of attaining new medicines. The sedatives
chloral hydrate (1869) and urethane (1885), and the antipyretics phenacetin
(1888) and acetylsalicylic acid (1897) are early examples of how targeted com-
pounds can be made that have favorable therapeutic properties by starting with
a working hypothesis. The fact, that the hypotheses in all four cases were more or
less incorrect (▶ Sects. 2.1, ▶ 2.2, and ▶ 3.1) simultaneously demonstrates one of
the main problems of drug design.
In the case of the artistic design of a poster or commodity, or, in the case of
engineering, the design of an automobile, a computer, or a machine, the result is
usually predictable. In contrast, the design of a drug is even today not completely
foreseeable. The consequences of the smallest structural changes of a drug on its
biological properties and target tissue are too multifaceted and at present too poorly
understood.
Until modern times, scientists have worked on the principle of trial and error to
find new medicines. By this they derived mostly empirical rules that have contrib-
uted to a knowledge base for rational drug design and which has been translated by
individual researchers more or less successfully into practice. Today new technol-
ogies are available for drug research, for instance, combinatorial chemistry, gene
technology, and automated screening methods with high throughput, protein crys-
tallography and fragment screening, virtual screening, and the application of bio-
and chemoinformatics.
ix
x Introduction
In many cases the molecular mechanisms of the mode of action of medicines are
fairly well understood, but in other cases we are at the threshold of comprehension.
Many of these mechanisms will be discussed in this book. Progress in protein
crystallography and NMR spectroscopy allows the determination of the three-
dimensional structure of protein–ligand complexes on a routine basis. As is
shown in many of the illustrations in this book (for a general explanation of
“reading” these illustrations, see the appendix at the end of this book) these
structures make a decisive contribution to the targeted design of drugs. 3D struc-
tures with up to atomic resolution are known for approximately 550,000 small
molecules and more than 85,000 proteins and protein–ligand complexes, and the
numbers are increasing exponentially. Methods for the prediction of the 3D struc-
tures of small molecules are now mature, and semiempirical and ab initio quantum
chemical calculations on drugs are now routinely performed. The sequencing of the
human genome is complete, and the genomes of other organisms are reported nearly
every week, including those of important human pathogens. The age of structural
genomics has begun, and it is only a matter of time before the 3D structures of entire
gene families are available. Given enough sequence homology, modeling programs
can nowadays achieve an impressive reliability. In the meantime, the composition of
entire genomes is being processed with structure-prediction programs. There are
already interesting approaches for the de novo prediction of 3D protein structures,
and the first correct 3D structural predictions have been successfully accomplished.
Structure-based and computer-aided design of new drugs is here to stay in
practical drug research. Computer programs serve the search for, modeling of,
and targeted design of new drugs. In countless cases these techniques have assisted
the discovery and optimization of new drugs. On the other hand, a too-strict and
one-sided focus on the computational results bears the danger of losing sight of the
available knowledge of the relationship between the chemical structure and bio-
logical activity. Another danger is the limited consideration of an active agent only
with respect to its interaction with one single target without considering the other
essential requirements for a drug, for instance, the pharmacokinetic and toxicolog-
ical properties. In the last decade, intensive research effort has gone into the
compilation of empirical guidelines to predict bioavailability, toxicological pro-
files, and metabolic properties (ADME parameters). The ability to predict the
metabolic profile for a given xenobiotic by the arsenal of cytochrome P450
enzymes or to predict for each individual patient the metabolic peculiarities is
still a dream. Nonetheless, just such an individually adjusted therapy and dosing
regime is within the realm of possibilities. It is also conceivable that in the
foreseeable future, gene sequencing of each of us will be financially feasible and
will require a manageable and justifiable amount of time and effort. This will open
entirely new perspectives for drug research. Whether this pushes open the gate to
individualized personal medicines will be a question of cost. The theme of this book
is to introduce the methods required for drug design particularly based on structural
and mechanistic evidence. By the use of well-selected examples the route to the
discovery and development of new medicines is discussed and will be reflected
under the constantly changing conditions.
Introduction xi
employees of “big pharma” have established their own small companies with an
innovative idea. If the idea was good and successful, after a few years these
innovators find themselves once again incorporated into the organization of
a “big pharma” company.
At the same time the prescribing practices in all areas of health care have
changed. Formerly it was the physician alone, occasionally in consultation with
a pharmacist, who was responsible for the pharmacological therapy of the patient.
Today cost-cutting measures, “negatives lists,” health insurance, the purchasing
departments of hospitals and pharmacies, the ubiquitous Internet, and even public
opinion influence therapies to an ever larger extent.
The drug market, with its US $600 billion, is an extremely attractive market.
Furthermore, this market is characterized by dynamic growth, which is decidedly
more than in other markets. The best selling drug in 2005, Lipitor ® (Sortis® in
Europe; atorvastatin) achieved US $12.2 billion in annual sales. Only illegal
narcotics like heroin and cocaine have higher sales figures.
Tailored medications – Will the latest technologies really deliver on this prom-
ise? What makes drug research so difficult? To use a parable, it is something like
playing against an almighty chess computer. The rules are known to both sides, but
it is very difficult to comprehend the consequences of each individual move during
a complicated middle game. A biological organism is an extremely complicated
system. The effect of a drug on the system and the effect of the system on the drug
are multifaceted. Every structural change made with the goal of optimizing one
particular characteristic simultaneously changes the finely tuned equilibrium of the
other characteristics of the drug.
The knowledge of the interplay between the chemical structure and the biolog-
ical effect must be united with the newest technology and results of genetic research
to purposefully develop new medicines. It is also necessary to define the range of
applications and the limitations of new technologies. Theory and modeling cannot
exist detached from experiment. The results of calculations depend strongly on the
boundary parameters of the simulation. The results collected at one system are only
conditionally transferable to other systems. Only an experienced specialist is in
a position to fully exploit the special potential of theoretical approaches. The claims
that some software and venture capital companies make, that their results automat-
ically lead to success, should be considered with some skepticism. This book should
be helpful in these situations too, to separate the wheat from the chaff and to
identifying the application range of these method as well as their limitations.
This book is about drug research and the mode of action of medicines. It is
different from classical textbooks on pharmaceutical chemistry in its structure and
goals. The principles, methods, and problems associated with the search for new
medicines are the themes. Classes of drugs are not discussed, but rather the way that
these drugs were discovered and some insights into the structural requirements for
their action on a particular target protein. As the title suggests, the book is meant for
students of chemistry, pharmacy, biochemistry, biology, and medicine who are
interested in the art of designing new medicines and the structural fundamentals of
how drugs act on their targets.
Introduction xiii
In the first section, after an introduction to the history of medicines and the
concept of serendipity as an unpredictable but always very successful concept in
drug research, examples from classical drug research will be presented.
A discussion about the fundamentals of drug action, the ligand–receptor interaction,
and the influence of the three-dimensional structure on the efficacy of a drug round
the section out. In the second section, the search for lead structures and their
optimization and the use of prodrug strategies are introduced. New screening
technologies but also the systematic modification of structures by using the concept
of bioisosteres and a peptidomimetic approach are discussed. In the third section,
experimental and theoretical methods applied in drug research are described.
Combinatorial chemistry has afforded access to a wide variety of test substances.
Gene technology has produced the target proteins in their pure form, and has helped
to characterize these proteins’ properties and function from the molecular level to
the cellular assembly, all the way to the organism level. It has built a bridge between
understanding the effects of a drug therapy on the complex microstructure of a cell
and in systems biology of an organism. The spatial structure of proteins and
protein–ligand complexes are accessible through NMR spectroscopy and X-ray
crystallography. Their structural principles are becoming better understood and are
increasingly allowing us access to the binding geometry of the drugs. The computer
methods and molecular dynamics simulations of complex conformational analysis
have also sharpened our understanding of targeted drug design. The fourth section
introduces design techniques such as pharmacophore and receptor modeling, and
discusses the methods of, and uses for, quantitative structure–activity relationships
(QSAR). Insights into the transport and distribution of drugs in biological systems
are given, and different techniques for structure-based design are presented. A drug-
design case study from the author’s research closes the chapter. The fifth section of
this book focuses on the core question of pharmacology: How drugs actually work?
Enzymes, receptors, channels, transporters, and surface proteins are divided into
individual chapters and discussed as a group of target proteins. The spatial structure
of the protein and modes of action are used to elucidate in detail why a drug works
and why it must exhibit a particular geometry and structure to work. Exemplarily,
the contributions of structure-based and computer-aided design to the discovery of
new drugs are presented in these chapters, and other aspects are also shifted into the
spotlight.
Because of the concept of this book, many important drugs are not considered or
are only fleetingly mentioned. The same is true of receptor theory, pharmacokinet-
ics and metabolism, the basics of gene technology, and statistical methods. The
biochemical, molecular biological, and pharmacological fundamentals of the mode
of action of drugs, which are important for the understanding of the theme of drug
design, are only commented upon in outline form. Other disciplines that are critical
for the development of an active substance to a medicine and application to
patients, such as pharmaceutical formulations, toxicological testing, and clinical
trials, are not themes that are covered in this book.
The selection of examples from therapeutic areas was made subjectively and for
didactic reasons based on case studies and to bring other aspects of drug research to
xiv Introduction
the foreground. A balanced presentation of the methods of drug design and their
practical application was attempted. The interested reader does not have to read the
book chronologically. If the reader’s interest is purely on drugs and their mode of
action, then they can also begin with ▶ Chap. 22. There are many cross references
in the text to help the reader to find the passages in other parts of the book that are
necessary for an exact comprehension of what is being discussed at any given part.
The references and literature suggestions that follow cite particularly recommend-
able monographs and are ordered alphabetically; journals and series on the themes
that are discussed in later chapters are not mentioned specifically again.
Literature
Monographs
Brunton L, Lazo J, Parker K (2005) Goodman & Gilman’s the pharmacological basis of thera-
peutics, 11th edn. McGraw-Hill, Europe
Ganellin CR, Roberts SM (eds) (1993) Medicinal chemistry. The role of organic chemistry in drug
research, 2nd edn. Academic Press, London
King FD (ed) (2003) Medicinal chemistry: principles and practice, 2nd edn. The Royal Society of
Chemistry, Cambridge
Krogsgaard-Larsen P, Bundgaard H (eds) (1991) A textbook of drug design and development.
Harwood Academic Publishers, Chur, Schweiz
Lednicer D (ed) (1993) Chronicles of drug discovery, vol 3. American Chemical Society,
Washington, DC and earlier volumes from this series
Lemke TL, Williams DA (2008) Foye’s principles of medicinal chemistry, 6th edn. Williams &
Wilkins, Baltimore
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry. Wiley-
VCH, Weinheim, Series with Guest Editors
Maxwell RA, Eckhardt SB (1990) Drug discovery. A casebook and analysis. Humana Press,
Clifton
Mutschler E, Derendorf H (1995) Drug action, basic principles and therapeutic aspects. CRC
Press:Boca Raton/Ann Arbor/London/Tokyo
Silverman RB (2004) The organic chemistry of drug design and drug action, 2nd edn. Elsevier/
Academic Press, Burlington
Wermuth CG, Koga N, König H, Metcalf BW (eds) (1992) Medicinal chemistry for the 21st
century. Blackwell Scientific, Oxford
Nature
Nature Reviews Drug Discovery
Perspectives in Drug Discovery and Design
Pharmacochemistry Library
Progress in Drug Research
Quantitative Structure-Activity Relationships
Reviews in Computational Chemistry
Science
Scientific American
Trends in Pharmacological Sciences
Nowadays the Internet, discussion platforms, and the tremendously valuable tool of Wikipedia are
available to everyone and provide access to an enormous source of information.
Part I
Fundamentals in Drug Research
2 I Fundamentals in Drug Research
This colored copper plate engraving from arguably the most beautiful plant book,
the Hortus Eystettensis by Basilius Besler, Eichst€att, 1613, shows the squill, Scilla
alba (modern name: Urginea maritima L.). This plant was known to the ancient
Egyptians, Greeks, and Romans as a remedy for many ailments, but especially
dropsy (today: congestive heart failure). It was venerated faithfully as general
defense against harm. It was not until our century that the active components of
squill, the glycosides scillaren, and proscillaridin were isolated in their pure form,
and a derivative with improved bioavailability, meproscillarin (Clift ®), was avail-
able for pharmaceutical therapy.
Drug Research: Yesterday, Today, and
Tomorrow 1
The targeted route to medicines is an old dream of humanity. Even the alchemists
sought after the Elixir, the Arcanum that was meant to heal all disease. It still has
not been found today. On the contrary, drug therapy has become even more
complicated as our knowledge of the different disease etiologies has become
more complex.
Nonetheless, the success of drug research is impressive. For hundreds of years,
alcohol, opium, and solanaceae alkaloids (from thorn apples) were the only prepa-
ratory measures for surgery. Today general anesthesia, neuroleptanalgesia, and
local anesthetics allow absolutely pain-free surgical and dental procedures to be
carried out. Until this century, plagues and infectious diseases have killed more
people than all wars. Today, thanks to hygiene, vaccines, chemotherapeutics, and
antibiotics, these diseases have been suppressed, at least in industrialized countries.
The dangerously increasing numbers of therapy-resistant bacterial and viral path-
ogens (e.g., tuberculosis) have presented new problems and make the development
of new medications urgently necessary. The H2-receptor inhibitors and proton-
pump inhibitors have drastically reduced the number of surgical procedures to treat
gastric and duodenal ulcers. Combinations of these inhibitors with antibiotics have
brought even more advances in that it allows a causal therapy (▶ Sect. 3.5).
Cardiovascular diseases, diabetes, and psychiatric diseases (diseases of the central
nervous system, CNS) are treated mostly symptomatically, that is, the cause of the
disease is not addressed, but rather the negative effects of the disease on the
organism. Often the therapy is limited to slowing the progression of these diseases
or increasing the quality of life. Synthetic corticosteroids have lead to significant
pain reduction and retardation of the pathological bone degeneration associated
with chronic inflammatory diseases (e.g., rheumatoid and chronic polyarthritis).
The spectrum of cancer therapy ranges from healing, particularly in combination
with surgical and radiation therapy, all the way to complete failure of all therapeutic
measures.
The history of drug research can be divided into several sequential phases:
• the beginning, when empirical methods were the only source of new medicines,
• targeted isolation of active compounds from plants,
• the beginning of a systematic search for new synthetic materials with biological
effects and the introduction of animal models as surrogates for patients,
• the use of molecular and other in vitro test systems as precise models and as
a replacement for animal experiments,
• the introduction of experimental and theoretical methods such as protein
crystallography, molecular modeling, and quantitative structure–activity rela-
tionships for the targeted structure-based and computer-supported design of
drugs, and
• the discoveries of new targets and the validation of their therapeutic value
through genomic, transcriptomic, and proteomic analysis, knock-in and knock-
out animal models, and gene silencing with siRNA.
Each preceding phase loses its importance with the arrival of the next phase.
Interestingly, in modern drug research individual phases run in the opposite direc-
tion. That is, first a target structure is discovered in the sequenced genome of an
organism and its function is modulated to validate it as a candidate for drug therapy.
Then the structure-based and computer-aided design of an active substance is
undertaken in close cooperation with multiple in vitro tests to clarify the activity
and the activity spectrum. Next, the animal experiments substantiate the clinical
relevance, and in the final step clinical trials confirm a test substance’s suitability as
a medicine for patients.
The beginnings of drug therapy can be found in traditional medicines. The narcotic
effect of the milk of the poppy, the use of autumn crocus (Colchicum autumnale)
for gout, and the diuretic effect of squill (Urginia maritime) for dropsy (today:
congestive heart failure) have been known since antiquity. The dried herbs and
extracts from these and other plants have served as the most important source of
medicines for more than 5,000 years. The oldest written records of these uses are
from 3000 BC.
Around 1550 BC the ancient Egyptian Papyrus Ebers listed approximately 800
prescriptions, of which many contained additional rituals to invoke the help of the
gods. The five-volume book De Materia Medica of Dioskurides (Greek physician,
first century AD) is the most scientifically rigorous work of antiquity. It contains
descriptions of 800 medicinal plants, 100 animal products, and 90 minerals. Its
influence reached into the late Arabic medicine and the early modern age.
The most famous medicine of antiquity was undoubtedly Theriac. Its precursor,
Mithridatum, served the King of Pontus, Mithridates VI (120–63 BC) as an antidote
for poisonings of all kinds. Theriac can be traced to Andromachus, the private
physician of the emperor Nero, and originally contained 64 ingredients. This
preparation remained very widespread even into the eighteenth century. It was
prepared in many variations with up to 100 ingredients. In some cities it was even
prepared under state control to ensure that none of the ingredients were left out! Its
use evolved into a panacea for all diseases. In addition, every imaginable wonder
1.2 Animal Experiments as a Starting Point for Drug Research 5
drug was in use, some examples include rain worm oil, unicorn powder, gastric
calculus stones, human cranium powder (Lat. Cranium, skull), mummy dust, and
many more.
Traditional Chinese medicine was very advanced even in ancient times.
A special feature of their formulation was, and is, the circumstances responsible
for the effect of four different qualities. The chief (jun) is the carrier of the effect,
the adjutant (chen) supports the effect or induces a different effect. The assistant
(zuo) can also support the main effect or can serve to ameliorate side effects, and
one or more messengers (shi) moderate the desired effect. The Chinese Pen-Ts’ao
school (first and second century AD), whose goal it was to live for as long as
possible without aging (!), recommended the following dosing regime:
When treating a disease with a medicine, if a strong effect is desired, one should begin with
a dose that is not larger than a grain of millet. If the disease is healed, no more medicine
should be given. If the disease is not healed, the dose should be doubled. If that does not
heal the disease, the dose should be increased tenfold. When the disease is healed, the
therapy should always be discontinued.
HO O CH3
1.1 Morphine H3C N
N
O H H
O N N
N 1.2 Caffeine
CH3
HO
H MeO
NHCOMe
MeO
HO N
H MeO
MeO O
OMe
N 1.3 Quinine 1.4 Colchicine
N COOCH3 OH
H3C H
O O N
CH3
H CH3
1.5 Cocaine
1.6 Ephedrine
N
H3C
N CH3 H OH
H
1.7 Coniine O
N N
H H H
O
H
MeO OMe
O
O OMe
OMe 1.9 Reserpine
OMe
Fig. 1.1 Many important natural products were isolated in the nineteenth century, and a few were
synthesized. Morphine 1.1 was isolated from opium by Friedrich Wilhelm Adam Sert€ urner in
1806, caffeine 1.2 was isolated from coffee, and quinine 1.3 was isolated from cinchona bark by
Friedlieb Runge in 1819. Quinine was discovered independently by Pierre Joseph Pelletier and
Joseph Bienaimé Caventou, who one year later isolated colchicine 1.4 from autumn crocus.
Cocaine 1.5 was extracted from coca leaves by Albert Niemann in 1860, and ephedrine 1.6 was
extracted from the Chinese plant Ma Huang (Ephedra vulgaris) by Nagayoshi Nagai. In 1886 the
first alkaloid, coniine 1.7, which is found in hemlock, was synthesized by Albert Ladenburg; in
1901 atropine 1.8 from Deadly Nightshade was synthesized by Richard Willst€atter. Reserpine 1.9,
from Rauwolfia serpentina was first prepared in the middle of the twentieth century, and its
structure was elucidated.
1.2 Animal Experiments as a Starting Point for Drug Research 7
Bologna, which was first described in his book De viribus electricitatis in motu
musculari in 1791, has become famous. In 1780 his students had already observed
how frog thighs would twitch when the nerve was dissected and if a static electricity
generator was simultaneously in use, such devices were standard laboratory equip-
ment in many laboratories at the time. He wanted to demonstrate in standardized
experiments whether the twitching was also caused by thunderstorms. He hung the
legs on an iron window grill with a copper hook — they twitched simply upon
contact with the grill. The voltage difference between the two metals was enough to
stimulate the nerve, even without an electrical discharge.
The systematic investigation of the biological effects in animals of plant
extracts, animal venoms, and synthetic substances began in the next-to-last century.
In 1847 the first pharmacology department was founded at the Imperial University
in Dorpat (today: Tartu, Estonia). The famous pharmacologist, Sir James W. Black,
who developed the first b-blocker (an antihypertensive, ▶ Sect. 29.3) at ICI, and
later took part in the development of the first H2 antagonists (see gastrointestinal
ulcer medications, ▶ Sect. 3.5) at Smith, Kline & French, compared pharmacolog-
ical testing to a prism: what pharmacologists see in their substances’ properties
directly depends on the model that was used to test the substances.
Just as a prism would, the models distort our vision in different ways. There is no
such thing as a depressed rabbit or a schizophrenic rat. Even if there were such
animals, they would not be able to share their subjective perceptions and emotions
with us. Gene-modified animals (▶ Sect. 12.5), such as the Alzheimer mouse, are also
approximations of reality that have been distorted through a different prism, to use
Black’s analogy. This actuality is often underestimated in industrial practices. Sci-
entists tend to optimize their experiments on a particular, isolated model. In doing so,
many factors and characteristics that are essential for a medicine, for instance the
selectivity or bioavailability, are inadequately considered.
There is no way out of this dilemma. We need simple in vitro models (Sect. 1.5)
to be able to test large series of potentially active compounds, and we need the
animal models to correlate the data and to make predictions about the therapeutic
effects on humans. In the past, therapeutic progress was preferentially achieved
when a new in vivo or in vitro pharmacological model was available for a new effect
(see the H2 receptor antagonists, ▶ Sect. 3.5).
Typical mistakes in the selection of models and interpretation and comparison of
experimental results arise from different modes of application and the correlation of
results obtained in different species of animals. It does not make sense to optimize
the therapeutic range of a substance in one species, and the toxicology in another.
Further, comparing effects after a fixed dose, without determining an effective dose
also distorts the results because very strong and weak substances fall outside the
measurement range. Measuring the effect strictly according to a schedule is also
questionable because neither the latency period, that is the time before an effect is
seen, nor the time of maximum biological effect are recorded. In whole-animal
models, auxiliary medications are usually applied, which can also influence the
experimental results. Anesthetized animals often give entirely different results than
conscious animals.
8 1 Drug Research: Yesterday, Today, and Tomorrow
Plagues and infectious diseases, and at the top of this list are malaria and tubercu-
losis, have killed more people over the ages than all of the wars in the history of
humanity. Twenty-two million people died during the first wave of the 1918
influenza (“Spanish flu”). Up until the middle of the twentieth century, millions
of people died every year of malaria, and unfortunately, today these numbers are
shooting up again (▶ Sect. 3.2). Until the turn of the twentieth century, ipecac
(Psychotria ipecacuanha) and cinchona (Cinchona officinalis L.) were the only
therapeutic approaches to this disease. The impressive successes in the fight against
plagues came in large part from the last 80 years of drug research. We have the
sulfonamides (▶ Sect. 2.3) and their combinations with dihydrofolatereductase
inhibitors (▶ Sect. 27.2), the antibiotics (▶ Sects. 2.4, ▶ 6.4, and ▶ 32.6), and
the synthetic tuberculostatic medicines (▶ Sect. 6.5) to thank for this. When
Selman A. Waksman (1888–1973) received the Nobel Prize for the discovery of
streptomycin (▶ Sect. 6.4), a little girl congratulated him with a bouquet of
flowers. She was the first patient with meningeal tuberculosis to be healed with
streptomycin. Today we cannot appreciate the atmosphere in a tuberculosis hos-
pital from our own experience, rather solely from Thomas Mann’s The Magic
Mountain (German: Zauberberg).
However, the infectious diseases, including tuberculosis, are on the advance
again. In the past many antibiotics were too broadly used. This and the spread of
resistant pathogens in hospitals have led to the situation that many cases are only
treatable with very specific antibiotics. If resistance develops to these antibiotics, all
of our weapons are dull. New viral infections are looming. Before the advent of the
immune disease AIDS (acquired immune deficiency syndrome) there were very
few cases of pneumonia from the fungus Pneumocystis jirovecii (formerly
Pneumocystis carinii), nowadays the numbers have increased tremendously. This
type of pneumonia is the primary cause of death of AIDS patients and
immunosupressed patients after organ transplantation. A great effort has been
made to find drugs for AIDS and its complications. On the other hand, many
widespread tropical diseases, for instance malaria and Chagas disease, have been
inadequately researched, and expanding resistance to the currently available med-
ications represents an increasing worldwide problem. Because these diseases are
rampant in parts of the world where people lack the economic resources to finance
chemotherapy, more and more pharmaceutical companies have withdrawn from
these research areas for financial reasons. The chances of recovering the develop-
ment costs from this social stratum are poor. Here the global politics must establish
some structure so that these people are able to benefit from the technological
progress made by modern drug research. An example of this is the Bill and Melinda
Gates Foundation, which is dedicated to the treatment and eradication of diseases
around the entire world, but with particular emphasis on developing countries.
Improved hygiene has also helped to reduce the risk of infection, for
instance traumatic fever or Shigella dysentery (discussed in ▶ Chap. 21, “A Case
1.4 Biological Concepts in Drug Research 9
Acetylcholine 1.10 (Fig. 1.2), which was synthesized in 1869 by Adolf v. Bayer, is
a neurotransmitter, that is, a transfer agent for nerve impulses. In 1921 Otto
Loewi, a pharmacologist, proved its biological effect in an elegant experiment.
Two isolated frog hearts were perfused with the same solution. The vagal nerve of
one of the hearts was stimulated, leading to a slowing of the heart rate, a so-called
bradycardia. Shortly afterward, the second heart also began to beat more slowly,
which was a clear indication of a humoral (Lat. humor, umor, fluid) signal transfer.
Soon after that acetylcholine was recognized as the responsible “Vagus Stoff”.
Acetylcholine is itself not usable as a therapeutic because it is metabolized too
quickly by acetylcholine esterases (▶ Sect. 23.7).
In 1901 Thomas Bell Aldrich (1861–1938) and Jokichi Takamine isolated the
first human hormone, adrenaline 1.11 (Fig. 1.2). This hormone and its N-desmethyl
derivative, noradrenaline 1.12 (Fig. 1.2), are produced in a central location, the
adrenal glands, and are released under stress conditions into the entire system with
the exceptions of the CNS and the placenta, which have their own barriers against
most polar compounds. These substances cause different reactions in different parts
of the organism, where they react with the relevant receptors. The specificity is
poor, and a plethora of pharmacodynamic effects result: pulse and blood pressure
rise, and the organism is prepared for “flight” – which has been an exceedingly
important function over the course of evolution.
Noradrenaline and adrenaline (also called norepinephrine and epinephrine,
respectively) are also neurotransmitters (▶ Sect. 29.3), just like acetylcholine, the
biogenic amines 1.13–1.15, the amino acids 1.16–1.19, and peptides, such as 1.20
and 1.21 (Fig. 1.2). Neurotransmitters are produced locally in the nerve cells,
stored, and upon stimulation of the nerve, released. After interaction with receptors
on the neighboring nerve cell, they are quickly metabolized or taken up again by the
same neuron that released them. Depending on the name of the neurotransmitter,
one speaks of the adrenergic, cholinergic, and dopaminergic (etc.) systems. The
effect that adrenaline invokes is referred to as adrenergic, and an antagonist to this
system is called antiadrenergic. However, this nomenclature is not always strictly
adhered to. It is common to see combinations of the name of the neurotransmitter
with the term agonist or antagonist, or sometimes blocker instead of antagonist, for
instance a dopamine agonist, a histamine antagonist, or a b-blocker for antagonists
of b-adrenergic receptors. A plethora of drugs have arisen from the structural
variations of neurotransmitters.
10 1 Drug Research: Yesterday, Today, and Tomorrow
O CH3 OH
H
+ N CH3 HO N
H3C O R
CH3
1.10 Acetylcholine HO
NH2 NH2
COOH
HOOC HOOC COOH
Fig. 1.2 The natural hormones und neurotransmitters acetylcholine 1.10, adrenaline 1.11, nor-
adrenaline 1.12, dopamine 1.13, histamine 1.14, and serotonin 1.15, the excitatory amino acids
glutamic acid 1.16 and aspartic acid 1.17, the inhibitory amino acid glycine 1.18 and
g-aminobutyric acid (GABA) 1.19, and several peptides, such as the enkephalins 1.20 and 1.21,
substance P and others serve as lead structures for drugs for a variety of cardiovascular and CNS
diseases (see ▶ Chaps. 3, “Classical Drug Research”; ▶ 29, “Agonists and Antagonists of Mem-
brane-Bound Receptors”; and ▶ 30, “Ligands for Channels, Pores, and Transporters”).
At the end of the 1920s the steroid hormones were isolated, and their structures
were determined in short order (▶ Sect. 28.5). Altogether the discoveries of the
mid-twentieth century heralded the “golden age” of drug research. The systematic
variation of the principles responsible for biological activity and our increasing
knowledge of the mode of action has led to the synthesis of enzyme inhibitors,
receptor agonists and antagonists, which together with natural product derivatives
from plants makes up the largest part of our modern pharmacy.
1.5 In Vitro Models and Molecular Test Systems 11
Around 40 years ago, we began to think about testing substances in simple in vitro
models. With these models biological testing takes place in test tubes rather than
animals. There are many compelling reasons to avoid animal experiments. They
increasingly provoke public criticism and are time and cost intensive. In the
beginning cell culture models were preferentially employed, for example tumor
cell cultures for testing cytostatic therapies, or embryonic chicken heart cells for
cardio-active compounds. Later these were joined by receptor-binding studies. The
first molecular test models were enzyme-inhibitor assays in which the inhibitory
activity of a molecule could be evaluated on one particular target protein in the
absence of interfering side effects (▶ Chap. 7, “Screening Technologies for
Lead Structure Discovery”). With the progress of gene technology methods
(▶ Chap. 12, “Gene Technology in Drug Research”), not only is the preparation
of the enzyme simplified, but also receptor-binding studies can be carried out on
standardized materials. Today it is possible to achieve an exact evaluation of the
entire activity spectrum of any substance on any enzyme, receptors of all types and
subtypes, ion channels, and transporters. In the meantime, in industrial drug dis-
covery this procedure has become routine. Before biological screening begins, the
following questions have to be answered: what therapeutic goal should be achieved
and is this goal achievable? Therapeutic concepts are established based on the
pathophysiology and the causes of its alteration. Regulatory interventions with
drugs should re-establish the normal physiological conditions as closely as possible.
In doing so, a distinct problem occurs. Nature works on two orthogonal principles: the
specificity of the mode of action and an accentuated spa separation of effects;
the compartmentalization. Adrenaline that is produced in the adrenal glands works
on the entire body except for the brain. If it is released there, it works only in the
synapse between two nerve cells. As far as the specificity goes, the chemists can beat
nature most of the time, but they fail when it comes to spatial separation by a wide
margin.
Through the progress made in gene technology (▶ Chap. 12, “Gene Technology
in Drug Research”) we can investigate active substances much more exactly than
before; but by using isolated enzymes and binding studies we are a long way away
from the reality of animal models, and even further away from humans. In analogy
to the difference between an animal experiment and an isolated-organ experiment,
a well-established correlation between the results obtained in cell culture and an
in vitro test and the desired therapeutic effect is a prerequisite to successfully using
the in vitro model. The quantitative relationship between different biological effects
(▶ Chap. 18, “Quantitative Structure–Activity Relationships”) establishes the con-
nection between animal models and humans.
One modern researcher stands out in the area of CNS-active compounds
especially, but also in areas of cardiovascular-active compounds and antihista-
mines. Paul Janssen (1926–2003) was the director of the company Janssen
Pharmaceuticals in Beerse, Belgium. In the years after World War II, his company
discovered over 70 new active substances, carried out the preclinical and clinical
12 1 Drug Research: Yesterday, Today, and Tomorrow
Up until the middle of the last century psychiatric hospitals were purely custodial care
facilities; they were almost indistinguishable from prisons in terms of the restriction of
personal freedom of the individual. The discovery of neuroleptics, antidepressants,
anticonvulsives, and sedatives revolutionized psychiatry. Typical examples of this
class of drugs are depicted in Fig. 1.3. With the repertoire of drugs that are available
today, schizophrenia, chronic anxiety, and depression preponderate open-ward psychi-
atry. Many patients can be treated in an ambulatory setting. In 1933 Manfred Sakel
(1901–1957), who worked at the psychiatric university hospital in Vienna, noticed that
when schizophrenics were given insulin to stimulate their appetites, they became
calmer. Encouraged by this result, he increased the dose to the point of hypoglycemic
coma, which is a form of deep unconsciousness induced by too little blood sugar. Insulin
shock, pentetrazole, and electroshock became the standard treatment over the next two
decades for psychotic illness, an impressive and frightening proof of the absence of
therapeutic alternatives.
This situation changed in the 1950s with the discovery of reserpine 1.9 (Fig. 1.1,
Sect. 1.1), a herbal natural product. This substance exerts its effect by emptying the
reserves of the neurotransmitters noradrenaline, serotonin, and dopamine in nerve
cells. Reserpine was the first substance to display a prominent neuroleptic effect,
that is, it is sedating and calming, and it was the first compound to be used for
psychotic illness, for which the biological effect could be explained by a mode of
action. In addition, reserpine was used as an antihypertensive medication. Because
of its very broad and unspecific effect it is rarely used today for psychiatric illness
or arterial hypertension.
The role of dopamine 1.13 (Fig. 1.2, Sect. 1.4) in the etiology of schizophrenia
became clear with the discovery of chlorpromazine 1.22 (Fig 1.3, ▶ Sects. 8.5 and
▶ 19.10), a substance that showed a favorable clinical effect. In contrast to the
unspecific reserpine, chlorpromazine is a pure dopamine antagonist. The applica-
tion of chlorpromazine and analogous tricyclic neuroleptics caused symptoms that
occur in Parkinson’s disease. This was the first indication that an endogenous
dopamine deficiency is the cause of that disease.
Chlordiazepoxide (Librium ®, ▶ Sect. 2.7), the first tranquilizer of the group of
benzodiazepines, was found by accident. Only one year after its introduction and
for many years after that, the chemically closely related medication diazepam 1.23
(Valium ®, Fig. 1.3) was the worldwide best-selling drug. The Rolling Stones
1.6 The Successful Therapy of Psychiatric Illness 13
CH3
O
N
S
Cl N
Cl N
CH3
N
CH3
1.22 Chlorpromazine 1.23 Diazepam
F3C
N O
CH3 CH3
N N
R H
Fig. 1.3 A revolution in the therapy of psychiatric illness was brought about by the discovery of
potent neuroleptics such as chlorpromazine 1.22, tranquilizers such as diazepam 1.23, and
antidepressants such as imipramine 1.24. For the first time, these compounds allowed
a purposeful treatment of schizophrenia, chronic anxiety, and depression. Examples of newer
antidepressants with specific modes of action on transport systems (▶ Sect. 4.6) for noradrenaline
and serotonin are desipramine 1.25 and fluoxetine 1.26, respectively.
reuptake of these neurotransmitters from the synaptic gap. Desipramine 1.25 and
fluoxetine 1.26 are even more selective in that they inhibit only the noradrenaline or
the serotonin transporter of nerve cells.
An extremely capable tool is available for modeling the properties and reactions of
molecules, and particularly their intermolecular interactions: the computer. In
addition to processing complex numerical problems, it is the translation of the
results into color graphics that exceedingly accommodates the human ability to
grasp pictures faster and more easily than text or columns of numbers. That is not
a surprise. Our brains process text sequentially, but pictures are comprehended in
parallel. X-ray crystallography and multidimensional NMR spectroscopic tech-
niques (▶ Chap. 13, “Experimental Methods of Structure Determination”) contrib-
ute to our understanding of molecules as much as quantum mechanical and force
field calculations (▶ Chap. 15, “Molecular Modeling”).
Is molecular modeling an invention of modern times? Yes and No. Friedrich
August Kekulé (1829–1896) supposedly derived his cyclic structure for benzene
from a vision of a snake that circled upon itself and bit its own tail (incidentally, the
snake Uroborus is an age-old alchemist symbol). This now-famous dream may be,
however, traced to a memory of the book Constitutionsformeln der Organischen
Chemie by the Austrian schoolteacher Joseph Loschmidt (1821–1895; Fig. 1.4).
Loschmidt admittedly would take pleasure in contemplating pictures of models that
are quite similar to his own. More and more today we place the three-dimensional
structure, the steric dimensions, and the electronic qualities of molecules in the
foreground. Advances in theoretical organic chemistry and X-ray crystallography
have made this possible. The first structure-based design was carried out on
hemoglobin, the red blood pigment, in the research group of Peter Goodford.
Hemoglobin’s affinity for oxygen is modulated by so-called allosteric effector
molecules that bind in the core of the tetrameric protein. From the three-
dimensional structure he deduced simple dialdehydes and their bisulfite addition
products. These substances bind to hemoglobin in the predicted way and shift the
oxygen-binding curve in the expected direction.
The first drug developed by using a structure-based approach is the antihy-
pertensive agent captopril, an angiotensin-converting enzyme (ACE) inhibitor
(▶ Sect. 25.4). Although the lead structure was a snake venom, the decisive
breakthrough was made after modeling the binding site. For this, the binding site
of carboxypeptidase, another zinc protease, was used because its three-dimensional
structure was known at the time.
The road to a new drug is difficult and tedious. A nested overview of the interplay
between the different methods and disciplines from a modern point of view is
illustrated in the scheme in Fig. 1.5. In the last few years molecular modeling
(▶ Chap. 15, “Molecular Modeling”) and particularly the modeling of ligand–
receptor interactions (▶ Chap. 4, “Protein–Ligand Interactions as the Basis
1.7 Modeling and Computer-Aided Design 15
H O
OH
H
H2N NH2
Cl
N N
N N N
H H
Fig. 1.4 Loschmidt’s book Constitutionsformeln der Organischen Chemie (1861) contains struc-
tures that anticipate both the formulation of the benzene ring as well as the modern modeling
structure. Kekulé must have known about this book because he disparaged it in a letter to Emil
Erlenmeyer in January 1862 in that he referred to it as “Confusionsformeln.” Loschmidt did not
become famous for his book, but rather because he carried out an experiment in 1865 that
determined the number of molecules in a mole to be 6.021023, a constant that was later to be
named after him.
Identification of a
biological target, proof Literature, patents,
of principle, molecular competitor products
test system (‘me too’ research)
Natural products,
synthetics, peptides, Screening Biological concept,
combinatorial clinical side effects
chemistry
Lead structures
Experimental design,
synthetic design
Synthesis
Computer-aided
DESIGN CYCLE design: protein
crystallography,
NMR, 3D database
Biological searches, de novo
testing Structure—activity design
relationships, QSAR,
molecular modeling
Candidate for
further Developmental
substance Formulation Drug
development
Fig. 1.5 The way to a drug is long. The upper part of the figure shows routes to lead structures.
The middle part describes the design cycle, which in practically all cases must be repeatedly
reiterated. Each of these phases is described in detail in the following chapters. The result of
iterative optimization is candidates for further development such as preclinical and toxicological
studies. It is from these studies that the actual candidates are selected. Formulation, clinical trials,
and registration then lead to a new medicine. The last phases are not presented in this book.
The development of different methods in drug research has already been described
in the last section. Table 1.1 gives a short historical overview of the most prominent
results.
1.8 The Results of Drug Research and the Drug Market 17
The assessment of the efficacy and safety of a drug has reached an extraordi-
narily high standard today. To some extent this development is a bystander in our
goal of finding new medicines, but it is also a hindrance. Acetylsalicylic acid
(Aspirin®) is without any doubt a valuable drug. Today this compound would
have great difficulty to pass clinical trials. Acetylsalicylic acid is an irreversible
enzyme inhibitor, it has relatively weak efficacy, it causes gastric bleeding in high
doses, and it has a very short biological half-life. Each of these problems would be
a profound argument against its continued development today. It probably would
have already failed in screening. In a risk–benefit analysis however, it is better than
most of the alternatives. Where is the problem? It probably lies in the analytical–
deterministic mindset that dominates science, and therefore also drug research. It is
often overlooked that such an approach deals with a system as complicated and
complex as a human, to whom we apply a drug therapy, cannot always be ade-
quately addressed by all means.
Despite public healthcare systems that constitute a barrier between the supplier
and the consumer, the drug market, with worldwide sales of more than US$880
billion, has strong competition. Two forces affect this market: the state of science
1.9 Controversial Drugs 19
and technology and the needs of patients. A few drugs command a large portion of
sales. Constantly changing “hit lists” of the best-selling drugs can be found on the
internet. Because of the merging of established pharmaceutical companies in the
last years, the market has contracted to fewer, bigger companies. It is frequently
the case that a single drug can make or break a company. Often only two to three
drugs make up more than 50% of a large company’s sales. A historical example is
Glaxo. This company made its way out of the midfield to the top with ranitidine.
Astra experienced a similar boom with omeprazole. Today after the merger with
Zeneca, it belongs to the biggest representatives of this field. Sankyo also had
a single drug, lovastatin, that exceedingly boosted sales. With its drugs sildenafil
(Viagra ®) and atorvastatin (Sortis ®/Lipitor ®) Pfizer’s profits shot to unimaginable
highs. Just in the last years we have been able to see an increasing concentration of
pharmaceutical companies, so that the market is making a transition to an oligo-
poly, dominated by multinational corporations. Keep in mind that sales giants such
as GlaxoSmithKline (GSK), Novartis, Sanofi-Aventis, Bayer HealthCare, Bristol-
Myers Squibb or AstraZeneca have only originated in the last 10 years through
mergers. Companies such as Pfizer and Roche have significantly grown from acqui-
sitions. The role research plays for pharmaceutical companies is apparent when one
considers that typically 15–20% of turnover is invested in this area. It is certain that
the concentration of the pharmaceutical market is not complete. We can only wait and
see how the landscape continues to shift and adapt at an almost annual pace.
Drugs remain in the focal point of public interest. Whereas for decades it was the
physician alone who prescribed medication, today it is the patient, frightened by the
lay press or better informed through labeling or reputable literature, who wants to
take control of, or at least share in the decision making.
The issues can be illustrated by one example. Psychotropic pharmaceuticals
exert an impressive effect on personality and behavior. At least since the intro-
duction of Valium ® (diazepam) these drugs have been in the media spotlight.
They are invaluable for the treatment of psychiatric illness. On the other hand, the
danger of misuse and addiction is particularly high. Some of these drugs are even
used as self-medication, without strict adherence to the indication guidelines.
Fluoxetine 1.26 (Prozac ®, Fig. 1.3, Sect. 1.6) was introduced in 1988 by Eli Lilly,
and brought unequivocal progress in the treatment of depression. On this one
medication alone there are now over ten popular science books with controversial
content. Peter Kramer’s book Listening to Prozac takes an overall sympathetic
tone with the assertion that depressed patients feel better and more “in harmony”
with their personality after treatment with fluoxetine. This book was on the New
York Times bestsellers’ list for over 21 weeks. Peter Breggin’s book Talking Back
to Prozac criticized fluoxetine, the company Eli Lilly, and the U.S. Food and Drug
Administration (FDA) polemically. The side effects, risks, and particularly the
addictive potential were placed in the foreground. Both books contain correct
20 1 Drug Research: Yesterday, Today, and Tomorrow
assertions, and both books lead to the wrong conclusions. Prozac ® is a valuable
medicine for the treatment of clinically manifest depression; for the treatment
of mundane unhappiness or as a general stimulant, however, it is a drug with
many risks.
To make a risk–benefit analysis of a medication, it is important to consider not
only the desired effect but also the severity of the illness and the objective and
subjective side effects. In oncology one accepts even severe side effects for the
possibility of improving the patient’s condition. If an end-stage cancer patient is
refused an effective pain therapy because of the risk of addiction, then that must be
seen as malpractice. On the other hand many people handle highly potent medica-
tions recklessly. The misuse of antibiotics, the faith in the almighty power of
tranquilizers and antidepressants, or the chronic use of analgesics and laxatives
do more damage than good.
1.10 Synopsis
• Drug research can be divided into several sequential phases starting with empirical
observations of the uptake of natural products from food, the development of
in vitro test systems, increasing understanding of structures and modes of action,
to in vivo models and gene technology.
• It all started with traditional medicines. The first prescriptions date back to the
ancient Egyptians and to traditional Chinese medicine.
• Paracelsus founded scientific medical research and understood humans to be
a “chemical laboratory.” The ingredients of drugs were first held responsible for
healing effects.
• With the advent of organic chemistry, the first therapeutic principles based on
pure organic compounds became available. The great age of natural products
from plants and their active ingredients began.
• Systematic studies on animals began in the next-to-last century and can be
seen as a starting point for drug research. In vitro models are needed to
test large series of potentially active compounds, but animal models are required
to correlate the data and make predictions about the therapeutic effects
in humans.
• Our present life expectancy would not be possible without the successful
fight against infectious diseases. The broad application of antibiotics and
the spread of resistant pathogens, however, have led to situations in which
the best weapons against infectious diseases are becoming increasingly dull.
Research against widespread tropical diseases has been neglected, and
the currently increasing resistance to available medications represents a
worldwide problem.
• The elucidation of biological concepts, pathways, and regulatory cycles by
endogenous compounds has strongly stimulated drug research. Many developed
drugs have arisen from structural variations of neurotransmitters, hormones,
steroids, or natural substrates.
Bibliography 21
Bibliography
General Literature
Barondes SH (1993) Molecules and mental illness, Scientific American Library. W. H. Freeman
and Company, New York
Beddell CR (ed) (1992) The design of drugs to macromolecular targets. Wiley, Chichester
Fischer D, Breitenbach J (eds) (2003) Die Pharmaindustrie. Spektrum Akademischer Verlag,
Heidelberg/Berlin
Friedrich C, Müller-Jahncke W-D (2005) Von der Frühen Neuzeit bis zur Gegenwart, vol 2.
GOVI-Verlag, Eschborn
Higby G (ed) (1997) The inside story of medicine. A Symposium. Madison, Wi
Herrmann EC, Franke R (eds) (1995) Computer-aided drug design in industrial research, Ernst
Schering research foundation workshop 15. Springer, Berlin
Müller K (ed) (1995) De Novo Design, Persp. Drug Discov. Design, vol 3, Escom, Leiden, 1995
MüllerJahnke WD, Friedrich C (2005) Arzneimittelgeschichte. Wissenschaftliche Verlagsge-
sellschaft, Stuttgart
Perun TJ, Propst CL (eds) (1989) Computer-aided drug design. Methods and applications. Marcel
Dekker, New York
Porter R, Teich M (eds) (1995) Drugs and narcotics in history. Cambridge
Restak RM (1994) Receptors. Bantam Books, New York
Schmitz R (1998) Geschichte der Pharmazie, vol 1. GOVI-Verlag, Eschborn
22 1 Drug Research: Yesterday, Today, and Tomorrow
Special Literature
Beddell CR, Goodford PJ, Norrington FE et al (1976) Compounds designed to fit a site of known
structure in human hemoglobin. Br J Pharmac 57:201–209
Breggin PR, Breggin GR (1994) Talking back to prozac. St. Martin’s Press, New York
Kramer P (1993) Listening to prozac. Viking, New York
Mutschler E (1987) Arzneimittel – Erfolge, Misserfolge, Hoffnungen. Deutsche Apoth-Ztg
127:2025–2033
Newman DJ, Cragg GM (2007) Natural products as sources of new drugs over the last 25 years.
J Nat Prod 70:461–477
Noe CR, Bader A (1993) Facts are better than dreams. Chem Brit 29:126–128, Kekulés and
Loschmidts Formeln
In the Beginning, There Was Serendipity
2
“A lucky accident dropped the medicine into our hands”; this is how a publication on
August 14, 1886, from Arnold Cahn and Paul Hepp in the Centralblatt f€ ur Klinische
Medizin began. The history of drug research is punctuated by lucky accidents.
As a general rule, detailed knowledge of biological systems was absent. So it is not
surprising that the working hypotheses were often wrong, and the obtained results
differed from the expectations. The case of accidental success fell into the back-
ground over time. Today happenstance as a strategy has been replaced by the arduous
and ambitious goal of preparing drugs by using a straightforward approach. The only
exception to this is the kind of shotgun-style testing of large and diverse chemical
compound libraries, including microbial and plant extracts that is done with the goal
of finding new lead structures. In this case, serendipity is desired to find as large and
diverse a palette of lead structures (▶ Chaps. 6, “The Classical Search for Lead
Structures” and ▶ 7, “Screening Technologies for Lead Structure Discovery”) with
potential for further optimization (▶ Chaps. 8, “Optimization of Lead Structures” and
▶ 9, “Designing Prodrugs”).
Back to Cahn and Hepp. What happened? There are several legends about this
lucky accident. The most plausible version is that the antipyretic effect of naph-
thalene, which was widely available from coal tar, was tested. The substance indeed
showed fever-lowering qualities. The responsible substance however, was not naph-
thalene but rather something entirely different: acetanilide 2.1 (Fig. 2.1). Further
experiments confirmed the efficacy. Shortly thereafter, the company Kalle & Co.
introduced it to the market with the name “Antifebrin.”
Phenacetin 2.2 (Fig. 2.1) was subsequently developed based upon a targeted
approach. At the time, Bayer in Elberfeld had 30 t of p-nitrophenol, a side product
from dye production, on their waste heap. The then 25-year-old Carl Duisberg, who
later became the chairman of Bayer Farbenfabriken AG and who also took a leading
O O
O + O-
N
HN CH3 HN C 3
CH
OH
OEt
2.1 Acetanilide 2.2 Phenacetin 2.3 p -Nitrophenol
O
NH2 HN CH3
OEt OH
Fig. 2.1 By starting with the accidently discovered acetanilide 2.1, Carl Duisberg planned
the synthesis of phenacetin 2.2 from nitrophenol 2.3. In contrast to the toxic metabolite 2.4, the
main metabolite, paracetamol (Amer. acetaminophen) 2.5 is well tolerated.
role in the foundation of I.G. Farbenindustrie in 1924, wanted to use it for the
preparation of acetanilide as it could easily be reduced to p-aminophenol. The
known toxicity of phenol groups led to the design of p-ethoxyacetanilide 2.2 (phen-
acetin), which actually did have the desired qualities and served as an analgesic for
headaches and as an antipyretic for a century. Unfortunately its metabolite 2.4, which
still contains the ethoxy group, leads to the production of methemoglobin, an
oxidized form of the red blood pigment that is incapable of carrying oxygen.
Furthermore, chronic misuse by, for instance, taking kilogram quantities of phenac-
etin over a lifetime, leads to kidney damage. Paradoxically, the main metabolite of
phenacetin, p-hydroxyacetanilide 2.5 (Fig. 2.1, acetaminophen in American English,
or paracetamol in UK English) is actually responsible for the effect, and it is less toxic
and better tolerated. In the USA alone, paracetamol achieved over US$1.3 billion in
annual sales. This is even more than for acetylsalicylic acid.
In 1799 Humphry Davy (1778–1829) discovered the euphoric effect of nitrous oxide
(N2O), which was appropriately named “laughing gas.” The dentist Horace Wells
(1815–1848) saw a traveling theater production of a “sniffing party” with N2O in
1844 in which a participant suffered from a flesh wound, apparently without pain. To
test this effect, Wells had one of his own teeth extracted, also without pain. He then
repeated the procedure on many people, with success. A public demonstration went
2.3 Fruitful Synergies: Dyes and Pharmaceuticals 25
Cl Cl OH Cl
Cl H Cl CH Cl CH2OH
OH− Metabolism
Cl Cl OH Cl
Fig. 2.2 The anesthetic chloroform 2.6 is formed upon treatment of chloral hydrate 2.7 with base.
This reaction does not work in vivo, however. The active metabolite of 2.7 is trichloroethanol 2.8.
O
H O
R
Fig. 2.3 The hypothetical H2N O N Et
“prodrug” of ethanol, O
Et
urethane 2.9, led to the 2.9 Urethane R = −CH2 CH3 N
development of H O
isoamylcarbamate 2.10, 2.10 Isoamylcarbamate
which in turn led to the first 2.11 Barbital
barbiturate, barbital 2.11. R = −CH2CH2CH(CH3)2
wrong though, and this drove him to suicide four years later. The same effect was
observed in 1842 by Crawford W. Long (1815–1878) with ether, but he did not report
it immediately. After administering ether, he was able to remove an ulcer from the
neck of a volunteer. William T. Morton (1819–1868) successfully carried out the first
ether anesthesia in the same hospital as Wells. Starting in 1847, chloroform was used
as an anesthetic. A few years later anesthesia became standard for surgical pro-
cedures, a real blessing for the suffering of humanity.
Oskar Liebreich (1839–1908) wanted to develop a depot form of chloroform 2.6
in 1868. Because chloral hydrate can be cleaved with base in an aqueous milieu, he
hoped that this could also happen in the body. Chloral hydrate is in fact a sedative,
but this is because of its active metabolite, trichloroethanol 2.8 (Fig. 2.2), and not
because it releases chloroform.
In 1885 Oswald Schmiedeberg (1838–1921) tested urethane 2.9 (ethylcarbamate,
Fig. 2.3) because he thought that it would release ethanol in the organism. Urethane
itself is the active agent. Its optimization later led to isoamylcarbamate 2.10
(Hedonal ®, 1899). Based on this, open and cyclic carbamates and ureas were
investigated. In 1903 the first barbiturate sedative, barbital (Veronal ® ) resulted.
In the decades that followed, a wealth of better-tolerated barbiturates with a broader
pharmacokinetic spectrum was introduced.
Dyes and pharmaceuticals have stimulated each other. The first synthetic dye was
the result of a failed drug synthesis. In 1856 August Wilhelm v. Hoffman assigned
the task of synthesizing quinine, an alkaloid used for treating malaria (▶ Sects. 1.1
and ▶ 3.2), to the then 17-year-old William Henry Perkins (1838–1907); by starting
26 2 In the Beginning, There Was Serendipity
Fig. 2.4 An unsuccessful quinine synthesis founded the dye industry. The structures of many
organic compounds were still entirely unknown in the middle of the nineteenth century. The
attempt to prepare quinine via a simple route (upper reaction) could not have worked. The
oxidation of an impure aniline (below) gave mauveine 2.12 in 1856, which was used to dye silk
a brilliant mauve color. It was the first synthetic dye!
with only the molecular formula, it was anticipated that the oxidation of an allyl-
substituted toluidine would deliver the desired product. Now that the structural
formula is known, we understand that this could not possibly have worked! Upon
oxidation of aniline that was contaminated with o- and p-toluidine Perkins isolated
a dark precipitate. It contained a dye, mauveine 2.12 (Fig. 2.4) that colored silks
a brilliant mauve. Other dyes were prepared in rapid succession. The development
and later proliferation of the dye industry in England and Germany in the second
half of the nineteenth century can be traced back to this accidental discovery.
Toward the end of the next-to-last century increasing competition and a difficult
economic situation in the dye market inspired the reactionary expansion into
industrial pharmaceutical research. In 1896 a pharmaceutical research laboratory
was founded in the 33-year-old Bayer Farbenfabrik. At that time innumerable
synthetic dyes were known, therefore it is not surprising that these substances
were tested for pharmacological effects.
Of all people, wine adulterators played an important role in the discovery of the
first synthetic laxative. To stop people from selling Trester wine (so-called
Nachwein) as a natural wine (Naturwein), in 1900 the dye phenolphthalein was
added as an easily detectable indicator. The Hungarian pharmacologist Zoltán
von Vámossy (1868–1953) investigated the effects of this compound. Back then,
the conventions of the pharmacologists were still rather primitive. The intravenous
application of 0.01–0.03 g to rabbits caused death “with loud shrieking, convul-
sions, and paralysis”. Vámossy then decided to feed 1–2 g to a rabbit and 5 g to a
4 kg lap dog. Because these oral doses were all well tolerated, Vámossy took 1.5 g
of phenolphthalein himself, and a friend took 1.0 g. The effects were explosive:
rumbling in the bowels, diarrhea, and for two additional days loose stools. It was
later established that 150–200 mg would have been a therapeutic dose.
2.3 Fruitful Synergies: Dyes and Pharmaceuticals 27
HO H2N NH2
HO As As OH
O O
x 2 HCl
HO
2.13 Phenolphthalein 2.14 Arsphenamine
Fig. 2.5 The laxative effect of phenolphthalein became apparent while testing it as an additive for
cheap wines. The antisyphilis compound arsphenamine 2.14 (Salvarsan®, here shown as monomer)
is simply an azodye in which the —N═N— group was exchanged for an —As═As— group.
NH2
H2N N N SO2NH2
2.15 Sulfamidochrysoidine
Fig. 2.6 The red azodye sulfamidochrysoidine 2.15 is effective only after cleavage to the
colorless sulfanilamide 2.16, which is a bacterial antimetabolite of p-aminobenzoic acid 2.17.
An entire range of antibacterial and antiparasitic dyes are based on the work
of Robert Koch (1843–1910). He showed that bacteria and parasites accumulate
dyes specifically. Based on this, Paul Ehrlich (1854–1915) hoped to kill pathogens
selectively with suitably chosen dyes. In 1891 he cured two mild cases of malaria by
treating the patients with methylene blue. In the following years he tested hundreds of
different pigments, and thousands more analogues were later synthesized in the
laboratories of Bayer and Hoechst. In 1909 Paul Ehrlich pursued a rational design
when he exchanged both of the nitrogen atoms of an —N═N— group of an azodye
for arsenic atoms. Arsphenamine 2.14 (Salvarsan ®, Fig. 2.5) was the first effective
compound to treat syphilis; the first chemotherapeutic. It became an extraordinary
economic success for the company Hoechst.
The breakthrough with chemotherapeutics was made by the physician Gerhard
Domagk (1895–1964). At the age of 31, he took over the newly formed department
of experimental pathology at Bayer in Elberfeld. Azo dyes bearing sulfonamide
groups had already been designed by the chemists Fritz Mietzsch and Josef Klarer,
but they showed no in vitro activity; Domagk tested these substances in strepto-
cocci-infected mice. By using this model, he found the first active substances in
1932. Sulfamidochrysoidine 2.15 (Protonsil®, Fig. 2.6), a dark-red dye that could
28 2 In the Beginning, There Was Serendipity
H H
S
H H CH3 RHN
S
RHN N
CH3
N O CH2R⬘
O COOH COOH
2.18 Penicillins 2.19 Cephalosporins
Fig. 2.7 Fleming’s accidental discovery of the antibiotic effects of a fungus has delivered a wide
palette of penicillins 2.18 and cephalosporins 2.19, each with different R groups.
2.5 The Discovery of the Hallucinogenic Effect of LSD 29
killed. This “experiment” led to the discovery of lysozyme, an enzyme that hydro-
lyzes the bacterial wall. As a therapy it is unfortunately unsuitable because it does
not attack most human pathogens.
Chance and a fungus played an important role in the industrial synthesis of
corticosteroids. An important step in the synthesis is the introduction of an oxygen
atom at a particular position in the steroid scaffold, position 11. In 1952 chemists at
the Upjohn company sought after a soil bacteria that could hydroxylate a steroid in
this position. Just when they finally decided to set an agar plate on the window bank
of the laboratory, Rhizopus arrhizus landed exactly there. This fungus transforms
progesterone (▶ Sect. 28.5) to 11a-hydroxyprogesterone. With its help the yield
could be increased to 50%. The closely related fungus Rhizopus nigricans even
afforded 90% of the desired product.
In the 1930s Albert Hoffmann (1906–2008) was working on the partial synthesis of
ergoline alkaloids at Sandoz. In 1938 he wanted to find a way to transfer the
respiratory and cardiovascular stimulatory effect of N,N-diethyl nicotinamide
2.20 onto this class of compounds. In analogy to 2.20, he prepared N,N-diethyl
lysergamide 2.21 (Fig. 2.8) with the hope of maintaining the stimulatory circulatory
and respiratory effects. Except in case the experimental animals were agitated
under anesthesia, the substances showed no particular effect. Therefore they were
not pursued at first. Hoffman prepared the substances for a second time five years
later because he wanted to investigate them more thoroughly. Upon the purification
procedure and recrystallization he reported feeling “a strange agitation combined
with a slight dizziness.” At home he fell into “a not-unpleasant inebriated condition
that was characterized by extremely animated fantasies . . . after about 2 hours, the
condition went away.” Hoffman suspected a connection to the compounds he
prepared and conducted a self-experiment with 0.25 mg a few days later. That
was the smallest dose with which he expected to see an effect. The outcome was
dramatic, the experience was the same as the first time, but much more intense. He
had a technician accompany him home on his bicycle. During the ride, his condition
CO-N(Et)2 H CO-N(Et)2
N N
CH3
H
2.20 N,N-diethyl
nicotinamide HN
2.21 LSD
Fig. 2.8 N,N-Diethyl nicotinamide 2.20 is a centrally active derivative of nicotinic acid.
Hofmann wanted to synthesize a general stimulant analogously by preparing the N,N-diethyl
amide of lysergic acid. The result was the hallucinogen lysergic acid diethyl amide 2.21 (LSD).
30 2 In the Beginning, There Was Serendipity
took on a threatening form, and he fell into a severe crisis dominated by dizziness
and anxiety. The world took on a grotesque form. Later it was determined that
0.02–0.1 mg is enough to cause hallucinations. The substance was temporarily
marketed as Delyside ® for use in psychotherapy and to treat anxiety and compul-
sive disorders.
The structure of the first calcium channel blocker, verapamil 2.22 was determined
by its synthesis (Fig. 2.9). Verapamil counteracts the effects of b-adrenergic
agonists, but it is not a b-blocker. It was only after its introduction to the market
MeO
CN CH3
MeO CH2 + OMe
H3C Br
+ Cl N OMe
CH3
MeO OMe
CN
MeO N OMe
CH3
H3C CH3 2.22 Verapamil
NO2 NO2
CHO MeOOC COOMe
MeOOC COOMe
+
H3C N CH3
H3C OH HO CH3 H
Fig. 2.9 Ferdinand Dengel, a chemist at the former Knoll AG wanted to prepare a cardiovascular
therapeutic by alkylating a nitrile. To avoid a double substitution, he started with the sterically
demanding isopropyl group. The result was the first calcium channel blocker, verapamil 2.22. The
isopropyl group is the optimal alkyl group because it stabilizes the biologically active conforma-
tion. The synthetic route played an important role in the development of the second calcium
channel blocker, nifedipine 2.23. In 1948, Friedrich Bosser at Bayer was given the task of finding
new substances that dilate the coronary arteries. After years of work, in 1964 he turned to the easily
prepared dihydropyridines, which surprisingly displayed the desired effects. In this case, the
space-filling nitro group promotes the biologically active conformation (▶ Sect. 17.9).
2.7 Surprising Rearrangements Lead to Medicines 31
that Albrecht Fleckenstein clarified its mode of action: it blocks the inward mem-
brane-voltage-dependent flow of calcium ions through the calcium channels
(▶ Sect. 30.1) in heart and endothelial cells. The hypotonic effect was initially
seen as a side effect, but in the following years it became the most important reason
for use. The second group of therapeutically important calcium channel blockers,
nifedipine 2.23 was inspired by a synthetic principle. It was a reaction from 1882,
the Hantzsch synthesis of dihydropyridines (Fig. 2.9). Remarkably, the pharmaco-
logical experiments on nifedipine had to be carried out in a darkened room because
of its photosensitivity. All the more reason to acclaim that it was developed into
a medicine despite this characteristic.
H
N
N N CH3
Cl
N+
Cl O− CH3NH2 Cl N+
O−
Fig. 2.10 Treatment of 2.25 with methylamine delivers the rearrangement product chlordiaz-
epoxide 2.25 (Librium ®) instead of the expected one. This first test compound became the first of
the benzodiazapine class to be marketed.
32 2 In the Beginning, There Was Serendipity
CH3
N CH3 N
H+
HO 2.27 Naftifine
2.26 CH3 t Bu
N
2.28 Terbinafine
Fig. 2.11 Instead of CNS activity, naftifine 2.27, prepared from spiro-compound 2.26, is an
antimycotic. A comparison with the more portent terbinafine 2.28 shows that the phenyl group can
advantageously be replaced with a tert-butylethinyl group.
introduced as naftifine 2.27, and later a more potent analogue, terbinafine 2.28
(Fig. 2.11) followed. Both substances showed a previously unknown mode of
action. They damage the membrane of fungi in that they block the ergosteroe
biosynthesis. This happens in a very early step because of the inhibition of the
enzyme squalene epoxidase.
The list of accidental discoveries, from which a few are described here, can be
prolonged ad infinitum. A few more examples are briefly mentioned without
chemical formulae.
• Pethidine (▶ Sect. 3.3), the first fully synthetic opiate analgesic, was synthesized
in the 1930s as part of an anticonvulsives research program, by starting from
atropine.
• The suitability of antihistamines for the prevention of motion sickness was
discovered in Boston because of a treatment for a skin rash. A patient reported
that her motion sickness, which always occurred when riding a Boston street car
went away. The “clinical trial” was carried out in 1947 on hundreds of sailors on
the transatlantic voyage of the USNS General Ballou.
• Haloperidol (▶ Sect. 3.3) was meant to be an analgesic, it turned out to be
a neuroleptic.
• Imipramine is structurally very similar to the neuroleptic chlorpromazine
(▶ Sects. 1.6 and ▶ 8.5). Nonetheless it has the opposite effect and is an
antidepressant.
• Phenylbutazone was meant to be an additive used to dissolve the anti-
inflammatory aminophenazone. The substance turned out to be an anti-
inflammatory agent itself as did its metabolite, oxyphenbutazone.
2.9 Where Would We Be Without Serendipity? 33
• An attempt to isolate the causative agent of bipolar disorder from the urine
of patients afforded only uric acid. Because uric acid is poorly soluble,
lithium ureate was tested. This led to the discovery of the antidepressant effect
of lithium salts.
• Clonidine was meant to be a local treatment for the runny nose that accom-
panies the common cold. Instead of the expected effect, a profound hypotonic
effect was surprisingly found. Despite intensive structural variations, none of
clonidine’s analogues have surpassed its potency.
• Levamisole was developed as a broad-spectrum anthelmintic (anti-worm agent).
Instead, an immunomodulatory effect was accidently found that now stands in
the therapeutic foreground.
• Praziquantel was originally meant to be an antidepressant. Because of its high
polarity, it cannot cross the blood–brain barrier. An outstanding suitability for
the treatment of the tropical disease bilharziosis was found through broad
biological testing.
• A chemist at Searle who was working on dipeptides licked his fingers while
flipping through the pages of a book. The sweet taste that he noticed turned out to
be caused by the artificial sweetener aspartame. Saccharine was also found in
a very similar way. In the case of cyclamate, a smoker noticed a sweet taste to his
cigarettes.
• Even today when one would think that rational concepts dominate drug research,
the lucky accident still helps to make “blockbusters.” In the pursuit of a
phosphodiesterase inhibitor to hinder the degradation of cyclic guanosine
monophosphate (cGMP), an improved treatment for angina pectoris was not
found (▶ Sect. 25.8). Instead it became conspicuous that the male subjects in the
clinical trial did not want to give up the substance. After the side effect of
a stronger penile erection was recognized, the side effect became the main effect.
The compound sildenafil was marketed for the treatment of erectile dysfunction
as Viagra ®, and developed into a billion-dollar product.
In the English-speaking world, a word is in use that is difficult to translate into other
languages: serendipity. This term, as an expression of a lucky accident, was coined
by Sir Horace Walpole in 1754. It is derived from a Persian fairytale in which three
princes of Serendip (earlier Ceylon, today Sri Lanka) have accidental and unex-
pected luck and make interesting discoveries entirely analogously to the many
examples in this chapter. Serendipity has played an exceedingly important role in
general in science, and especially in drug research. How would our modern
medicine supply look without all of these lucky accidents? By no means should
an arbitrary approach be taken, and an accidental discovery be counted upon. To the
contrary, chemists and pharmacologists have always developed concrete ideas as
to how and why particular structural variations on a lead compound should be
34 2 In the Beginning, There Was Serendipity
pursued. Some of these hypotheses were correct, and others were false. One thing
that they always had in common that helped the researchers was that when
a hypothesis failed, or an unexpected result was found, they recognized the poten-
tial consequences of the result, drew the correct conclusions, and did the right
things. The following chapters will show numerous examples of successful targeted
drug design in cases in which the correct working hypothesis was realized. The
search for a new active substance is, however, not a process that can be pushed
through by a purely technically oriented management. As a general rule, short-term
planning and bureaucratic control have only negative consequences. On the other
hand the search for new medicines requires a concerted effort from many different
groups of specialists, who must work together in a suitable organizational structure.
The subsequent preclinical and clinical development of a newly found active
substance is an extremely expensive and time-consuming process that must be
carefully planned, carried out, and controlled. For this, other instruments are
necessary than are used for drug discovery.
2.10 Synopsis
• The history of early drug research is full of lucky accidents. Many active
principles of substances were discovered by serendipity, but mostly success
can be attributed to an outstanding researcher with a “prepared mind” who
observed important effects.
• Dyes and pharmaceuticals, both developed in the early stages of the up-coming
chemical industry, especially stimulated each other in very fruitful synergies.
• The discovery by Alexander Fleming of the first antibiotic principle, the peni-
cillins, as a defense mechanism of a fungus against bacteria, is one of the most
famous examples of a serendipitous discovery.
• The partial synthesis of ergoline alkaloids led to the discovery of the hallucino-
genic effects of LSD. In those days, researchers frequently conducted self-
experiments to first test active principle in humans.
• Unexpected synthetic products, surprising structural rearrangements, and ini-
tially false working hypotheses produced new, pharmacologically interesting
substances with surprising or outstanding qualities.
• Even today, where rational concepts and the understanding of mode-of-action
dominates drug research, the lucky accident can still help to make “block-
busters” as proven recently by the example of sildenafil (Viagra ®).
Bibliography
Primary Literature
Ban TA (2006) The role of serendipity in drug discovery. Dialogues Clin Neurosci 8:335–344
Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Bibliography 35
de Stevens G (1986) Serendipity and structured research in drug discovery. Fortschr Arzneimit-
telforsch 30:189–203
Kubinyi H (1999) Chance favors the prepared mind. From serendipity to rational drug design.
J Receptor Signal Transd Res 19:15–39
Restak RM (1994) Receptors. Bantam Books, New York
Roberts RM (1989) Serendipity. Accidental discoveries in science. Wiley, New York
Sneader W (1990) Chronology of drug introductions. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry, vol 1, Kennewell PD (ed). Pergamon Press, Oxford,
S.7–S.80
Secondary Literature
Cahn A, Hepp P (1886) Das Antifebrin, ein neues Fiebermittel. Centralblatt f€
ur Klinische Medizin
7:561–564
Hofmann A (1993) LSD – mein Sorgenkind, dtv/Klett-Cotta
Sternbach LH (1978) The Benzodiazepine story. Fortschr Arzneimittelforsch 22:229–266
St€utz A (1987) Allylamine derivatives – a new class of active substances in antifungal chemo-
therapy. Angew Chem Int Ed 26:320–328
von Vámossy Z (1900) Ist Phenolphthalein ein unsch€adliches Mittel zum Kenntlichmachen von
Tresterweinen? Chemiker-Zeitung 24:679–680
Classical Drug Research
3
The hundred years of pharmaceutical research from 1880 to 1980 were punctuated
by trial and error, but also by elegant ideas and their translation into therapeutically
valuable principles. Many lead structures were found by accident (see ▶ Chap. 2,
“In the Beginning, There Was Serendipity”), others came from traditional medi-
cines or from biochemical concepts. In contrast to modern drug research, classical
design was the result of rather limited knowledge of the pathophysiology and
cellular and molecular etiology of disease, and was restricted to animal experi-
ments. Nonetheless, this phase, and particularly the last 50 years, has been excep-
tionally successful. The targeted fight against infectious diseases and the successful
treatment of many psychiatric and other important diseases can be attributed to this
period in drug development. With this came a significant increase in quality of life
and life expectancy. In the following sections, selected examples are used to
demonstrate different aspects of classical pharmaceutical research.
The history of acetylsalicylic acid (ASA, Aspirin ®) reflects the progress of phar-
maceutical research like no other example. This is especially true for the elucida-
tion of the mode of action, and the newly found targeted therapies that resulted.
Willow bark extracts have been used since antiquity for the treatment of inflam-
mation. When Napoleon marched across Europe, between 1806–1813 the bark was
even used as a substitute for cinchona bark (Sect. 3.2). Salicin 3.1, a glucoside of the
o-hydroxybenzylalcohol saligenin, is responsible for the effect. Upon hydrolysis
and oxidation, the actual active compound, salicylic acid 3.2 (Fig. 3.1), is formed.
In 1897 the then 29-year-old Bayer chemist Felix Hoffmann began a systematic
search for derivatives of salicylic acid. His father, who suffered from severe
rheumatoid arthritis, had asked him to. High doses of salicylic acid caused unpleas-
ant gastric irritation and vomiting. Hoffmann prepared simple derivatives of
salicylic acid, and was successful within the year. On October 10, 1897 he synthe-
sized acetylsalicylic acid 3.3 (ASA, Fig. 3.1) for the first time in a pure form.
CH2OH COOH
O-b-D-glucopyranoside OH
COOH
O CH3
Fig. 3.1 Salicylic acid 3.2 is the oxidation and cleavage product of salicin 3.1, which is isolated
from willow bark. Acetylsalicylic acid (ASA) 3.3 is not simply a prodrug of salicylic acid, but
rather a drug with its own mode of action.
It was a lucky strike. Although ASA has a very short half-life in plasma, it is
analgetic, antipyretic, and anti-inflammatory in large measure. The clinical trial was
carried out at the Diakonissenkrankenhaus in Halle an der Saale on 50 patients. On
February 1, 1899 Bayer registered ASA as Aspirin ® (A for acetyl and spiraea,
another plant that contains salicylic acid) as a trademark under the number 36 433.
From then on it was sold as 1 g of powder in envelopes, and shortly thereafter as
tablets. Detractors alleged that it was only developed in tablet form so that Bayer
could emboss their famous Bayer cross onto it. Aspirin quickly gained a leading
place in drug therapy. One-hundred years after its market introduction, 40,000 t of
ASA are produced and pressed into tablets every year, worldwide. At the end of
1994 the Bayer plant in Bitterfeld produced 400,000 Aspirin ® tablets per hour, 3.5
billion per year. The importance that the trademark Aspirin had for Bayer became
clear in 1994 when the company paid US$1 billion to take over the self-medication
business from Sterling—Winthrop, which included the trademark rights for
Aspirin, which had been lost in 1918.
The Spanish philosopher José Ortega y Gasset called the previous century the‚
“age of Aspirin.” In his book‚ The Rising of the Masses, he wrote:
The ordinary person lives today more easily, comfortably and safely than the most powerful
of the past. Why should he care that he is not richer than others when the world is and
roads, trains, hotels, telegraphs, personal safety, and Aspirin ® are at his disposal.
The compliment was wonderful, but one must also consider that all of these
scientific discoveries are slightly more than 100 years old! ASA was considered to
3.1 Aspirin: A Never-Ending Story 39
Cyclo -
oxygenase
COOH COOH
O
O
Prostacyclin- Thromboxane-
COOH synthase synthase
O
COOH
O
O
HO OH OH
Fig. 3.2 Arachidonic acid 3.4 undergoes an oxidative cyclization and a peroxidase reaction in the
prostaglandin biosynthesis to give the primary product PGH2 3.5. Finally prostacyclin synthase
transforms PGH2 into prostacyclin 3.6, which protects the gastric mucosa, dilates blood vessels,
and inhibits platelet (thrombocyte) aggregation. The platelet thromboxane synthase transforms
PGH2 into thromboxane A2, which promotes aggregation. ASA irreversibly inhibits cyclooxygen-
ase. By using low ASA doses, the thromboxane A2 synthesis in the platelets is more strongly
inhibited than the production of prostacyclin in the vascular walls.
be a prodrug of salicylic acid and a drug of unknown mode of action until John Robert
Vane (Nobel Prize 1982) and Sergio H. Ferreira discovered in 1971 that salicylic
acid and other nonsteroidal anti-inflammatory drugs inhibit prostaglandin G/H
synthase (cyclooxygenase, COX). COX, a ubiquitously present, membrane-bound
enzyme transforms arachidonic acid 3.4 over a cyclic endoperoxide into PGH2 3.5,
which in turn is transformed into prostacyclin 3.6, thromboxane A2 3.7, and other
prostaglandins. Large quantities of prostaglandins are produced in inflamed tissue, so
that the inhibition of cyclooxygenase intervenes in the cause of the process
itself (Fig. 3.2).
ASA is in fact a metabolic precursor of salicylic acid. In contrast to other anti-
inflammatory drugs, including salicylic acid, however, it has an astonishing mode
of action (▶ Sect. 27.9). It has been known for some time that ASA selectively
acetylates the hydroxyl group of the amino acid serine 530 of cyclooxygenase. In
1995 the three-dimensional complex structure of a bromine analogue was solved
for the first time. This drives the point home that ASA, analogously to other COX
inhibitors, docks near the arachidonic acid binding site (▶ Sect. 27.9). Therefore
despite its relatively weak binding, ASA is in an outstanding position to acetylate
this serine. Serine 530 is not involved in the catalytic mechanism, but the additional
40 3 Classical Drug Research
F3C O N
volume of the acetyl group impedes arachidonic acid’s entrance to the binding site
and therefore the synthesis of the prostaglandin precursors. A COX mutant that
carries an alanine instead of a serine at position 530, is enzymatically fully active
but is inhibited by all other anti-inflammatory compounds. This mutant is, as
expected, only weakly inhibited by ASA.
Stimulation for the continued research on nonsteroidal anti-inflammatory drugs
was generated by the discovery in 1991 of a second cyclooxygenase, COX-2. All
anti-inflammatory drugs until then were unselective, or they exerted their effect
overwhelmingly over COX-1 and only slightly over COX-2. The most important
side effect of ASA and other anti-inflammatory drugs is the gastrointestinal damage
that can occur at high doses; this results from the inhibition of the COX-1-
dependent synthesis of prostacyclin 3.6, which protects the gastric mucosa. In
contrast to the ubiquitously occurring COX-1, COX-2 is responsible for the fast
synthesis of prostaglandins in inflamed tissue. It has been possible to bring many
drugs to the market that are more than 1,000-fold more selective for COX-2 than
COX-1, for instance, 3.8 and 3.9 (Fig. 3.3 and ▶ Sect. 27.9).
But do not worry, Aspirin ® will live forever. Its success is growing in another
market. Even at low doses ASA inhibits the synthesis of thromboxane A2 3.7, which
initiates the coagulation of platelets (thrombocytes). Because of its irreversible
inhibition of cyclooxygenase, and the inability of platelets to synthesize new
enzyme, a one-time contact with the substance is enough to suppress the synthesis
for the lifetime of the thrombocyte, that is, for about a week. The enzyme is
replaced in other tissues besides thrombocytes. Therefore the physiological adver-
sary to thromboxane, the aggregation-inhibiting prostacyclin that is produced in the
walls of the vasculature, can be replenished (Fig. 3.2).
With regard to the condition of increased coagulation tendency, ASA adjusts the
biosynthesis away from the “bad” thromboxane in the direction of the “good”
prostacyclin. This effect is the basis for the therapeutic use of ASA in cases of
thrombosis susceptibility, for instance, before and after a heart attack or stroke.
Considering the now-known mechanism of the effect, the dose can be decreased by
tenfold! That reduces the risk of gastrointestinal bleeding as a possible side effect.
Based on these observations, it is now recommended that ASA be taken
3.1 Aspirin: A Never-Ending Story 41
prophylactically before long-haul flights. The constricted sitting and lack of move-
ment coupled with the dry air and reduced pressure in the cabin lead to dehydration
and cause a “thickening” of the blood. The economy-class syndrome typically leads
to jet legs and increases the risk of embolism and vein thromboses. Here ASA can
offer a measure of protection. On the other hand, its use before surgical procedures
is not recommended. No surgeon wants an increased bleeding risk for the patient as
a result of diminished coagulation competence during a procedure.
Felix Hoffmann’s approach of using simple derivatization to improve the
tolerability of a substance led to a new therapeutic principle 100 years ago, the
value of which cannot be appreciated enough. The victory lap of ASA was, and
is, unstoppable. A German/Austrian study on 13,300 patients showed that ASA
therapy reduces the mortality of a heart attack by 17%, and the number of non-fatal
repeat attacks by 30%. On October 9, 1985 the US FDA, a normally
conservative organization, announced that the daily consumption of ASA can
reduce the chances of a recurrent heart attack by 20%, and in some high-risk
populations by even more than 50%. A further study on 22,000 physicians inves-
tigated the influence of regular ASA use on the chances of heart attack. Here, the
physicians were not the experimenters but the patients. The study was prematurely
ended when it was established that the control group had 18 lethal and 171 non-
lethal heart attacks, whereas the ASA-treated group had 5 lethal and 99 non-lethal
heart attacks: altogether a reduction of 50%. A study on 90,000 nurses showed the
same protective effect in women. The risk of a first heart attack was reduced by
30%. This marked the introduction of ASA as a “preventive medicine.”
A six-year study of 600,000 volunteers is worth an entry in the Guinness Book of
World Records. After the results were in, it appeared that ASA reduces the risk of
lethal colon cancer by 40%. Even this effect has a plausible explanation.
Malondialdehyde, a metabolite of prostaglandines, damages DNA. Mutations in
the so-called tumor-suppressor gene TP53 occur in human colon tumors particu-
larly frequently. This causes the cancer cells to lose the ability to regulate their
growth, and they grow uncontrollably. It could also be entirely different. As a result
of gastrointestinal bleeding, a possible side effect of ASA, the treated group was
probably more frequently examined than the control group. It is entirely conceiv-
able that the colon cancer was therefore found in an earlier stage in which it was
more easily operable.
Since 1992 Aspirin ® is available as a chewable tablet. In this form it is buffered
with calcium carbonate, the absorption is much faster, and the side effects are
reduced. ASA has had an unbelievable career, particularly if one considers that it
would never have had a chance under modern criteria to be approved. Its short
plasma half-life, the irreversible protein inhibition, and the high doses would have
met today’s exclusion criteria. A definitive end point in its hypothetical modern
development would be the teratogenicity seen in rats. A pathological result in
toxicity studies with this animal model will definitely lead to discontinuation,
because who would dare to wager that a teratogenic effect occurs in rodents, but
not in humans. Aspirin ®— really a never-ending story.
42 3 Classical Drug Research
The therapy of malaria begins with the discovery of cinchona, around which there
are numerous legends. The nicest and most frequently cited version is that of the
fever-stricken Countess Cinchon, the wife of the Spanish viceroy in Lima, Peru,
who was healed by the doctor Juan de Vega in 1638. On the advice of the town
magistrate of Loja, Quinquina the “bark of the barks” (therefore the confusing name
“cinchona bark”) was brought in from 800 km away. The Countess was allegedly
healed and from then on distributed the powder herself. In the older works, the
cinchona bark was also called “Countess powder” or “Jesuit powder.” Perhaps it
was also true that the Indians, who were forced into compulsory service in the silver
mines by their Christian conquerors, chewed the bark to fight off shivering in the
cold. The clever Jesuits took note of these observations, and thought that chewing
the bark would also help with the shivering that comes from a malarial fever
episode. Cinchona then came back to Europe with the Jesuits.
Malaria, the remittent fever, is a widespread tropical and sub-tropical disease.
Because it is transmitted by the anopheles mosquito, it occurs particularly in
wetlands. Even the city Buenos Aires (Span. “good airs”) was badly hit by malaria
(Ital. mala aria¼“bad airs”). Alexander the Great, the Gothic King Alarich, and
the German Emperors Otto II and Heinrich IV died of it. Even Albrecht D€urer
(1471–1528) apparently suffered from malaria. He sent his private physician
a drawing of himself in which he was wearing only a loincloth. His right hand is
over his spleen with the additional text that do der gelb Fleck ist vnd mit dem Finger
drawff dewt, do ist mir we (there where the yellow spot is and where the finger
points, is where it hurts). In Europe malaria was still widespread until the middle of
the last century. In the north of Germany, the last epidemics were in the years 1896,
1918, and 1926.
The miasma, emissions from the ground, swamps, and corpses, were long seen
as the source of malaria and other epidemics. The Roman author Marcus Terrentius
Varrus (116–127 BC) suspected back then that small invisible organisms might be
responsible. Toward the end of the nineteenth century, the anopheles mosquito was
identified as the vector, and a plasmodium was recognized as the cause of malaria.
Around 1930 about 700 million people were infected, and in 2003 the number was
estimated to be 300–500 million. Up to 1.2 million people die every year, mostly
children under the age of 5, and many others retain permanent damage. Psychiatric
changes are also a consequence. The term “spleen” for eccentricity originally came
from the enlarged spleen that malaria causes.
It should not go unmentioned that heterozygotic (i.e, genetically mixed) carriers
of sickle cell anemia are protected from malaria. This genetic form of anemia was
the first disease for which the molecular cause could be identified (▶ Sect. 12.12).
A single amino acid in the hemoglobin of those afflicted is mutated. This causes
hemoglobin to aggregate, and the erythrocyte shrinks together. The malaria parasite
cannot adequately reproduce in such an erythrocyte. This partial protection from
malaria has abetted the spread of sickle cell anemia in malaria-endemic areas, but
not in other areas.
3.2 Malaria: Success and Failure 43
H MeO
N N
HO
H HN
MeO N(Et)2
CH3
N 3.10 Quinine 3.11 Plasmoquine
CH3 CH3
N(Et)2 N(Et)2
HN HN
MeO
N Cl N Cl
OH
HO NH
H N
HN
N CF3
Cl N
CF3
3.14 Mefloquine 3.15 Amodiaquine
Fig. 3.4 Simple synthetic analogues with antimalarial effects were derived from quinine 3.10.
Plasmoquine 3.11 still contains the methoxyquinoline ring of quinine, but it is in a different
position. The later-developed analogues mepacrine 3.12 and chloroquine 3.13 show strong
similarity to quinine. The newer derivatives mefloquine 3.14 and amodiaquine 3.15 are also
structurally closely related to quinine.
The active substance in the cinchona bark, the alkaloid quinine 3.10 (Fig. 3.4)
was isolated in 1820. Aside from the positive therapeutic effects, it also had
considerable side effects; nonetheless up until a few years ago it was the most
important antimalarial, particularly for the parenteral treatment of severe malaria.
The first synthetic alternative, plasmoquine 3.11, became available in 1927, but it is
seldom used due to its side effects. The later-developed, more potent analogues
3.12–3.14 show a clear structural relationship to the lead structure quinine
(Fig. 3.4). It was only through the protection from malaria that the exploitation of
the colonies was possible. The World Health Organization, WHO, initiated a global
malaria-eradication program in 1955 mainly through the use of the insecticide
dichlorodiphenyltrichloroethane 3.16 (DDT, Fig. 3.5).
The success was overwhelming, the number of cases and fatalities was reduced
to practically zero (Table 3.1). In 1953 it was estimated that five million lives have
44 3 Classical Drug Research
Cl Cl Cl Cl
CCl3 CCl2
Fig. 3.5 The insecticide p, p0 -dichlorodiphenyltrichloroethane 3.16 (DDT) saved more human
life than all of antimalarials put together. The latest investigations show though, that the
antiandrogenic effects of the main metabolite p, p0 -dichlorodiphenyldichloroethylene 3.17
(DDE) is possibly the main culprit responsible for reproductive disorders found in animals,
including perhaps humans.
Table 3.1 Number of malaria cases in different countries before and after the introduction
of DDT 3.16 (Fig. 3.5) The numbers in parentheses are the years (Jukes TH (1974) Naturwiss
61:6–16)
Country Cases of malaria (year)
Before DDT After DDT
Italy 411,602 (1946) 37 (1969)
Spain 19,644 (1950) 28 (1969)a
Yugoslavia 169,545 (1937) 15 (1969)a
Bulgaria 144,631 (1946) 10 (1969)a
Romania 338,198 (1948) 4 (1969)a
Turkey 1,188,969 (1950) 2,173 (1969)
India 75 million per year 750,000 (1969)
Sri Lanka 2.8 million (1946) 110 (1961)
31 (1962)
17 (1963)
2.5 million (1968/1969)b
Taiwan >1 million (1945) 9 (1969)
Venezuela 817,115 (1943) 800 (1958)
Mauritius 46,395 (1948) 17 (1969)
a
Imported cases
b
After DDT spraying was discontinued in 1963
been saved since 1942. In India alone the number of cases went from 75 million to
750,000, and the number of annual fatalities was reduced to 1,500. DDT has saved
more lives than all antimalarial drugs put together! The acute toxicity of DDT is
actually not a problem for mammals and humans. Unfortunately, it turned out that
DDT decomposes extremely slowly in the environment, and it enriches as it moves
its way up the food chain, especially in birds and fish. It also accumulates in human
fat and in breast milk. The chronic toxicity comes from long-term retention of
one year or more, and that is a serious problem.
The moving book, Silent Spring by Rachel Carson, was published in 1962.
Despite warnings from experts, DDT spraying for mosquitoes was stopped
in Sri Lanka in 1963, and the number of malaria cases raced to 2.4 million by
3.2 Malaria: Success and Failure 45
1968/1969. By then it was too late to use DDT again because the mosquitoes had
become resistant, and this was certainly also partially due to the residual DDT that
remained in the environment in the intervening years.
Further investigations showed that a DDT metabolite, dichlorodiphenyldichloro-
ethylene 3.17 (DDE, Fig. 3.5) has surprisingly strong antiandrogenous effects, that
is, it blocks the effects of male hormones. Therefore, DDE is responsible for the
DDT-dependent reproductive and developmental disorders that are seen in some
species, perhaps also in humans. It is remarkable that the effect of this metabolite
was only discovered 50 years after DDT was introduced.
Not only the mosquitoes became resistant to DDT, the parasite also became
resistant to the drugs. For this reason, the history of the chemotherapeutic develop-
ments for malaria has been a rollercoaster ride of new promising compounds, and
the more or less quick development and distribution of resistant parasites.
Chloroquine 3.13, was prepared in 1934 in the Bayer laboratories, but was
judged to be “too toxic”; it was “rediscovered” by the Americans and deployed
as a malaria therapeutic par excellence. Efficacious, well tolerated, and above all
else inexpensive to produce, it, along with the above-described mosquito extermi-
nation with DDT and landscaping measures, brought us within reach of a victory
over malaria. But resistant parasites emerged almost simultaneously and indepen-
dently from one another in the 1960s in different parts of Southeast Asia, Oceania,
and South America. They possessed a mutated transport protein in the membrane of
their gastriole that recognizes chloroquine as a substrate. By using this protein they
were able to expel chloroquine from its target. In the meantime, resistant parasites
have spread throughout almost the entire geographic range of malaria. Chloroquine
lost its once phenomenal status for the therapy of malaria tropica. Since then,
a malaria therapeutic with similar qualities as chloroquine has been sought by
researchers, until now, however, without success. The structurally related
amodiaquine 3.15 (Fig. 3.4) is in fact effective against weakly chloroquine-resistant
strains, but it is largely ineffective against highly resistant strains (especially in
Southeast Asia). Moreover, upon long-term use as a prophylaxis, it carries the risk
of irreversible liver damage or a life-threatening agranulocytosis. In the short term,
it appeared that the antifolate combination of sulfadoxine/pyrimethamine 3.18/3.19
(Fansidar ®) could replace chloroquine (Fig. 3.4), but the first resistance occurred
much faster than with chloroquine. Starting from the point of origin in Southeast
Asia, the resistance has spread over the entire world.
The wars of the last century have also promoted the search for new antimalarial
drugs. Tremendous effort was made at the Walter Reed Army Institute of
Research in the USA. Over the course of 40 years, and particularly during WWII
and the Vietnam War, more than 250,000 substances were tested for an anti-
malarial effect. Judging on hand of the exerted effort, the success was modest:
the two aryl amino alcohols halofantrine 3.20 and mefloquine 3.14, and the
8-aminoquinoline tafenoquine 3.21, which still has not completed clinical trials,
were the result of strenuous labor. After its introduction, halofantrine was with-
drawn from the market because it caused lethal arrhythmias (▶ Sect. 30.3). In
Southeast Asia the resistance to mefloquine developed so quickly that it can only
46 3 Classical Drug Research
be used in combinations with artesunate 3.22. Because mefloquine has been used
sparingly due to its price, most of the parasite strains are still sensitive to it. For this
reason, today mefloquine is one of the most important malaria prophylactics for
Western tourists. Artesunate is a partial-synthetic derivative of dihydroartemisinin
3.24, which is isolated from annual mugwort (Artemisia annua). Artemisinin’s
very unusual endoperoxide structure is essential for its activity. Intense research
is currently devoted to clarifying whether the iron(II)-catalyzed production
of radicals, which then react with the immediate cell structures (iron-triggered
cluster bomb), or a specific calcium pump inhibition is its mode of action. At any
rate, these are the most potent medicines to fight malaria to date. Scientists consider
it to be only a matter of time until resistance to artemisinin develops.
The artemisinin-based combination therapy is the current recommendation of the
WHO. It is combined with whatever is available, even with substances that have
already established massive resistance. At the moment it is combined with the
Chinese-developed aryl amino alcohol lumefantrine 3.23, which is usually
still effective. The combinations of dihydroartemisinin/piperaquine 3.24/3.25
and artesunate/pyronaridine 3.22/3.26 (Fig. 3.6) are in advanced stages of clinical
trials.
Both combination partners were developed in China in the 1960s and 1980s,
respectively. They belong to the same class as chloroquine, even though
pyronaridine has an azaacridine instead of a quinone scaffold. Resistance to both
of these compounds is already widespread in Southeast Asia. The combination of
dapsone/chlorproguanil (LapDap ®) 3.27/3.28 was introduced only a few years ago
and both compounds are representatives of a long-used class: the antifolates. Even
in this case, the majority of Southeast Asian strains are already resistant. True
novelties in the mode of action are rare. In 1997 a very expensive combination
medication atovaquone/proguanil 3.29/3.30 (Malarone ®) was introduced that syn-
ergistically inhibits the mitochondrial respiratory chain. Fosmidomycin 3.31, an
inhibitor of the parasite-specific mevalonate-independant isoprenoid synthesis
pathway, is currently in clinical trials. Increased efforts are necessary to find new
substances. Ideally, modes of action that have not been exploited yet should be
pursued. It is only in this way that we can be armed and ready for the time that
resistance to artemisinin spreads.
Research on the opiates has taught us how complex natural products can be
systematically simplified, and structurally abbreviated analogues can be prepared
that have the identical effect, but sometimes with even better specificity. It has also
shown that there is sometimes no obvious solution for a specific problem. The
separation of the analgesic and addictive qualities could not, or only inadequately
be achieved.
The narcotic, analgesic, and euphoric effects of opium, which is isolated from
poppies, have been known for at least 5,000 years. Opium was used for operations,
3.3 Morphine Analogues: A Molecule Cut to Pieces 47
NH2
MeO OMe N
H2N Cl
H
N N N
H2N S N
O O H3C
CF3
CH3
HO N CH3
O
O
CH3 H3C
Cl CH3
F3C N O
HN
NH2
Cl CH3
CH3
CH3
H N
HO
O
H3C H
O
O CH3
Cl Cl
O
CH3 O
O
O–
O Cl
3.22 Artesunate 3.23 Lumefantrine
CH3
H
O
H3C O H
O
O
CH3
OH
3.24 Dihydroartemisinin
N N
N N
Cl N N Cl
3.25 Piperaquine
O O
N S
OH
H2N NH2
N
HN 3.27 Dapsone
N O
CH3
O
Cl
Cl N
3.26 Pyronaridine
OH
NH2 O
3.29 Atovaquone
N
H2N N Cl
N NH2
CH3
H Cl N
CH3
H2N N Cl
3.28 Chlorproguanil N CH3
H
CH3
OH O Na+
3.30 Proguanil
H N P
O–
OH
O
3.31 Fosmidomycin
Fig. 3.6 The latest research in antimalarials shows that many products can be used in
combination. First Fansidar ®, a combination of sulfadoxine 3.18 and pyrimethamine 3.19 was the
drug of choice. The development of rapid resistance has made this once-promising treatment
useless in the meantime. To date, hopes rest on the artemisinin derivatives 3.22 and 3.24. A new
beacon of hope is found in fosmidomycin 3.31, which has a novel mode of action in that it inhibits
the mevalonate-independent biosynthetic route to isoprenoids.
but is also a traditional drug of abuse. The importance of its abuse in the cultural
history of humanity is illustrated, among other places, in the “Opium Wars” of the
nineteenth century. In 1840 the Chinese wanted to stop the English from importing
opium and burned 20,000 cases of it; this led to a 2-year-long war between the two
countries.
3.3 Morphine Analogues: A Molecule Cut to Pieces 49
R1O HO
O H H O OH H
N CH3 N
R2O O
3.35 Naloxone
3.32 Morphine, R1 = R2 = H
3.33 Codeine, R1 = Me, R2 = H
3.34 Heroin, R1 = R2 = Acetyl
Fig. 3.7 Morphine 3.32 and codeine 3.33 served as lead structures for heroin 3.34, which has
better CNS bioavailablity, and naloxone 3.35, a morphine antagonist.
H3C N
H OH
N
COOEt H3C O
=
O
COOEt
N 3.37 Atropine
3.36 Pethidine
CH3
HO
O
O
Et
N CH3
H
N(Me)2 MeO
H3C
H3C CH3
3.38 Levomethadone OH
3.39 Etorphine
Fig. 3.8 The architecture of morphine was dissected in many ways. The strongly potent pethidine
3.36 was the first fully synthetic opiate analgesic, but it was discovered in the 1930s in a search for
anticonvulsives by varying the structure of atropine 3.37. It is recognizable however, that pethidine
retains the benzene ring of morphine as well as its piperidine ring. Levomethadone 3.38 is derived
from pethidine. The addition of another ring led to substances the potency of which surpasses
morphine by orders of magnitude. Etorphine 3.39 is 2,000–10,000-times more potent than
morphine in animals. Since 1963 it is used in African wildlife preserves to immobilize large
animals such as elephants and rhinoceroses.
and Gilg Tschudi. Morphine contains five rings: an aromatic benzene ring, two
unsaturated six-membered rings, the nitrogen-containing piperidine ring, and an
oxygen-containing five-membered ring. Systematic structural modifications had the
goal of simplifying the structure, for example, by opening one or more rings, or
removing them altogether.
In 1939, the potent analogue pethidine 3.36 (Fig. 3.8) was the first fully synthetic
analgesic, though it was originally based on the spasmolytic atropine 3.37. Despite
this, it is recognized to be a morphine analogue. In levomethadone 3.38 the
piperidine ring of pethidine is opened, an oxygen atom from the ester group is
removed, and another aromatic ring is added. There are thousands of other ana-
logues, some of which have been introduced to therapy. Aside from the decon-
struction of morphine, the construction of additional rings has surprisingly led to
analogues with more potency, for example, etorphine 3.39 (Fig. 3.8).
For a long time it was a complete mystery why our bodies would have extra
receptors for the contents of poppy plants, so-called opiate receptors. The solution
came with the discovery of the endogenous morphine-like peptides Met- and Leu-
enkephalin (▶ Sect. 10.2), which are the natural ligands for these receptors. The
discovery stimulated an intensive search for orally active peptides or
peptidomimetics devoid of addictive potential. The result of the work was more
3.4 Cocaine: Drug and Valuable Lead Structure 51
Cl
Cl
OH
N O
OH
CON(Me)2 N
Fig. 3.9 Structural derivatives of morphine and its analogues have led to selective antidiarrhea
agents, loperamide 3.40, for instance, as well as neuroleptics such as haloperidol 3.41.
than sobering. Although orally active analogues were found, their addictive poten-
tial was identical to that of morphine and most morphine-derived analogues.
A few synthetic analogues have, in addition to agonistic activity, a weak antag-
onistic effect as well. The potential for these substances to be abused by addicts is
less than with the classical morphine analogues. Combination preparations of
agonists and antagonists are also available. With appropriate use, the analgesic
effect of the agonist dominates because it is present in excess. If the medicine is
injected intravenously, the more-strongly binding antagonist displaces the agonist,
and the desired euphoric effect never sets in.
The work with regard to improved selectivity was also successful. Today cough
medicines and antidiarrhea medicines, for example, loperamide 3.40 (Fig. 3.9), are
available that have no central morphine-like effects. This substance is able to pass
through the blood–brain barrier but is immediately expelled by an active trans-
porter. Upon inhibition of these transporters, for instance, when coupled with
quinidine, loperamide also has classical opiate effects. Its structure unites elements
of pethidine 3.36 and levomethadone 3.38.
In this section only a few representatives of the many thousand structural
modifications of morphine can be discussed. The approach of Paul Janssen should
not remain unmentioned though; he started with pethidine 3.36 with the goal of
preparing a strong analgesic, but instead experienced an unexpected success in
another area. The result was the neuroleptic haloperidol 3.41 (Fig. 3.9), a drug for
the treatment of schizophrenia, the mode of action of which is mediated by an
antagonistic effect at the dopamine D2 receptor (▶ Sect. 29.4).
COOCH3
N O
H3C
O O
H3C O
H
NH2
3.42 Cocaine 3.43 Benzocaine
CH3
H CH3 CH3
N H
H3C N N N
O
H3C H3C O
H3C
Fig. 3.10 The local anesthetic effect of cocaine 3.42 was recognized early on. The independently
found lead structure benzocaine 3.43 and the basic moiety of cocaine were models for synthetic
local anesthetics. The structural relationship is clearly recognizable in lidocaine 3.44, which also
acts as an antiarrhythmic, and in mepivacaine 3.45.
The translation of the quite positive central effects of cocaine onto analogues
devoid of addictive potential is still in progress. The example of morphine leads
one to fear that this goal might not be possible.
Coca leaves and cocaine 3.42 (Fig. 3.10) belong to the oldest known drugs.
Chewing dried coca leaves has a long tradition in Peru and Bolivia. In 1744
Garcilaso de la Vega wrote that coca “satisfies hunger, gives new energy to the
tired and exhausted, and lets the unhappy forget their troubles”. The Scottish
author, Robert Louis Stevenson (Treasure Island) wrote in his novella The Strange
Case of Dr. Jekyll and Mr. Hyde about a personality split that a doctor undergoes
under the influence of drugs; he wrote the first draft of this novella in only three
days and nights while under the influence of cocaine. In 1863 the American chemist
Angelo Mariani (1838–1914) patented a mixture of coca extract and wine as
Vin Mariani. It made him a rich man. In 1886 the pharmacist John S. Pemberton
developed a coca-containing stimulant and headache remedy that he named Coca
Cola. He sold the rights in 1891 to a colleague, Asa G. Candler, who founded the
Coca Cola Company one year later. Up until 1906 Coca Cola indeed contained
a small amount of cocaine, but today it only contains the harmless stimulant
caffeine. Back at the turn of the last century, cocaine was already fashionable,
particularly in artistic circles. The Viennese psychiatrist Sigmund Freud (1856–
1939) experimented with cocaine intensively and rather uncritically. He considered
it to be a wonder drug, took it himself regularly, and recommended it generously for
use in therapy, for the treatment of stomach aches, and for a depressed mood. Later,
after massive criticism from his colleagues he turned away from it.
Cocaine causes the release of dopamine from its transporter (see ▶ Sect. 30.7).
Usually it is sniffed, occasionally it is intravenously injected, or it is mixed in drinks
or taken orally. Sniffing delivers it quickly to the brain where it displaces dopamine
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 53
from the binding site of the transporter and this causes increased dopamine release
into the synaptic gap. The free base, which is made by mixing it with sodium
bicarbonate (crack) is absorbed very quickly through the lungs by smoking it, and
causes euphoria that is even more distinct stronger than when the salt (coke,
powder, snow) is sniffed. Because cocaine does not bind for long, the transporter
is quickly reloaded with dopamine. The same effect can be induced again after
a little while. Other cocaine analogues that bind for longer do not allow the effect to
be repeated for hours. Psychological dependence occurs very quickly, even after the
first use in the case of crack cocaine. Physical withdrawal symptoms, as seen with
heroin addicts, usually do not occur.
The credit for discovering the local anesthetic effect of cocaine does not go to
Freud but rather a friend of his, the ophthalmologist Carl Koller (1857–1944).
Freud had planned to investigate this effect but in 1884 he wanted to visit a friend of
his, Martha Bernays, in New York quickly first. Koller picked up on Freud’s
suggestion and carried out the decisive experiment on the eye in his absence. The
synthetic benzoic acid esters and anilides that were initially used as local anes-
thetics were not derived from cocaine 3.42, but rather from p-aminobenzoic acid
esters; benzocaine 3.43 was already in use in therapy in 1902. A structural rela-
tionship to cocaine is, however, easily seen in modern local anesthetics such as
lidocaine 3.44 and mepivacaine 3.45 (Fig. 3.10).
The history of the treatment of gastroduodenal ulcers is long and educational. Basic
research clarified the important mechanisms without providing a new drug. The
development of the therapy occurred in several phases. Again and again, better was
the enemy of good. In the beginning the treatment consisted of antacids, and later
anticholinergics. In severe cases only surgery helped. The H2 antagonists made the
breakthrough to purely pharmaceutical treatment. Now we are experiencing the victory
lap of the proton-pump inhibitors, which are used in different combinations with
antibiotics. Perhaps in the future this will be augmented or even replaced by a vaccine.
Gastric and duodenal ulcers are usually chronic illnesses and are widespread in
the general population. Any damage to the mucosal membrane of the stomach leads
to damage to the underlying cells through proteolytic enzymes and gastric acid.
Acetylcholine 3.46, histamine 3.47, and gastrin, a mixture of peptides with 17 (little
gastrin) and 34 (big gastrin) amino acids, stimulate the production of acid.
For decades the treatment of gastroduodenal ulcers was based on reducing the
amount of acid, for instance, with sodium bicarbonate, calcium carbonate, magne-
sium salts, and aluminum oxide hydrate. Advanced ulcers had to be treated surgi-
cally. Anticholinergics, antagonists of the acetylcholine receptor should, in
principle, have been suitable for ulcer treatment; however, unspecific antagonists
are out of the question because of their severe side effects. It was not until
pirenzepine 3.48 (Fig. 3.11), a selective so-called M1 antagonist, was developped
54 3 Classical Drug Research
H
O CH3 N NH2
+N CH3
H3C O N
CH3
H O
N
O
N(Me)2
N N
O N N Me
Fig. 3.11 Acetylcholine 3.46 and histamine 3.47 stimulate the acid production in the stomach.
The acetylcholine receptor antagonist pirenzepine 3.48 was the first drug specifically for ulcer
therapy. Classical H1 antihistamines such as diphenhydramine 3.49 cannot antagonize histamine in
the stomach.
that this class could be used in therapy. Here the undesirable side effects of
unspecific anticholinergics are only apparent at relative high doses.
The role of histamine in acid secretion was initially called into question because
the classical antihistamines, later defined as H1 antihistamines, did not reduce acid
secretion. These substances, for instance, diphenhydramine 3.49 (Fig. 3.11) antag-
onize histamine in the intestines, lungs, and in allergic reactions. Today a wide
palette of different histamine antagonists is available for the treatment of allergic
rhinitis (hay fever). The most important side effect, particularly with the older
substances, is a more or less pronounced sedation. Histamine-induced gastric acid
secretion, the effect on the heart, and uterus contractions are not inhibited by
diphenhydramine and other analogues. It was first suspected in 1948 that there
might be two different histamine receptors, H1 and H2. The H1-type is inhibited by
diphenhydramine, but the H2-type, which is responsible for the above-mentioned
effects is not. Both belong to the family of G protein-coupled receptors
(▶ Sect. 29.1). In the meantime two additional members of the family, the H3 and
H4 receptors, had been discovered. In 1964 James W. Black (1924–2010) at Smith
Kline & French in England began to develop three models to test the inhibition of
these other effects of the H2-mediated effect of histamine. One was an in vivo model
measuring gastric perfusion on anesthetized rats, and two were in vitro models
evaluating the histamine-induced stimulation of a guinea pig heart and a rat uterus.
James Black later received not only the Nobel Prize, but was also knighted by Queen
Elizabeth II, two rather unusual honors for an industrial pharmaceutical researcher.
Despite all strategies that were available for the development of receptor antag-
onists, the search for an H2 antagonist was to no avail for years. The American
management in Philadelphia became impatient and wanted to end the program. The
first promising result came just in the nick of time. Because all lipophilic analogues
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 55
H H
X NH2 R N N
X CH3
HN N NH HN N S
H3C H H
N N
S CH3
3.54 Cimetidine
HN N N
C N
were ineffective, the earlier more polar compounds that had already been investi-
gated were reinvestigated. A compound that had already been synthesized in 1928
and determined to be ineffective, Na-guanylhistamine 3.50 (Fig. 3.12), now
appeared to be a weak antagonist. The effect had been overlooked because 3.50
is actually a partial agonist and therefore shows a weak histamine-like effect.
Within a few days the first lead structure, S-(2-imidazoyl-4-yl-ethyl)isothiourea
3.51, with interesting activity was identified (Fig. 3.12).
The extension of the side chains of both of these compounds delivered partial
agonists, the antagonistic effects of which were too weak. It was only in 1972 after
they abandoned the hypothesis that the basic nitrogen in the side chain was
necessary for activity that they, after chain elongation and an N-methyl substitution
of the thiourea, arrived at the first clinically useful H2 antagonist burimamide 3.52.
Human trials confirmed the efficacy, but the bioavailability was poor. The next
milestone was achieved with the development of metiamide 3.53 (Fig. 3.12), which
is 5–10-times more potent than burimamide and clinically demonstrated the desired
ulcer-healing effect. In some patients, however, a granulocytopenia occurred,
which is a dangerous suppression of the white blood cells and cannot be tolerated.
The medical need was great. It was not foreseeable whether the observed effect
was a result of H2 antagonism. We have the company to thank for taking on the risk
of further research. The sulfur atom of the thiourea was suspect. An isosteric
exchange for an oxygen atom delivered a less-potent urea analogue. Exchange for
an ═NH group led back to a guanidine, which was strongly basic, but a potent
antagonist nonetheless. Substitution of the imino group for an NO2 or a CN group
led to less-basic analogues, the antagonistic potency of which was comparable to
metiamide. The somewhat more active of the two analogues, cimetidine 3.54
(Fig. 3.12) was clinically tested. In November 1976 and in August 1977 it was
introduced in England and the USA, respectively. By 1979 it was available in over
100 countries. Shortly thereafter in 1983, cimetidine (Tagamet ®) became the most-
prescribed drug in many countries, and its sales reached about US $1 billion.
56 3 Classical Drug Research
Such a successful drug makes other companies restless. There are many cases
in the history of pharmaceutical research in which a major new concept was adapted
by developments in other companies. Other examples of this are the structurally
entirely different calcium channel blockers verapamil and nifedipine (▶ Sect. 2.6)
and the angiotensin-converting enzyme inhibitors captopril and enalapril
(▶ Sect. 25.4).
The same happened in the development of the H2 antagonists. Ulcer therapy had
been researched since 1960 at Allen and Hansburys, a subsidiary of Glaxo. One of
the first lead structures 3.55 (Fig. 3.13), an aminotetrazole with about the same
potency as burimamide, was systematically varied without success. Their research
management also wanted to stop the project to concentrate on the anticholinergics.
The breakthrough came upon replacement of the tetrazole ring with a furan. It was
not exactly an obvious idea because the previously synthesized compounds always
had at least one nitrogen atom in the ring. The —CH2SCH2CH2— chain was taken
over from metiamide 3.53, and a dimethylaminomethylene group was added to
improve water solubility; the result was AH 18665 3.56 (Fig. 3.13).
The chemists also synthesized a cyanoguanidine AH 18801 3.57 that was
comparable to cimetidine 3.54 in terms of potency. The substance’s characteristics
were, however, unsatisfactory: the melting point was too low. The nitrovinyl
analogue 3.58 brought success in this respect. It was synthesized and was an oil!
That was not seen as a prohibitive problem because it was redeemingly 10-times
more potent than cyanoguanidine 3.57 in the rat. Ranitidine 3.58 (Fig. 3.13) was
developed as a drug and introduced in 1981 as Zantac ® and Sostril ®. Compared to
cimetidine, ranitidine was 4–5-times more efficacious in humans and had the
advantage that it was more selective. In 1987 ranitidine overtook cimetidine. In
1994 with US $4 billion in sales, it became the most economically successful drug
in annual sales at that time. Within a few years, Glaxo was catapulted to the
pinnacle of the world rankings of pharmaceutical corporations. Glaxo used this
opportunity. The research of this company and its strategy in drug development
belong to “the finest” in the branch today. Through mergers and acquisitions with
competitors, Glaxo, “GSK” as it is known today, has become one of the largest
pharmaceutical corporations in the market.
In the meantime, an antitumor effect in colon, gastric, and renal cancer has been
reported for cimetidine. Apparently it suppresses the tumor-mediated interleukin-1-
induced selectin activation (▶ Sect. 31.3).
It is understandable from the chemical structure that cimetidine has a high
affinity for cytochrome P450 enzymes, particularly CYP 3A4 (▶ Sect. 27.6). As
a consequence, interactions with other drugs that depend on CYP 3A4 for meta-
bolism are common. What was first seen as an indispensible imidazole moiety in
3.54 blocks the catalytic iron center in the P450 enzymes. Ranitidine 3.58 carries
a furan ring in the same position and lacks the P450 inhibition. After cimetidine and
ranitidine, very few other drugs have made their way to the market. Nizatidine 3.59
and famotidine 3.60 contain a thiazole ring as a heterocycle (Fig. 3.13). In 3.60, the
electron-withdrawing group of the guanidine moiety is replaced by a sulfonamide
group.
3.5 H2 Antagonists: Ulcer Therapy Without Surgery 57
3.56 AH 18665, X = S
3.57 AH 18801, X = N-CN
3.58 Ranitidine, X = CH-NO2
H H
H3C N N N
N S CH3
CH3 S
NO2
3.59 Nizatidine
NH2
H2N N N SO2NH2
S N
NH2 S
3.60 Famotidine
CH3
OMe
N
CH3
N
S
MeO N O
H
3.61 Omeprazole
It is true even for the H2 blockers that good drugs are replaced by better ones.
After being prompted to acid stimulation, the cells use an H+/K+-ATPase active
enzyme to pump protons out of the cell in exchange for potassium at the cost of
energy. If “the faucet is turned off ” at this step, not only the histamine-induced acid
production, but also the acetylcholine and gastrin-mediated acid production is
stopped. Omeprazole 3.61 is a prodrug that has been developed, which, upon
rearrangement, acts as an irreversible inhibitor of this proton pump (▶ Sect. 9.5).
The effect of omeprazole therefore lasts longer, and the reduction in acid secretion
is stronger than with the H2 antagonists. Gastric and duodenal ulcers heal more
58 3 Classical Drug Research
quickly and reliably. These substances also hit it big. At the end of the last century,
Losec ®, Antra ® (both from Astra), and Prilosec ® (Merck & Co., USA) had com-
bined global sales of over US $6 billion despite the fact that they were introduced to
the market much later than ranitidine. The enantiomerically pure form
esomeprazole (Nexium ®) even reached US $7 billion in sales in 2007.
That is not even the end of the story. Although in principle it had been known
since 1983, the relevance of the bacteria Helicobacter pylori for the etiology of
ulcers was first discussed in 1994 at a conference of the US National Institutes of
Health (NIH). This bacterium infects a large portion of the population in childhood.
Frequently it is spread within a family; a kiss can be enough to infect someone. It
causes gastrointestinal damage in a portion of those infected, which can lead to an
ulcer. In the meantime it is held responsible not only for ulcers but also for at least
two different forms of gastric cancer. It survives assault by many antibacterial
agents as well as the acidic milieu of the stomach. It has an urease that releases
ammonia in its immediate vicinity, which in turn neutralizes the gastric acid.
The drugs of choice to treat such infections are combinations of H2 blockers,
proton pump inhibitors, and antibiotics. H. pylori seems to quickly develop antibi-
otic resistance though. Since the beginning of 1995 the first animal model is
available, a mouse with a sustained H. pylori infection; this should promote further
research in this important area. There is a vaccine currently in development.
A portion of the vaccinated patients exerted enough of an immune response to
defend themselves from the bacteria. For practical use however, its reliability must
be improved. Perhaps in the foreseeable future we will have an ulcer therapy that is
completely different, for instance, a swallowed vaccine that delivers life-long
protection. The revolution is in sight: a one-time treatment without repeated
gastroscopy. The patients will be delighted. Others will see this dramatic change
in therapy with mixed emotions.
3.6 Synopsis
• Even though the period of classical drug research was strongly governed by trial
and error, it has been exceptionally successful. Many leads were found by
accident or from traditional medicine, though limited knowledge of pathophys-
iology or molecular disease etiology was available.
• Acetylsalicylic acid or Aspirin ® is one of our oldest but also most prototypical
drugs. Originating from bark extracts and chemically modified to improve taste
and tolerance, it achieves its actual potency and mode of action by irreversibly
inhibiting cyclooxygenase.
• Since then two isoforms of cyclooxygenase have been characterized, one is
constitutionally present, and the other is induced in inflamed tissue.
Acetylsalicylic acid inhibits both unselectively, giving rise to some undesirable
side effects.
• Due to irreversible inhibition of COX in platelets, Aspirin exerts an influence on
the ratio of synthesized thromboxane and prostacyclin, which has a depressing
Bibliography 59
Bibliography
General Literature
Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Ryan J, Newman A, Jacobs M (eds) (2000) The pharmaceutical century. Ten decades of drug
discovery. American Chemical Society, Washington, DC, Supplement to ACS Publications
Sneader W (1996) Drug prototypes and their exploitation. Wiley, Chichester
Sneader W (2005) Drug discovery. A history. Wiley, Chichester
Verg E (1988) Meilensteine: 125 Jahre Bayer, 1863–1988. Bayer AG, Leverkusen
60 3 Classical Drug Research
Special Literature
Aspirin – eine unendliche Geschichte, Research. Das Bayer-Forschungsmagazin, Issue 6, S. 4–21
(1992) and other articles in this magazine
Battistini B, Botting R, Bakhle YS (1994) COX-1 and COX-2: toward the development of more
selective NSAIDs. Drug News Perspect 7:501–512
Kelce WR et al (1995) Persistent DDT Metabolite p, p’-DDE is a potent androgen receptor
antagonist. Nature 375:581–585
Patrono C (1989) Aspirin and human platelets: from clinical trials to acetylation of cyclooxygen-
ase and back. Trends Pharm Sci 10:453–458
Schlitzer M (2007) Malaria chemotherapeutics part I: history of antimalarial drug development,
currently used therapeutics, and drugs in clinical development. Chem Med Chem 2:944–986
Wiesner J, Ortmann R, Jomaa H, Schlitzer M (2003) New antimalarial drugs. Angew Chem Int Ed
Engl 42:5274–5293
Protein–Ligand Interactions as the Basis for
Drug Action 4
In the early 1880s, Emil Fischer investigated the cleavage of glucosides with
different enzymes that only differed in the stereochemistry of the glycosidic carbon
atom. He noticed that particular glucosides could only be cleaved with one group of
enzymes. Other glucosides, on the other hand, could only be cleaved with another
group of enzymes. He drew the correct conclusions from his observation and in
1894 formulated them in an article in the Berichte der Deutschen Chemischen
Gesellschaft (Reports of the German Chemical Society):
The limited effect of enzymes on the glucosides can also be explained by the assumption that
a chemical process can be initiated only by those [enzymes] that have a similar geometric
construction that approximates that of the molecule [substrates]. To use a picture, I want to
say that enzymes and glucosides must fit together like a lock and key to be able to exert
a chemical effect upon one another. This idea has gained plausibility and value for
stereochemistry research after the phenomena was transferred from the biological to the
chemical field.
Emil Fischer did not pursue this image any further, and later even complained that it
is often quoted out of context. The configuration of the sugars interested him, that of
the isomeric glucosides did not. He expressed a rather distanced attitude to purely
theoretical considerations. In 1912, he wrote in a letter “I myself take not so much
pleasure in theoretical things.” This is remarkably modest for a man who exerted such
a great influence with his image of a lock and key! Emil Fischer would have certainly
been pleased and proud if he had seen the results of the X-ray structural analysis of
protein–ligand complexes, for instance, of retinol (vitamin A) bound to the retinol-
binding protein, which is the transport protein for this molecule (Fig. 4.1).
Many binding sites can exceedingly specifically discriminate between analogues
that are chemically closely related. Even the smallest mishap must not occur in
protein biosynthesis. Friedrich Cramer more closely investigated the recognition
mechanism for the incorporation of the amino acids valine and leucine. These
amino acids differ in their side-chains only in that a methyl group is exchanged
for an ethyl group. The smaller valine residue should easily fit into the “lock” for
64 4 Protein–Ligand Interactions as the Basis for Drug Action
the adaptability of the protein is related to its function. Proteins often have to be
adequately flexible to fulfill their biological functions.
For the rational design of ligands, there are two fundamentally different starting
points that differ in the informational content of the system. Either the exact
three-dimensional structure of the binding site is known or it is unknown. In the first
case, the lock is known, and the key “only” has to be cut (▶ Chap. 20, “Protein
Modeling and Structure-Based Drug Design”). In the other case, the active and
inactive analogues represent the fitting and ill-fitting keys. It is through the comparison
of the keys and systematic variations that better-fitting keys can be designed
(▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Comparisons”). In the
following section, the binding of a low-molecular-weight drug (“ligand”) and
a macromolecular receptor will be more precisely illuminated. These target
structures for drugs can be outside or inside the cell, or they can be embedded in the
cell membrane. Therefore we will briefly address the construction and function of the
cell membrane before the protein–ligand interaction is brought to the foreground.
The majority of biological processes in our body take place inside cells. These cells
are surrounded by a membrane that protects the cellular content from “leaking”.
The membrane also hinders undesirable xenobiotics from entering the cell and
mediates the contacts between cells. Membranes are also found within the cell,
where they form substructures (so-called compartments) and separate individual
cellular components from one another. In mammalian cells, the outer membrane is
made up of a lipid double-layer, in which proteins and cholesterol molecules are
embedded (Fig. 4.2). All molecules can move relatively freely, therefore it is called
a “fluid mosaic membrane.”
Lipid membranes of this type function as barriers for polar substances and as
permeable layers for non-polar molecules. The importance of membranes for the
transport and distribution of drugs is presented in detail in ▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”. Here,
only the important function that the lipid membrane has for the activity of
drug molecules is discussed. Membrane-embedded proteins belong to entirely
different classes. Among them are the membrane-anchored and membrane-residing
enzymes, the large class of G protein-coupled receptors (▶ Chap. 29, “Agonists and
Antagonists of Membrane-Bound Receptors”), ion channels, pores and transporters
(▶ Chap. 30, “Ligands for Channels, Pores, and Transporters”), and surface recep-
tors (▶ Chap. 31, “Ligands for Surface Receptors”).
Due to the phosphate and ethanolamine head groups, both of the outer layers
of the lipid double layer are very polar. The alkyl chains are found on the inside,
where the membrane is non-polar. Many drugs are also non-polar and accumulate
here in higher concentration than in solution. Amphiphilic (soap-like) molecules,
that is, substances that have both non-polar and polar character, arrange themselves
in the membrane so that the non-polar portion is on the inside (Fig. 4.2).
66 4 Protein–Ligand Interactions as the Basis for Drug Action
Exterior of
the membrane
Non-polar
alkyl groups Protein
Interior of
Membrane the membrane
lipids Membrane-embedded
Non-polar cholesterine molecule
drug
Fig. 4.2 Membranes from mammalian cells are constructed from a lipid double layer, in which
proteins (yellow) and individual cholesterol molecules (black) are embedded. The individual lipid
molecules (orange) point their polar groups to the exterior of the membrane, and their alkyl chains
to the interior. Therefore polar drugs (light blue) accumulate on the outside of the membrane.
Non-polar drugs (red) are enriched in the interior of the membrane. Amphiphilic drugs (violet) are
oriented into the membrane according to their structure. Despite this, all of the molecules can
move relatively freely. Therefore this is called a “fluid mosaic membrane”.
This orientation within the membrane plays a particularly important role when the
polar group is a positively charged nitrogen atom that can form additional electro-
static interactions with the phosphate group of the lipids.
In the meantime, this concept has been proven experimentally with numerous
independent methods. For many receptors it is accepted that the ligand binds at
a site in the protein that is only accessible from the inner layer of the membrane
(e.g., lipases, ▶ Sect. 23.7; or cyclooxygenases, ▶ Sect. 27.9). Therefore the
enrichment and arrangement of an active molecule in the membrane plays an
important role for the optimal approach to the binding site. If the molecule, on
the other hand, assumes an incorrect orientation, its docking is hindered.
The binding of a ligand to its target protein is measurable. The extent of the binding
is characterized by the binding constant (Eq. 4.1). Literally interpreted, the disso-
ciation constant Kd is the reverse of the association constant Ka. With enzymes,
the so-called inhibition constant Ki is determined in a kinetic assay (▶ Sect. 7.2). At
low substrate concentration, it determines the inhibitory concentration that is
necessary to reduce the rate of an enzyme reaction by one half. Although Ki is
therefore not exactly defined as a dissociation constant, the two quantities are
usually referred to interchangeably. In the following, the abbreviation Ki is used
in the same sense as a dissociation constant, which indicates the strength of the
interaction between protein and ligand. It is a thermodynamic equilibrium measure
that indicates what portion of the ligand is bound to the protein, on average. The law
of mass action can be expressed as:
4.3 The Binding Constant Ki Describes the Strength of Protein–Ligand Interactions 67
½ligand ½protein
Ki ¼ (4.1)
½ligand protein complex
Ki has the dimensions of a concentration with the units of mol/L (M). The smaller
the Ki value is, the more strongly the ligand binds to the protein. If the concentration
of the ligand is significantly lower than Ki, only a very small portion of the protein
molecules are occupied by ligand molecules. A biological effect like that of the
inhibition of an enzyme cannot be observed. If the ligand concentration is equivalent
to Ki, half of the available protein molecules are occupied by a ligand. The Gibbs free
energy can be derived from the binding constants by a thermodynamic relationship
(which is valid for equilibria under so-called standard conditions; Eq. 4.2):
DG ¼ RT ln K i (4.2)
cold object and not the other way around? This has something to do with the tendency
of all natural process to distribute energy evenly. The metal atoms vibrate very
strongly in a hot metal block in around their resting positions. Therefore the piece of
metal is hot. Some vibrational degrees of freedom are strongly activated. If the cold
metal block is brought into contact with the hot metal, these vibrations are transmitted.
In the end, the metal atoms in both blocks vibrate around their resting positions, but on
average not as vigorously as the atoms in the hot block moved before. The sum of the
energy content has remained constant; it is, however, now distributed over many more
degrees of freedom. The system can be described as having gone into a more disor-
dered state (many more atoms are now vibrating on average than in the beginning).
This happens for all spontaneously occurring processes. The entropy, S, is used as
a measure to describe the uniform distribution or random disorder. To correctly
describe the process of the formation of a protein–ligand complex (Eq. 4.3), we
need not only the enthalpy (DH) that is exchanged between the two binding partners,
how the distribution of degrees of freedom changes, and whether the system migrates
into a more disordered state must also be considered. Therefore the term free energy
(DG) has been introduced because it considers not only the energy balance of the
process. It also considers the changes in entropy (TDS) that reflect the spontaneous
distribution of energy over the degrees of freedom of the system. Spontaneously
occurring processes are characterized by a negative value for DG.
DG ¼ DH TDS (4.3)
Organic molecules can bind to proteins by forming chemical bonds between ligand
and protein as well as non-covalent interactions. For example, a chemically mod-
ified product of omeprazole reacts with its target a protein and forms a covalent
bond (▶ Sect. 9.5). In this section, we want to limit ourselves to ligands that bind to
the protein by forming non-covalent interactions. For the following discussion, it
is helpful to classify protein–ligand interactions into different categories. The
different types of interactions are summarized in Fig. 4.3.
Hydrogen bonds (H-bonds) are very frequently observed between protein and
ligand. The proton-carrying partner in a biological system is usually an NH or
OH group, which is termed hydrogen-bond donor. The opposite group is an electro-
negative atom with a partial negative charge and is termed hydrogen-bond acceptor.
4.4 Important Types of Protein–Ligand Interactions 69
CH3 H3C
Hydrophobic interactions
+
N Cation–p interactions
C=O··H
N H O
N-H··O
Fig. 4.4 Geometry of a hydrogen bond. The atoms N, H, and O adopt an almost linear orientation
to one another. The N···O distance is between 2.8 and 3.2 Å. The angle N–H···O is practically
always larger than 150 . A large variability is observed for the C═O···H angle. It is typically
between 100 and 180 .
the attractive interactions between the metal ion and the opposite charge in the
ligand that makes a decisive contribution to the affinity in these structures. Fur-
thermore, there are a few groups that are particularly well suited to forming
complexes with transition metals. Among these are the thiols RSH, hydroxamic
acids RCONHOH, acid groups, and many nitrogen-containing heterocycles.
Whether the charge can increase the affinity contribution of hydrogen bonds
depends strongly on the protonation state in which the involved functional groups
are found. Drugs are usually weak acids or bases, that is, they contain so-called
titratable groups (▶ Sect. 19.4). Whether these groups, for example, a carbonic
acid, an acidic sulfonamide, or a nitrogen-containing heterocycle, can release
or accept a proton and transform into a charged state depends strongly on the pH.
The same can happen with functional groups of the acidic or basic amino acid
residues. Then these groups can form charge-assisted hydrogen bonds that provide
a higher contribution to the binding affinity (Sect. 4.8).
The pKa value is considered to estimate whether a group is in the protonated or
deprotonated state. It indicates at which pH value the two forms, which are in
equilibrium with one another, are present in equal amounts. The situation might
become even more complicated because the pKa value can be shifted by the local
environment. In a hydrophobic environment, adopting a charged state is less
favorable for acidic and basic groups, that is, a shift to less acid or basic character
is the consequence. If an already-protonated, positively charged group in the ligand
faces an amino acid of the protein with the same charge, its protonation becomes
even more difficult to accomplish. The group therefore behaves less basic.
The opposite is the case when putative positively charged basic groups bind in
a protein environment with a negative charge. Here, the charged state is even more
easily formed, which corresponds to having stronger basic character. Entirely
analogous considerations result for acidic groups, just with opposite signs.
Here a positively charged protein environment shifts acidic groups toward higher
acidity, and a negatively charged environment makes them behave less
acidic. In this way the protein environment can induce a significant pKa shift
of the titratable groups of the ligand. Uncharged H-bonds can become
charge-supported contacts that significantly contribute to the binding affinity
(▶ Sect. 21.9). With the help of electrostatic calculations an attempt can be made
to estimate the pKa shift upon complex formation (▶ Sect. 15.4).
4.5 The Strength of Protein–Ligand Interactions 71
O S
Chlorophenyl- Furanyl- Thiophenyl-
The results show that electrostatic interactions are the dominating energetic
factor. The interaction between a cation and an anion in a vacuum is more than
400 kJ/mol. This corresponds to the strength of a covalent bond! This amount is
enormous compared to the typical protein–ligand interactions in water that are
summarized in Sect. 4.4. The binding of an ion pair in the gas phase, therefore, is
much larger than the typical strength of a protein–ligand interaction in water.
Two water molecules bind to each other with 22 kJ/mol. This interaction is also
overwhelmingly of electrostatic nature in that the large dipole moment is respon-
sible for the strong binding. Interactions between small, non-polar molecules are
much weaker. Two methane molecules bind to each other with about 2 kJ/mol. This
is less than 10% of an H2O···H2O interaction. Correspondingly, methane boils at
90 K whereas water is a liquid at room temperature. The direct interactions between
polar groups are therefore orders of magnitude stronger than those between non-
polar groups.
The data that were presented in the previous section could suggest that protein–
ligand interactions are mainly determined by H-bonds and ionic interactions.
All the more astonishing is the fact that the acetate ion, CH3COO, does not
form a dimer with the guanidinium ion H2NC(═NH2+)NH2 in water. Likewise,
amides practically do not associate in water at all, even though hydrogen bonds
often occur between two amide groups in protein structures. How can that be? The
answer is: water is to blame for everything!
All biochemical reactions take place in water, and they only occur at all because
of this reason! The binding of a ligand to a protein occurs in an aqueous environ-
ment. At first, the “empty” binding pocket of the protein is filled with water. A few
water molecules form hydrogen bonds to the protein and are found in an energet-
ically favorable orientation. Other water molecules are in contact with lipophilic
areas on the protein surface and cannot build a perfect hydrogen-bond network.
The ligand is also solvated. When it diffuses into the binding pocket it displaces the
water molecules that are there and must additionally strip off its own solvation
shell. At the same time, the “cave” in which the ligand was situated in the water
phase collapses. Therefore not only are direct interactions between protein and
ligand formed, numerous H-bonds to water molecules are broken.
4.6 Blame It All on Water! 73
Protein
H H
N O N O
H H H
H
+ +
O O O O
H H H H
Ligand Protein–ligand
complex
b Hydrophobic interactions
H H
O O
CH3 H CH3 H
+ +
O CH3 CH3 O
H H H H
Fig. 4.6 The influence of water molecules on the strength of protein–ligand interactions.
(a) Upon formation of an H-bond between protein and ligand, water molecules must be displaced.
These form hydrogen bonds to both protein and ligand prior to complex formation. The balance of
hydrogen bonds, that is, the number of H-bonds before and after binding, remains unchanged.
(b) Upon formation of hydrophobic contacts, water molecules are released from an environment
that was unfavorable for them into the bulk water phase. The number of H-bonds increases.
each other (Fig. 4.6). Because previously H-bonds were possible neither to the
protein nor to the ligand, the total number of H-bonds now increases. Moreover, the
strength with which the water molecules were fixed in the binding pocket before
their release is decisive. If they were strongly fixed, the newly gained degrees of
translational freedom increase the disorder and therefore boosts the entropy, which
is thermodynamically favorable for the free energy DG. If the displaced water
molecules were already severely disordered, their displacement causes very little
entropy gain. Newer findings have shown that the binding pocket does not need to
always be uniformly packed with water molecules. Narrow hydrophobic pockets in
particular are not perfectly solvated. This has consequences for the free energy
balance during binding because it is just this displacement of water molecules that
is decisive for the hydrophobic interactions.
Ligand Ligand–receptor
Receptor
in solution complex
Bound H2O
molecules
Free rotation
Loosely
associated
H2O molecules
H2O molecules that
can move freely
in solution
Fig. 4.7 Illustration of the thermodynamic contribution to the free energy DG. Before binding, the
ligand can move freely; this gives rise to a certain translational and rotational entropy. Moreover,
the ligand is usually flexible, and adopts different conformations. Protein and ligand are solvated in
that H-bonds to water molecules are formed. Some water molecules are in loose contact with the
protein or the ligand without forming H-bonds. Translational and rotational degrees of freedom
are lost upon binding. The concomitant loss in entropy is unfavorable for the binding. Furthermore,
both the protein and the ligand must shed their water shells, which is also an unfavorable process for
the binding. The binding of the ligand leads to the formation of direct interactions to the protein and
it releases water molecules. Both of these are contributions that are favorable for the binding.
H-bonds are indicated by dashed lines and hydrophobic interactions by dotted lines.
The entropy gain occurs, as mentioned, because of the release of fixed water
molecules. This, however, is not the only entropy contribution that changes upon
ligand binding. The protein changes too. For example, many side chains in proteins
are distributed over multiple conformational states. Upon binding a ligand, this
distribution can change. According to the total balance, the entropy can increase or
decrease through this change. The same is true for the rotation of side chains,
especially methyl groups. If the rotational behavior changes, the total entropy of the
ligand-binding process is influenced. The picture can even be complicated in that
some areas of the protein transform into a more ordered state, and others become
less ordered. In this way the entropic contribution is partially compensated. It is
often assumed that the changes in the entropic portion of the binding within a series
of very similar ligands are the same. Then such contributions can be neglected in
a relative comparison of ligands. Unfortunately, this simplified picture has proven
to be a fallacy. Just such an example is introduced in Sect. 4.10.
Gln195 NH2
Asp38
Asp38 His48 N
N
H O O - O
H +
N N
N P O
O H O O Thr51
Tyr169 H
Cys35 O O
Asp176 H H
O H
H Gly36 Gly192
O
Tyr34
Fig. 4.8 Numerous intermolecular hydrogen bonds are formed in the complex between tyrosyl-
RNA synthetase and the substrate tyrosyl adenylate. The exchange of amino acid Tyr34 for Phe or
Tyr169 for Phe leads to the situation that in each case the hydrogen bond can no longer be formed.
This results in a loss of binding affinity.
H H
O N O N
O O
HN NH2
HN
H
O O
F HHNN O O
F H HHNN
4.1 H 4.2 H
NN
Pro300 OO NN
Pro300 OO
Leu300 NH
NH Leu300 NH
NH
OO
OO
Fig. 4.9 Fidarestat 4.1 (left) forms a hydrogen bond with its carboxamide group to the NH
function of Leu300 (blue). By exchanging Leu for Pro (red), the H-bond can no longer be formed.
This leads to a DDG loss of 7.8 kJ/mol, which is paid for mostly by the enthalpy (DDH: 6.9 kJ/
mol). The carboxamide group is missing in sorbinil 4.2 (right). The exchange leucine ! proline
leaves the free energy of binding DDG practically unchanged. Sorbinil, however, binds to the wild
type (leucine, blue) enthalpically more favorably and entropically less favorably than to the
proline mutant (red). An entrapped water molecule mediates an H-bond between sorbinil and
Leu300. This brings an enthalpic advantage to the wild type of about 5 kJ/mol. At the same time,
the entrapment of a water molecule is entropically disadvantageous for the wild type (─TDDS: 6
kJ/mol) and compensates the enthalpic advantage.
78 4 Protein–Ligand Interactions as the Basis for Drug Action
−lgKi 8
0
0 2 4 6 8 10 12 14
n
H-bond with the NH group of Leu300 is missing in sorbinil, the loss of the NH
function in the protein is hardly noticeable. This explains the practically unchanged
free energy of binding. Nonetheless, the sorbinil complexes with the wild-type
protein and the mutant are different. The binding with the wild type is enthalpically
more favorable, but it is entropically more expensive than with the mutant.
The crystal structure indicates that a water molecule mediates an H-bond between
the ether group of sorbinil and the NH function of Leu300 (Fig. 4.9). This yields
an enthalpy gain of about 5 kJ/mol. At the same time, the uptake of water
is entropically disfavored. This contribution of nearly 6 kJ/mol just compensates
the enthalpic gain so that there is practically no affinity gain in DG in the balance.
The proline mutant cannot form a water-mediated contact to sorbinil because of the
missing NH function. Therefore the enthalpic gain from the H-bond is lost. There is
also no entropic loss from capturing a water molecule.
The three-dimensional structures of a large number of protein–ligand complexes
have been elucidated. Many of these complexes contain hydrogen bonds between
the protein and ligand. The entire issue of the contribution of hydrogen bonds
to the binding affinity becomes apparent in Fig. 4.10. Here the experimentally
determined binding constants for 80 protein–ligand complexes are plotted against
the number of hydrogen bonds. The measured binding constants spread over a
considerable range for a given number of hydrogen bonds. The contribution of
a single H-bond is therefore by no means constant, but rather it varies significantly.
The contribution of an H-bond can even reduce the binding affinity due to an
unfavorable desolvation effect. If two ligands are compared that are only different
4.8 Contribution of a Hydrogen Bond 79
Table 4.3 Binding constants Ki for the thermolysin inhibitors 4.3, which contain either
a phosphonamide (X ¼ ─NH─), a phosphonate (X = ─O─), or a phosphinate (X ¼ ─CH2─)
group. The phosphonamide group -PO2NH- complexes the zinc ion and simultaneously forms an
H-bond with Ala113
Ala 113
O
O O
X
O N P R
H -
O O
4.3 2+
Zn
Binding constant Ki in mM X¼
R ─NH─ ─O─ ─CH2─
OH 0.76 660 1.4
Gly-OH 0.27 230 0.3
Phe-OH 0.08 53 0.07
Ala-OH 0.02 13 0.02
Leu-OH 0.01 9 0.01
in the functional group that forms the H-bonds with the protein, the affinity can
increase, remain the same, or even decrease.
An impressive example of the importance of hydrogen bonds is displayed by the
inhibitors 4.3 of the metalloprotease thermolysin, which were synthesized in the
research group of Paul Bartlett. There, a phosphonamide ─PO2HN─ was replaced
by a phosphinate ─PO2CH2─ or a phosphonate ─PO2O─. The results of these
exchanges are summarized in Table 4.3. Although the X-ray structure shows that
the NH groups form an H-bond, it can nonetheless be replaced with a CH2 group
without loss of binding affinity. This result is understandable if we consider the
number of hydrogen bonds before and after ligand binding for the phosphonamide
and for the phosphinate, as we did in Fig. 4.6. In both cases the number of H-bonds
is unchanged. If the NH group is replaced by an oxygen atom, the binding affinity
decreases by a factor of 1,000. In water, the oxygen atom that is in the place of the
NH group can form a hydrogen bond to the bulk water. In the protein–ligand
complex of the phosphonate ─PO2O─, the electronegative oxygen atom is found
exactly opposite the oxygen of the carbonyl group of Ala113. Two acceptor groups
are directly facing one another. A hydrogen bond cannot be formed here. The
inventory of hydrogen bonds remains unbalanced. Furthermore, the two groups
repel one another, which results in a poorer binding. A similarly positioned case is
illustrated in Table 4.4. Here the binding affinity of three thrombin inhibitors
4.4 that were synthesized at Eli Lilly are compared with each other. The amine
(X ¼ ─NH─) can form an H-bond with Gly219 and binds the most strongly.
The ether (X ¼ ─O─) binds 5,000-times weaker because of an electrostatic repulsion
80 4 Protein–Ligand Interactions as the Basis for Drug Action
Table 4.4 Binding of 4.4 to the serine proteases thrombin and trypsin
H
N N CHO
X
O O
O
H
N NH
between the ether oxygen atom and the carbonyl group of the protein. The aliphatic
compound (X ¼ ─CH2─) shows remarkable binding compared to X ¼ ─NH─ that
is merely reduced by a factor of eight (thrombin) and two (trypsin).
We have seen that the direct attractive forces between lipophilic groups are
considerably smaller than those between polar groups. Hydrophobic interactions
are mainly based on the displacement of water molecules. It has been shown in
many experiments that their contribution to the binding affinity is, as a first approx-
imation, proportional to the size of lipophilic surface that is buried upon ligand
binding and therefore no longer accessible to water. Typically it is found that the
contribution is approximately between 50 to 200 J/mol per Å2 of lipophilic
contact area. An example for this is retinol. It binds to the retinol-binding protein
(Fig. 4.1) with a binding constant of 190 nM, exclusively through lipophilic
contacts. This corresponds to a free energy of 39.8 kJ/mol. As a result of the
binding, a lipophilic area of 250 Å2 is buried. The contribution per Å2 amounts
to 39,800/250 ¼ 159 J/mol Å2.
Six HIV protease inhibitors (▶ Sect. 24.6) are listed in Fig. 4.11. During the
course of a lead structure optimization, the hydrophobic surface of 4.5 was enlarged
by adding hydrophobic groups. It could be confirmed crystallographically that the
binding mode did not change. If the changes in the molecular volume in this series
are plotted against the affinity, a linear relationship is obtained. The binding affinity
increases by 65 J/mol Å2.
In many cases, the hydrophobic interactions are a dominant contribution to the
free energy of binding. In Fig. 4.12 the lipophilic surface area that is buried upon
4.10 Binding and Mobility: Compensation of Enthalpy and Entropy 81
O O
X=H
SO2 S Cl
N N CH3
CF3
Br
N+
I
X H H X
4.5
Fig. 4.11 The scaffold of the HIV protease inhibitor 4.5 was enlarged during the course of a lead
structure optimization by adding hydrophobic groups to the aromatic N-benzyl group. An
unchanged binding mode was evidenced crystallographically. The additional molecular volume
improved the binding affinity in a linear manner by about 65 J/mol Å2.
16
14
12
10
8
i
−lgK
6
Fig. 4.12 In analogy to
Fig. 4.10, a plot of the binding 4
constants Ki of the 80
crystallographically
investigated protein–ligand 2
complexes against the buried
hydrophobic surface area 0
shows that there is no simple 0 100 200 300 400
function for this measure
either. X/Å2
According to Eq. 4.3, enthalpy and entropy are in a close physical relationship, and
their sum results in the free energy of binding. If the formation of protein–ligand
82 4 Protein–Ligand Interactions as the Basis for Drug Action
NH2 NH2
N H N H
N N
O O O O
4.7
4.6
H2N NH
ΔG: −31.7 kJ/mol ΔG: −46.7 kJ/mol
ΔH: −13.6 kJ/mol ΔH: −40.6 kJ/mol
−TΔS: −18.1 kJ/mol −TΔS: −6.1 kJ/mol
NH NH
N H N H
N N
O O O O
4.8 4.9
H2N NH H2N NH
Fig. 4.13 Replacement of a phenyl group in 4.6 by a para-benzamidinophenyl group in 4.7 leads to
a significant improvement in the affinity of this thrombin inhibitor, which is largely because of an
enthalpic gain. This is because of the formation of a salt bridge to Asp189 (▶ Sect. 23.3). The
homologous ligands 4.8 and 4.9 bind equally strongly to thrombin, but the binding affinity is divided
into the enthalpic and entropic contributions entirely differently. Compound 4.9 has a significantly
higher residual mobility in the binding pocket than 4.8, which results in an entropic advantage for
this derivative, even though the poorer contacts to the protein cause an enthalpic disadvantage.
N
N
H
H O N
O N O
O
ΔΔG = -3.1 kJ/mol
Cl
Cl
4.10 ΔG = -19.9 kJ/mol 4.12 ΔG = -23.0 kJ/mol
ΔΔ
G=
-18
.6 k
ΔΔG = -9.6 kJ/mol J/m ΔΔG = -15.5 kJ/mol
ol
NH2 NH2
N N
H H
O N O N
O O
ΔΔG = -9.0 kJ/mol
Cl Cl
Fig. 4.14 Optimization of the thrombin inhibitor 4.10 to 4.11 increases affinity by DDG = 18.6
kJ/mol. This is achieved by increasing the size of the hydrophobic side chain (red) from n-propyl
to phenyl and attaching an amino group (blue). The changes can also be accomplished in step-wise
fashion. Increasing the hydrophobic surface to 4.12 enhances affinity only by 3.1 kJ/mol, major
contribution of 15.5 kJ/mol is provided by the addition of the subsequently introduced amino
group. Adding first the amino group to feature 4.13, contributes 9.6 kJ/mol, and the subsequent
substitution of the hydrophobic substituent increases affinity by another 9 kJ/mol. Explanation
for the lack of additivity is found in the complex interference of residual mobility, desolvation and
strength of the formed enthalpic interactions.
This chapter should not give the impression that a quantitative prediction about the
strength of protein–ligand interactions is impossible. Despite the complex character
of protein–ligand interactions, some simple rules should always be consulted first.
86 4 Protein–Ligand Interactions as the Basis for Drug Action
4.12 Synopsis
that allow both binding partners to change conformations and mutually adapt to
one another to optimally interact.
• The cells are surrounded by a lipid double-layer membrane with polar
head groups on the exterior and hydrophobic alkyl chains in the interior.
This membrane is a barrier for polar substances, but sufficiently lipophilic
compounds can penetrate and even pass through the membrane.
• The strength of protein–ligand interactions is measured by the binding constant,
which quantifies the stability of a protein–ligand complex as a dissociation
constant according to the law of mass action for complex formation.
• The binding constant is logarithmically related to the Gibbs free energy of
binding. The free energy is composed of an enthalpic and entropic contribution.
The enthalpic part summarizes all terms that relate to the interaction energy
of the binding partners. The entropic part considers the order of the system and
how its energy content is distributed over the degrees of freedom of the system.
• Protein–ligand complexes usually form through non-covalent interactions, pre-
dominantly through hydrogen bonds. The strength of hydrogen bonds strongly
depends on the distributions of charges among the interacting functional groups.
Whether a group is charged or not depends on its protonation state, which is
defined by the pKa value of the titratable groups involved in the protein–ligand
interactions.
• Depending on the local environment in a binding pocket, the pKa values of
titratable groups can vary significantly and can, by this, transform a normal
H-bond into a much stronger charge-assisted H-bond.
• Hydrophobic interactions form through the close proximity of non-polar
functional groups of the binding partners. As direct interactions, they are rather
weak. Nevertheless they can afford a significant contribution to binding affinity
through the release of water molecules from either the lipophilic environment of
the binding pocket or from the ligand surface next to a lipophilic surface patch.
• The strength of protein–ligand interactions is strongly influenced by the water
environment. Both the protein binding pocket and the ligand are solvated before
complex formation and functional groups of protein and ligand will form
H-bonds to water molecules. The total balance of the hydrogen-bond inventory
before and after complex formation matters for binding affinity considerations.
Only if the newly formed hydrogen bonds in the complex are increased in
number and/or stronger than those previously formed to water, a net affinity
increase results.
• The release of water molecules from hydrophobic surface patches can increase
affinity by enthalpy and entropy. Release of fixed water molecules increases the
degrees of freedom and boosts entropy. Replacement of highly disordered water
molecules into the bulk water environment can contribute to an enthalpic gain.
• Entropic contributions to binding arise from an increase of the degrees of
freedom of the protein–ligand–water system and, as a first approximation,
correlate with the size of the hydrophobic surface buried in the formed complex.
• Free energy variations are observed over a window of about 30–55 kJ/mol in
protein–ligand complexes. Variations in enthalpy (DH) and entropy (TDS) can
88 4 Protein–Ligand Interactions as the Basis for Drug Action
Bibliography
General Literature
Andrews PR (1993) Drug-receptor interactions. In: Kubinyi H (ed) 3D-QSAR in drug design.
Theory, methods and applications. Escom, Leiden, pp 13–40
Andrews PR, Craik DJ, Martin JL (1984) Functional group contributions to drug-receptor
interactions. J Med Chem 27:1648–1657
Böhm HJ, Klebe G (1996) What can we learn from molecular recognition in protein-ligand
complexes for the design of new drugs? Angew Chem Int Ed Engl 35:2588–2614
Böhm H-J, Schneider G (2003) Protein-ligand interactions. From molecular recognition to drug
design. In: Mannhold R, Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in
medicinal chemistry. Wiley-VCH, Weinheim
Creighton TE (1992) Proteins: structures and molecular properties, 2nd edn. W.H. Freeman,
New York
Gohlke H, Klebe G (2002) Approaches to the description and prediction of binding affinity of
small-molecule ligands to macromolecular receptors. Angew Chem Int Ed Engl 41:2644–2676
Kuntz ID, Chen K, Sharp KA, Kollman PA (1999) The maximal affinity of ligands. Proc Natl Acad
Sci USA 96:9997–10002
Special Literature
Ehrlich P (1913) Chemotherapeutics: scientific principles, methods and results. Lancet 182:445–451
Fersht AR, Shi JP, Knill-Jones J et al (1985) Hydrogen bonding and biological specificity analysed
by protein engineering. Nature 314:235–238
Gerlach C, Smolinski M et al (2007) Thermodynamic inhibition profile of a cyclopentyl- and
a cyclohexyl derivative towards thrombin: the same, but for deviating reasons. Angew Chem
Int Ed Engl 46:8511–8514
Lichtenthaler FW (1994) 100 Years “Schluessel-Schloss-Prinzip”: what made Emil Fischer use
this analogy? Angew Chem Int Ed Engl 33:2364–2374
Mason RP, Rhodes DG, Herbette LG (1991) Reevaluating equilibrium and kinetic binding
parameters for lipophilic drugs based on a structural model for drug interaction with biological
membranes. J Med Chem 34:869–877
Morgan BP, Scholtz JM, Ballinger MD, Zipkin ID, Bartlett PA (1991) Differential binding energy:
a detailed evaluation of the influence of hydrogen-bonding and hydrophobic groups on the
inhibition of thermolysin by phosphorous-containing inhibitors. J Am Chem Soc 113:297–307
Petrova T, Steuber H et al (2005) Factorizing selectivity determinants of inhibitor binding toward
aldose and aldehyde reductases: structural and thermodynamic properties of the aldose reduc-
tase mutant Leu300Pro-Fidarestat complex. J Med Chem 48:5659–5665
Optical Activity and Biological Effect
5
The decisive experiment was carried out by the then 26-year-old Louis Pasteur in
Paris in 1848. Several literature reports were inconsistent with his theory that an
obvious relationship must exist between crystal forms and their optical properties.
During a careful investigation of the sodium–ammonium salt of the optically
inactive tartaric acid, he discovered that the crystals had different forms. They
were either right- or left-symmetrical and could be sorted by hand. The crystals of
the enantiomers 5.1 and 5.2 (Fig. 5.1) gave solutions that had an opposite rotational
direction. This confirmed his suspicion. Before Pasteur could present his results to
the Academy of Science, he had to repeat the experiment publically (!) in Biot’s
Mirror
plane
COOH COOH COOH
HO H H OH H OH
Inversion
H OH HO H H OH Symmetry
COOH COOH COOH
5.1 5.2 5.3
D-(-)-Tartaric acid L-(+)-Tartaric acid meso-Tartaric acid
Fig. 5.1 Optical isomerism in tartaric acid. The enantiomers ()-tartaric acid 5.1 (mp.
168–170 C, [a]D20 ¼ 12 ) and (+)-tartaric acid 5.2 (mp. 168–170 C, [a]D20 ¼ +12 ) cannot be
superimposed upon each other either in the plane of the paper or in 3D space. They have only
a twofold rotational axis (orange axes) that dissect the central C—C bond. Each mirror image
rotates the plane of polarized light in opposite directions to the other. In contrast, meso-tartaric acid
5.3 (mp. ¼ 140 C) has an inversion center of symmetry (the purple center on the central C—C
bond). Solutions of meso-tartaric acid have no optical activity because the contribution from each
stereogenic center compensates for the other. Racemic tartaric acid (mp. ¼ 206 C, no rotation) is
a 1:1 mixture of both enantiomers of tartaric acid 5.1 and 5.2. Such mixtures are optically inactive
and are called racemates (Lat. racemus, the grape—tartaric acid is found in grapes and wine).
presence at the Collège de France. He was lucky. It was only because his solutions
were allowed to slowly evaporate at room temperature that his experiment was
successful. Above the critical temperature of 28 C, a stoichiometric 1:1 mixture of
both enantiomeric forms, a racemate, would have homogeneously crystallized
(Sect. 5.4).
A few years later Pasteur managed another important observation: mold con-
tamination of a racemic tartaric acid solution caused optical activity to develop.
One enantiomer of tartaric acid is metabolized significantly faster than the other.
With this, he discovered two important methods to separate racemates into enan-
tiomers. Whereas mechanical sorting is limited to a very few examples, enzymatic
kinetic resolution of enantiomers has found broad applications (Sect. 5.4).
An explanation for optical isomerism was possible with the help of the theory of
tetrahedral carbon, which was independently developed in 1874 by Jacobus
5.2 Structural Basis of Optical Activity 91
5.4 Twistane
O
N
N
N
N
5.6
O N
5.5 Methalqualone N
O
Fig. 5.2 Even molecules without stereogenic centers can form an image–mirror-image pair
because of their spatial construction; an example is twistane 5.4. If rotation around the bonds is
limited, as in the case of the sedative methaqualone 5.5, enantiomers are separable (so-called
atropisomers). In non-planar fused ring systems like the dibenzocycloheptadiene derivative 5.6,
the enantiomeric separation depends on the barrier of inversion for the ring system.
Henricus van’t Hoff and Joseph-Achille Le Bel. When a carbon atom carries four
different substituents an asymmetric, or, as it is sometimes called, a stereogenic
center is produced. This property is not limited to carbon; nitrogen (in ammonium
salts), or silicon atoms with four different substituents, phosphorus, for instance, in
phosphonic or phosphoric acid esters, or even sulfur atoms in sulfoxides (with two
different substituents, oxygen, and the lone electron pair) can also be asymmetric. The
spatial orientation of these compounds give rise to two mirror-image isomers, each of
which rotates polarized light in the opposite direction to the same degree. These forms
are called enantiomers (earlier antipodes). With the exception of their optical
activity, enantiomers are identical in all of their chemical and physicochemical
properties, but only as long as they are in an achiral environment.
Compounds with two chiral centers that are configured as an image and mirror
image within the same molecule do not exhibit optical activity macroscopically.
meso-Tartaric acid 5.3 (Fig. 5.1), an inversion-symmetrical molecule, exists as
a racemic mixture of chiral conformers. Each conformer exists as an “internal”
racemic mixture because in one energetically favored conformation the molecule
exhibits inversion symmetry. Its left part can be inverted by point reflection through
the center of the central C—C bond into its right part. Optical activity is also present in
other forms of molecular asymmetry. An example is any regular or irregular tetrahe-
dral orientation of different substituents on any other scaffold than a single carbon
atom. Another case can be found in compounds in which two groups are strongly
rotationally hindered around a common bond. An asymmetrical center results, giving
rise to optically active rotational isomers, so-called atropisomers (Fig. 5.2).
The experimentally determined rotational value (+) or () (previously called
d or l) is used to characterize enantiomeric compounds. The spatial configuration
of a stereogenic center in a molecule is described as D or L (Lat. dextro, levo).
This notation is based on the Fischer convention and is related to the absolute
92 5 Optical Activity and Biological Effect
Fischer Projection
CHO
H OH
CHO CHO
HO H
H OH HO H
H OH
CH2OH CH2OH
5.7 5.8 H OH
CH2OH
D-Glyceraldehyde L-Glyceraldehyde
5.9 D-Glucose
Stereoprojection
Fig. 5.3 The rotation (+ or ) and the Fischer assignment (D or L) is reported as part of the
characterization of optically active compounds. To determine the Fischer assignment, the longest
carbon chain is drawn vertically with the highest-oxidized carbon atom on top (e.g., 5.9). The
standard is set by the asymmetric carbon (red) of the D- and L-glyceraldehyde pair (5.7 and 5.8).
With sugars (e.g., glucose 5.9) or amino acids (e.g., alanine 5.10), the carbon that is marked with
the arrow decides whether the molecule is D or L.
configuration of D- and L-glyceraldehyde, 5.7 and 5.8 (Fig. 5.3). Most sugars, for
instance glucose 5.9, can be traced back to D-glyceraldehyde 5.7, and the natural
amino acids of proteins, for instance alanine 5.10, can be traced back to
L-glyceraldehyde 5.8. For this reason, today the D/L nomenclature is still frequently
applied to sugars and amino acids. The enantiomers of tartaric acid correspond to
the D-() or L-(+) form.
The Cahn–Ingold–Prelog rule allows an unambiguous stereochemical assign-
ment (Fig. 5.4). According to the convention, the optical center is oriented so that
the substituent with the smallest atomic number is at the back (e.g., a hydrogen
atom or a lone pair of electrons). To use an intuitive explanatory model, we want to
assign this substituent to be the column of a steering wheel. Then the other sub-
stituents lie in the plane of the steering wheel. If these substituents are regarded in
descending order according to the atomic number, and this sequence follows
a rotation to the right, the stereogenic center has an R configuration; the opposite
direction is the S configuration (from the Latin: rectus and sinister). The only
disadvantage to this nomenclature system is that the assignment of the stereocenter
can change just because of the atomic number, valency, or oxidation state. The
homologous L-amino acids serine and cysteine, which are structurally stereochem-
ical analogues that differ only in that an oxygen is exchanged for a sulfur atom, are
classified as (S)-serine and (R)-cysteine.
If one stereogenic center is present in a molecule, there are two enantiomers.
Each additional symmetry-independent stereocenter increases the number of
5.2 Structural Basis of Optical Activity 93
Cahn–ngold–Prelog Rules
• Large atomic numbers have priority over low ones, (e.g., Br>Cl>F>O>N>C>H)
• Free electron pairs always have the lowest priority
• Larger atomic masses have priority, (e.g., for isotopes D>H)
• In case the first sphere is identical, (i.e., C), the next sphere is considered
CH3 CH3 CH3 H
CH3 > > >
H H H
CH3 CH3 H H
C[C+C+C] > C[C+C+H] > C[C+H+H] > C[H+H+H]
CH3 CH3 CH3 H CH3 H CH3 H
> > > > > > >
F F OH OH NH2 NH2 CH3
CH3 H CH3 CH3
H CH3 H CH3 H
• Multiple bonds are considered as multiple single bonds, e.g., aldehyde
CHO = C (O+O+H)>CH2OH = (O+H+H)
• If the substituents are chiral, the R>S and R,R>R>S and S,S>S,R
• In the case of differently configurated double bonds Z>E
(Z = zusammen = together and E = entgegen = apart for the configuration of double bonds)
CHO CHO
H H
HO CH2OH HOH2C OH
(R)-Glyceraldehyde (S)-Glyceraldehyde
5.7 5.8
Fig. 5.4 The R/S nomenclature that was proposed by R. S. Cahn, C. K. Ingold, and V. Prelog is
unambiguous. Priority rules for each of the four different substituents on the tetrahedral
stereogenic center were established. The substituent with the lowest priority is placed in the
back, and the direction of remaining substituents determine the direction of rotation by decreasing
priority.
Racemic acids and bases can often be separated by using other enantiomerically
pure, optically active bases and acids, as the formed diastereomeric salts of which
have different solubility. The chemical reaction of racemic acids, amines, and
alcohols with optically active alcohols or acids results in diastereomeric reaction
products. Because of their different characteristics, it is possible to separate them
and finally isolate the desired optically active product by chemical cleavage.
Syntheses that do not start with optically active starting materials, and that use
no optically active auxiliaries, always lead to racemic mixtures, that is, an exact
50:50 mixture of both enantiomers. Access to optically active compounds can be
obtained when synthetic reaction components are taken from the “chiral pool”.
Here, all optically active natural products, their derivatives, and degradation prod-
ucts that are available in an optically pure form can be used as easily accessible
synthetic building blocks. Syntheses with chiral catalysts are particularly elegant. In
most cases the optimization of the yield and enantiomeric purity, which is
expressed as the ee value (ee¼enantiomeric excess) requires considerable process
development. The chromatographic separation of racemates on optically active solid
supports is more appropriate for semipreparative or analytical purposes.
Enzymatic and biotechnological techniques have increasingly gained favor in
the last years. Proteases, esterases, lipases, or hydantoinases react more or less
selectively, preferentially with a distinctly different reaction rate; only one enan-
tiomer of a racemic mixture is transformed to the product. The selectivity and yield
of such a reaction can be optimized through the careful selection of the medium and
other reaction conditions.
The production of optically pure ephedrine is an example of an industrial
application of biotechnological synthesis that has been in use for decades. This
phytopharmacon is found in combination preparations for the adjuvant therapy of
5.4 Lipases Separate Racemates 95
rhinitis, bronchitis, and asthma. The synthetic intermediate 5.12 (Fig. 5.6) is
obtained from a mixture of benzaldehyde, sugar, and yeast. It is then transformed
to (1R,2S)-()-ephedrine 5.13, which is identical to the natural product in both of
its optical centers. The C1 isomer (1S,2S)-(+)-pseudoephedrine 5.14 is
a diastereomer of ephedrine. Its optical rotation, melting point, and biological
characteristics are different from ephedrine’s.
Innumerable other microbial syntheses deliver optically pure products with or
without the use of achiral, racemic, or enantiomerically pure starting materials. The
biotechnological syntheses of a variety of antibiotics, above all the penicillins and
cephalosporins (▶ Sects. 2.4 and ▶ 23.7), are of particular economic importance.
Even the biotechnological preparation of synthetic intermediates for chiral drugs is
gaining increasing importance.
Because of their asymmetric architecture, lipases are well suited to separate race-
mates. This can either happen if one of the two enantiomers binds as a substrate
better and reacts faster, or if a chemical reaction takes place in the binding pocket of
the protein with disparate efficiency. Lipases are often used for kinetic resolution
because their architecture and their lipophilic surface allow them to sustain their
reactivity in organic solvents. They belong to a larger family of hydrolyzing
enzymes (▶ Chap. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme
96 5 Optical Activity and Biological Effect
5.16 5.15
NH2 NH2
ΔG E-S
ΔR-SΔG = −19.4 ± 6 kJ/mol
E-R
ΔR-SG
ΔR-SH
ΔSG ΔR G
E-A
E+S E+R TΔR-SS
(S)-Amine + + (R)-Amine
Reaction coordinate
Fig. 5.7 The reaction of (R)- and (S)-phenylethylamine, 5.15 and 5.16, with Candida antarctica
lipase begins with the formation of an acyl–enzyme complex, E–A. The faster-reacting R-amine
5.15 (red) forms a lower-energy transition state that leads to the free enzyme and the R-amide
(E+R). Analogously the S-amide (E+S) forms from the higher-energy E–S transition state (blue)
from the S-amine 5.16. Difference in DG{ is 19.4 kJ/mol and favors the R form. The DG{
difference is based on a combined enthalpic and entropic contribution in which the R form is
enthalpically favored, and entropically disfavored. The S form is enthalpically disfavored but has
an entropic advantage.
Interestingly the transition-state analogue of the faster-reacting R form fits into the
binding pocket well (Fig. 5.8). On the other hand, the S form demonstrated great
residual mobility in the catalytic center. Computer simulations and molecular dynam-
ics with both forms confirmed the picture: whereas the R analogue had a well-defined
and temporally stable geometry, which is ideal for the reaction, the S analogue is very
mobile and rarely adopts an orientation that is productive for the catalytic reaction in
the lipase. Therefore a successful reaction of this substrate occurs much less often. On
the other hand, the R analogue, fixed in a vice-like clamp and waiting for its reaction,
forms good enthalpic contacts with the enzyme. It takes on a form that is practically
complementary to the enzyme pocket. This results in a large enthalpic advantage. The
fixation has its entropic price though. The methyl group on the stereogenic center
embeds itself in a small niche in the binding pocket. The S analogue does not have
this possibility because its methyl group is oriented in the mirrored direction. In this
case, the anchor that can be embedded in the binding pocket is missing. It has a high
mobility in the catalytic center and does not lose as many degrees of freedom
compared to the situation before enzyme binding. Entropically this is advantageous.
Enthalpically, however, the substrate loses a good interaction and the complementary
fit is rarely achieved. In the end, the enthalpic component prevails so that the
(R)-amine is transformed significantly faster. This is more than enough to ensure
that, in practice, only the (R)-amide is formed in high yield. This lipase can also be
immobilized onto a solid support and loaded into a glass column. After the acyl form
is prepared on the column, a racemic mixture of the amine only need to be poured
onto the column. The (S)-amine and (R)-amide must then simply be collected in
a flask. If the solvent is well chosen, the amide crystallizes directly from the solution,
and can be mechanically separated.
Interestingly, the enantiopreference of the kinetic resolution is lost with
increasing temperature or enlargement of the enzyme pocket. An enlargement can
be achieved by exchanging a tryptophan along the rim of the catalytic pocket for
a histidine. The higher temperature or increased space in the binding pocket
increases the mobility of both substrates in the lipase. The enthalpic advantage of
the faster-reacting R-amine is lost. The entropic difference of both substrates levels
out under these conditions.
This example shows on a molecular level how a lipase achieves kinetic resolution.
With knowledge of the energetic parameters and structural information, an attempt
can be made to tailor lipases for other transformations. Because of the importance of
such reactions, the targeted design of enzyme catalysts has developed into an ever
more important theme for the synthesis of chiral building blocks in new drugs.
Flora and fauna stand out because of their symmetry. Consider the face, the arms and
legs, the ribs, or an orchid flower. The exceptions, for instance a snail shell, are rare or
occur, as in the case of the flounder, only under special evolutionary conditions. The
inner organs of vertebrates are oriented partially paired and partially asymmetrically.
5.5 Differences in the Activity of Enantiomers 99
CH3
Eudismic
O * N CH3 Ratio
H
H OH
b-Blockade 100
Membrane effect 1
5.19 Propranolol
O H CH3 CH3
+ N CH3
H3C O * Cholinergic effect 320
CH3
5.20 Metacholine
O H CH3 CH3
+ N CH3
* O * CH3
OH
Ester group center 50–100
OH
*
t-Bu
N α1 Receptor 73
*
H H D2 Receptor 1250
*
5 HT1 Receptor 8
5 HT2 Receptor 73
Muscarinic Receptor 0.5
5.22 Butaclamol,
(+)-Enantiomer
Fig. 5.9 Enantiomers have different biological effects. The eudismic ratio of propanolol 5.19 is
100 for b-antagonism, and for unspecific membrane interaction, it is, expectedly, 1. Identical
partial structures can have entirely different eudismic ratios, for instance compare the optical
center of the alcohol moiety of the cholinergic compound metacholine 5.20, with the identical
center on the anticholinergic compound 5.21. Compound 5.21 also proves that the eudismic ratio
of different centers in a compound are independent from each other. The example butaclamol 5.22
also shows that the same substance can have different eudismic ratios on different receptors.
effects. If drug testing were then what it is today, this catastrophe would certainly
have been recognized earlier and probably largely avoided. This would not have
been prevented by the administration of only one enantiomer. Both enantiomers
racemize in vitro, that is, one converts into the other even in a test tube.
5.5 Differences in the Activity of Enantiomers 101
O ACE 0.14
5.23 Thiorphan
O 2.3
Thermolysin
HS COOH
N ACE >10
H
5.24 retro -Thiorphan
Fig. 5.10 Thiorphan 5.23 inhibits the metabolism of enkephalins and contains
a b-mercaptopropionic acid, the absolute configuration of which is analogous to L-phenylalanine.
Application of the retro–inverso concept gives aminothiol 5.24, the absolute configuration of
which corresponds to D-phenylalanine. The identical binding mode to the zinc protease was
determined for both thiorphan 5.23 and retro-thiorphan 5.24. Thiorphan and neutral endopeptidase
24.11 (NEP 24.11, previously referred to as enkephalinase) are inhibited by both compounds to the
same extent. On the other hand, angiotensin-converting enzyme (ACE), another zinc protease,
discriminates decidedly between these substances.
Accordingly, the effect was confirmed in vivo after administration of the suppos-
edly safe enantiomer led to teratogenic effects in an animal model.
The “other” enantiomer can also open new therapeutic opportunities. The
enantiomer of a synthetic opiate, for instance propoxyphene 5.27 (Fig. 5.11) has
weak analgesic and narcotic effects, but good cough-suppressing effects. Enantio-
mers can also influence each other in their effects, and even cancel one another out.
In the case of the calcium channel ligand 5.28, one enantiomer is an agonist and the
other is an antagonist.
In the time period between 1983 and 2002, 38% of all approved drugs were
achiral, 39% were enantiomerically pure, and 23% were racemic or diastereo-
meric mixtures. The fact is that racemic mixtures of chiral drugs were much more
easily accepted in earlier decades than they are today. This was certainly not
caused by a stereophobia on the part of the chemical industry. It was more an
expression of inadequate understanding of the stereospecificity and side effects,
and perhaps also because economic considerations were in the foreground; kinetic
resolution and/or enantiomerically pure syntheses are very expensive. You can
certainly see that the proportion of enantiomerically pure drugs is gaining in the
marketplace (Fig. 5.12).
In the 1970s, Ariëns was the first to decisively come out against the use of
racemic mixtures in therapy. Racemates are, in his view, compounds with 50%
impurity. The non-active or less-active enantiomer is seen as enantiomeric ballast.
He used the diastereomeric mixture labetalol 5.11 (Fig. 5.5, Sect. 5.2) as a showcase
102 5 Optical Activity and Biological Effect
O
O CH3
N O
N *
* O
N
N O H
H O
O
5.26 Thalidomide
5.25 N-Methyl-5-phenyl-5
- propylbarbiturate
CF3
OCOEt H
* O2N * COOMe
CH3
* N
H CH3 CH3 H3C N CH3
H
5.27 Propoxyphene 5.28 Bay K 8644
Fig. 5.11 Enantiomers also differ in their mode of action. The (R)-()-enantiomer of barbiturate
5.25 is a hypnotic agent, whereas the (S)-(+)-enantiomer causes seizures. In rats and mice only the
(S)-()-enantiomer of thalidomide 5.26 (Contergan ®) is teratogenic, that is, it causes
embryopathies. Thalidomide 5.26 racemizes in vitro as well as in rabbits. Therefore even the
(R)-(+)-enantiomer is teratogenic in rabbits. Propoxyphene 5.27 is a potent analgesic, the effect of
which depends on the (2S,3R)-(+) enantiomer, dextropropoxyphene. The (2R,3S)-()-enantiomer
is a cough suppressant. The (R)-(+)-enantiomer of Bay K 8644 5.28 is a weak calcium channel
blocker. The (S)-()-enantiomer stabilizes calcium channels in the open form and is therefore an
agonist, that is, a calcium channel opener.
Fig. 5.12 The proportion of achiral, enantiomerically pure, and racemic or diastereomeric drugs
approved in the period from 1983 to 2003. In the meantime, the proportion of newly approved
drugs has shifted decidedly in the direction of enantiomerically pure compounds.
CH3 R
CN NH
* N *
Fig. 5.13 Upon metabolism of the monoamine oxidase inhibitor, selegilin 5.29, which is used to
treat Parkinson’s disease, the more potent (R)-(–)-enantiomer is converted to methamphetamine
5.30 and amphetamine 5.31. The less-active (S)-(+)-selegilin has less severe side effects because it
is not metabolized to CNS-active stimulants.
H CH3 H3C H
Metabolic
COOH Inversion * COOH
Fig. 5.14 The (R)-()-enantiomer of ibuprofen 5.32 undergoes a metabolic inversion of its
stereocenter, and the (S)-(+)-enantiomer is formed. As a cyclooxygenase inhibitor in vitro, the
(S)-(+)-form is more potent than the (R)-()-form. The less-active form is converted to the more-
active enantiomer in vivo. Therefore both compounds exhibit equally anti-inflammatory properties
in animal models.
According to the result, in special cases the continued use of the racemate or the
development of an achiral analogue can be considered. At any rate, today these data
must be complete before the drug can receive approval.
bind almost identically to the catalytic zinc. Further, the endocyclic SO2 group
forms very similar hydrogen bonds to Gln92. The hydrophobic isobutyl side chains
are in similar parts of the binding pockets. The six-membered ring, however must
adopt a conformation in the case of the more-weakly binding enantiomer that is
highly strained. The price for taking on this strained conformation is paid for in the
reduced binding affinity to the enzyme.
The enantiomeric agonists 5.36 and 5.37 bind in the ligand-binding domain of
the retinoic acid receptor with a difference of a factor of 1,000 (▶ Sect. 28.2). The
receptor itself adopts the same geometry (Fig. 5.17). The alcohol function in the
middle of the molecule is at the stereogenic center. In both cases, the hydrogen
bond to Met272 is formed. As a result, the neighboring amide must take on
a deviating orientation in the binding pocket. On the “right” side, the tetraline
moiety for both stereoisomers is in a similar place. On the “left” side, the benzoic
acid moiety of both enantiomers form a hydrogen-bond network with Arg278,
Ser289, and Leu233. The fluorine-substituted benzene ring adopts in both cases
a 180 flipped orientation. These different orientations, together with the diver-
gently oriented amide bond are responsible for the severe difference in the binding
affinity of the mirror-image agonists.
O O
O NH2 O NH2
S S
S S
O O
S N S N
O O
5.34 5.35
Fig. 5.16 The enantiomeric sulfonamides 5.34 (gray) and 5.35 (beige) bind in a similar way to
the enzyme carbonic anhydrase. Because the protein adopts practically the same geometry with
both inhibitors, only one structure is shown. The zinc ion in the catalytic center (purple sphere) is
coordinated to the sulfonamide groups. The SO2 groups in the six-membered ring form a hydrogen
bond to Gln92 (green). The hydrophobic isobutylamino moieties on the chiral centers project into
a hydrophobic pocket and fill this out to the same extent. In doing this, the six-membered ring must
adopt a deviating conformation in both enantiomers. In one stereoisomer this conformation is
much more strained than in the other, and causes a loss in binding affinity.
Some naturally occurring peptides form ion channels in lipid layers. Their
synthetic antipodes are also able to do this. The more interesting question is: how
does the mirror image of an enzyme behave? In 1992 Stephan Kent and co-workers
prepared HIV protease, a homodimer made up of 299 amino acids, entirely from
D-amino acids. The naturally occurring protein was also prepared in parallel. The
L-enzyme reacts only with L-peptide substrates and the D-enzyme reacts only with
the all-D enantiomer. The same is true for chiral inhibitors of the HIV-1 protease.
An achiral inhibitor, on the other hand, inhibits both enzymes in the same way.
Rubredoxin, an electron-transport protein, was prepared as the D-protein for the sole
purpose of mixing it with the naturally occurring L-protein and to make the racemate!
If the effort involved is considered, this is certainly an approach that takes some getting
used to. The reward for the work was very high-quality crystals. The racemate
crystallized in a centrosymmetric space group (▶ Sect. 13.2), which allowed a better
resolution of the 3D structure than was possible with the natural, all-L enantiomer.
5.7 An Excursion in the World of Antipodes 107
S Met272 S Met272
OH OH
H H
N N
O O
HOOC F HOOC F
(R )-5.36 (S)-5.37
Fig. 5.17 Both enantiomers of the agonists 5.26 (beige) and 5.37 (gray) bind the retinoic acid
receptor with 1,000-fold difference in affinity. Because the protein adopts practically the same
geometry with both ligands, only one structure is shown. Both ligands form H-bonds with their OH
groups to the sulfur in Met272. In doing so, the fluorine-substituted aromatic ring of the benzoic acid
moiety on the left with its central amide bond has to adopt a deviating orientation. The tetrahydro-
naphthalene (tetraline) moiety, on the other hand, is positioned in the same way in both enantiomers.
What does a visit to the mirror-image world look like? Achiral drugs would have
an identical potency and mode of action. On the other hand, many enantiomerically
pure drugs would be useless. We would have to watch out for chiral barbiturates
such as 5.25. They would sooner cause a seizure than act as a sedative. In cases in
which chiral antibiotics were used to treat bacterial infections, it would first have to
be established whether the infecting bacteria came from the mirror-image world or
the normal world. The administration of trimethoprim (▶ Sect. 37.2) and
a sulfonamide (both achiral) would help at any rate.
There would be tremendous problems with nutrition. The carbohydrate and
protein metabolism would not work anymore, nor would the resorption of mono-
mers from the gastrointestinal tract. We would not be able to recognize some plants
by their smell. (R)-Carvone smells of caraway seeds, (S)-carvone smells of spear-
mint. Our beloved sugar would have lost its sweet taste, and fruit juices and
108 5 Optical Activity and Biological Effect
lemonade would taste sour. Coffee, tea, and cola would retain their stimulatory
effects because caffeine is achiral. Diet drinks would have to be sweetened with
saccharine or cyclamate (both achiral) because aspartame is chiral.
Let us return to the normal world! But first, let us have a quick glass of vodka. It
could also be cognac, whisky, or a dry red wine. The taste would be the same as in
the normal world, or would it not? Despite the many hundred flavor components of
wine, the exchange of a single chiral center could have the consequence that
a connoisseur might no longer recognize the chateau. The euphoric effects would
be the same, though this would not be the case for the hard, optically active drugs
such as heroin, cocaine, or LSD.
5.8 Synopsis
Bibliography
General Literature
Ariëns EJ, Soudijn W, Timmermans PBMWM (1983) Stereochemistry and biological activity of
drugs. Blackwell Scientific, Oxford
Brown C (ed) (1990) Chirality in drug design and synthesis. Academic, London
Caner H, Groner E, Levy L (2004) Trends in the development of chiral drugs. Drug Discov Today
9:105–110
Eichelbaum M, Testa B, Somogyi A (2002) Handbook of experimental pharmacology, stereo-
chemical aspects of drug action and disposition. Springer, Heidelberg
Holmstedt B, Frank H, Testa B (1990) Chirality and biological activity. Alan R. Liss, New York
Klebe G (2004) Differences in binding of stereoisomers to protein active sites. In: Pifat-Mrzljak
G (ed) Supramolecular structure and function 8. Kluwer Academic/Plenum, New York,
pp 31–53
Smith DF (ed) (1989) CRC handbook of stereoisomers: therapeutic drugs. CRC Press, Boca Raton
Special Literature
The starting point in the development of a new drug is the search for an appropriate
lead structure for a target protein. Next such a target structure within the genome or
proteome must be validated as being relevant as a therapeutic principle. The
production of the pure target structure is possible by using gene technology
methods. After a high-throughput screening assay is established, thousands of test
molecules can be evaluated for binding to the target protein. The X-ray crystal
structure is solved and serves both the search for, and optimization of lead struc-
tures. Without techniques such as bio- and chemoinformatics, molecular modeling,
and computational chemistry this type of search and optimization is unthinkable
(announcement poster from the research group of the author on the occasion of
a conference in 2003 in Rauischholzhausen, Marburg).
The Classical Search for Lead Structures
6
The starting point in the search for a new drug is the lead structure. Such
a substance has already a desirable biological effect, but some specific character-
istics are still inadequate for its therapeutic use. The definition of the term “lead
structure” also means that analogues can be prepared by targeted chemical varia-
tions which produce compounds better than the lead structure in, for instance, their
potency or selectivity. The goal is the optimization of all characteristics until a final
substance is ready for therapeutic use.
The largest part of our pharmacy originates directly or indirectly from natural
products, that is, from plants, animals, or microbial sources, or from endogenous
substances such as hormones and neurotransmitters. Only a few natural products
have become drugs themselves. Examples include morphine, codeine, papaverin,
digoxin, ephedrine, cilcosporin, and hirudin, the latter of which was isolated from
leeches. Examples of endogenous drugs are the thyroid hormone T3, insulin, coag-
ulation factor VIII, erythropoietin, and further proteins for substitution therapy.
Most naturally occurring compounds serve as lead structures. They are chemically
manipulated with the goal of optimizing their desirable characteristics and mini-
mizing their side effects (▶ Chap. 8, “Optimization of Lead Structures”). Examples
are found in the many natural products and endogenous receptor agonists that have
been modified into selective agonists and antagonists (▶ Sects. 6.2, ▶ 6.3, ▶ 6.4, and
▶ 6.6). Drugs are also derived from enzyme substrates (▶ Sect. 6.6 and ▶ Chaps. 23,
“Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic
Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”;
▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidoreductase Inhibitors”) which can either
be substrates for endogenous enzymes, for instance, that play a role in blood pressure
regulation or inflammation, or they are substrates of enzymes from viruses, bacteria,
or parasites, of which the metabolism should be specifically shut down.
In the last 100 years preparative organic chemistry has played a decisive role not
only in the systematic variation of lead structures but also in lead structure discovery.
The search for new active substances has delivered many drugs that have no structural
relationship to endogenous examples. In other cases, the relationship between the
biological effect and the mode of action was clarified long after their discovery.
The first example of discovering an active principle through testing occurred in the
eighteenth century, and is found in the effects of digitalis. The Scottish physician,
William Withering, while working in England, was consulted by a patient who
suffered from an extremely weak heart. After the doctor was unable to help him, the
patient consulted a gypsy woman, who prescribed a herbal therapy. Impressed by
the recovery of the patient, Withering sought out the woman and asked for the
recipe. He received it in exchange for a handsome fee. The mixture contained an
extract of the (poisonous) purple foxglove, Digitalis purpurea. The physician
investigated the potency of different preparations of these plants in that he gave
the medicines to 163 patients! With this experiment, he established that the best
formulation was made up of the dried, powdered leaves. After the observation was
made that a toxic dose is quickly reached, he recommended that diluted
preparations be administered in repeated doses until the desired effect was
achieved. Even though digitalis is still used today for congestive heart failure,
no one would recommend that Withering’s experimental technique be used to
establish the therapeutic potential of a substance. This approach was neither
ethical nor practical.
The example of the previous section shows that nature has furnished plants with
highly potent substances. A plethora of secondary metabolites, for example, alka-
loids, terpenes, flavones, and glycosides are also available. The contents of about
a hundred different plant species have either directly or indirectly, in the form of
analogues, found their way into human therapy. Traditional medicines use about
5,000–10,000 of the several hundred thousand already known species from the rich
plant kingdom. Morphine, caffeine, quinine, cocaine, ephedrine, coniine, atropine,
and reserpine were already mentioned in ▶ Sect. 1.1. Further plant-based pharma-
ceuticals that are used in therapy, or that have served as lead structures for the
development of medicines are compounds 6.1–6.7 (Fig. 6.1), and, in addition,
emetine, pilocarpine, podophyllotoxin, and the vinca alkaloids vinblastine and
vincristine.
Why do plants contain so many valuable therapeutic compounds? There is not
a human-related answer because plants did not evolve so that they could become
human medicines. The plants, however, had to respond to their environment, and
a competition with other species occurred. The decisive disadvantage of being
a plant is that it cannot run away! That is not a disadvantage when it comes to
reproduction. Bees take care of the first part, and aerodynamic seeds help with the
rest. An effective protective mechanism against, for instance, fungal infection and
pests such as caterpillars, sheep, and cattle served as a selection advantage for
some plants. The substances that offer an advantage taste bitter, hot, or are
toxic. They exert their effects in that they interact with the enzymes or receptors
6.2 Lead Structures from Plants 115
H3C CH3 OH
OMe
N+
O
x 2 Cl−
O
O O
6.1 Tubocurarin
OH N+ R
MeO
H CH3
MeO
CH3 OH
N OO
MeO H
OMe CH3
OO
OH
OMe CH3
6.3 Digitoxin, R = H
OO
6.2 Papaverin OH 6.4 Digoxin, R = OH
HO
OH
H CH3
H
O O
O CH3
H3C
O O H
O CH3 OH H O CH3
H3C H
O
O O 6.6 Artemisinin
H O CH3
N O
OH HO
O H
CH3 N
O O O
O
H3C
NH2
Fig. 6.1 Natural products from plants that have been introduced to therapy or have served as lead
structures include, in addition to the substances introduced in ▶ Sect. 1.1, tubocurarine (curare)
6.1, papaverin 6.2, digitoxin 6.3, digoxin 6.4, and the related cardiac glycosides. Newer natural
products from plants with great therapeutic potential include paclitaxel (Taxol ®) 6.5 for tumor
therapy, artemisinin 6.6 for malaria therapy (▶ Sect. 3.3), and the acetylcholinesterase inhibitor
huperzin A 6.7 for the potential treatment of Alzheimer’s disease.
of the “enemy.” The stronger the effect, the better the protection. A successful
principle of evolution is the development of defensive substances that do not kill,
but cause an unpleasant experience for the predator, which in turn teaches the
enemy to stay away. That is how butterflies survive that accumulate poisonous
116 6 The Classical Search for Lead Structures
plant-based substances in their bodies, and even those others that just imitate the
appearance of these butterflies. After the first experience with the poisonous
species, birds give both species a wide berth.
Plant substances have already undergone a selection process on biologically
relevant proteins; during the course of evolution they have “seen” receptors and
binding sites. Further, the course of their biosynthesis takes place in the binding site
of a protein, that is, they have functionality that mediates affinity to a protein.
Certainly, there are many plant substances that coincidently have a biological effect
in humans. Morphine contains a basic nitrogen, a phenolic hydroxyl group, an ether
bridge, and a hydrophobic domain: a medicinal chemist would also choose such
a mixture of functional groups, without the complicated ring structure, in the
conception of an active substance.
The isolation of natural products from plants for lead discovery has experienced
rather changing valuation in the last decades. Large pharmaceutical companies
have repeatedly started ambitious programs to elucidate the mechanism of action of
traditional medicines, only to abandon the area again disappointed. The disappoint-
ments are a result of an unfavorable relationship between effort and reward. All too
often only a toxin is isolated instead of a valuable lead structure, and all too often an
already-known principle is found. Nonetheless, the search continues. Nature offers
structural variation that the chemist can only dream of.
In contrast to the plants, the evolution of animal venoms occurred with the objective
of subduing prey or defending against an enemy. Many of these substances are
proteins, peptides, and alkaloids. They function as potent poisons that can quickly
lame or kill a victim. Because of this, many active substances from animals are
unsuitable for therapy, but others, for the exact same reason, are interesting lead
structures. Animal products offer many surprises, as illustrated in the following two
examples.
Despite its simple structure, epibatidine 6.8 (Fig. 6.2), which was isolated from
the Ecuadorian poison dart frog Epipedobates tricolor, is a 100-fold more-potent
analgesic than morphine! It does not affect the opiate receptor, but rather it is an
agonist at the nicotinic acetylcholine (nACh) receptor (▶ Sect. 30.4). That comes as
no surprise when its structural similarity to nicotine 6.9 is considered. Epibatidine
has a binding constant of 0.04 nM on the nACh receptor, which is 50-fold stronger
than nicotine. Unfortunately, its analgesic effects are coupled with a pronounced
body temperature reduction (hypothermia).
Dolastatine 6.10 (Fig. 6.2) was isolated from the wedge sea hare, Dolabella
auricularia, a marine snail. It is an interesting lead structure for antitumor com-
pounds. Synthetic analogues of 6.10 cause the complete disappearance of tumors in
some animal models. The diversity of marine animals in particular has historically
been a rich source of new and interesting lead structures and modes of action.
6.3 Lead Structures from Animal Venoms and Other Ingredients 117
H NH H
N
Cl N CH3
N
O O
H
H3C N N N O OMe
N N N
O O O O
CH3 CH3
O
6.10 Dolastatin-15
H3C
O− H3C O NH
HO H
H2N+ H O HO O CH3
H O CH3
N OH N
H
N CH2OH
O O
H HO
H H
HO
H OH H
Fig. 6.2 Epibatidine 6.8, a non-opiate analgesic that binds 50-fold more potently to the nicotinic
acetylcholine receptor than nicotine 6.9, comes from a South American frog (▶ Sect. 30.4).
Dolastatin-15 6.10, which was isolated from a marine snail, is an interesting lead structure for
cancer therapeutics. The toxin of the fugu fish, tetrodotoxin 6.11, is not a lead structure but rather
a sodium channel blocker for experimental (in vitro) use. The steroid alkaloid batrachotoxin 6.12 is
the most potent animal venom known. The LD50 value in mice, that is, the dose necessary to kill
50% of the experimental animals within 24 h, is 200 ng/kg.
OH O OH O O
OH HN
NH
NH HN
NH2 H2N
NH2
OH OH HO
HO CH3 H N(CH3)2
O
O OH
6.13 Tetracyclin
CHO
H3C
H3C HO HO
O O
H O HO
N
N R2 R1 = −CH2OH
H R1
H N
O O R2 = −NHCH3
N H
OH
H CH3
6.14 Streptomycin
HN
6.15 Ergotamine
O CH3
NH O
N N H
H N
HO N
N H H
N O
H N
H
O O
HN
Fig. 6.3 Penicillins, cephalosporins (▶ Sects. 2.4 and ▶ 23.7), and tetracycline 6.13 were impor-
tant lead structures for even better antibiotics. In contrast, streptomycin 6.14 is used in therapy
itself. Ergotamine 6.15 is a typical representative of the ergot alkaloids, from which a plethora of
different drugs have been derived. Likewise, asperlicin 6.16 is a structurally complex microbial
natural product. The 10,000-fold more potent derivative devazepide 6.17 was derived from it.
Lovastatin and some analogues (▶ Sects. 9.2 and ▶ 27.3) are exceedingly
important therapeutic substances that were isolated from microorganisms; they
interfere in the biosyntheses of cholesterol. Cholecystokinin (CCK) is a peptide
hormone that acts at a G protein-coupled receptor (▶ Sect. 29.1). It induces
multifaceted effects in the central nervous system and gastrointestinal tract. The
non-peptide CCK antagonist asperlicin 6.16 (IC50 ¼ 1.4 mM) originated from
extracts of Aspergillus alliaceus. After intensive structural variation, the much
simpler devazepide 6.17 (IC50 ¼ 80 pM) was designed, which has more than
120 6 The Classical Search for Lead Structures
10,000-fold better affinity to the CCK receptor (Fig. 6.3). This antagonist is orally
bioavailable and is an appetite stimulator.
The enzyme streptokinase for the dissolution of blood clots, and bacterial
collagenase for wound treatment are examples of therapeutically important proteins
that were isolated from microorganisms.
In 1903, Paul Ehrlich investigated hundreds of dyes in mice that had been infected
with trypanosomes. The result of this research was Nagana Red, the first drug for
Trypanosoma crucei infection, the causative agent of cattle trypanosomiasis. Other
dyes followed, as did colorless compounds that contained amide instead of azo
groups. It was only after Ehrlich’s death in 1916 that Bayer, after having investi-
gated more than a thousand analogues, produced its wonder drug suramin
(Germanin® ) 6.18 (Fig. 6.4). The work in this area led to the discovery of the
antibacterial sulfonamides in the 1930s (▶ Sect. 2.3). Thousands, if not tens of
thousands, of analogues were synthesized and tested. Many were introduced to the
market. Depending on the structure, they cover an extraordinarily broad spectrum
of different pharmacokinetic characteristics.
No actual biological activity was expected from the synthetic intermediates.
They were seen merely as starting material for the desired end product. Despite this,
many intermediates were routinely tested for biological activity, and it was a good
thing too!
CH3 O CH3
H H
N N
N N
H H
O O
O NH SO3Na O NH SO3Na
SO3Na SO3Na
SO3Na 6.18 Suramin SO3Na
Fig. 6.4 Bayer’s suramin 6.18, which is also known as E 205 or Germanin ®, had strategic
importance for the colonies. An English engineer who was suffering from the African sleeping
sickness (trypsanosomiasis) and was near death despite aggressive treatment with diverse anti-
mony and arsenic preparations, was cured after a few injections of this substance. The solvent for
the preparation of the intravenous injection solution was rain water in the tropical clinical trials(!).
After a short time, suramin was considered to be a “wonder drug.” Despite the fact that the
structure was kept secret, French researchers worked out their own synthesis within a short time.
Suramin is still used for the treatment of trypsanosomiasis because it has good efficacy and a long-
lasting effect.
6.6 Mimicry: How to Copy Endogenous Ligands 121
S O R
N
N NH2 COOH
H
N N
Fig. 6.5 Thiacetazone 6.19 and isoniazid 6.20 are tuberculostatics that originated as synthetic
intermediates. Isoniazid penetrates the cell wall and irreversibly binds to the enzymatic cofactor
NADH after radical generation. The originally accepted hypothesis that, upon metabolic
degredation to isonicotinic acid 6.21, it acts as an antimetabolite for nicotinic acid 6.22, proved
to be incorrect.
O COOH
NH2 N COOH
H
N
N N 6.23 Methotrexate
CH3
H2N N N NO2
N
OH
S N S
N H3C N
N N N
N
N N
N N N H
H N
H
6.24 Mercaptopurine 6.25 Azathioprine 6.26 Allopurinol
Fig. 6.6 Simple synthetic intermediates to methotrexate 6.23 turned out to be new drugs.
Mercaptopurine 6.24 and azathioprine 6.25 are immunosuppressants, and allopurinol 6.26 is
used to treat gout.
N O OH
OH
H
P X = -CH2-, -NH-, -O-
X
Substrate
OH OH
CHO , as CH B
H O OH
OH OH
N
H O H O OH
Fig. 6.7 Examples of substrate, transition state, and groups that imitate the enzymatic transition
state of an amide hydrolysis reaction. A few of the groups reversibly form covalent bonds to the
serine in the catalytic pocket of a serine protease (see ▶ Sect. 23.2).
archetypes for new medicines. The directed design of drugs from these lead
structures led to the “golden age” of pharmaceutical research (▶ Sect. 1.4).
The principal approach is demonstrated here on the example of enzyme inhib-
itors. Enzymes catalyze chemical reactions in that they stabilize the transition state
of the reaction. In doing so, they decrease the activation energy, and the reaction
can proceed at a lower temperature (▶ Sect. 22.3). This specificity can be exploited
particularly well for the optimization of enzyme inhibitors. By starting with knowl-
edge of the reaction mechanism, substrate groups are assembled that are structurally
analogous to the transition state (Fig. 6.7). They imitate it but do not lead to
6.6 Mimicry: How to Copy Endogenous Ligands 123
NH2 H OH
N H2N OH N
N
N N
N N N
N N
HO N N O
O Sugar O
HO OH HO OH HO OH
Fig. 6.8 Pentostatine 6.29 and nebularine 6.30 inhibit the enzymatic transformation of adenosine
6.27 to inosine 6.28. The affinity of 6.29 is 7 orders of magnitude more potent than the substrate
adenosine (Ki ¼ 2.5 pM), and the active form of 6.30 is 10 orders of magnitude even more potent
(Ki ¼ 0.3 pM). The structures of pentostatin as well as the active form of nebularine correspond to
the transition state of the enzymatic reaction.
Many drugs came from the observation of side effects during clinical or practical
use (see ▶ Sect. 2.8). The diuretic effects of mercury compounds were discovered
purely by accident (▶ Sect. 30.9). In 1919 physicians in the First Medical Univer-
sity Hospital in Vienna were testing a new treatment for syphilis. It was observed in
a 21-year-old woman that her urine production increased from 200–500 mL a day to
1.2–2.0 L on the third day of treatment with the test substance. This result led to the
development of the first effective diuretic (medicine to increase urine production).
Fortunately, we are no longer dependent on extremely toxic mercury compounds
for the therapy of venereal disease or as diuretics!
In 1948 it was observed in vulcanization factories that the antioxidant disulfiram
6.31 (Fig. 6.9) caused workers to become intolerant of alcoholic drinks. This
discovery led to the use of the substance for the treatment of chronic alcoholism.
S CH3
H
S N(Et)2 O N
(Et)2N S N CH3
H
S
O
OH OH
OH CH3
O O O O
O
6.33 Dicoumarol 6.34 Warfarin
CH3 O
H
HS 6.35 Penicillamine
OH
H3C
NH2
The approaches that are described in the previous sections are still used in industrial
pharmaceutical research today. Because of the enormous costs associated with the
development of drugs, the search for original lead structures is an increasingly
important goal. Large sums are paid for novel therapeutic approaches, test models,
or 3D structures of target proteins. This information can lead to an advantage over
the competition that indeed takes time to realize, but must be zealously defended
and brought to fruition.
According to the principle of risk diversification and the maximal exploitation of
all imaginable resources, today pharmaceutical companies subscribe to a strategy of
broadly established screening of huge substance libraries of plant extracts,
126 6 The Classical Search for Lead Structures
6.9 Synopsis
• Many active substances originate from natural products found in plants, animals,
and microbial sources. Their mode of action has been copied as an active
principle for the development of drugs.
• Endogenous substances such as hormones and neurotransmitters also served as
references for drug development.
• Only a few natural products became drugs themselves.
• Usually targeted chemical variations are required to optimize a lead for meta-
bolic stability, half-life, or selectivity to be ready for therapeutic use.
• Plants contain many valuable therapeutic compounds usually developed as an
effective protective mechanism against all sorts of enemies.
• Nature offers a tremendous body of structural variations, however, ambitious
programs to elucidate mechanisms of action of traditional medicines all too often
only isolate toxins and discover already-known principles.
• Animals have developed venoms as aggressive or defense mechanisms to be
used as predators or against enemies. They are mostly proteins, peptides, or
alkaloids that either kill or lame a victim.
Bibliography 127
Bibliography
General Literature
Burger A (1983) A guide to the chemical basis of drug design. Wiley, New York
Sneader W (1990) Chronology of drug introductions. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry. vol 1, Kennewell PD (ed). Pergamon, Oxford, pp 7–80
Verg E (1988) Meilensteine. 125 Jahre Bayer 1863–1988. Bayer AG, Leverkusen
Special Literature
Badio B et al (1994) Epibatidine: discovery and definition as a potent analgesic and nicotinic
agonist. Med Chem Res 4:440–448 and other works (Special journal edition dedicated to
Epibatidine)
Buss AD, Waigh RD (1995) Natural products as leads for new pharmaceuticals. In: Wolff M (ed)
Burger’s medicinal chemistry and drug discovery. Wiley, New York, pp 983–1033
Hylands PJ, Nisbet LJ (1991) The search for molecular diversity (I): natural products. Ann Rep
Med Chem 26:259–269
Pettit GR et al (1993) Isolation of dolastatins 10–15 from the marine mollusc Dolabella
Auricularia. Tetrahedron 41:9151–9170
Suffness M (1993) Taxol: from discovery to therapeutic use. Ann Rep Med Chem 28:305–314
Tempesta MS, King SR (1994) Ethnobotany as a source for new drugs. Ann Rep Med Chem
29:325–330
Screening Technologies for Lead Structure
Discovery 7
In the last chapter, examples were presented of how lead structures can be discovered
by purposefully searching, particularly by using examples from nature or compounds
with known modes of action. Even if a large number of natural products and synthetic
substances are available, it is not always easy to filter the active molecules out and to
assess their value for a given indication. This requires a time and cost-intensive sorting
or screening of enormous substance libraries. By “screening” is meant the more or less
specific biological testing of compounds. Although today molecular test systems and
cell culture models are practically exclusively used, the cost for testing a compound is
between US $2 and US $5. Because typically millions of compounds are tested,
a screening campaign can cost a lot of money!
The screening process can be divided into three phases. First there is an
automatic introductory screening, which is usually carried out by robots and
encompasses libraries of millions of compounds. The first substances that show
an interaction are identified as “hits” that have to be validated by repeated testing.
Next, a more detailed screening follows, with which the chemical space around the
identified compounds is explored. The goal is to establish a structure–activity
relationship (▶ Chap. 18, “Quantitative Structure–Activity Relationships”) and to
improve the pharmacological and physicochemical properties (▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”). Along the
way, lead structures (so-called “leads”) are discovered. Then in the last phase the
lead optimization takes place through detailed biological testing, through which
a drug candidate is selected for clinical testing (▶ Chap. 8, “Optimization of Lead
Structures”). How can we find appropriate hits from the enormous amount of test
candidates that have the potential to be developed into a medicine? The question is
answered by screening for biological effects.
The prerequisite for a large-scale screening was the development of in vitro test
systems as a surrogate for animal experiments. The first were carried out on
Important target proteins for drug development are proteases and esterases, which
are enzymes that cleave peptide and ester bonds (▶ Chaps. 23, “Inhibitors of
Hydrolases with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhib-
itors”; ▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”). How can their enzy-
matic activity be visualized? One prepares synthetic substrates that are similar to
the natural substrate. They carry however, a para-nitroanilide or a para-
nitrophenolate group coupled by a peptide or ester bond (Fig. 7.1) When the
enzyme cleaves this substrate, yellow nitrophenolate or nitroanilide is released,
and the absorption properties of the produced anion are a measurably change. This
is observed spectroscopically. If then, during screening, a compound acts as an
inhibitor, the enzymatic cleavage of the synthetic substrate is more or less
suppressed, and the yellow color is minimized. In this way the inhibition potency
of test substances can be determined (Fig. 7.1)
NH2 NH− NH
O−
Peptide +N
O O
R N + + N+ −
N −O
N −O
cleavage −O
O O O
H
-RCOO− O− O
O− OH
Ester +
N
O O
R O +
N+ −O
N −O
N+ −
−O O O
O
405 nm
Fig. 7.1 A p-nitrophenolate or a p-nitroanilide group is added to the terminus of a natural protease
or esterase substrate. The enzyme cleaves the p-nitrophenolate or p-nitroanilide, which becomes
visible as a yellow-colored mesomerically stabilized anion (absorption maximum at 405 nm). If
a competitive inhibitor is added along with the substrate to the enzyme, the cleavage reaction rate
is suppressed depending on the binding strength. This is apparent by the more or less strong yellow
color of the solution and can be quantitatively measured.
132 7 Screening Technologies for Lead Structure Discovery
which can appear as though the protein is well inhibited. The addition of detergents
can reverse this effect.
By using a sophisticated robot system, 100,000 assays a day can be carried
out. This leads to an enormous flood of data to be evaluated. The reduced test
volume has the advantage that much less material is consumed. Furthermore, the
measurements can be carried out quickly. At the same time the sample manipu-
lation has become ever more difficult. One only has to consider the evaporation
of such small amounts of solution, the enormously increasing logistics of
comprehending so much data in parallel, or the reproducibility of the results,
and the necessary sensitivity to measure weak signals with certainty to appreciate
the difficulty.
In order to improve this last aspect, ever more sensitive detection procedures are
used. Fluorescence measuring techniques are particularly sensitive. In the sim-
plest case, a fluorescing substrate such as coumarin (▶ Sect. 14.6) is incorporated in
the place of para-nitroanilide. The protein–ligand binding can also be followed by
fluorescence anisotropy (or polarization). A known ligand is coupled to
a fluorophore and excited with polarized light. The emitted fluorescence is in this
case also polarized. In the time that the excited molecule can freely diffuse in
solution, the extent of the induced polarization decreases. Because a small molecule
can diffuse much faster than a big one, its polarization signal decreases much faster
than if it were bound to a large protein. The difference is determined based on the
change in diffusion character of the large protein, which can be measured.
Even better sensitivity can be achieved with so-called FRET measuring
techniques (fluorescence resonance energy transfer). A resonance energy transfer
can occur between donor and acceptor fluorophores of similar absorption if both
are separated by no more than 50 Å. If, for example, a phosphatase assay is desired,
a phosphorylated peptide substrate must be coupled with a covalently bound donor
fluorophore. The substrate is added with the test compound. Depending on how
potent the inhibiting test compound is, the enzyme’s activity is reduced, and less
substrate is cleaved. Then an antibody is added that binds to the unphosphorylated
substrate. The antibody is also coupled to a fluorescence acceptor, the absorption
maximum of which overlaps with the emission spectrum of the donor fluorophore.
If a fair amount of phosphorylated substrate is still present, that is, the test
compound is a potent inhibitor, the spatial proximity of the donor and acceptor
leads to a strong FRET signal. This can be quantitatively measured.
In the meantime, progress in assay miniaturization allows the detection of single
molecules. This is possible by using fluorescence correlation spectroscopy (FCS).
A confocal laser microscope irradiates approximately a femtoliter of test solution.
If a single fluorophore diffuses through the volume of interest, it causes a time-
resolved fluctuation in the fluorescence signal. An exact analysis of these signals
delivers information about the concentration and diffusion constants. The diffusion
velocity, on the other hand, depends on whether the fluorescence-marker-labeled
substance is bound to a protein or not. If the proteins as well as the ligands are
tagged with different markers, the association and dissociation can even be
followed.
134 7 Screening Technologies for Lead Structure Discovery
The binding of a ligand to a protein says nothing about the concomitant function or
change in function. Often it is easy to relate the observed inhibition in an enzyme
assay to a function. The correlation is less obvious with receptors and ion channels
(▶ Chaps. 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29, “Agonists
and Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for Channels,
Pores, and Transporters”). If the biochemical pathways and cell cycle regulation
are considered, it becomes even more complex to assign function for enzymes. This
correlation is not so easily reproduced in a test tube. Therefore assays must also be
developed to study function that allow the response of an entire cell to be measured
upon ligand binding. It is possible to culture cells for many different tissues, which
then allows the study of tissue-specific receptors.
Typically the activity of ion channels can be investigated by using binding tests or
radioactive assays. The so-called patch–clamp technique allows the influence of
a drug candidate to be even better characterized. An electrode is attached to the surface
of a cell, and a voltage or current is applied. In this way the opening or closing of single
channels can be registered, particularly when a test molecule is added. This technique
certainly does not encroach on the dimension of the high-throughput techniques. It is
better used to elucidate the function of hits from a prescreening. Fluorescence methods
are more popular for the first step. As an example, Ca2+-channel function can be
assessed by measuring an increase in intracellular calcium levels by using a dye that
fluoresces in the presence of calcium ions.
Other tests employ the coupling to a reporter gene. Receptor stimulation
initiates a signaling cascade that, for some receptors, leads to the transcription of
gene products that are controlled by the relevant promoters (▶ Sect. 28.1). If the
sequence of the relevant gene is replaced with that of a reporter’s, such as b-
galactosidase, luciferase, or green-fluorescent protein (GFP), then these proteins are
produced by the cell instead. This can subsequently be observed as an easily
detectable signal (Fig. 7.2). As examples, if the produced b-galactosidase cleaves
X-gal, a blue dye is released, luciferase develops an ATP-dependent chemilumines-
cence, and the green-fluorescent protein is detectable because of its own intrinsic
fluorescence.
DNA
Promotor
Gene A
for Gene A
Preparation of
the construct
DNA
Promotor
GFP Gene
GFP
Cell penetration
DNA
Promotor
GFP Gene
for Gene A
Test model
GFP
hν
Activation hν
by active Registered signal
substances
Fig. 7.2 Genes are controlled by promoters. Promoter-initiated gene activation leads to the
synthesis of the relevant protein. By using green fluorescent protein (GFP), an easily observed
assay can be constructed based on this principle. For this the gene promoter that is activated by
agonist binding is coupled to the GF-protein gene. Activation of the promoter then delivers not the
original gene product, but rather the GF protein. The presence of GF protein is easily observed
because of its fluorescence upon excitation with ultraviolet light.
disease. Aside from macroscopic changes in the body form, changes in the gene
expression pattern can also be analyzed (▶ Sect. 12.9). Are mutations in proteins
apparent? Certainly the worm does not have the same metabolic pathways as we do.
Even its disease models only partially represent the pathophysiology that is seen in
human disease. Nonetheless the direct testing of compounds on the pinworm seems
to afford a new perspective for screening substance libraries. As an alternative, the
fruit fly (Drosophila melanogaster) or the zebra fish (Danio rerio) are also available
as test organisms. They help to test the validity of a therapeutic approach early in
a program.
h b
c
g
Computer
Screening
f
d
O
OH
OAc
3
2
e
1
0
−1
−2
1 2 3 4 5 6
Fig. 7.3 The spatial structure of a protein is the starting point for virtual screening (a). The binding
pocket is explored with a variety of different probe atoms, for instance, for hydrogen bond acceptors
or donors (b). Regions that are particularly favorable for such interacting groups are highlighted on
the computer graphics. If the “hot spots” in these areas are summarized, a spatial pattern of properties
that a potential ligand should have become apparent (c). This pattern is called “pharmacophore” and
serves as the search criterion for a database retrieval (d). Potential ligands from a large database are
filtered and energetically evaluated by docking (e). The found hits are either commercially available
or synthesized in the laboratory (f). Next biological testing takes place (g), and if the binding is
successful, the lead structure is crystallized with the protein. The subsequent structural determination
(h) serves as a starting point for further design cycles.
In case a hit from the latter group is found, the compound can be subsequently
synthesized. The search is divided into multiple filtering steps that become increas-
ingly stringent and sophisticated with successive reduction of the search quantity.
With the help of fast docking programs (▶ Sect. 20.8), molecules are fitted into the
binding pocket and a binding geometry is generated, from which the expected
binding affinity can be estimated. This step is the decisive one, but unfortunately it
is also the most difficult (▶ Sect. 20.9). In ▶ Chap. 21, “A Case Study: Structure-
Based Inhibitor Design for tRNA-Guanine Transglycosylase”, examples are
presented that were found by virtual screening.
The evaluation of the generated binding geometries is accomplished with suffi-
cient accuracy in about 70% of cases nowadays. An improvement in predictive
power requires that we understand the ligand–protein recognition process better
(▶ Chap. 4, “Protein–Ligand Interactions as the Basis for Drug Action”). The role
of water in the binding, the induced steric and dielectric adaptation, the plastic
behavior and residual mobility of proteins and bound ligands, and the dynamic
changes during complex formation are still poorly understood. The composition of
the databases themselves plays a decisive role in the search’s success. Enlarging the
database alone is not enough. The enrichment of the compounds that could fulfill
the requirements is crucial. Screening is often compared to the search for a needle
in a haystack. When looking for such a needle, it is not helpful to simply double the
size of the haystack! The haystack must be spiked with more promising needles. To
achieve this, all available knowledge about the structure, function, and dynamic
behavior of the target protein must be used to define the database search. Compar-
isons between proteins and protein binding pockets, especially among members of
the same protein family, can offer decisive information (▶ Sects. 20.3, ▶ 20.4,
▶ 20.5, ▶ 20.6). In principle, all of the data that are needed about the composition
of a suitable compound library for a virtual screening are already intrinsically coded
in the structure and geometric interaction properties of the binding pocket. It is only
a question of applying it correctly. Another decisive criterion for a hit is an adequate
pharmacokinetic profile so that satisfactory bioavailability can be achieved
(▶ Chap. 19, “From In Vitro to In Vivo: Optimization of ADME and Toxicology
Properties”).
Surface plasmon resonance techniques are being increasingly used to screen for
new lead structures. For this a target molecule is anchored onto the gold-coated
surface of a sensor chip. The underside of a glass carrier is irradiated with light
(Fig. 7.4). Changes in the refractive index, which are measured as a shift in total
internal refraction are a measure for bulk change on the sensor surface.
If a compound binds, the resulting change in mass on the gold surface can be
registered. Because the technique is fast and time resolved, other kinetic parameters
such as the association or dissociation rate constants of the binding event can be
measured in addition to the stoichiometry. One problem associated with screening
7.7 Biophysics Supports Screening 139
Intensity
Light Source Detector
I
II
I II
Polarized Angle
Prism Reflected
Light Light Resonance
Signal
II
Kon Koff
Sensor Chip
with Gold Film
I
Flow Channel
Time
Target Protein Test Ligand
Sensorgram
Fig. 7.4 The principle of surface plasmon resonance (SPR). The method registers changes in the
refractive index on the surface of a sensor chip (green). The extent of the changes on the gold surface
that are caused by the binding of the substrate molecule (yellow) onto an anchored receptor (red) leads
to a shift in the resonance angle of the reflected light (I and II). That way, not only the binding affinity
but also the kinetic association (kon) and dissociation (koff) parameters are measured.
in microtiter plates is the huge amount of time that is needed to load the plate with
compounds. One way around this bottleneck is to apply the entire compound library
to a sensor chip in a microarray format by using spraying techniques. This means
now all the low-molecular weight ligands are anchored on the chip. If a test receptor
protein is added to such a chip, a mass difference is detected where the protein
binds. Because of the spatial resolution of the chip, it can easily be determined
which library compound is responsible for the interaction. The disadvantage of the
method is that the test compounds must be attached with a chemical anchor that
allows them to be immobilized on the chip surface. Surface plasmon resonance has
meanwhile achieved a sensitivity that allows the detection of even very small test
compounds with a mass as small as 100 Da. Therefore the approach can be
reversed: Now the protein is immobilized at the surface and ligand binding from
solution can be recorded.
In Sect. 7.1, the concept of “ligand efficiency” was introduced. To take the latter
aspect into consideration, test libraries are being increasingly supplied with com-
pounds that have a molecular weight of less than 250 Da. In the meantime the term
chemical “fragment” has become popular for these search candidates. The term is
a bit unfortunate because the molecules are actually “complete” small molecules,
and not as the term might suggest that they are simply a “fragment”, that is, an
additional building block to be attached to a lead structure.
Proteins denature when they are heated. A “melting temperature” is defined
when an unfolding process (▶ Sect. 14.2) occurs. This temperature can be mea-
sured very sensitively with a thermal sensor. The binding of a ligand to a protein
140 7 Screening Technologies for Lead Structure Discovery
Time (min)
0 10 20 30 40 50 60 70
0.0
μJ/s −0.2
−0.4
∫dF = ΔH
−0.6
−0.8
0
−2
kJ/mol
ΔH ΔG
−4 Stoichiometry
−6
0.0 0.5 1.0 1.5 2.0
Molar Ratio
Fig. 7.5 In isothermal titration calorimetry, a solution of a ligand is added dropwise to a solution
of a protein. The binding to the protein leads to an exothermic or an endothermic reaction.
The heat that evolves upon the addition of each drop is the area under the single signal peaks.
The total integral of all signal peaks is the binding enthalpy DH. With increasing amount of
ligand the protein becomes saturated so that the signal intensity of the heat signal decreases.
The binding constant (dissociation constant) can be derived from the shape of the curve and
the free energy DG can be obtained from the relationship DG ¼ RT ln Kd. The stoichiometry
of the reaction is simultaneously obtained. The entropy is calculated by using the equation:
DG ¼ DH TDS.
RF
(selective)
fast fast
minus
=
Fig. 7.6 To determine the saturation transfer difference (STD) with NMR spectroscopy, a library
of test ligands ( , ) is added to a target protein (ellipse). Potential binders (here ) reside for
a finite time span bound to the protein. If the nuclear spin of one type of nucleus in the protein is
selectively saturated (red) by using a suitable resonance frequency (RF), the protein magnetization
can be transferred (nuclear Overhauser effect, see ▶ Sect. 13.7) to the ligand that was bound in the
meantime ( ). These ligands become apparent in that their spectrum is altered even though they
are already dissociated from the protein. If the difference between the spectra in presence of the
saturated and unsaturated protein is displayed, it is possible to determine which ligands were
bound immediately to the protein. Many variations and sophisticated experimental protocols have
been developed for the principle of magnetization transfer.
7.9 Crystallographic Screening for Small Molecular Fragments 143
Crystal structure analysis delivers the most exact spatial position of a molecule in
the binding pocket of a protein. Even the geometry of small, very weakly binding
molecules is easily recognized. In structures that have a resolution better than
2–2.5 Å (▶ Sect. 13.5), water molecules are usually still recognizable as discrete
density maxima. Often, they indicate sites in the binding pocket that can be
equivalently accommodated by polar functional groups of ligands (Fig. 7.8). In
the early 1990s, Dagmar Ringe in the research group of Greg Petzko exposed
protein crystals intentionally to solvent molecules to allow the solvent to diffuse
into the crystals (▶ Sect. 20.2). The solvent molecules can act as probes in that they
populate binding regions of the protein pockets. As an example, the areas where
144 7 Screening Technologies for Lead Structure Discovery
a Kd = 17 mM d
H
N CH3
HO 7.1
O
Zn2+ S1⬘
Stromelysin
His211
Zn2+
b Kd = 17 mM Kd = 0.02 mM Val163
H
N CH3 HO
HO 7.2
His205
O
S1⬘ e
Zn2+ CN
Stromelysin
c IC50 = 25 nM
H
N O His211
HO
7.3
O Zn2+
Val163
Zn2+ S1⬘
CN
Stromelysin His205
Fig. 7.7 In the “SAR by NMR” method, ligands with weak affinity to a protein, in this case
stromelysin, are sought from a large complex mixture. 15N-labeled protein is used and so-called
1
H-15N HSQC spectra are measured. If a ligand such as acetohydroxamic acid 7.1 becomes
apparent through a shift in the resonance of specific amino acids that protrude into the binding
pocket, the binding geometry can be deduced (a, d). Later the binding site is saturated with these
ligands. Further NMR measurements are carried out to identify ligands for neighboring binding
positions. These are revealed by the shift in the resonances of neighboring amino acids. That is how
4-cyano-40 -hydroxybiphenyl 7.2 was discovered (b, d). A chemical coupling of both hits 7.1 and
7.2 with a –CH2CH2O– linker produced 7.3, which is a nanomolar inhibitor of the protease
stromelysin (c, e).
HO
HO O
Benzylsuccinic acid
146 7 Screening Technologies for Lead Structure Discovery
a binding pocket. A creative scientist will directly exploit their position for the design
of new drug candidates. From there, it was obvious to use crystal structure analysis as
a method to screen small molecules or “fragments” (MW <250 Da).
Even today a crystal structure determination is fairly laborious. All the same, it
can be largely automated so that a few hundred molecules can be processed. In
addition, the tendency of small molecules to diffuse into mature protein crystals can
also be used (so-called “soaking”; ▶ Sect. 13.9). If a “cocktail” of multiple test
substances is used, the screening can be accelerated. A protein crystal can be
exposed to up to 10 compounds at once. The composition of the cocktails is
construed so that a mixture of different forms (long and stretched, angular, spher-
ical, etc.) is present. This makes it easier to distinguish them later in the electron
density (see ▶ Sect. 12.5). To optimize the effort-to-yield ratio for the crystallo-
graphic screening, often a different screening method is carried out first to pre-filter
possible hits. Only compounds that have been identified as hits in the first screening
are used in the subsequent crystallographic screening. However, only a few tech-
niques that have been described in the previous section are really suitable to find
a small, weakly binding candidate from a fragment library. Frequently this concerns
only millimolar-binding candidates.
The hits from the crystallographic fragment screening can be further developed
(▶ Sect. 20.7). One possibility is to probe the different regions of the binding
pocket and then connect the pieces with a linker, analogously to what was
described in Sect. 7.6 in the “SAR by NMR” method. In another, usually more
successful variation, the fragment hits are chemically elaborated upon. For this
approach additional moieties are added on the basis of the crystal structure. In this
way the original hit, which serves as a seed, can be enlarged to bind more strongly to
the protein.
Ligands bind with very poor affinity to flat pockets that are open to the surrounding
solvent. Therefore, it is extremely difficult to evidence their binding or obtain
a crystal structure with a ligand bound in such an area. James Wells and his
colleagues at the Sunesis company in San Francisco developed the idea to tether
ligands for this type of binding. From a chemical point of view, this means that
a reaction is carried out with the exposed thiol of a cysteine residue on the protein’s
surface. Such a cysteine must be available in the native protein, or it is appropriately
introduced by mutagenesis (▶ Sect. 12.2). Under suitable reaction conditions, the
ligand is anchored with a disulfide bond, which is formed through the thiol group
of the exposed cysteine (Fig. 7.9). Only those test candidates from the compound
library will react that are able to form an interaction with the surface in the vicinity
of the cysteine thiol group. For all intents and purposes, they explore the surround-
ing region, react with the cysteine, and remain coupled to the surface by the
disulfide bridge. Successfully formed complexes are then evidenced by mass
spectrometry. James Wells and Robert Strout chose thymidylate synthase as their
7.10 Tethered Ligands Explore Protein Surfaces 147
R
R R
S
S S S R
S S
S
S
SH S
S
+
Fig. 7.9 The thiol group of the exposed cysteine is used as an anchor group for the formation of
disulfide bonds with ligand candidates from a compound library. There, suitable ligands react that
are also able to interact with the surface region in the vicinity of the cysteine thiol. A crystal
structure was determined from just such a covalently linked complex (Fig. 7.12). After optimiza-
tion of the initially discovered hit, the disulfide anchor can be discarded and a non-covalent
inhibitor can be developed.
first test example. This enzyme plays an important role in the de novo synthesis of
thymidine, an essential building block for DNA. Cells with a high division rate
especially need this building block so that inhibition of this enzyme might represent
potent anti-infective agents or antitumor compounds (▶ Sect. 27.2).
Thymidylate synthase has a cysteine residue in position 146, in the vicinity of
the catalytic site. From a library of 1200 disulfides, compounds 7.4–7.7 proved to be
binders whereas the very similar derivatives 7.8–7.11 were not selected (Fig. 7.10).
Accordingly, the phenylsulfonamide together with the proline moiety seemed to be
essential for binding. Next the disulfide anchor was removed, and the binding
constant for N-tosyl-D-proline 7.12 was measured to be 1.1 mM (Fig. 7.11). To
further test the concept, Cys146 was exchanged for a serine (Fig. 7.12). When no
binding was apparent with this mutant, the neighboring His147 was mutated to
a cysteine, but this mutant could not fish out the N-tosylproline moiety either. In
contrast, the position-143 mutant was successful (Fig. 7.12). In that case a leucine
was exchanged for a cysteine. The subsequently determined crystal structure
showed that the N-tosylprolyl moiety was almost identically bound in both cova-
lently anchored complexes, just as they are without an S—S anchor (Fig. 7.12). This
is convincing proof that the covalent coupling is not responsible for the binding
geometry. In fact, the technique allows small, initially weakly binding ligands to be
fished out of a large library. From the original millimolar hit 7.12, the side chain of
the natural cofactor methylenetetrahydrofolic acid could be transferred to give 7.13,
which was developed into a nanomolar inhibitor 7.15 in two steps.
The method of “tethering” can be fairly generally applied. It has especially
achieved success in the search for ligands that disrupt the formation of protein–
protein surface contacts (▶ Sect. 10.6). A great advantage of the technique is that
it is not necessary to develop an additional biochemical binding assay. Weakly
148 7 Screening Technologies for Lead Structure Discovery
CH3
CH3 F
S
S S S
S
S S O
H3C
O O N S S
S S O N
O N N
O
7.8 7.9
7.4 7.5
CH3 CH3 CH3
H3C CH3
S S
S S S S
O O
S Cl S S
N S
O O O N
S S H O H
O N O N
7.6 7.7
7.10 7.11
Fig. 7.10 From a library of 1,200 disulfides, the compounds on the left side 7.4–7.7 proved to be
binders although structurally similar derivatives 7.8–7.11 (right) were synthesized but did not bind
to the protein.
binding ligands are covalently “tethered” and cannot be washed away as happens in
the case of simple complex formation. Further, the covalently bound chemical
probes allow the adaptive capacity of the surface region to be explored.
7.11 Synopsis
• Large substance libraries are screened for biological effects to filter out active
molecules and assess their value for a given indication.
• Three phases are distinguished, a broad automatic introductory screening for
hits, a more detailed screening of chemical analogues around a hit to establish
the first structure–activity relationship, and a lead optimization to find candidates
for clinical testing.
• A prerequisite for high-throughput screening was the development of in vitro
test systems using pure proteins produced by gene technology along with the
entire arsenal of biochemical methods in the test tube so that the function of
single-gene products can be recorded.
• As a disadvantage, high-throughput screening does not assess the entire effect
spectrum and ignores effects such as transport, distribution, metabolism, and
excretion.
• Screening libraries are frequently assembled of molecules from other drug
development projects; as such, they are rather inefficient with regard to their
molecular size and their modest screening hit activity in micromolar range.
7.11 Synopsis 149
O NH O NH
CH3
O H
COOH COOH N
O O O
S S S COOH
O N O N O N
HOOC COOH
O NH
O N
HN N
H
N
H2N
7.13
Fig. 7.11 By transferring a side chain from the natural cofactor methylenetetrahydrofolic acid
7.13, N-tosyl-D-proline, a millimolar inhibitor could be transformed into a nanomolar inhibitor
7.15 in two steps.
Small substances with high ligand efficiency and sufficient space for structural
optimization are particularly promising.
• Enzymatic function and its inhibition can be recorded by the production of
chromophoric reaction products.
• Radioactively labeled compounds or enzyme-linked immunosorbent assays are
versatile techniques to record protein function on the molecular level.
• Progress in assay miniaturization calls for sophisticated robotic systems, ever-
improving sensitivity of the read-out, including fluorescence measuring tech-
niques, and reliable logistics to handle the enormous data flow.
• Aggregate formation of hydrophobic test compounds can exert significant influ-
ence on the assay read-out or even cause false positive or negative hits.
• Testing on cell-based assays is performed to study changes in cellular or
organism-related function beyond pure binding of a test compound to a given
protein target.
150 7 Screening Technologies for Lead Structure Discovery
S S
Cys143 Leu
Cys143
= 7.4
S
S
Cys146
= 7.4
Cys146 Ser
His147 = 7.12
Fig. 7.12 Superpostions of crystal structures of the enzyme thymidylate synthase with two
tethered ligands, one bound to Cys143 (C atoms of ligand 7.4 are green) and the other to
Cys146 (C atoms of ligand 7.4 are violet), both of which are N-tosyl-D-proline derivatives and
which are covalently anchored through S—S bridges. Upon cleavage of the disulfide anchor, the
free N-tosyl-D-proline (C atoms are gray, 7.12) proved to be a ligand with an affinity of 1.1 mM. Its
binding geometry is very similar to both of the covalently anchored derivatives.
• Primary animal testing in vertebrates has been abolished today for ethical
reasons, but it is being increasingly replaced by whole-animal screening by
using nematodes as the simplest multicellular organism to record synergistic
and side effects.
• As a complementary and alternative method, virtual computer screening has
been developed to screen large compound libraries by docking ligand candidates
into the known spatial structure of a target protein.
• Binding events are recoreded by biophysical methods such as surface plasmon
resonance, thermal stability shifting, mass spectrometry, or microcalorimetry.
They are used to detect ligands as potential binders.
• NMR spectroscopy can be used to detect ligand binding by magnetization
transfer. Multiple binders can be chemically linked to more strongly binding
ligands according to the SAR by NMR technique.
• Exposure of small molecular probes and fragments to protein crystals allows for
the structural characterization of the binding modes of weakly binding fragments
as a versatile starting point to lead optimization.
• Small-molecule fragments tethered to a protein through covalent attachment to
the exposed thiol group of a cysteine residue allow the exploration of the binding
properties of flat, solvent-exposed surface depressions and serve as a starting
point to develop antagonists to perturb the protein–protein interface in complex
formation.
Bibliography 151
Bibliography
General Literature
Blundell TL, Jhoti H, Abell C (2002) High-throughput crystallography for lead discovery in drug
design. Nat Rev Drug Discov 1:45–54
Hajduk PJ, Greer J (2007) A decade of fragment-based drug design: strategic advances and lessons
learned. Nat Rev Drug Discov 6:211–219
Jahnke W, Erlanson DA (2006) Fragment-based approaches in drug discovery. In: Mannhold R,
Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry, vol 34. Wiley-
VCH, Weinheim
Jones AK, Buckingham SD, Sattelle DB (2005) Chemistry-to-gene screens in Caenorhabitis
elegans. Nat Rev Drug Discov 4:321–330
Klebe G (2006) Virtual ligand screening: strategies, perspectives and limitations. Drug Discov
Today 11:580–592
Löfås S (2004) Optimizing the hit-to-lead process using SPR analysis. Assay Drug Dev Technol
2:407–415
Siegel MM (2002) Early discovery drug screening using mass spectrometry. Curr Topics Med
Chem 2:13–33
Sotriffer C (2010) Virtual screening. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and
principles in medicinal chemistry, vol 48. Wiley-VCH, Weinheim
Vogtherr M, Fiebig K (2003) NMR-based screening methods for lead discovery. In: Hillisch A,
Hilgenfeld R (eds) Modern methods of drug discovery. Birkh€ausen Verlag, Boston, pp S183–
S120. ISBN 376436081X
Special Literature
Hajduk PJ, Sheppard G, Nettesheim DG, Olejniczak ET, Shuker SB, Meadows RP, Steinman DH,
Carrera GM Jr, Marcotte PA, Severin J, Walter K, Smith H, Gubbins E, Simmer R, Holzman
TF, Morgan DW, Davidsen SK, Summers JB, Fesik SW (1997) Discovery of potent nonpeptide
inhibitors of stromelysin using SAR by NMR. J Am Chem Soc 119:5818–5827
Erlanson DA, Braisted AC, Raphael DR, Randal M, Stroud RM, Gordon EM, Wells JA
(2000) Site-directed ligand discovery. Proc Natl Assoc Soc 97:9367–9372
Optimization of Lead Structures
8
A lead structure is the starting point on the way to a drug. The potency, specificity,
and duration of effect must be optimized, and the side effects and toxicity must be
minimized in an usually elaborate, iterative process. Every change in the chemical
structure modulates the 3D structure of the molecule, its physicochemical prop-
erties, and the activity spectrum. The isosteric replacement of atoms or groups,
the introduction of hydrophobic building blocks, the dissection of rings or the
restriction of flexible molecular portions into cyclic structures, and the optimiza-
tion of the substitution pattern are all possibilities to purposefully modify a target
structure.
Creativity and luck are always important prerequisites for success in pharmaceu-
tical research. Nonetheless, there is a treasure chest of decades of accumulated
experience that can be exceedingly supportive to the rational optimization process.
The computer-aided methods can contribute to their full capability in this field in
particular. Several general considerations and approaches to lead optimization are
presented in the sections of this chapter. A discussion of the structure-based
and computer-aided optimization of lead structures is presented in ▶ Chaps. 17,
“Pharmacophore Hypotheses and Molecular Comparisons” and ▶ 20, “Protein
Modeling and Structure-Based Drug Design”; examples for its application to differ-
ent therapeutic areas are presented in ▶ Chaps. 23, “Inhibitors of Hydrolases with an
Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors
of Hydrolyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; ▶ 27, “Oxidore-
ductase Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear Receptors”; ▶ 29,
“Agonists and Antagonists of Membrane-Bound Receptors”; ▶ 30, “Ligands for
Channels, Pores, and Transporters”; ▶ 31, “Ligands for Surface Receptors”; ▶ 32,
“Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs”.
The truth is objective and absolute. But we can never be sure that we have found it. Our
knowledge is always an assumed knowledge. Our theories are hypotheses. We test for the
truth in that we exclude what is false. (Objective Knowledge, 1972)
a scheme for the variation of aromatic substituents that allows the biological activity
to be optimized in a minimum number of steps (Sect. 8.3). The application of
experimental design, simultaneously changing multiple parts of a molecule, and the
evaluation of the results by using quantitative structure–activity relationships
(▶ Chap. 18, “Quantitative Structure–Activity Relationships”) usually allows a fast
and effective optimization. In structure-based and computer-aided optimization, the
3D structure of the target protein and its complexes leads to directed structural
variations of the active substances. Here again, the aspects of total lipophilicity and
metabolism should not be neglected.
HO N N
H H
Fig. 8.1 A few possibilities for the isosteric replacement of atoms and/or groups.
156 8 Optimization of Lead Structures
I I
HO O CH2CH(NH2)COOH
I 8.1 Triiodothyronine, T3
HO O CH2CH(NH2)COOH
8.2
R
COOH
O
O 8.4 R = -COOH
NH2
8.3 Acetylsalicylic acid or -SO2NH2
Fig. 8.2 Isosteric replacement with retention, loss, and reversal of the biological activity. All
three iodine atoms of the thyroid hormone thyroxine 8.1 can be replaced with alkyl groups and
compound 8.2 is still active. In the case of acetylsalicylic acid 8.3, the exchange of the –OCOCH3
for an NHCOCH3 group led to the loss of the acylating ability and therefore a nearly complete loss of
the biological activity. The antimetabolite sulfanilamide 8.4 (R ¼ SO2NH2) is derived from
p-aminobenzoic acid 8.4 (R ¼ COOH), which is a critical intermediate in the bacterial dihydrofolate
synthesis; 8.4 (R ¼ SO2NH2) is the result of the exchange of a carboxyl group for an isosteric
sulfonamide group.
For example, –COOH, an H-bond acceptor and donor, can be replaced with other
groups that have the same or modified properties, for instance, with the similarly
acidic tetrazole. Another example can be found in the exchange of a phenyl ring for
a thiophene or a furan building block (Fig. 8.1). The potential of isosteric replace-
ment is illustrated in the exchange of all three iodine atoms of triiodothyronine T3
8.1 for alkyl groups to give 3,5-dimethyl-30 -isopropylthyronine 8.2, which in turn
retains impressive affinity and agonistic activity on the thyroid hormone receptor.
In contrast to triiodothyronine, which is both iodinated and metabolized by
a deiodinase, the alkyl groups of 8.2 are no longer metabolically cleavable.
Bioisosteric replacement was and is one of the most important strategies in
pharmaceutical research. Nonetheless, surprises sometimes occur. The replacement
of an ester for an amide group in the local anesthetics (▶ Sect. 3.4) expectedly
improved the metabolic stability. In the case of acetylsalicylic acid 8.3 (Fig. 8.2)
this exchange cannot be made. An analogous exchange of the –COO– group for
a –CONH– group results in a complete activity loss because the amide can no
longer acylate the cyclooxygenase enzyme (▶ Sect. 27.9). In the case of
p-aminobenzoic acid (R ¼ –COOH, Fig. 8.2) the exchange of a carboxyl group
for a sulfonamide group gives sulfanilamide 8.4 (R ¼ –SO2NH2), which is an
antimetabolite of p-aminobenzoic acid (▶ Sect. 2.3).
8.3 Systematic Variation of Aromatic Substituents 157
The goal of lead structure optimization has an impact on the planning of the
relevant experimental series. If the biological consequences of structural changes
are to be evaluated with minimal effort, careful design must precede the synthesis
of the substances. Here an almost unsolvable problem emerges in that, as a general
rule, the exchange of a substituent or group leads to complex changes in multiple
properties. The exchange of an ethyl group for a methyl group changes only the
lipophilicity and size of the substituent. If a methyl group is exchanged for
a chlorine atom, the polarizability, electronic properties, and moreover the metab-
olism is altered. Other substituents could then change the H-bond donor and
acceptor properties as well as the ionization and dissociation.
In 1971, Paul Craig proposed the use of a simple diagram for the structural variation
of aromatic substituents, with which the important characteristics of these substitu-
ents, for instance, lipophilicity and electronic properties, are plotted against each
other. The selection of substituents from different quadrants of this diagram allows
an evaluation of different combinations of properties. The concept can be extended to
multiple dimensions, possibly with the aid of mathematical and statistical methods.
In 1972, John Topliss made a suggestion that went further, which would be
called today an evolutionary strategy. One substituent at a time (e.g., hydrogen for
chlorine) is exchanged in the optimization of the substitution pattern of an aromatic
compound. The next compound is planned based on which of the first two com-
pounds demonstrated better effects. If the new substituent improves the effect,
a new substituent is chosen that has the same physicochemical properties, in larger
measure, or more of these substituents are added. If the new substituents impair the
biological activity, then a substituent is chosen that has the opposite physicochem-
ical properties. If two different substituents produce the same effect, it should be
evaluated whether changes in the physicochemical properties influence the activity
in the opposite direction. Despite its elegance, this strategy often fails for the
mundane reason that it is too time consuming to take such a stepwise approach.
158 8 Optimization of Lead Structures
As a consequence of the work of Craig and Topliss, further design methods were
developed. None of these methods should be interpreted too closely. Synthetic
planning must be oriented on both the accessibility of the compounds as well as
achieving the largest possible structural variation, that is, a diversity of physico-
chemical properties and 3D structure. Since the introduction of combinatorial
chemistry (▶ Chap. 11, “Combinatorics: Chemistry with Big Numbers”), the ratio-
nal design of diverse substance libraries has taken on entirely new possibilities and
perspectives.
The structural variation of a lead structure influences not only the activity strength
but also the activity spectrum. That can be thoroughly advantageous, but it also
brings with it the risk that the selectivity can deteriorate. A simple rule of thumb is
that enlarging the molecule, introducing optically active centers, and rigidification
improves the selectivity, assuming that the activity is not entirely lost. On the other
hand, removing a chiral center, establishing more flexibility, or reducing the size of
the molecule usually results in unspecific and weaker activity.
Because of the sequencing of the human genome, the gene family to which
a target protein belongs is known, as is the number of members of the gene family.
By using gene technology it is possible to construct single isoform test systems
(assays). As a result, today pharmaceutical research is in a position to make
a predictive selectivity profile. This has stimulated efforts to develop selective
drugs. An interesting corollary to these efforts is the fact that the molecular weight
of drugs has increased, as statistics show, in the last years, a confirmation of the
above-mentioned rule of thumb.
For drugs that are meant to act on neuroreceptors in the brain, the polarity is critical
to whether they can cross the blood–brain barrier. Polar compounds are unable to do
this and act only in the periphery, for instance, on the circulatory system. Examples of
this are adrenaline 8.5 and dopamine 8.6 (Fig. 8.3). The stepwise removal or masking
of polar groups brings the central effects into the foreground. Ephedrine 8.7 acts in the
brain and in the periphery, it is centrally stimulating and raises the blood pressure.
Amphetamine 8.8 (“speed”) and the intoxicant MDMA 8.9 (the designer drug
“ecstasy”) are weak bases. Their relatively nonpolar neutral forms easily overcome
the blood–brain barrier and their CNS effects dominate (Fig. 8.3).
There are exceptions even here. L-DOPA 8.10 (Fig. 8.3) is an extremely polar
amino acid. It could never cross the blood–brain barrier by passive diffusion alone.
Instead it is recognized by an amino acid transporter and actively transported over
the membrane and into the brain. This simultaneously solves the problem of
bringing dopamine 8.6, which is used to treat Parkinson’s disease, into the brain
because L-DOPA is decarboxylated to dopamine there (▶ Sects. 9.4 and ▶ 27.8).
The decisive influence that even the smallest changes in the structure can have is
seen in the effect spectrum of the hormone and neurotransmitter noradrenaline and
adrenaline and their synthetic analogues. Whereas noradrenaline 8.11 (Fig. 8.4)
8.4 Optimizing the Activity and Selectivity Profile 159
OH
H NH2
HO N
CH3 CH3
OH
HO H
N 8.8 Amphetamine
CH3
8.5 Adrenaline
CH3
H
HO NH2 8.7 Ephedrine N
O CH3
R CH3
HO O
Fig. 8.3 The polar compounds adrenaline 8.5 and dopamine 8.6 are cardiovascularly active in the
periphery after intravenous administration. Ephedrine 8.7 is more lipophilic and therefore shows
both peripheral and central effects. The more nonpolar compound amphetamine 8.8 (“speed”) has
overwhelmingly stimulatory effect in the CNS. 3,4-Methylenedioxymethamphetamine 8.9
(MDMA; “ecstasy”) is hallucinogenic. Polar groups are red and neutral or lipophilic groups are
blue.
8.11 Noradrenaline, R = H
OH
H Predominantly α-Mimetic
HO N 8.5 Adrenaline, R = CH3
R
α- and β-Mimetic
HO
8.12 Isoprenaline, R = -CH(CH3)2
β1-Mimetic
OH
H 8.13 Dobutamine
HO N
β1-Mimetic
CH3
HO
OH OH
H H
N CH3 Cl N CH3
HO
CH3 CH3
CH3 CH3
HO H2N
Cl
Fig. 8.4 Noradrenaline 8.11, adrenaline 8.5, and isoprenaline 8.12 act to different extents on the
a and b receptors. Selective b1 and b2 agonists, for instance, 8.13, 8.14, and 8.15, act specifically as
cardiac stimulants or bronchodilators.
160 8 Optimization of Lead Structures
O O O O O O O
S S S
H2N NH H2N OH
N O
Cl N Cl
H H
O O O
S 8.18 Carbutamide, R = NH2
N N CH3
H H 8.19 Tolbutamide, R = CH3
R
O O O
S
O N N
H H
Cl
N
H
OMe 8.20 Glibenclamide
Fig. 8.5 The sulfonamides hydrochlorothiazide 8.16, furosemide 8.17, and related diuretics are
different from most antibacterial analogues because of the unsubstituted sulfonamide group.
Carbutamide 8.18 and tolbutamide 8.19 were the first unspecific sulfonamides with hypoglycemic
effects that were later replaced with specific hypoglycemics of the glibenclamide-type 8.20.
affects the a-adrenergic receptors, its N-methyl derivative adrenaline 8.5 (Fig. 8.3)
acts on a and b receptors as a mixed a/b agonist. This difference was used to
enlarge the N-alkyl group to arrive at the specific b-agonist isoprenaline 8.2
(Fig. 8.4). Further differentiation of the effects could be achieved within the class
of b-adrenergic substances. Dobutamine 8.13 is missing the alcoholic hydroxyl
group of adrenaline. Despite its structural relationship to dopamine 8.6 (Fig. 8.3) it
is a b1 agonist with cardioselective effects. Specific b2 agonists, for instance
salbutamol 8.14 and clenbuterol 8.15 (Fig. 8.4) are used to treat asthma because
they are bronchiodilators without the cardio-stimulatory effects of the unspecific b
agonists (▶ Sect. 29.3).
The sulfonamides are a prime example for the targeted optimization of lead
structures in different therapeutic indications. From the first antibacterial examples,
the diuretics as well as hypoglycemics (antidiabetics) resulted. It had already been
noticed in 1940 that sulfanilamide (▶ Sect. 2.3) inhibits the enzyme carbonic
anhydrase, and therefore should lead to increased urine production (▶ Sect. 25.7).
Among other substances, hydrochlorothiazide 8.16, furosemide 8.17 (Fig. 8.5), and
structurally related compounds gained therapeutic importance. In the early 1940s,
the hypoglycemic effects of a few sulfonamides were clinically observed. The
antibacterial and simultaneously hypoglycemic carbutamide 8.18 was introduced
to therapy in 1955, the lipophilic and therefore more bioavailable tolbutamide 8.19
8.5 From Agonists to Antagonists 161
OH OH
H H
Cl N CH3 N CH3
CH3 CH3
Cl
8.21 DCI 8.22 Pronethalol
OH
H
O N CH3 8.23 Practolol, R = -NHCOCH3
OH O
H
O N
N N 8.25 Xamoterol
H
O
HO
Fig. 8.6 3,4-Dichloroisoprenaline 8.21 (DCI) and pronethalol 8.22, the first unspecific
b-blockers, were derived from isoprenaline 8.12. Practolol 8.23 and metoprolol 8.24 are specific
b1 agonists. Xamoterol 8.25 is a partial b1 agonist, a combined agonist and antagonist.
162 8 Optimization of Lead Structures
OH R
CH3
CH3
8.28 Terfenadine, T = CH3
Polar H1 Antagonist (non-sedating)
Fexofenadine, Active Metabolite: R = -COOH
8.28 (R ¼ H) can cross the blood–brain barrier because of its high lipophilicity, but
is immediately expelled by a transporter. Because of its cardiotoxicity, terfenadine
has been withdrawn from the market in the meantime and replaced by its active
metabolite fexofenadine 8.28 (R ¼ COOH). The sedating side effects of antihista-
mines also led to neuroleptics and antidepressants (▶ Sect. 1.6). Here, however, the
limits of rational drug optimization are apparent. Promethazine 8.29 is an antihis-
tamine with antiallergic action and sedating side effects. The neuroleptic chlor-
promazine 8.30 is a central depressant and therefore an antipsychotic; the
extraordinarily similar structure of imipramine 8.31 acts, on the other hand, as
a stimulant and is an antidepressant (Fig. 8.8). All three substances have different
mechanisms of action. The introduction of additional aromatic rings to other
receptor agonists, for instance, to the neurotransmitters acetylcholine and dopa-
mine, has led to antagonists (Fig. 8.9).
S S
N N Cl N
CH3
N CH3 CH3
H3C CH3 N N
CH3 CH3
Fig. 8.8 Closely related structures of active substances can have very different qualitative
activity. Chlorpromazine 8.30, a dopamine antagonist with neuroleptic activity, and imipramine
8.31, a dopamine transporter inhibitor with antidepressant activity, are both derived from
promethazine 8.29, an H1 antagonist with antiallergic activity.
H + D
N NH3 P
Fig. 8.9 The active
substance histamine 8.26 and N A
pharmacophores that are
attributed to it (A acceptor, D 8.26 Histamine Pharmacophore
donor, P positively charged (Positively charged
group). form at pH = 7)
These correlations are discussed in detail in ▶ Sect. 19.5. The molecular size influ-
ences the bioavailability insofar that substances with a molecular weight above
500–600 Da are captured by the liver on the sole grounds of the molecular size, and
are quickly excreted with the bile. Aside from this there are substances that penetrate
the membrane regardless of their polarity. These are taken up into the cell or are
eliminated from the cell by transporters (▶ Sect. 30.7). Among these are structural
analogues of amino acids and nucleosides. Classical strategies to extend the duration
of action are the conversion of free hydroxyl groups to ethers (see ▶ Sect. 9.2), the
replacement of esters with amides, and the replacement of metabolically labile amide
groups with isosteres. In a few cases, such structural changes are associated with
a reduction in potency, which is more than compensated for by a longer duration of
action. In the case of peptides the replacement of L-amino acids with D-amino acids,
the inversion of amide groups, and the replacement of larger structural elements with
peptidomimetic groups (▶ Sect. 10.4) have all proven successful.
The metabolism of aliphatic amino groups can be suppressed with alkyl substi-
tution or branching at the a carbon. Secondary alcohols can be converted to the
more bioavailable tertiary alcohols by introducing an ethinyl group at the same
carbon atom (▶ Sect. 28.5). The introduction of an isosteric fluorine atom in the
para position as a replacement for hydrogen prevents hydroxylation in this position.
If steric considerations do not play a role, the para position can also be blocked
164 8 Optimization of Lead Structures
with a larger group, such as a chlorine atom or a methoxy group. In the hydroxylated
3- and 4-position of the neurotransmitters dopamine, adrenaline, and noradrenaline,
the conversion to the monohydroxylated analogues, 3,5-dihydroxy compounds or to
the NH-isosteric indole group (Fig. 8.1, Sect. 8.2) led to metabolically more stable
and therefore longer-acting compounds.
Rational design is characterized by the fact that the common feature of all
active compounds, and the differences to less potent or inactive analogues
can be derived from the structure of the pharmacophore. A pharmacophore
(Sect. 8.9) is defined as a special arrangement of particular functionalities that
are common to more than one drug and form the basis of the biological activity
(▶ Sect. 17.1).
During the course of rational optimization the molecular scaffold and the sub-
stituents at a pharmacophore are changed to maintain the principle function while
arriving at higher potency or better selectivity. Many computer methods have been
developed to generate ideas for the spatial isomorphic replacement of ligand scaf-
folds. By considering the conformational aspects of the molecules (▶ Chap. 16,
“Conformational Analysis”), they scan databases to find possible candidates that,
despite a different parent scaffold, can place the side chains and interacting groups
in the same spatial orientation. Examples of such approaches are presented in
▶ Sect. 10.8 and ▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Com-
parisons”. But an indirect approach using the protein structure has also been tried.
For this, the spatial structure of the protein–ligand complex is the starting point
from which a part of the binding pocket is cut out, and new building blocks for the
ligand are sought. Subsequently the form and interaction properties of the cut-out
pocket are compared with a database of all known protein–ligand complexes
(▶ Sect. 20.4). If a subpocket is discovered that has similarities to the sought-
after pocket, then ligands that bind there provide an interesting design hypothesis.
The structure of the building blocks that occupy the newly discovered pocket can
generate ideas for isosteric structural elements in a modified ligand.
A different strategy that also considers the pharmacophore can be successful.
In this approach the pharmacophore is retained and only those groups are modified
that affect the pharmacokinetic properties, that is, the transport, distribution,
metabolism, and excretion of a molecule. An efficient and pragmatic strategy is
important. For this, it is essential that not too many changes are made at the same
time, and the changes should not be too biased. With little synthetic effort, a broad
spectrum of physicochemical properties and spatial arrangements should be
covered.
In the meantime it has been established that binding to human plasma proteins
such as serum albumin and the acidic k1-glycoprotein is of decisive importance for
the transport and pharmacokinetic properties of a drug. Therefore binding to these
proteins is considered even in the early phase of drug development (▶ Chap. 19,
8.8 Optimizing Affinity, Enthalpy, and Entropy of Binding and Binding Kinetics 165
kcal/mol ΔG ΔH −TΔS
5
−5
−10
−15
−20
r
ir
ir
ir
ir
ir
ir
vi
vi
vi
av
av
av
av
av
av
na
na
na
in
fin
en
an
an
un
di
ito
pi
qu
el
pr
az
pr
ar
In
Lo
R
N
Sa
Am
Ti
D
At
kcal/mol
5
−5
−10
−15
−20
tin
tin
tin
in
tin
at
ta
ta
ta
ta
st
as
as
as
as
va
uv
av
iv
uv
or
er
Fl
Pr
os
At
C
Fig. 8.10 Between 1995 and 2006, the profile of multiple development generations of HIV
protease inhibitors (upper, for formulae see ▶ Fig. 24.15) and statins as HMG-CoA inhibitors
(lower, for formulae see ▶ Fig. 27.13) could be optimized for their thermodynamic signatures, that
is, the extent to which they are driven by entropy or enthalpy. The free energy DG is shown in red,
the enthalpy DH in blue, and the entropic contribution TDS in green. The more negative the
column becomes, the stronger the binding affinity and the more the profile is determined by
enthalpy or entropy. The initially developed compound such as indinavir, saquinavir, nelfinavir,
and pravastatin were entropic binders; in contrast, the newer derivatives such as darunavir or
rosuvastatin have an improved enthalpic profile.
inhibitors (▶ Sect. 27.3) are displayed in Fig. 8.10. Notably, it has been successful
to shift the profile from initially strongly entropically driven binders to
enthalpically driven ones. This observation suggests that it is initially simpler to
optimize a substance’s entropic binding contribution than its enthalpic contribu-
tion. Most of the time this can be seen in the first lead structure upon which an
enlargement of the hydrophobic surface area leads to better binding. The affinity
that is gained is explained by the displacement of ordered water molecules
(▶ Sect. 4.6). Such contributions are assumed to be entropically favorable.
A strategy of introducing rigid rings can also be pursued. In doing so, the com-
pound loses degrees of freedom. If the geometry of the bound state is correctly
frozen, the binding is improved for entropic reasons. An example of this is the
8.8 Optimizing Affinity, Enthalpy, and Entropy of Binding and Binding Kinetics 167
O O
O CH3 O N
H H3C O
H
N
N N H3C S N
H
CH3 O O
H CO2H
O
8.32 HN NH2
8.33
NH2
HN
Fig. 8.11 The rigid thrombin inhibitor 8.32 only has a small number of rotatable bonds. It has an
optimal shape complementarity to the binding pocket of thrombin. Its binding is, for the most part,
entropically driven. On the other hand, the considerably more flexible ligand 8.33 has a higher
enthalpic binding contribution.
binding of the largely rigid thrombin inhibitor 8.32, which binds in an almost
exclusively entropically driven manner to the protein (Fig. 8.11). In contrast, the
decidedly more flexible ligand 8.33 displays a large enthalpic binding contribu-
tion. Compound 8.32 represents the result of an optimization that led to a substance
with single-digit nanomolar binding and an optimal shape complementarity for the
binding pocket of thrombin.
As it seems, in general there are applicable concepts for the entropy-driven optimi-
zation. If one can “always win entropically,” then for theoretical reasons enthalpically
favored lead structures should be preferred as a starting point for optimization.
However, caution is called for here. Why a ligand has a particular thermody-
namic profile must be clarified. The inhibitors 8.34 and 8.35 were discovered in
a virtual screening as aldose reductase inhibitors (Fig. 8.12). The chemical struc-
tures of both ligands are very similar. Nevertheless one is an enthalpically driven
binder, and the other is an entropically driven binder. The crystal structure of both
ligands with the protein delivered the reason: the enthalpically preferred inhibitor
8.34 entraps a water molecule, which mediates binding between the ligand and the
protein, whereas the other one does not. The incorporation of a water molecule is
entropically disfavored, and therefore the profile appears to be that of an enthalpic
binder. A resistance profile for inhibitors against mutants of the viral HIV protease
was investigated in the research group of Ernesto Freire at The Johns Hopkins
University in Baltimore (▶ Sect. 24.5). Interestingly, the result was that resistance
to the entropically favored inhibitors could be developed much faster than to
inhibitors with enthalpic advantages. This observation indicates that it is worth-
while to concentrate on enthalpically favored binders in cases in which resistance
can be expected to develop. In the investigated example the enthalpically driven
168 8 Optimization of Lead Structures
OH OH
O S
N
O O
N
O O2N O N N
O2N
8.34 8.35
ΔG: −35.4 kJ/mol ΔG: −31.3 kJ/mol
ΔH : −25.6 kJ/mol ΔH: −8.7 kJ/mol
−TΔS: −9.8 kJ/mol −TΔS: −22.6 kJ/mol
Fig. 8.12 Compounds 8.34 and 8.35 were discovered in a virtual screening as lead structure for
the inhibition of aldose reductase. Although they are structurally similar, 8.34 is a stronger
enthalpic binder and 8.35 is an entropic binder. The subsequent crystal structure analysis of the
complex with the reductase showed that 8.34 traps a water molecule upon binding, whereas this
was not observed with 8.35. Because the entrapment of a water molecule is entropically unfavor-
able, the binding of 8.34 is enthalpically preferred.
binder 8.33 had a less-rigid scaffold (Fig. 8.11). This allows it to more easily elude
changes that are caused by mutations. It is much more difficult for rigid ligands that
bind for entropic reasons to adapt to such steric modifications.
On the other hand, entropic binders can also have an advantage in escaping
resistance. If a ligand is entropically favored because it adopts multiple binding
modes, and even exhibits large residual mobility in the binding pocket when bound,
this can prove to be beneficial! If the protein tries to change the shape of its binding
pocket through resistance mutations to this inhibitor, an incorporated ligand that is
able to adopt multiple binding modes is left with alternative orientations, which,
despite the mutation, still offer good binding.
If it is clear that a lead structure is an enthalpically driven binder, and
superimposed effects such as the entrapment of water molecules have not distorted
the profile, how is the binding of an enthalpically driven binder optimized? Let us
remember the consideration in ▶ Sects. 4.5 and ▶ 4.8: hydrogen bonds, electrostatic
interactions, and van der Waals contacts determine the binding enthalpy. However,
a change in such an interaction property of a molecule is often coupled with
a compensation of enthalpy and entropy. The result is that DG and the binding
affinity do not change at all! The optimization process can be compared to the act of
getting around the inherent enthalpy/entropy compensation. Enthalpically favorable
hydrogen bonds should have an optimal geometry and should not induce severe
structural changes in the protein environment. Otherwise this can lead to an entropic
compensation by causing a shift in the dynamic degrees of freedom. It seems to be
more favorable to strengthen the hydrogen bonds in structurally rigid regions of the
binding pocket. There, enthalpy is better gained because the compensatory shift in
dynamic parameters is less likely. Introduced hydrogen bonds should also not reduce
the degree of desolvation of a bound ligand in that they induce small structural
changes in the binding geometry of hydrophobic groups that become stronger when
8.9 Synopsis 169
exposed to the surrounding solvent environment. It is also important that the local
water structure in the binding pocket remains unchanged.
Another essential question has to do with the optimal interaction kinetics that
a ligand should have. Surface plasmon resonance was introduced in ▶ Sect. 7.7.
The question of whether a ligand binds quickly or slowly to a protein and with what
rate it is released again can be determined with this method. Ideally, how long
should a ligand stay bound to a protein, what is the optimal residence time? The
binding affinity is determined by the relative ratio of the association rate (kon) and
the dissociation rate (koff). It has been shown that structurally similar ligands can
have entirely different kinetic profiles. Which profile is optimal? A loss in affinity
can manifest itself as an increased dissociation rate, or a slower association rate, as
well as a combination of both effects. It was shown in the research group of Helena
Danielson in Uppsala that different binding profiles of therapeutically used HIV
protease inhibitors correlate with the development of resistance to mutants of the
protease. They also demonstrated that resistance forms more rapidly against drugs
that have a higher dissociation rate. This is a decisive criterion to direct drug
optimization in the correct direction. Certainly the kinetic binding profile must be
granted a greater priority in the future. Therefore, a more comprehensive correla-
tion between the structure and the binding is necessary so that this knowledge can
be used for targeted design. Until now, what differentiates a “fast” or “slow” binder
has only been understood in a very few cases. These are parameters that have to do
with the induced-fit adaptations of the protein. It can also involve the ease with
which the desolvation of the previously uncomplexed binding pockets takes place
or with the kinetics with which a ligand in the solvated state sheds its own water
shell. More attention must be paid to these protein and ligand-based properties.
8.9 Synopsis
• A lead structure is only the starting point on the way to a drug; potency,
specificity, and duration of action have to be optimized concurrently to minimize
side effects and toxicity.
• The structure of an active substance is determined by its pharmacophore, which
is responsible for target binding. Its adhesion groups enhance potency and
biological activity, its lipophilicity is responsible for transport and distribution,
and groups to be cleaved or modified release the active form.
• Multiple concepts to modify the chemical structure of a lead can be planned,
however, optimization is multifactorial due to highly correlated influences of the
attempted changes.
• Bioisosteric functional group replacement attempts the exchange of groups on
a given skeleton for sterically and electronically related groups that maintain
activity but improve other drug properties.
• Me-too research follows the goal of modifying the competitor’s lead structures
to arrive at patent-free analogues with improved properties.
170 8 Optimization of Lead Structures
Bibliography
General Literature
Sneader W (1985) Drug discovery: the evolution of modern medicines. Wiley, New York
Taylor JB, Triggle DJ (eds) (2007) Comprehensive medicinal chemistry II. Elsevier, Oxford
Wermuth CG (ed) (2008) The practice of medicinal chemistry, 3rd edn. Elsevier-Academic,
New York
Special Literature
Copeland RA, Pompliano DL, Meek TD (2006) Drug–target residence time and its implications
for lead optimization. Nat Rev Drug Discov 5:730–740
Fokkens J, Klebe G (2006) A simple protocol to estimate protein binding affinity differences for
enantiomers without prior resolution of racemates. Angew Int Ed Engl 45:985–989
Hansch C (1974) Bioisosterism. Intra-Science Chem Rept 8:17–25
Lipinski CA (1986) Bioisosterism in drug design. Ann Rep Med Chem 21:283–291
Ohtaka H, Freire E (2005) Adaptive inhibitors of the HIV-1 protease. Prog Biophys Mol Biol
88:193–208
Shuman CF, Markgren P-O, H€am€al€ainen M, Danielson UH (2003) Elucidation of HIV-1 protease
resistance by characterization of interaction kinetics between inhibitors and enzyme variants.
Antiviral Res 58:235–242
Steuber H, Heine A, Klebe G (2007) Structural and thermodynamic study on aldose reductase:
nitro-substituted inhibitors with strong enthalpic binding contribution. J Mol Biol 368:618–638
Thornber CW (1979) Isosterism and molecular modification in drug design. Chem Soc Rev
8:563–580
Designing Prodrugs
9
After the optimization of a lead structure there are still problems. Many substances
lack important characteristics that are required for therapy in humans, for instance,
adequate bioavailability, duration of action and metabolic stability, the ability to
penetrate the blood–brain barrier, selectivity, or good tolerability. Often it proves
impossible to address or improve these properties through structural variation. A
solution to this problem can be found through special preparations, for instance to
be used for poorly water-soluble substances, or via a derivatization to a prodrug.
This term refers to a non-active or poorly active precursor or derivative of an active
molecule. In the organism this form is converted to the actual active substance. In
most cases, this is achieved by enzymatic reactions, in a few cases it happens by
spontaneous chemical decomposition.
Aside from this, the metabolites of some drugs also show favorable therapeutic
properties. In some cases this has led to new and improved drugs, in other cases the
original substance was retained as a prodrug.
Multiple factors have crucial importance for the absorption, bioavailability, and
duration of action of an active substance. The most important are the solubility and
lipophilicity of the drug, which are nearly equal in importance, followed by the
molecular size and the metabolic stability. The terms absorption and bioavailability
have very different meanings. Absorption refers to the amount of active substance
that is taken up by the entire gastrointestinal tract. The bioavailability refers to just
the portion of the active substance that is available in the circulation after the first
pass through the liver.
After oral administration, the metabolism of the substance by enzymes begins.
Ester and amide bonds are hydrolyzed, often already in the stomach and intestines,
or by passage through the stomach and intestinal wall. The entire blood volume
that flows through the intestines goes first to the liver via the portal vein (Fig. 9.1).
This passage is called “first pass”. Because of its rich spectrum of hydrolyzing,
Metabolites
Organs
Urine
Drug
Fig. 9.1 Schematic sketch of the “lifecycle” of a drug after oral administration. The drug is
already metabolized during the passage through the stomach or intestinal wall, and above all, at the
first pass through the liver. Lipophilic drugs and substances with a molecular weight of more than
500–600 Da are excreted with the bile. Polar substances and conjugated and/or metabolic products
(metabolites) are excreted by the kidneys.
oxidizing, reducing, and conjugating enzymes, the liver is the main site of drug
degradation, that is, metabolism. A drug can have poor bioavailability despite
good absorption because of fast and pronounced metabolism in the liver. For many
substances, the first pass is already ‘the end of the road’. They are well absorbed,
but are immediately metabolized or excreted in the bile. The “first-pass effect”
refers to cases of successful and extensive metabolism in the very first passage.
Lipophilic active substances and those with a molecular weight of more than
500–600 Daltons (Da) are susceptible to particularly intense first-pass effects. Of
course, blood flows continuously through the liver, and metabolism carries on. The
substances are no longer in the blood stream at as high a concentration as they were
before the first liver passage because they have been distributed to the tissue. In
general, the hydrolytic cleavage of ester or amide groups leads to highly water-
soluble metabolites that can be excreted by the kidneys. Conjugation, that is, the
coupling of the substance with native polar substances, for instance, with sulfate
groups, the amino acid glycine, or the glucose oxidation product glucuronic acid,
leads to easily excreted products. In humans, conjugation has great importance. It
is more critical if the substance has neither easily degradable functional groups nor
conjugation positions. Nonetheless humans have enzymes that can metabolize xeno-
biotics. Among these, the cytochrome P450 isoenzymes are particularly important
because they are able to chemically change a molecule oxidatively at various posi-
tions. Usually this leads to better water solubility and therefore better-excretable
substances. Because these enzymes cannot predict what properties the metabolites
of these biotransformations will possess, it can occasionally happen that toxic com-
pounds ensue that have mutagenic or carcinogenic properties (▶ Sect. 27.6).
Evolution has had time over millions of years to hone the degradation and
excretion of foreign substances. For many compounds however, the system fails.
Instead of detoxifying, the opposite happens, a “poisoning”. The carcinogenic
effect of polycyclic hydrocarbons is attributed to an oxidative assault, just as is
9.1 Foundations of Drug Metabolism 175
H
Further Conjugation with
O Metabolization Macromolecules
H
9.1 Benzene Epoxide
Fig. 9.2 The oxidation of benzene 9.1 leads to a reactive and toxic intermediate. In contrast, the
oxidation of toluene 9.2 affords benzoic acid 9.3, which can be excreted by the kidney as its
nontoxic glycine conjugate 9.4.
the bone marrow damage and blood disease that is caused by benzene 9.1. The
simplest alkyl homologue of benzene, toluene 9.2 is less toxic for this reason alone
because it can be oxidized to benzoic acid 9.3, which, after conjugation with the
amino acid glycine, can be excreted as hippuric acid 9.4 (Fig. 9.2). There are even
more conjugation possibilities available for the benzoic acid intermediate.
One can speculate as to why no multienzyme complexes have evolved to
immediately convert toxic intermediates into polar, nontoxic metabolites. In any
case, it is an almost unsolvable problem because the properties of the metabolites
would have to be predicted for each xenobiotic. A modification that leads to
improved water solubility in one compound can cause a mutagenic effect in
another. For their own protection humans have, in fact, mechanisms for trapping
reactive metabolites. Here glutathione and glutathione transferase must be men-
tioned because they can detoxify electrophiles particularly well (▶ Sect. 27.7).
Perhaps toxic or carcinogenic effects were not a particularly decisive theme for
evolution until now. Tumors play a secondary role for most animals because of their
short lifespan. Up until just a few generations ago, war and infectious diseases were
the primary causes of death in humans. It has only been in recent times that the
average life expectancy increased. In the sense of evolution, aging individuals play
only a secondary role. Once reproduction is complete, the parents are only neces-
sary for the care of their young until early adulthood. One only needs to think of
female spiders that consider their mates to be nothing more than their next prey
immediately after copulation!
From the above-described examples of toxic chemicals, the wrong conclusion
should not be drawn that only human-made substances can cause cancer. That is
true for a few natural products as well, for instance, aflatoxins. These microbial
secondary metabolites, which form in spoiled nuts and other foodstuffs are potent
carcinogens. Certain alkaloids, for example, from the Spurge family (Euphorbiaceae)
are also strongly cancer-promoting substances; they are so-called tumor promoters.
The principle of nil nocere (Lat. do not harm) is strictly applied to medicines,
and only slowly have these standards been applied to other materials in our
176 9 Designing Prodrugs
environment. For the testing and development of active compounds, this means that
particularly rigorous tests for carcinogenic, mutagenic, and teratogenic effects must
be conducted. The well-founded suspicion alone that a compound or one of its
possible metabolites displays such effects leads to the consequence that the com-
pound is not further developed.
CH3COO O
O
OR
H3C CH3
O H H Cl
N CH3
9.6 Clofibrate, R = Et
CH3COO 9.7 Clofibric acid, R = H
9.5 Heroin
R1 R2
O
EtOOC N
CH3
N
N O
ROOC N
H COOH
O
Fig. 9.3 Heroin 9.5, the diacetyl derivative of morphine acts reliably and quickly, “heroically.”
Like morphine, it is slowly and inefficiently absorbed, but after intravenous application it crosses
the blood–brain barrier 100 times faster than morphine. There, the ester is converted by the
enzyme pseudocholinesterase to morphine, which can no longer leave the brain because of its
higher polarity. The cholesterol-lowering drug clofibrate 9.6 is a prodrug of the actual active
compound, the free acid 9.7. The antihypertensive enalapril 9.8 is also a prodrug of the active
compound 9.9. Here the high lipophilicity is not responsible nor the absorption, rather it is actively
transported by binding to a dipeptide transporter. The diester of enalapril is unsuitable as a drug
because it spontaneously forms the inactive diketopiperazine 9.10.
Other ester prodrugs were developed for depot formulations to achieve a longer
duration of action after subcutaneous or intramuscular administration.
The phenolic hydroxyl group of bambuterol 9.15 is masked as a carbamate.
Terbutaline 9.16 (Fig. 9.5) is formed from this prodrug after hydrolysis by
unspecific cholinesterases (▶ Sect. 23.7). By using this prodrug strategy it was
possible to make a long-acting bronchospasmolytic that only needs to be adminis-
tered once daily in contrast to the actual active substance, which must be admin-
istered three times daily.
Occasionally, a prodrug can be used to improve the taste, for instance, in the case
of the extremely bitter chloramphenicol 9.17. By converting it to the palmitate 9.18
(Fig. 9.5) the water solubility is strongly reduced, but the substance no longer tastes
bitter. The concomitant reduction in the absorption is of no consequence. The
substance is hydrolyzed to the highly soluble and easily absorbed chloramphenicol
in the duodenum by the pancreatic lipase enzymes.
The glucoside salicin (▶ Sect. 3.1) represents a true prodrug that after hydrolysis
and oxidation is converted to the anti-inflammatory salicylic acid. In contrast,
acetylsalicylic acid (ASA) is a mixed type. It has its own activity through the
178 9 Designing Prodrugs
SCoA
9.12 Mevalonic Acid
9.11 HMG-CoA
H H
O
HO HO COOH
O OH
R R
9.13 Lovastatin 9.14 Active Metabolite
N
O H3C
CH3 H3C
O CH3
CH3 HO
N CH3
H N
Bioactivation H
OH
OH
O HO
N
9.16 Terbutaline
O
Na +
9.15 Bambuterol R N
SO2
OPO(OH)2 O
HO H N
OR O O
H3C
H N N
HN CHCl2 H N
O2N
O
CF3
9.17 R = H
9.20 R = Methyl
9.18 R = CO(CH2)14CH3 9.19 Fosphenytoin
9.21 R = Ethyl
Celecoxib Prodrugs
H2N N NH2
H H H
N N N CH3
N N
Bioactivation
NH NH CH3 CH3
Cl H3C
Cl
9.22 Proguanil
9.23 Cycloguanil
O
S S
Bioactivation
F F CH2COOH
CH2COOH
Cl Cl
S R N
Cl Cl
9.28 Mustard gas 9.29 N-analog, R = CH3
9.30 N-Aryl-analog, R = Aryl
O O Cl
P N 9.31 Cyclophosphamide
N Cl
H
Metabolic activation
in the liver
O O Cl
P N
N Cl
H
HO
O O Cl HO O Cl
P N + O
P N
H2N Cl H2N Cl
O
9.32 Active form Acrolein
Fig. 9.7 The cytostatic N-methyl and N-aryl compounds 9.29 and 9.30 are derived from mustard
gas 9.28. The first step in the activation of the prodrug cyclophosphamide 9.31 is a metabolic
hydroxylation of the carbon next to the nitrogen atom. The biologically active agent 9.32 and the
toxic side product acrolein come from a labile intermediate that is formed by enzymatic degrada-
tion and spontaneous decomposition.
In the case of the cancer therapeutic 5-fluorouracil 9.33, the activation occurs
through tumor-specific enzymes. The triple-prodrug capecitabin 9.34 is initially
activated to 9.35 by a carboxylesterase in the liver (Fig. 9.8). Then cytidine
deaminase cleaves an amino group to give 9.36 in the liver as well as in the
tumor. Lastly thymidine phosphorylase releases the active substance 9.33 in the
tumor cell. There, the compound unleashes its effect by blocking thymidylate
synthase, an enzyme that plays an important role in the thymine biosynthesis
(▶ Sect. 27.2) in that it delivers building blocks for DNA synthesis. Because cancer
cells divide more quickly than healthy cells, they are more dependent on the activity
of thymidylate synthase.
O
CH3 NH2
HN O F
F N
N
O N
O N H 3C O
H3C O Carboxyl- Cytidine-
esterase deaminase
Liver HO OH Liver, Tumor
HO OH 9.35
9.34 Capecitabin
O
F
HN O
F
O N HN
H3C O Thymidine-
phosphorylase O N
Tumor H
HO OH
9.36 9.33 5-Fluorouracil
Fig. 9.8 The triple-prodrug capecitabin 9.34 is activated to 9.35 by a carboxylesterase in the liver,
then it is transformed into 9.36 by a cytidine deaminase in the tumor, and a thymidine phosphor-
ylase produces the cancer therapeutic 5-fluorouracil 9.33.
HO NH2 HO NH2
HO HO COOH
OH NH2 CH3
H
HO N N
N CH2OH CH
H CH3
O
HO
Fig. 9.9 Because dopamine 9.37 cannot enter the central nervous system, the metabolic precursor
L-DOPA 9.38 is used. To reduce the cardiovascular effects of dopamine, L-DOPA is combined
with a peripherally active decarboxylase inhibitor benserazide 9.39. The administration of
a monoamino oxidase inhibitor, for example, selegilin 9.40, prevents the fast degradation of
dopamine.
L-DOPA with the peripheral decarboxylase inhibitor benserazide 9.39 and the CNS-
effective monoamino oxidase inhibitor selegilin 9.40 (▶ Sect. 27.8) largely solves
this problem. The peripheral side effects are reduced and the CNS effects are
extended (Fig. 9.9). Despite this tour de force of drug design, which has led to
significant therapeutic progress, the metabolically produced dopamine still acts in
too many places. Aside from the residual peripheral side effects, sudden changes
between excessive movement, normal movement, and rigidity, insomnia, agitation,
and hallucinations are all manifestations of the generalized CNS activity.
It has been speculated in conjunction with this observation, whether, in addition
to endogenous and genetic factors, environmental factors, for example, the meta-
bolic transformation of structurally analogous foreign substances, might be respon-
sible for triggering Parkinson’s disease.
The design of active substances that exert their effect only in, or overwhelmingly in,
one particular organ is called drug targeting. Aside from general principles, for
example an optimal lipophilicity as a prerequisite for crossing the blood–brain
barrier, specific metabolic transformations are used. The Parkinson’s disease drug
L-DOPA, which was introduced in the previous section, is such a prodrug. The
anticonvulsive medicine progabide 9.41 is a double prodrug because both func-
tional groups of the neurotransmitter are masked. After crossing the blood–brain
barrier and release of the amino and carboxyl groups, the actual active compound,
g-aminobutyric acid (GABA, Fig 9.10), is formed.
184 9 Designing Prodrugs
OH
O O
N H 2N
F NH2 OH
Blood–Brain
Barrier
Fig. 9.10 Because it is a lipophilic neutral molecule, progabide 9.41 can cross the blood–brain
barrier. It is transformed into the neurotransmitter g-aminobutyric acid (GABA) 9.42 upon
metabolic release of the amino and carboxyl groups.
H H O H H O
X Drug X Drug
N Neutral N
lipophilic
CH3 CH3
X Drug X Drug
+N Charged +N Metabolic
CH3 polar CH3 cleavage
9.44 Free
drug
Fast
elimination
Fig. 9.11 Drug targeting in the brain is accomplished with a drug–dihydropyridine conjugate
9.43. This substance can easily enter the central nervous system. Metabolic oxidation leads to
a permanently charged pyridine 9.44, which cannot cross the blood–brain barrier. The active
compound is released in the brain, and the polar conjugate is quickly excreted from the periphery.
The ability of the blood–brain barrier to exclude polar substances can also be used
as a prodrug concept. For this an active compound with a metabolically labile group
can be coupled to a dihydropyridine. The neutral conjugate 9.43 can cross the blood–
brain barrier. Oxidation leads to a permanently charged compound 9.44, which can
no longer leave the brain. Upon metabolic cleavage the free active compound is
released in situ (Fig 9.11). If oxidation takes place in the periphery, the highly water-
soluble complex is excreted before the actual active substance is released. As nice as
this principle seems, it has not found its way into therapy yet.
9.5 Drug Targeting, Trojan Horses, and Pro-prodrugs 185
O
N 9.45 Aciclovir, X = H NH2
HN
9.46 Valaciclovir, X = CH3
H2N N N
O CH3
XO O
Fig. 9.12 Aciclovir 9.45 is a Trojan horse. An enzymatic phosphorylation of its hydroxyl group
by a viral kinase affords its monophosphorylated form in virus-infected cells only, which is then
transformed to the triphosphate derivative by the cellular kinases. Valaciclovir 9.46 is a
pro-prodrug because it is first transformed to aciclovir by hydrolysis and subsequently activated.
Several analogues of nucleoside bases and nucleosides are Trojan horses. The
anti-herpes medicine aciclovir 9.45 enters the cell as its inactive form. The first
monophosphorylation occurs only in virus-infected cells by a virus-specific thymi-
dine kinase. Next cellular kinases carry out the formation of the triphosphate, the
actual active substance. Because of this aciclovir acts as a targeted antiviral. The
compound is, however, poorly absorbed. The more suitable valaciclovir 9.46
(Fig. 9.12) is understood to be a pro-prodrug. In the organism it is initially
hydrolyzed to aciclovir and then transformed into the active form by the viral
enzyme. Valaciclovir is more lipophilic than aciclovir, but despite this it is more
soluble in water and approximately 55% bioavailable.
Omeprazole 9.47 is the prodrug of an irreversible inhibitor of the H+/K+-
ATPase, the so-called proton pump. Only under strongly acidic conditions, in the
acid-producing cells of the stomach, it is transformed into sulfenic acid 9.48, which
is in equilibrium with cyclic sulfenamide 9.49 (Fig. 9.13). This reacts irreversibly
with an SH group of the enzyme to form a disulfide. Omeprazole is more effective
than the H2 antagonists (▶ Sect. 3.5) because it blocks not only the histamine-
induced acid secretions but rather all forms of acid secretion.
The different metabolic activity in different tissues can be used to achieve
a selective effect in one specific organ. In principle, adrenaline (▶ Sect. 1.4) as
well as some b-blockers are suitable for the treatment of glaucoma, because they
can normalize elevated intraocular pressure. However, they have substantial unde-
sirable side effects on the heart function and circulation. This can be avoided by the
administration of prodrugs that are metabolized more quickly in the eye, or only in
the eye, for example, a particularly robust ester 9.50 of adrenaline 9.51, or a ketone–
oxime ether 9.52 of timolol 9.53 (Fig. 9.14).
The area of drug targeting has developed into an exciting field in the last years.
Aside from the above-described prodrugs that release active compounds in the
target area, the concept of antibody-coupled drugs has been pursued especially
for the development of novel cancer therapeutics. Another approach is the
coupling of drugs to a cell-specific recognition sequence. The goal of this work
is to trick the membrane transporters of very specific cells so that the drug
conjugate gains entry. Tumor therapeutics that were derived from N-lost were
introduced in Sect. 9.3. These cytotoxic alkylating compounds, however, are very
reactive and should only be activated in the desired target tissue. For this, the
186 9 Designing Prodrugs
CH3
OMe
CH3
N
CH3 OMe
N H+ N +
S N
N CH3
MeO O
H MeO N
H S
9.47 Omeprazole OH 9.48
CH3
CH3
N + ATPase-SH
N +
N OMe
N OMe
MeO N
H MeO N
CH3
S CH3
S
ATPase S 9.49
Fig. 9.13 In the presence of acids, omeprazole 9.47 is rearranged to a sulfenic acid 9.48, which is
in equilibrium with a cyclic sulfenamide 9.49. This reacts irreversibly with an SH group on the
H+/K+-ATPase, the so-called proton pump.
O
OH X
H
RO N H CH3
N O
CH3 N
CH3
RO N N H3C
S
Ketone, X = O
OH
H
HO N
CH3
HO 9.53 Timolol, X = H, OH
9.51 Adrenaline, R = H
Fig. 9.14 The metabolic peculiarities of the eye are exploited for drug targeting in glaucoma
therapy. After penetrating the cornea, the bis-pivaloyl ester, dipivefrin 9.50 of adrenaline 9.51 is
hydrolyzed 20 times faster than it is in the periphery. The oxime ether of timolol 9.52 is
metabolized through the ketone to the active form, timolol 9.53, only in the eye.
9.6 Synopsis 187
9.54 9.55
following strategies were developed. Aromatic N-lost derivative 9.55 (Fig. 9.15) is
released from prodrug 9.54 by specific peptide cleavage with carboxypeptidase
G2, an enzyme that only exists in bacteria. This enzyme was coupled to
a monoclonal antibody (▶ Sect. 32.3) that specifically recognizes human colorec-
tal cancer cells. With this, the enzyme that “arms” the cancer drug is brought in
the immediate vicinity of the cancer cell. In the future, this antibody-guided
enzyme-activated prodrug therapy could make cancer therapy more tolerable
and less toxic by releasing the active substance locally and in a distinctly more
targeted way.
9.6 Synopsis
Bibliography
General Literature
Balant LP, Doelker E (1995) Metabolic considerations in prodrug design. In: Wolff ME (ed)
Burger’s medicinal chemistry, vol I, 5th edn. Wiley, New York, pp 949–982
Bodor N (1987) Prodrugs and site-specific chemical delivery systems. Annu Rep Med Chem
22:303–313
Bundgaard H (ed) (1985) Design of prodrugs. Elsevier, Amsterdam
Bundgaard H (1991) Design and application of prodrugs. In: Krogsgaard-Larsen P,
Bundgaard H (eds) A textbook of drug design and development. Harwood Academic, Chur,
pp 113–191
Ettmayer P, Amidou GL, Clement B, Testa B (2004) Learned from marketed and investigational
prodrugs. J Med Chem 47:2394–2404
Gibson GG (1994) Introduction to drug metabolism. Blackie, London
Rautio J (2012) Prodrugs and targeted delivery—towards better ADME properties. In:
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry,
vol 47. Wiley-VCH, Weinheim
Silverman RB (2004) The organic chemistry of drug design and drug action, 2 edn. Elsevier
Academic, Oxford, Chapter 7, Drug metabolism, and Chapter 8, Prodrugs and drug delivery
systems
Stella VJ, Borchardt RT, Hageman MJ, Oliyai R, Maag H, Tilley JW (eds) (2007) Prodrugs:
challenges and rewards, vol 2. Springer, New York
Testa B (2007) Prodrug and soft drug design. In: Taylor JB, Triggle DJ (eds) Comprehensive
medicinal chemistry II, vol 5. Elsevier, Oxford, pp 1009–1041
Testa B, Mayer JM (2003) Hydrolysis in drug and prodrug metabolism – chemistry, biochemistry
and enzymology. Wiley-VHCA, Z€ urich
Special Literature
Bodor N, Buchwald P (2005) Ophthalmic drug design based on the metabolic activity of the eye:
soft drugs and chemical delivery systems. AAPS J 7:E820–E833
Brewster ME, Pop E, Bodor N (1993) Chemical approaches to brain-targeting of biologically
active compounds. In: Kozikowski AP (ed) Drug design for neuroscience. Raven, New York
Napier MP, Sharma SK et al (2000) Antibody-directed enzyme prodrug therapy: efficacy and
mechanism of action in colorectal carcinoma. Clin Cancer Res 6:765–772
Peptidomimetics
10
Peptides are open-chain polymers made up of amino acids (Fig. 10.1). The main
chain is constructed of alternating amide groups —CONH— and aliphatic
carbon atoms, which are labeled Ca. The side chains branch from the main chain
at the Ca atom. The amide group is barely flexible (▶ Sect. 14.1). In contrast,
a rotation around the Ca–Cb bond is possible. The side chains are flexible as well.
Because of this, each amino acid can take on multiple conformations. As a
consequence, peptides are very flexible molecules with many rotatable bonds and
a multitude of possibilities to adopt different spatial configurations. Formally, there
is no difference between the construction of peptides and proteins. Nonetheless,
oligomers of amino acids up to a size of 30–50 monomer building blocks are called
peptides, and the term protein is preferred for any members of this substance class
that are above this limit.
O O
H H
H2N N N COOH
N N
H H
O O
HO
Tyr Gly Gly Phe Leu
Cg
Cb
O χ
H
N
N
ω Hφ ψ
O
Ca
Fig. 10.1 The pentapeptide Leu-enkephalin as an example of a peptide structure. The left side
with the free NH2 group is the N terminus, and the other is the C terminus. Each amino acid
contributes three atoms to the peptide chain. Nature almost exclusively uses the 20 natural
(proteinogenic) L-amino acids for the construction of peptides (see Appendix 1). Depending on
the functional groups in the side chains, the distinction is made between hydrophilic acidic and
basic amino acids and those with hydrophobic aliphatic and aromatic side chains. The amino acids
are abbreviated with three-letter codes. A one-letter code is also used. The definition of the torsion
angles o, f, c, and w is shown on the example of the amino acid phenylalanine. The angle o is
practically always close to 180 . The spatial course of the peptide backbone is determined by the f
and c angles (see ▶ Sect. 14.2). The first atom in the side chain is called the Cb atom, and the next
is given the index g.
H-Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly-NH2 Oxytocin
O Me O Me
Me N N O
N N
N Ciclosporin
O O Me
O
O Me N
Me N O O O
H
N N
H N N
Me O H
O
pGlu-His-Trp-Ser-Tyr-D-Leu-Leu-Arg-Pro-NHEt Leuprolide
Fig. 10.2 Peptides as drugs. Oxytocin is used to induce and strengthen contractions during labor.
The immunosuppressive ciclosporin prevents organ rejection after transplantation. Leuprolide
(pGlu ¼ pyro-glutamate) is an analogue of LHRH (luteinizing hormone releasing hormone), one
of the hypothalamic hormones that, via LH (luteinizing hormone), controls the synthesis of male
and female sexual hormones. Leuprolide is used to treat advanced-stage prostate cancer.
are replaced with isosteric building blocks so that the molecular recognition
properties of the peptide remain, but the undesirable characteristics are reduced.
Such peptidomimetics should have the following qualities:
• Few or no cleavable amide bonds to improve metabolic stability.
• Reduced molecular weight to improve oral bioavailability.
• The same spatial orientation of groups responsible for strong binding to the
receptor or enzyme as in the peptide.
Bacteria are the true masters of constructing peptide structures that frequently
achieve the desired metabolic stability. They incorporate amino acids that do not
belong to the typical 20 residues that are usually used for the construction of
proteins. Stereochemically inverted amino acids are also employed, and many of
these structures have a cyclic architecture. They have even evolved a dedicated
synthesis machinery for this: nonribosomal peptide synthesis (▶ Sect. 32.6). This
system of modular, coupled enzymes works like an assembly line. Depending on
the desired product, different enzymatic functional units are lined up, one after the
other, to successively assemble the amino acids cyclizing the product in the final
step. The exchange of an enzymatic synthesis unit causes other amino acids to be
incorporated into the otherwise unchanged peptide. Even ester bonds can be
constructed with a very similar multienzyme complex. Many lead structures all the
way to complete drugs can be derived from these originally bacterial peptides, such
as ciclosporin in Fig. 10.1, which is a most important immunosuppressant. A large
number of macrolide antibiotics (▶ Sect. 32.6) are also synthesized in this way.
Recently a so-called chemoenzymatic synthetic strategy has been developed for
the construction of such macrolides. As discussed in ▶ Sect. 11.6, linear oligopeptides
can easily be synthesized by using the Merrifield synthesis. Non-natural amino acids
192 10 Peptidomimetics
with L and D configurations can also be used to generate high combinatorial diversity.
It is very difficult to cyclize these linear oligopeptides to the desired macrocycle by
using chemical-synthetic methods. Here the nonribosomal peptide synthetic machin-
ery is of service. The synthetically prepared peptides are then funneled into the
enzymatic process chain and the cyclization domain from the bacteria catalyzes
the ring closure of the peptide: a perfect symbiosis between synthetic chemistry
and enzyme biology!
In the beginning of the 1980s, there was only one generally accepted example for
a low-molecular-weight active substance that takes over the function of an endog-
enous peptide: the opiate. It is assumed that morphine 10.1 is a mimetic of the
endogenous peptide b-endorphine 10.2 (Fig. 10.3). A comparison of both structures
makes it immediately clear that morphine cannot possibly simulate all of the
functional groups of the peptide. Obviously not all are necessary for the biological
activity. This underscores the suspicion that other peptides also bind to receptors
with only a few functional groups. If this hypothesis is true, it should be possible to
identify the essential functional groups and find a small organic molecule that has
the necessary functional groups in the correct relative orientation.
The starting point for the design of peptidomimetics is the identification of the
biologically active peptide, the function of which is to be imitated. In the first step,
single amino acids are excluded to determine whether a portion of the peptide
retains sufficient activity. Next the importance of the individual side chains is
investigated. In a so-called alanine scan (Sect. 10.7), each amino acid is succes-
sively replaced with alanine. A severe loss of activity is an indication that the
removed side chain is important. Until now only peptides made up of the natural
20 amino acids have been investigated. In the next step structural elements are
introduced that do not occur in the 20 proteinogenic amino acids. In principle, the
following are possibilities for peptide structure modification:
• The use of D- instead of L-amino acids.
• Modifications of the side chain of amino acids.
HO
O H H Tyr-Gly-Gly-Phe-Met-Thr-Ser-Glu-Lys-Ser-
N CH3 Gln-Thr-Pro-Leu-Val-Thr-Leu-Phe-Lys-Asn-
Ala-Ile-Ile-Lys-Asn-Ala-Tyr-Lys-Lys-Gly-Glu
HO
Fig. 10.3 Morphine 10.1 is a peptidomimetic for the endogenous peptide b-endorphine 10.2 and
the enkephalins (▶ Sect. 1.4). It binds as an agonist to the opiate receptor.
10.3 First Step to Variation: Modifying Side Chains 193
β
α COOH
Phenylalanine
NH2
COOH
NH
NH N COOH H2N
H
O
NH HN
H2N NH2
O O
F O
Fig. 10.4 Sterically demanding, conformationally fixed, or metabolically stable analogues of the
amino acid phenylalanine; the structural enhancements are indicated in red.
194 10 Peptidomimetics
binding pocket more completely. Rigid analogues lead to improved binding if the
biologically active conformation, the one that is adopted in or at the receptor site,
is immobilized.
The introduction of nonproteinogenic amino acids can increase the metabolic
stability. The hydroxylation of aromatic side chains can be suppressed by using
a substituent, for example fluorine or a methoxy group, in the para position.
Stability to cleavage by the digestive enzyme chymotrypsin can be improved by
adding substituents to the Cb atom because the modified side chain no longer fits
into the active site of this protease. A peptide’s proteolytic stability can also be
improved by exchanging L- for D-amino acids. As described above, bacteria have
already recognized this trick. Distributing D-amino acids randomly in the peptide
can furnish active substances with astonishing metabolic stability.
O R
Amide bond
N
H
O R O R H OH R R R
N
CH3
N-Methyl- Ketomethylen- Hydroxyethylen- (E)-Ethylen- Carba-
R R
O N
H
Ether Reduced Amide
R
H O OH R
N X = -NH-, -O-, -CH2-
P
X
O
Retro-inverso Phosphonamides, Phosphonates, Phosphinates
Fig. 10.5 Different functional groups that can serve as a replacement for amide bonds in
peptidomimetics.
10.4 A More Courageous Step: Modifying the Main Chain 195
possible replacement for the amide bond that is destined for cleavage (▶ Chap. 23,
“Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate”). The
hydroxyethylene group is especially suitable for aspartic protease inhibitors
(▶ Chap. 24, “Aspartic Protease Inhibitors”). Phosphonamides, phosphonates,
and phosphinates are often strong inhibitors of metalloproteases (▶ Chap. 25,
“Inhibitors of Hydrolyzing Metalloenzymes”).
y i+1 f i+2
Fig. 10.6 A b turn is
a peptide conformation in Ri+1 O Ri+2
f i+1 y i+2
which a hydrogen bond is
formed between the amino N
acids i and i + 3. Particular HN H O
ranges for the values of the O HN
torsion angles fi+1, ci+1, fi+2, Ri Ri+3
and ci+2 are characteristic for NH O
the b turn.
10.5 Rigidifying the Backbone by Fixing Conformations 197
H H
S S
N N H
N H N S
H O
O O NH
O
O O
R O O
S
N
H R
N O
N
O O
O N O N
Figure 10.7 Typical b-turn mimics. The amino acids are added onto the template at the colored
positions.
R N
O O
N N N N
N N
N H O H
O R R O O
H
O
N N S S
HN H N
H N
N N O
N N N
O R H H
R R O O
Fig. 10.8 The illustrated rings replace one or two amino acids and force a particular
conformation.
H
N
N
O
N 10.3 TRH
N
H
NH O CONH2
O
H
N
H
N
N
N
O
O N CONH2 N CONH2
R Ph
Pharmacophore 10.4
Fig. 10.9 By starting with the structure of tripeptide TRH 10.3 and a hypothesis for the functional
groups that are essential for binding, the non-peptidic molecule 10.4 was designed, which also
binds to the TRH receptor.
TRH is the tripeptide pGlu–His–Pro–NH2 10.3. The approach is shown in Fig. 10.9.
After deducing a pharmacophore hypothesis, a rigid scaffold molecule was sought
upon which the side chains could be appended in the correct relative orientation.
Cyclohexane was chosen as a scaffold. Compound 10.4 is a potent TRH receptor
ligand. The substance acts as an agonist and elicits the same effects as TRH. An
improvement in cognitive function could be seen in animal experiments after the
administration of 10.4.
Proteins communicate with one another and transmit information and signals in that
they form mutual complexes through commonly shared surfaces. The area of the
shared contact surface is usually larger than a few thousand square Ångströms (Å2).
This is a large value when compared to the surface that a small organic molecule of
typical drug size occupies upon binding. Furthermore, the contact area between
two proteins is, as a general rule, not very jagged. It hardly resembles the deep
binding pockets in enzymes that can host small ligands. Nevertheless it would open
entirely new perspectives for drug therapy if such protein–protein contact surfaces
could be blocked with low-molecular-weight compounds. At first glance, this task
seems almost impossible. How can a small molecule bind to a flat, barely structured
10.6 Peptidomimetics to Interfere with Protein–Protein Interactions 199
Fig. 10.10 The NMR spectroscopic structure of the BCL-XL protein with the a-helical,
16-membered peptide fragment from the BK protein (orange). The peptide binds in a deep groove
with the amino acids Ile85, Ile81, Leu78, Val74 (from left to right, side chains are in light blue).
The surface of the BCL protein is shown in white, the contact surface of the hydrophobic amino
acids of the peptide all protrude into the cleft and are indicated by the light-blue net.
protein surface with an interaction that is strong enough not to be “washed away”
when the protein–protein contact forms? Furthermore, there is the problem that
amino acid residues on the convex surface of a protein have in general much more
space to flexibly adapt their conformation. A statistical analysis of the amino acid
composition across the contact surfaces in protein complexes showed a preference
for aromatic residues, aspartate, arginine and the aliphatic residues proline and
isoleucine. The selective exchange of amino acids in the contact surface also
showed that there are a few protruding residues that dominate the interaction
(so-called “hot spots,” ▶ Sect. 17.10). The search for possible binding sites of a
small molecule that can compete with the formation of the protein–protein interface
starts with a detailed analysis of the complementary geometry to the contacting
surfaces. Are there clustered areas with charged residues or does a structural
element such as a b turn or a helix penetrate a little more deeply into the opposite
contact surface? Next, the peptide sequence that corresponds to the contact surface
is synthesized. This can be portions that preferably adopt a helical structure or that
can be fixed in a turn pattern such as a cyclopeptide. If an active peptide is found, it
must be structurally characterized in complex with the opposite contact surface.
The complex of the BCL-XL (B-cell lymphoma) protein with a 16-residue
peptide that was cut from the BAK protein is shown in Fig. 10.10. BCL-XL belongs
to the proteins that prevent programmed cell death (apoptosis). Its function is
regulated by binding to pro- and antiapoptotic factors such as BAK. Inhibitors of
this contact formation might therefore deliver potential drugs for an anticancer
therapy. The binding of the helical peptide takes place in a stretched-out groove.
Small molecules have been discovered that fill this crevice (Fig. 10.11). The group
200 10 Peptidomimetics
of Andrew Hamilton at Yale University has been searching for a basic scaffold that
can imitate the characteristics of a helix and simultaneously hold the side chains on
one side. Terphenyl derivatives 10.5–10.7 were found that can arrange the side
chains in a staggered conformation analogous to a helix. An alanine scan along the
BAK peptide showed that four hydrophobic residues (Val74, Leu78, Ile81, and
Ile85) are essential for binding. In addition, Asp83 forms a salt bridge to BCL-XL.
The terphenyl scaffold was therefore furnished with an acidic group at the end and
decorated with alkyl and aryl residues in the ortho positions. Compound 10.6 binds
to the BCL-XL protein with an affinity of 114 nM.
A different approach was taken at Abbott. Small molecules that interact
with the BCL protein were sought by NMR spectroscopy (▶ Sect. 7.8). The
millimolar inhibitors para-fluorobiphenylcarboxylic acid 10.8 (Fig. 10.11) and
1-hydroxytetraline 10.9 were discovered. Both bind to distinct but neighboring
positions. They replace Asp83 and Leu78 of the binding domain of the BAK peptide,
and 10.9 occupies the Ile85 position. From the two discovered fragments, the scientists
at Abbott developed compound 10.10, which had two-digit nanomolar affinity for the
protein. Further optimization led to 10.11, a highly potent antagonist that blocks the
entire family of antiapoptotic BCL-2 proteins. The synergistic effect of ABT-737
together with radiation and chemotherapy was demonstrated in animal experiments.
An analogous case was studied with the MDM2 protein at Roche. MDM2 is
overexpressed in many tumors. It binds to the tumor-suppressor protein p53, which
protects cells from converting to a malignant state. It is therefore the protein that is
most often inactivated during the carcinogenesis. Inhibition of complex formation
between the overexpressed MDM2 protein and p53 could thus represent an
approach to a possible cancer therapy. Here too, an a-helical p53 peptide stretch
binds to a hydrophobic groove on the MDM2 protein. A cis-imidazoline with an
affinity of 100–300 nM was found in screening. The co-crystal structure was
accomplished with 10.12 (Fig. 10.11). The imidazoline scaffold imitates the side
of an a helix of the peptide from the p53 protein. The two p-bromophenyl rings
replace a Trp and a Leu. The ethyl ether group on the third aromatic ring orients in
the pocket that is filled with a phenylalanine in the peptide. The MDM2 protein is
blocked through this competitive binding, and the level of free p53 increases.
Through this, the p53 pathway in cancer cells is activated, and the cell cycle
comes to a complete stop. The cell may go into programmed cell death. The
tumor growth inhibition was already demonstrated in animal models.
Another large class of proteins that is controlled by contacts with other proteins
is the integrins. Numerous low-molecular-weight inhibitors have been discovered
for this class. An example for the successful design of antagonists by starting from
cyclic peptides is presented in ▶ Sect. 31.2. Many G protein-coupled receptors
(▶ Sect. 29.1) are controlled by endogenous peptides or proteins. For this, the
peptide or protein binds to the receptor. The replacement of the peptide sequences
with an organic molecule that imitates the binding of the natural ligand has also
been attempted. An example of the design of such an active compound is given in
▶ Sects.29.5 and ▶ 29.6. Although successful, the design concept that was followed
10.6 Peptidomimetics to Interfere with Protein–Protein Interactions 201
O O O
NO2
H
OH N
O
O S S
COOH O NH
10.9
Kd = 4.3 mM
10.10
Ki = 36 nM
F F
10.8
Kd = 0.3 mM
NO2
H
N H N
O
O S S OH
O NH Br
O N
N
N
O
N
N O
10.11 Br
N
Ki = 1 nM ABT-737 10.12
Cl
Fig. 10.11 Different inhibitors of protein–protein contacts that imitate the a-helical structural
building blocks in the contact surface. The terphenyl derivatives 10.5–10.7 bind to the BCL-XL
protein in a pronounced crevice and block the binding site of a helix. The small fragments 10.8 and
10.9, which led to the development of inhibitors 10.10 and 10.11 were discovered in the same area
in an NMR spectroscopic screening. Compound 10.12 is a different helix mimetic that prevents the
interaction between the MDM2 and p53 proteins.
202 10 Peptidomimetics
was wrong: the active peptide and the derived synthetic mimic do not bind in
an overlapping binding region of the receptor.
Tachykinins are neuropeptides that all contain the same lipophilic C terminus:
–Phe–X–Gly–Leu–Met–NH2. A well-investigated representative of the tachykinins
is substance P, Arg–Pro–Lys–Pro–Gln–Gln–Phe–Phe–Gly–Leu–Met–NH2 (10.13,
Table 10.2). Tachykinins bind to at least three different tachykinin receptors, the
NK1, NK2, and NK3 receptors. All three belong to the class of G protein-coupled
receptors (▶ Sect. 29.1). They mediate a variety of biological effects, for example,
bronchoconstriction or pain transmission. Consequently a receptor antagonist could
be helpful for the treatment of asthma as well as to fight pain.
The study that was carried out on the development of an NK2 receptor antagonist
at Parke–Davis in Cambridge is a classic example of conversion of a peptide to
a peptidomimetic (Table 10.2 and Fig. 10.12). A compound was sought that binds
to the same receptor as substance P. Starting point of the work was a hexapeptide,
Leu–Gln–Met–Trp–Phe–Gly–NH2 (10.14), known from the literature that binds to
the NK2 receptor with an affinity of 11.7 nM. In the first step each amino acid was
systematically exchanged for alanine (10.15–10.20). In a few cases the
H
N
10.22 , R = H
Ki = 2700 nM
O O
H
N
O N NH2
H R 10.23, R = CH3
O
Ki = 327 nM
H
N
10.26, R = H
Ki = 17.2 nM
O O O
H
O N
O N NHR
10.27, R = CH2 CONH2
H
O
Ki = 1.4 nM
Fig. 10.12 Important intermediates on the way to NK2 receptor antagonists 10.27.
replacement with alanine resulted in only a weak decrease in the binding affinity.
As an example, the N-terminal leucine could be replaced with an alanine (10.15).
The conclusion was that the Leu side chain can only be of secondary importance for
receptor binding. The compound in which tryptophan or phenylalanine were
replaced with alanine, however, showed very little affinity for the NK2 receptor.
This was the “smoking gun” that these two amino acids are essential for the
binding. The removal of the C-terminal amino acid glycine (10.21) decreased the
affinity by a factor of 7. Obviously this amino acid also has some importance
for receptor binding. The testing of several N-terminal protected dipeptides led
to Z–Trp–Phe–NH2 (10.22, Ki ¼ 2700 nM) as a lead structure for further work.
With this, the first stage of the project was accomplished. As a dipeptide, 10.22
represented an interesting lead structure for further work.
In the next stage, additional methyl groups were introduced at different
positions of the molecule. This limited the number of possible conformations.
A decrease in binding affinity was observed for many of the investigated com-
pounds with conformational restriction. A methyl group on the Ca atom of
phenylalanine increased the binding affinity by a factor of 8 (10.23, Ki ¼ 327 nM).
A possible explanation for this finding is that the conformation that is adopted in the
receptor is stabilized by the additional methyl group. Then the N-terminal part
of the molecule was varied. The replacement of the terminal phenyl ring with
a 2,3-dimethoxyphenyl group further increased the binding affinity by a factor of 10
(10.24, Ki ¼ 37.6 nM). This value corresponds to the racemic a-methylphenyl-
alanine. The enantiomerically pure compound 10.26 with this building block in the
204 10 Peptidomimetics
H
N 10.28, R = Et, X = H I C50 = 3800 nM
10.29, R = H, X = H I C50 >10000 nM
10.30, R = H, X = 3,5-di-CH3 I C50 = 1533 nM
X
R O 10.31, R = Ac, X = 3,5-di-CH3 I C50 = 67 nM
N
H 10.32 , R = Ac, X = 3,5-di-CF3 I C50 = 1.6 nM
O
F
H
N
CF3
CF3
O H
N O
N CF3
N CF3 O
H N N O CH3
O
H
10.33, IC50 = 3 nM 10.34 Aprepitant
Fig. 10.13 The optimization of lead structure 10.28, which was found by screening, to selective
NK1 receptor antagonists 10.32 and 10.33. In contrast to the metabolically labile benzyl esters
10.28–10.32, ketone 10.33 is also active in animal experiments. The first NK1 receptor antagonist
aprepitant 10.34 was successfully brought to the market by MSD for the prevention of acute
emesis.
In the previous sections it was often highlighted that the side chains of the amino
acids are responsible for the binding to receptors. Usually the main chain merely
plays the role of a scaffold that serves to bring the side chains into the necessary
spatial alignment for binding. As such, a rigid, non-peptidic scaffold onto which the
side chains can be attached in the same spatial orientation should be suitable to
design molecules with similar properties as peptides. This idea was embedded in
a computer program in the group of Paul Bartlett at the University of California in
Berkeley. The program CAVEAT allows the search for rigid molecules that
10.9 Design of Peptidomimetics: Quo Vadis? 205
HN NH2
NH
NH
O B
O
N A
H
NH HN
C
O
OH
Fig. 10.14 The principles of a 3D search for scaffold mimics with the CAVEAT program. First,
the relative orientation of the biologically active side chains in the peptide lead structure is defined
by the Ca–Cb vectors. In this example the three essential amino acids Trp, Arg, and Tyr are taken.
The three vectors, A, B, and C are the essential information used to search the 3D database for rigid
scaffold structures that bear substitutable bonds in the same relative orientation. A list of cyclic
structures that represent possible templates for peptidomimetics is the result.
imitate a particular segment of a peptide scaffold. For this, the bonds on the peptide
backbone are described with vectors (Fig. 10.14). The 3D structure of the peptide
for the peptidomimetic being sought must be known as a prerequisite. The orien-
tation of the side chains is determined by the binding vectors Ca–Cb. The relative
orientation of, for instance, three amino acid side chains is found by the position
of the relevant Ca–Cb binding vectors. With this spatial pattern of vectors, a 3D
database of molecular scaffolds that contain three substitutable bonds oriented
analogously to the three Ca–Cb vectors is searched. The result is a list of rigid,
usually cyclic molecular scaffolds, the free positions of which can be coupled to the
amino acid side chains.
In this chapter the systematic approach to the design of peptidomimetics has been
described. The approaches have proven themselves in many cases and have led to
many attractive drugs. Nevertheless there are also difficulties. The first problem is
the stepwise approach. A peptide is systematically modified, and the synthesized
structures serve only to identify the essential functional groups. The synthesis of
the many resultant derivatives, that is, practically all in which an amide group was
206 10 Peptidomimetics
10.10 Synopsis
• Proteins communicate with each other through the formation of large, mutu-
ally shared surface patches. Small molecules designed to bind to such flat
surfaces can antagonize complex formation and interfere with protein–protein
communication.
• Design of small molecules to block protein–protein interfaces exploits depres-
sions on the surface that accommodate spatial patterns such as turns or helical
portions of the penetrating contact surface of the partner protein.
• Peptides bind to receptors mostly via side chains, and the backbone provides the
scaffold for their attachment. Computer programs can be used to screen struc-
tural databases to retrieve alternative scaffolds that are able to orient substituents
in very similar fashion.
Bibliography
General Literature
Ahn J-M, Boyle NA, MacDonald MT, Janda KD (2002) Peptidomimetics and peptide backbone
modifications. Mini Rev Med Chem 2:463–473
Gante J (1994) Peptidomimetics—tailored enzyme inhibitors. Angew Chem Int Ed Engl
33:1699–1701
Giannis A, Kolter T (1993) Peptidomimetics for receptor ligands—discovery, development, and
medical perspectives. Angew Chem Int Ed Engl 32:1244–1267
Hirschmann R (1991) Medicinal chemistry in the golden age of biology: lessons from steroid and
peptide research. Angew Chem Int Ed Engl 30:1278–1301
Marahiel MA (2009) Working outside the protein-synthesis rules: Insights into non-ribosomal
peptide synthesis. J Pept Sci 15:799–807
Special Literature
Howson W (1995) Rational design of Tachykinin receptor antagonists. Drug News Perspect
8:97–103
Lauri G, Bartlett PA (1994) CAVEAT: a program to facilitate the design of organic molecules.
J Comput Aided Mol Des 8:51–66
Lelais G, Seebach D (2004) b2-amino acids-synthesis, occurrence in natural products, and
components of b-peptides. Biopolymers 76:206–243
McLeod AM, Merchant KJ, Cascieri MA et al (1993) N-Acyl-Ltryptophan benzyl esters: potent
substance P receptor antagonists. J Med Chem 36:2044–2045
Merchant KJ, Lewis RT, MacLeod AM (1994) Synthesis of homochiral ketones derived from
L-tryptophan: potent substance P receptor antagonists. Tetrahedron Lett 35:4205–4208
Olson GL, Bolin DR, Bonner MP et al (1993) Concepts and progress in the development of peptide
mimetics. J Med Chem 36:3039–3049
Part III
Experimental and Theoretical Methods
210 III Experimental and Theoretical Methods
The search for new lead structures and the optimization of their activity profile by
systematic modification are among the most time and cost-demanding steps in drug
research. The optimization of a small organic molecule can serve as an example.
Even if the number of different groups per position is limited to relatively few,
several million structures are possible as exemplarily shown in the case of the
multisubstituted tetrahydroisoquinoline carboxylic acid amide 11.1 (Fig. 11.1). The
combinatorial explosion of all imaginable substitution possibilities can no longer be
realized with classical chemical techniques. The diversity increases even more
when the different stereoisomers are considered. The number is already consider-
ably larger than the number of all of the compounds referenced in Chemical
Abstracts (33 million) or in Beilstein (10 million compounds).
In the days when substances were tested on whole animals or in complex
pharmacological in vitro models, the biological tests were the rate-determining
step. The introduction of molecular test models, for example, enzyme or
receptor-binding tests, and extensive automation of screening has fundamentally
changed this situation. Testing of many thousands of compounds per day is
technically unproblematic (▶ Sect. 7.3). To use the capacity of these methods
to their fullest extent, the synthesis of thousands or even tens or hundreds of
thousands of different molecules is desirable. The strategy can then shift either to
automated parallel synthesis to cover a large number of single compounds or
the simultaneous production of compound mixtures by using combinatorial
chemistry.
Nature has shown a way to achieve combinatorial diversity with the nucleic acids
and with proteins. A 600-base-pair DNA sequence codes a protein with 200 amino
acids. From the “pool” of four nucleic acids that code for the 20 proteinogenic
amino acids in triplet sequences, 4600 (a number with 360 digits!) different
5 4
5 R5 O
R4 2
R6 R9
* N
R7 * N R3 R10 20
R1 R2
R8
5
10
5 10
2
11.1
Fig. 11.1 The tetrahydroisoquinoline carboxylic acid amide 11.1 is to be substituted in 10 posi-
tions. The groups in these positions encompass a multiplicity of a total of 68 building blocks
(R1–R10 ¼ 5, 10, 10, 4, 5, 5, 5, 2, 2, 20 groups). Twenty million compounds can be constructed in
this way. If the structural diversity that results from the two stereocenters (*) is considered, this
number increases again by a factor of 4.
DNA sequences are possible. This translates to 20200 (a number with 260 digits!)
different amino acid sequences for the resulting protein. Short peptides with
enormous structural variety can be constructed with just the 20 proteinogenic
amino acids. If instead of amino acid A, a manageable number of modified
amino acids M is used, the number of possible analogues increases even more
(Table 11.1).
Peptides play an important role in biological systems. They are found as
protein ligands in the free form or as simple derivatives. Peptide sequences exposed
on the surface of a protein determine the recognition properties of the protein at
a receptor. Nature exhausts the full combinatorial diversity of the variable
sequences on the surface regions (epitopes) of proteins for their selective recogni-
tion. This principle of Nature can be adopted to generate huge compound libraries
with highly variable composition.
11.2 Protein Biosynthesis as a Tool to Build Compound Libraries 213
O
O Cl O O Cl
Cl Cl
O
O Cl Cl
O
Cl O O
Cl
11.2 11.3
Cl AA4 O O AA1
O O
O
Cl Cl AA3 AA2
11.4 O O
O O
Val Pro Pro Val
O O O O
11.5 11.6
Fig. 11.2 The oligofunctional acid chlorides of the central building blocks cubane 11.2, xanthene
11.3, and benzene 11.4 are treated with protected amino acids. A xanthene-containing library
inhibits the digestive enzyme trypsin. The active component of the library was deconvoluted and
characterized by targeted resynthesis. In the end the isomers 11.5 and 11.6 remained as the most
potent compounds. The derivative 11.5 inhibits trypsin with a Ki of 9.4 mM.
11.4 What Is Contained in Chemical Space? 215
strategy also has disadvantages. The coupling partners have different reactivities. As
a result, the products are not evenly distributed. The transformation of a particular
functional group on the central building block can depend upon which components
the central molecule has already reacted with, and how this influences the other
functional groups.
The thus-generated library is then tested. If binding to the target protein is found,
the active substance in the mixture is characterized, a task that is not particularly
simple. On the one hand, sophisticated analytical techniques such as liquid chro-
matography coupled with NMR spectroscopy and mass spectrometry can be used.
Moreover, an attempt can be made to “deconvolute” the library. For this, a targeted
resynthesis of the library is carried out in which a partial library is prepared by using
a defined selection of building blocks. This smaller library is then tested and the
composition of the active mixture is determined. This strategy must be followed
back to the level of single defined reactions product.
At this point the fundamental question must be asked: how many organic molecules
are principally possible from which medicinal chemists can create their candidates?
What is the content of this, at first virtual, chemical space? Much has been speculated
about this question. Numbers between 1020 and 10200 possible molecules have been
named. The last claim encompasses so many molecules that the entire mass of the
universe would not be enough to synthesize at least one molecule of every com-
pound! We have to thank Tobias Fink and Jean-Louis Reymond of the University of
Bern for forming a concrete idea of the principle occupancy of this chemical space.
Beginning with mathematical graphs that describe simple hydrocarbon scaffolds,
molecules with up to 11 C, N, O, and F atoms were generated on the computer.
Heteroatoms and unsaturated bonds were scattered throughout the generated molec-
ular graphs in a combinatorial fashion. Different filters that consider the chemical
stability of the functional groups, the strain of the ring systems, and the formation of
tautomers produced a database of 26.4 million structures. If all possible stereoiso-
mers are generated, an average of 4.2 isomers per entry is formed. The database
finally encompassed 110 million molecules. It is interesting to see that the number of
entries increases exponentially with the square of the number of atoms. Therefore
already 90% of the database is composed of molecules with 11 non-hydrogen atoms.
If the number of molecules that can be generated with 25 non-hydrogen atoms is
estimated, the result is 1027 imaginable products. Twenty-five atoms represent
approximately the average size of a typical drug molecule.
It is worthwhile, however, to look at the database with entries of 11 non-
hydrogen atoms more closely. The average molecular mass in this database is
153 7 Da. Molecules of this size fall into the range of typical fragments or
“lead-like” molecules (▶ Sect. 7.9). Exclusion criteria were proposed that emphasize
promising candidates for drug development. The so-called “rule of three” leans on
the “rule of five,” which was established by Chris Lipinski at Pfizer (▶ Sect. 19.7).
216 11 Combinatorics: Chemistry with Big Numbers
If the database is filtered with these rules, approximately half of the entries remain. Of
these, ca. 15% are acyclic compounds, and about 43% contain one ring. It is very
enlightening to see that only about 55% of the ring systems in the virtual database
have been described in Chemical Abstracts or Beilstein. Comparison with a data
collection of already-synthesized molecules of the same size makes clear where the
chemical space has been only sketchily explored. It seems that very big gaps still
exist! Over 99.8% of the entries in the virtual database are waiting to be synthesized.
A comparison of the physicochemical properties of the molecules in both databases
suggests that very broad areas still remain that until now have not been explored.
If the chemical space is limited to compounds with 7, 8, or 9 atoms, it seems that the
chemical space is well covered with already prepared molecules. Approximately 2/3
of the molecules with 10 or 11 atoms in the virtual database are chiral. In this group
particularly, there are many candidates that meet the “lead-like” criteria. This is a real
challenge for synthetic chemists. Chiral fused carbo- and heterocycles are difficult to
make. Nevertheless, Nature has led the way: many biologically active natural prod-
ucts contain just these building blocks.
ClCH2OCH3 + SnCl4
or ZnCl2 + HCHO
Cl
Boc-NHCHR1COO−
R1 Boc
N
O H
O
HCl / CH3COOH
R1 H
N
O H
O
Boc
Boc-NHCHR2COOH/
O N H
DCCI / DMF R1
N R2
O H
O
Cleavage with
strong acid
HBr/CF3COOH H
O N H
Br R1
+ N R2
HO H
O
Fig. 11.3 The Merrifield peptide synthesis is assembled on a polymeric resin that is
functionalized in an appropriate way. The first N-terminal-protected amino acid is coupled to
the chloromethylene group (Boc ¼ tert-butoxycarbonyl protecting group). Then the amino group
is released, activated with dicyclohexylcarbodiimide (DCCI), and coupled with a second amino
acid. The N terminus of the resulting dipeptide can be deprotected and elongated. It can also be
cleaved from the resin under strongly acidic conditions as a peptide.
218 11 Combinatorics: Chemistry with Big Numbers
A C
B
A B C
A C
B
A A A B B B C C C
A B C A B C A B C
A C
B
A A A B B B C C C
A A A A A A A A A
A B C A B C A B C
A A A B B B C C C
B B B B B B B B B
A B C A B C A B C
A A A B B B C C C
C C C C C C C C C
A B C A B C A B C
Fig. 11.4 The construction of a compound library according to the split-and-combine technique
starts with a certain amount of resin beads. These are evenly distributed among n reaction vessels.
Only three are considered here for the sake of simplicity. In the first flask reagent A (e.g., amino
acid A) is coupled to the resin. Reagents B and C are analogously added to flask 2 and 3. In the next
step a dipeptide is constructed. To solve the problem of different reaction rates between the
different amino acids A, B, and C, only one soluble reaction partner is added in excess to the
mixture of solid-phase-bound starting materials. After the first reaction step, the resin, which is
now loaded with an amino acid is combined and mixed. It is again distributed between 3 (or more)
reaction flasks. The next reaction is carried out. In the case of a peptide synthesis, amino acid A is
added to flask 1, B to flask 2, and C to flask 3. The resin is combined and mixed thoroughly. In the
meantime all nine possible dipeptides are on the beads. After separating the beads again, the third
step follows. In case the peptide chain is to be extended by another amino acid, amino acid A is
added to flask 1, B to flask 2, and C to flask 3. Now all 27 imaginable sequential tripeptides are on
the resin after three parallel reaction steps. A clearly identifiable compound is found on each resin
bead. The library can be tested directly on the polymer or it can be tested in solution after cleavage
from the support.
220 11 Combinatorics: Chemistry with Big Numbers
The libraries that were generated on the solid support are biologically tested. This
can be done directly on the polymer-immobilized compounds. As with testing
the libraries from bacteriophages, there is a danger that the support material
influences the test, for example, through steric hindrance or unspecific interactions.
Furthermore, it is important that the test protein is in a soluble form. Membrane-
bound receptors therefore elude testing. Alternatively, the compound library can be
cleaved from the resin. For this, the coupling between the resin and the library
component must be made by using an appropriate “linker,” which allows the library
to be selectively released. This linker is cleaved off, for instance, at a low pH or
photochemically with UV light. It must not interfere, however, with the synthetic
assembly of the library, and must not be cleaved during the synthesis.
The final cleavage from the resin must not destroy the products. Testing the
cleaved products certainly correlates better with physiological conditions. Spread-
ing the cleaved compounds onto a large area or embedding them in a gel
achieves a spatial separation so that compounds interacting with the test protein
occur in local high concentrations. This way, the binding to insoluble (e.g., mem-
brane-bound receptors) proteins can be tested. However, the advantage of
the mechanical manipulation of a polymer-bound compound library is lost
upon release.
If biological activity is found in the test, it remains to be determined which
compound from the library is responsible. If the library is precisely defined through
the synthetic program, then it is known which compounds were tested. Active
components are narrowed down by deconvolution and the resynthesis of partial
libraries. Only one defined compound is produced on each resin bead with the one-
bead-one-compound technique. It is not known, however, which one. It is only
after activity is found that the compound characterization is attempted. There are
many ways to do this: they can be tested on the resin by separating the relevant resin
beads and analyzing the compounds. If the library is of peptides or oligonucleo-
tides, peptide sequencing by Edman degradation (works even on 0.1 picomolar!) is
carried out, or polymerase chain reaction allows (▶ Sect. 12.1) amplification and
enrichment of oligonucleotides.
Even more elaborate techniques are also used. During synthesis, the library is
allowed to “grow” on multiple different linkers. The single library compounds can
be released from these linkers under different conditions (e.g., different pH values,
or photochemically at different wavelengths). First the compound is cleaved from
the first linker to carry out testing. The cleavage from the second linker is performed
after mechanical separation of the desired resin beads. This method serves to
practically “label” the resin beads. The technique is therefore an elegant variation
on library testing in a detached state. The different linker-bound compounds on the
resin bead need not be identical. Therefore a test library of peptides can be linked to
the resin bead by oligonucleotides, which are used as labels. Halogenated aromatics
were also proposed as labels because they can be easily identified by mass
11.8 Combinatorial Libraries with Large Diversity: A Challenge for Synthetic Chemistry 221
spectrometry even in the smallest quantities. The labels can even be encoded based
on their sequence or the number of monomer building blocks with an appropriate
binary code.
The techniques of labeling the resin bead require considerable synthetic effort
for the library preparation. The transformation steps for the assembly of the library
and the labeling must not disturb one another. Even the final reading of the labels
can require multiple working steps. The alternative route using the programmed
synthesis concept with deconvolution and resynthesis also means increased effort
for the repeated construction of the library components. However, the same work-
ing steps are always used, they are just carried out with different reagent compo-
sitions. With respect to automation, this is certainly an advantage.
Another aspect speaks for the last above-mentioned concept. In the meantime
a large number of organic reactions have been transferred to solid-phase synthesis.
For each solid-phase synthesis, a special strategy, a specific linker, and a suitable
cleavage method must be developed. Each single synthetic step must be compatible
with the protecting groups, the polymer support, and the linker. However, a whole
new dimension of chemical diversity is made available than is possible with
peptides and nucleotides.
Careful design of the target molecules to be synthesized is indispensible for
combinatorial chemistry. Limitations arise from the accessibility, that is, the devel-
opment of an appropriate synthetic scheme, and furthermore from the desired
structural diversity of the resulting library. Computer methods help to find
a “reasonable” selection of synthetic components. How is the optimal composition
obtained? This highly depends on what the constructed library should be tested for.
A library can be developed for general-purpose screening. It should then be
“optimally diverse.” Its composition is outlined according to generally accepted
criteria such as molecular weight, total lipophilicity, an even distribution of H-bond
donors and acceptors, as well as the size of the hydrophobic surface area.
These characteristics are important for the similarity or diversity of active com-
pounds in the library (▶ Chap. 17, “Pharmacophore Hypotheses and Molecular
Comparisons”). The desired library diversity can also be considered in relation to
the biological properties of a receptor (target oriented). Criteria that make
a molecule “similar” or “diverse” for one receptor are not necessarily the same
for another receptor (▶ Sect. 17.7). In view of the broad palette of proteins for
which combinatorial libraries should be tested, there is no absolute measure of
diversity. Therefore, combinatorial chemistry plays an important role in the estab-
lishment of structure–activity relationships for a target protein. For this the chem-
ical variation in different positions must be very quickly conducted on a suitable,
discovered lead structure. The design and synthesis of such targeted compound
libraries opens the gateway quickly.
222 11 Combinatorics: Chemistry with Big Numbers
X H R2 O A
O ∗
X N NH2
N N
H3C ∗
R1 O R3 O ∗
O
cHex
∗
X R1 R2 R3
∗
O ∗
OH X A D O
∗
X A O D
OH X O D A
∗
X O A D
OH ∗
X D A O
X D O A
∗
CH3 CH3
OH O NH2
∗ ∗ O ∗
CH3 CH3
∗
O O
∗
N ∗
∗ ∗
OCH3 O
OCH3 OCH3 O
N ∗ O ∗ ∗
∗
Fig. 11.5 Peptoids are oligoglycines that are substituted at nitrogen. A library of di- and
tripeptoids was constructed according to the split-and-combine technique. Three X groups
were added to the N terminus. Three groups O with a hydroxy function, 4 groups A with an
aromatic ring, and 17 groups D with diverse groups were used as nitrogen side chains. Eighteen
mixtures (6 permutations of A, O, and D with 3 end groups) gave ca. 5,000 di- and tripeptides.
The H-ODA-NH2 library showed activity on the a-adrenergic receptor. First, the hydroxy
groups O were deconvoluted. The compounds with p-hydroxyphenethyl groups were the most
active ones. In the next synthesis round, 17 partial libraries were composed with this O group
held constant, and defined groups were used from the diverse D group. Compounds with
a diphenyl or diphenyl ether group were particularly active. With these groups in the
D position, the work was continued. The division of the aromatic side chains A in the last
position led to eight individual compounds.
224 11 Combinatorics: Chemistry with Big Numbers
N NH2
HN N
O O O
D
O
O A
O
N NH2
HN N
OH
O O
11.7
A O
11.8 OH
O O O
H H
N N NH2
HO N N
H H
O O
SCH3 HO
11.9
Fig. 11.6 The derivative 11.7 is the most potent compound from the H-ODA-NH2 library with
a Ki ¼ 5 nM on the a-adrenergic receptor. Testing on the opiate receptor gave compound 11.8 as the
candidate with highest affinity (Ki ¼ 6 nM) from the H-ADO-NH2 library after deconvolution. Met-
enkephalin 11.9 is a potent opiate receptor ligand. The relationship between the p-hydroxyphenyl
group in 11.8 and the tyrosine side chain in 11.9, and a phenyl portion in the diphenylmethane
groups of 11.8 and the benzyl groups of phenylalanine in 11.9 is obvious. Tyr and Phe are essential
for the activity of Met-enkephaline.
Ar-CHO:
*
Aa:
O
a R1
Gly NH2 +
CH2Cl + O R1:
Ala
Le
Leu R H
Phe Me
OMe
Y: OSiMe2tBu
CN O
O H
b CO2Me c
N Ar N Ar
O + O
COMe
R R
CO2tBu
Y
CO2Me
O
Cl Thio O O Thio
d
+ Thio : O N Ar
CH2SAc R
CH2CH2SAc Y
CH(Me)CH2SAc
CH3
O O SH
N Ph
HO
11.10
CO2Me
Fig. 11.7 The amino acids Aa ¼ Gly, Ala, Leu, or Phe are coupled to the support resin (a). Next,
they are transformed to imines with four different aromatic aldehydes (Ar-CHO; b), which react
with alkenes under 1,3-dipolar cycloaddition conditions to give pyrrolidines (c). In the last step the
free NH proton on the heterocycle is treated with different thiol compounds (thio-COCl; d). With
the help of the split-and-combine technique the library is cleaved from the polymer with release of
an acid function. Its ability to inhibit the angiotensin-converting enzyme was tested. By
resynthesis and renewed testing, the library was deconvoluted to the active compound. In doing
so, compound 11.10 was identified as a high-affinity inhibitor.
avoid deconvolution of a library but which still uses the advantages of combinato-
rial chemistry is parallel synthesis in spatially separated reaction vessels. It remains
clear along the entire reaction sequence which reactant and product is in each
vessel. A laborious deconvolution is omitted. At first this strategy seems to be
impractical. How should a thousand reaction components be reasonably
transformed in a thousand reaction flasks? For this purpose, the reaction flasks
226 11 Combinatorics: Chemistry with Big Numbers
should not be thought about in the classical organic chemistry sense. Rather,
miniaturized reaction “automats” are developed in which all reactions steps are
carried out in parallel. Alternatively, methods have been developed in which the
resin beads are filled into many small reaction capsules (e.g., called teabags or
“KansTM”). These are open for the solution-phase for compound transport, but the
beads are mechanically enclosed. Each capsule is fitted with a label that can be read
with a radio transmitter. All of the capsules are then placed in a classical round-
bottomed flask and the usual chemistry is carried out. The capsules can be mechan-
ically separated and brought into contact with different reagents. Which reaction
sequence is performed on which capsule is followed by the registration system with
the radio transmitter. In this way, one molecule can clearly be prepared by combi-
natorial principles per reaction capsule, practically as it is in parallel synthesis. The
single compounds are then available for testing.
Synthesis on a solid support material has disadvantages compared to chemistry
in the solution phase. Usually transformations are slower and the analysis to follow
the reactions is considerably more laborious to carry out. Coupling to the solid
support requires a suitable linker. Such an anchoring group should be removed from
the library before testing as tracelessly as possible. Above all else, upon removal of
the linker (“traceless linkers”) no functional groups should remain in the library that
might unintentionally be part of the pharmacophore. The chemistry to attach
and remove the linker must be compatible with all of the other reactions in the
synthesis of the solid-supported library. This can lead to limitations in the usable
chemistry. In preparative chemistry, molecules are preferably constructed by using
a convergent synthesis strategy. For this, an approach is developed in which the
components of the final product are prepared in separate steps, each in parallel. In
the subsequent reaction steps, the previously prepared components are brought
together and coupled with one another to produce the final product. Such a strategy
is more efficient and leads to a higher yield than a linear synthetic route.
A convergent strategy, however, cannot be carried out by sequential construction
on a resin. Therefore, the tables have been turned for some syntheses. The prepared
libraries are not bound to the solid support, but rather the reagents with which they
are treated. The advantage of carrying out reactions on the solid support is retained.
Good mechanical separation of reaction components, working simply with large
excesses of reagent, and automated reactions belong to this technique. An advan-
tage is that it is now possible to carry out convergent syntheses. Even toxic reagents
can be used as their separation is ensured by their firm adhesion to a solid support.
The usual analytical methods that are typically applied for the solution phase can
also be used.
Some reactions, especially ring-closure reactions or condensations, are in
competition with intermolecular transformations. To avoid these, highly diluted
solution conditions are used. If a solid-supported reactant is used, the local con-
centration of the reactant will be reduced as it is fixed to the solid support and
spatially separated. Reactions that occur over a trapped reaction product can be
simplified if the trapping reagent is coupled to a solid support. Mechanical filtering
is enough to separate the trapped components. Similarly, the products can be
11.12 The Protein Finds Its Own Optimal Ligand 227
separated and purified by trapping them on a solid support. Acids and bases can be
separated for purification by treatment with an immobilized amine or a sulfonic
acid. In the meantime, the adhesion of metal-complexing groups or hydrophobic
adhesion groups are already used for the purification of combinatorially produced
compound libraries.
How will combinatorial chemistry develop further? The miniaturization
of reaction vessels and synthetic automats seems to be a seminal perspective. The
“lab-on-a-chip” concept is already intensively used for bioanalytical methods. Small
reaction volumes, integrated separation columns, miniaturized valves, and pumps
that are controlled by piezo elements are integrated on small chip cards. We can only
wait and see whether such serial reaction automats are the laboratories of the future.
Could a protein simply produce its own best inhibitor itself? With the ideal
geometry, it should be able to form the optimal interactions directly in the binding
pocket of the target enzyme. Which chemical reactions might be best suited for
such a concept? It would have to be a reaction that can be conducted in aqueous
medium, is reliably enthapically driven, is fast, and that gives complete turnover.
Such a reaction, named “click chemistry,” was investigated in detail in the group of
Barry Sharpless in La Jolla California in recent years. Cycloadditions of unsatu-
rated compounds (1,3-dipolar cycloadditions, Diels–Alder reactions); nucleophilic
substitutions, particularly ring-opening reactions; non-aldol-like carbonyl reac-
tions; and additions to C─C multiple bonds fulfill these requirements. These
can be applied by using combinatorial principles. The 1,3-dipolar cycloaddition
(Huisgen Reaction) can be particularly well used to build five-membered triazole
and tetrazole heterocycles (Fig. 11.8). 1,4-Disubstituted 1,2,3-triazoles can be
regiospecifically produced by the reaction of an azide and alkyne in the presence
of Cu(I) salts at room temperature. 1,5-Disubstituted triazoles are formed when
copper ions are excluded or other ions such as ruthenium are added. The reaction
runs in a broad pH range between 4 and 12. The reaction type can be extended to
tetrazoles. For this, nitriles are needed as dipolarophile reaction partners in the
presence of zinc ions.
The research group of Jean-Marie Lehn in Strasbourg chose another way. They
developed “dynamic combinatorial chemistry” through the spontaneous construc-
tion of molecules from suitable starting materials and irreversible reactions
(Fig. 11.9). All imaginable combinatorial products form from a mixture of different
building blocks. A dynamic exchange equilibrium is established between them. The
target receptor, (e.g., a protein) is added to such an equilibrium system. This way
the mixture components with the best protein-binding characteristic have an advan-
tage, as the protein captures the best binders and shifts the equilibrium. It leads to
a self-perpetuating choice of the ligands that fit best into the binding pocket. In this
way the added protein practically seeks its own best inhibitor.
228 11 Combinatorics: Chemistry with Big Numbers
R2
+
N N N
R2 R2
R1 CH N N
N N
1 N 1
N
R2 +
+
N N N 4 R1 5
R1
1,4-Triazole 1,5-Triazole
HC R1
R2
+ R2
N N N N
N
N
R1 N
R1 N
1,5-Tetrazole
Fig. 11.8 The 1,3-dipolar cycloaddition (Huisgen reaction) is a typical click chemistry reaction
and leads to five-membered triazole and tetrazole heterocycles. In the presence of Cu(I) salts,
azides and alkynes react regiospecifically at room temperature to form 1,4-disubstituted 1,2,3-
triazoles, in the absence of copper but with ruthenium ions, 1,5-disubstituted products are formed.
If a nitrile is used instead of the alkyne component, and the reaction is catalyzed by zinc ions, 1,5-
disubstituted tetrazoles are obtained as products.
Library
generation
Selection through
the receptor
Dynamic exchange of
library components
Receptor
Receptor
Selection of the
Library best binder
components
Fig. 11.9 A mixture of different library components is furnished that interact under equilibrium
conditions in dynamic combinatorial chemistry. Numerous products can form in the equilibria.
They represent potential “keys” that can fit in the “lock” of the target protein. The added receptor
protein binds to the best-fitting ligands from the compound mixture and shifts the equilibrium in
the direction of increased formation of this product. It is then removed from the equilibrium by the
protein binding (according to O. Ramström and J.-M. Lehn).
success has been achieved with HIV protease (▶ Sect. 24.3) and ACh-binding
protein (▶ Sect. 30.5) Goal-oriented combinatorial libraries are used as starting
materials for these reactions. Time will tell what significance this in situ inhibitor
synthesis will gain for practical drug research.
11.13 Synopsis
Phenylphenanthridine
N
N N
N N N N
H N N
N
HN
HN
N
N
N
Ser203
Fig. 11.10 The library produced from alkynes bearing an AChE-suitable tacrine side chain
and AChE custom-made phenylphenanthridine-substituted azides. In the presence of acetylcho-
linesterase (AChE) the products 11.11 (green) and 11.12 (gray) are formed, which proved to
be potent enzyme inhibitors. They differ in the topology on the five-membered ring. Crystal
structure determinations were accomplished with both inhibitors. The surface around the
protein is shown with the bound ligand 11.12. Both ligands occupy the tube-shaped binding
pocket of AChE. Compound 11.11 binds via a water molecule (red sphere) to the hydroxy function
of Ser203.
Bibliography 231
Bibliography
General Literature
Madden D, Krchnak V, Lebl M (1994) Synthetic combinatorial libraries: views on techniques and
their application. Persp Drug Discov Design 2:269–285
Moos WH, Green GD, Pavia MR (1993) Recent advances in the generation of molecular diversity.
Annu Rep Med Chem 28:315–324
Nicolaou KC, Hanko R, Hartwig W (2002) Handbook of combinatorial chemistry. Drugs, cata-
lysts, materials. Wiley-VCH, Weinheim
Ramström O, Lehn J-M (2002) Drug discovery by dynamic combinatorial libraries. Nat Rev Drug
Discov 1:27–36
Seneci P (2000) Solid-phase synthesis and combinatorial technologies. Wiley-Interscience,
New York
Special Literature
Bourne Y, Kolb HC, Radic Z, Sharpless KB, Taylor P, Marchot P (2004) Freeze-frame inhibitor
captures acetylcholinesterase in a unique conformation. Proc Natl Acad Sci 110:1449–1454
Carell T, Wintner EA, Sutherland AJ, Rebek J, Dunayevskiy YM, Vouros P (1995) New promise
in combinatorial chemistry: synthesis, characterization, in screening of small-molecule librar-
ies in solution. Chem Biol 2:171–183
Dooley CT, Chung NN, Schiller PW, Houghton RA (1993) Acetalins: opioid receptor antagonists
determined through the use of synthetic peptide combinatorial libraries. Proc Natl Acad Sci
USA 90:10811–10815
Fink T, Reymond J-L (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N,
O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new
ring systems, stereochemistry, physicochemical properties, compound classes, and drug dis-
covery. J Chem Inf Model 47:342–353
Geysen HM, Meloen R, Barteling S (1984) Use of peptide synthesis to probe viral antigens for
epitopes to a resolution of a single amino acid. Proc Natl Acad Sci USA 81:3998–4002
Murphy MM, Schullek JR, Gordon EM, Gallop MA (1995) Combinatorial organic synthesis of
highly functionalized pyrrolidines: identification of a potent angiotensin converting enzyme
inhibitor from a mercaptoracyl proline library. J Am Chem Soc 117:7029–7030
Zuckermann RN, Martin EJ, Spellmeyer DC et al (1994) Discovery of nanomolar ligands for
7-transmembrane G-protein- coupled receptors from a diverse N-(substituted)glycine peptoid
library. J Med Chem 37:2678–2685
Gene Technology in Drug Research
12
Engineers and writers have predicted many developments in science and technol-
ogy. In addition to other sophisticated machines, Leonardo da Vinci described the
principle of the helicopter. In the early 1820s, Charles Babbage designed an
automatic calculator long ahead of its time. Over 160 years later, the mechanical
precursor of a programmable computer was in fact built, and it worked! Jules Verne
described submarines and a journey to the moon, and Hans Dominik described
obtaining energy by splitting the atom. All of these visions have become reality.
Only a single application was preconceived for gene technology, the most seminal
invention of our time: the cloning of two genetically identical individuals in Aldous
Huxley’s Brave New World. It remains a hope that researchers will respect ethical
boundaries, and despite the feasibility, never actually use Huxley’s idea.
With the methods of gene technology it is possible to bring new genes into the cell,
multiply them, and exchange or remove them. If they are removed or changed, the cell
can no longer produce the original protein derived from that gene. By introducing a new
gene and using a clever choice of method, the cell manufactures a foreign product,
either a purposefully modified protein, or an entirely new one. For many diseases, the
molecular cause is known to be the absence of a protein, or a genetically caused
mutation in a protein. Only a few generally known examples are mentioned here:
• Diabetes as a result of insulin deficiency,
• Particular, hereditary cancer forms (e.g., familial colon cancer, malignant
melanoma),
• Chorea Huntington, a chronic form of brain atrophy,
• Sickle cell anemia, a genetic disease producing malformed red blood cells
(Sect. 12.14), and
• Bleeding disorders that are caused by the absence of particular coagulation
factors (see Sect. 12.14).
The possibility of purposefully producing arbitrary proteins has yielded the
following main applications of gene technology.
• The identification of genes and proteins that could play a role in the treatment of
a disease,
• The development of animal models to test a therapeutic principle,
The foundations of gene technology were first established in the middle of the
twentieth century. The starting shot was made in 1953. Back then, James Watson
and Francis Crick elucidated the three-dimensional structure of the hereditary
substance of all living things, desoxyribonucleic acid, DNA. Immediate indications
were obtained from the structure about the mechanism of our hereditary transfer
and about the genetic code for the biosynthesis of proteins. A few years later,
Werner Arber found enzymes, restriction enzymes that attack a very specific
position on the double helix to sequence-specifically cleave the DNA. What was
initially seen as a curiosity proved to be an exceedingly important discovery for
gene technology. It is possible to selectively cleave DNA with these enzymes and to
introduce new fragments. Next, the merging of new information with the original
DNA, the recombination of the genetic constitution, is accomplished with ligases
from special viruses called bacteriophages. The techniques for DNA sequencing
have also made decisive progress. Soon afterward, the amino acid sequence of
a protein was no longer directly determined, but rather deduced from the analysis of
the corresponding DNA. Today, for sequencing the detour over the cDNA is used,
which is complementary to the RNA (Sect. 12.6).
In 1973 Stanley Cohen and Herbert Boyer managed to recombine the genome of
a bacterium for the first time (Fig. 12.1). Then things happened one after the other:
two years later the bacterial strain Escherichia coli K12, which is still used today,
was developed. A part of its genetic constitution is missing so that it is only viable
under laboratory conditions. This bacterium can be arbitrarily genetically manipu-
lated without the worry that it could be injurious. The British scientists H. Wil-
liams-Smith and E. S. Anderson carried out self-experiments independently of one
another in that they orally ingested Escherichia coli K12. They proved that these
bacteria only survive in the GI tract for a short period of time, and that the K12
gene, which confers antibiotic resistance for the selection of transformed cells,
cannot be transferred to the normal Escherichia coli that is found in the intestinal
flora. Experts discussed the possible dangers of gene experiments at a conference in
Asilomar, California, and defined different risk and safety classes. In 1976 the
12.1 The History and Basics of Gene Technology 235
DNA Recombinant
Vector DNA
Fig. 12.1 The principle of gene technological recombination of hereditary information. Bacteria
often contain additional genetic material in addition to their “chromosome” in the form of ring-
shaped plasmids; these are used in gene technology as vectors to introduce foreign genes. Plasmids
are removed from the cell and sequence-specifically cut with so-called restriction enzymes,
which come from bacteria. The target DNA that carries the desired genes, which were typically
also treated with the same restriction enzyme, is bound in vitro to the overlapping single-stranded
DNA ends. The DNA ends are coupled with the enzyme DNA ligase, and the modified, recom-
binant plasmid is brought into the bacteria cell. Plasmid vectors that are used in gene technology
carry in addition to the DNA segment that is necessary for replication, additional information that
allows the recognition and selection of the transformed cells (usually an antibiotic-resistance
gene). In the presence of the selecting agent, only plasmid-containing cells grow.
company Genentech was established. Its founder, Herbert Boyer had to borrow
US $500 as start-up capital! When in 1980 the company was initially traded on the
stock market, within a few minutes he became a millionaire because of the value of
his stock. As early as 1982 Genentech introduced the first medication to the market,
that was manufactured by using gene technology human insulin (Humulin ®).
In 1983 Kary Mullis made a very decisive contribution to gene technology in
that he developed the polymerase chain reaction (PCR) while he was working at the
California company Cetus, which was founded in 1971. Heating melts double-
stranded DNA into its single strands, then the four DNA nucleotides are added, as
are two short single-stranded DNA pieces that are complementary to the regions at
the beginning of the DNA, the so-called primers. A polymerase can then be used to
synthesize new DNA in a test tube. This means that by starting with the primers,
a new double strand is formed (Fig. 12.2). A heat-stable DNA polymerase from the
bacteria Thermus aquaticus, which is endemic to the hot springs in Yellowstone
National Park is used for the DNA synthesis. Each repetition of this step doubles the
original DNA amount. Within a few hours, billions and trillions of DNA molecules
can be manufactured from a single starting molecule. This amount is enough for
a sequencing of the relevant DNA segment.
236 12 Gene Technology in Drug Research
Heating at 95 °C
Double-Stranded
DNA Molecule
Two Single Strands
+ Two Primers
+ Excess
Nucleotide
Repeating the
PCR Cycle
Fig. 12.2 Polymerase chain reaction allows unlimited identical copies of a DNA molecule to be
manufactured. For this the DNA is heated to separate the double-stranded DNA into complemen-
tary single strands. Synthetic oligonucleotides with approximately 20 bases, so-called primers,
which are complementary to these DNA strands hybridize with the corresponding strand. Each
primer must bind to one end of the two DNA strands. The primers set the boundaries of the
amplified DNA. Furthermore an excess of primer must be used because in each cycle one primer
pair is needed for each DNA double strand. They are not explicitly produced in later cycles. The
primers are necessary to effectuate the new synthesis of DNA in the presence of the DNA
polymerase and an excess of the four different nucleotides. This occurs in the reverse direction
(dashed arrow) because of opposite course of the DNA strands and the specificity of the
polymerase. The newly synthesized DNA segment can be a few hundreds to thousands of base
pairs long. The result is two identical double-stranded DNA molecules. After heating, single
strands are obtained and the above-described procedure is repeated. Because the DNA polymerase
is heat stable, it does not need to be repeatedly added. Each repeat of the above-described steps
leads to a doubling of the DNA molecule. Its number grows exponentially. Ten cycles lead to
about 1,000 DNA molecules, 20 to a million, and 30 already to a billion. In this way, a single DNA
molecule can be multiplied into a quantity that is biochemically analyzable.
PCR methods are applied diversely. The entire genetic information of an indi-
vidual can be derived from a single DNA molecule. In medical diagnostics, this
serves to evidence genetic disorders, cancer, infectious diseases, and risk factors.
PCR methods are also used to establish a genetic fingerprint in paternity tests and in
forensic science.
12.2 Gene Technology: A Key Technology in Drug Design 237
New genetic information cannot only be brought into bacterial cells, but also
into yeasts, virus-infected insect cells, and even in mammalian cells. In a first
approximation though, it is valid that the more complex the organism is, from
bacteria all the way to mammalian cells, the more difficult it is to produce proteins
in these cells. On the other hand, insect and mammalian cells have the advantage
that they not only produce small proteins but also more complex ones (e.g.,
glycosylated proteins) in a functional form. In many cases, such organisms are
therefore to be depended upon.
The 1970s and 1980s were the grand age of receptor-binding tests with membrane
preparations. Radioactively labeled ligands were used to determine the specific
binding of new substances. The most important receptors for hormones and
neurotransmitters were known and in some cases, the difference between pre-
and postsynaptic receptors as well. The different subtypes and their amino acid
sequences were not known. Correspondingly, the results of the investigations
were inaccurate.
Gene-technology methods allow the preparation of homogeneous recombinant
proteins in practically unlimited quantities. They play an important role at the very
first step of drug design: the identification of a target protein. Progress with the
methodology led to the discovery of new receptors with partially unknown function
or specificity. The next steps are the testing of the therapeutic concept
on genetically altered animals. Another important contribution is the preparation
of proteins for molecular test systems and the isolation of adequate material for
the elucidation of the 3D protein structure (▶ Chap. 13, “Experimental Methods of
Structure Determination”). With perhaps the exception of a very few proteins that
can be isolated from blood or other natural sources, the production of large
quantities of protein is dependent on gene technology. Nowadays the purification
of proteins from animals or human blood is done rather reluctantly. The risk of
transmitting viruses or infections is deemed to be too high.
Gene technology offers the possibility to selectively produce structural variants
of proteins. The generation of point mutations (site-directed mutagenesis) allows
particular properties in proteins to be improved, and the binding and catalytic
properties of enzymes to be purposefully changed. Membrane-bound receptors
can be probed position by position to establish which amino acids are responsible
for the maintenance of the 3D structure, the adoption of a particular conformation,
or are of critical importance with respect to binding of a ligand. Three-dimensional
structural models of receptors can be generated in this way, or their relevance can
be appraised.
In many cases, it has also proven worthwhile to introduce point mutations that
change the surface properties of proteins and help to elucidate the 3D structure of
proteins. Sometimes the charge on individual amino acids must be changed for the
238 12 Gene Technology in Drug Research
sake of the protein crystallization. In the case of proteins in which a part of their
sequence is anchored in the membrane, the membrane anchor, which would impede
crystallization, is removed before the experiment. With soluble receptors it has
proven worthwhile to remove individual domains, crystallize them, and determine
their structure. Of course such modified proteins must still fulfill their particular
functions, that is, ligand binding or DNA docking. If the difficult crystallization
step is accomplished, then the actual structural elucidation is nowadays usually only
a matter of a few weeks in most cases (▶ Chap. 13, “Experimental Methods of
Structure Determination”).
If the contributions for humanity are considered that come from all of this
progress, the question involuntarily arises: why are such broad segments of society
so afraid of gene technology? It takes a little effort to understand these prejudices.
With the use of gene technology, almost everything that is theoretically imaginable
is possible in the field of genetics. The trust that people have in science is, however,
not as unshakable as it was before the atom bomb. Now, when significantly more
chances than risks are at hand, the sins of our forefathers have come back to haunt us.
Scientists have all too often underestimated possible risks in the past and put their
ethical concerns on hold. Scientists have still not managed to assuage public fears.
We must take these fears earnestly and build new trust by behaving responsibly.
usual systematic sequencing methods. Above all, it benefited from the development
of faster and faster sequencing machines and powerful bioinformatic programs. It
was of no disadvantage in the end that because of the high redundancy of the method,
the genome had to be sequenced multiple times with the shotgun method. Interest-
ingly, the shotgun method was also used at the end by the international consortium
that followed the systematic approach to elucidate local sequence areas. Because the
initial intent of the private enterprise was to patent the sequenced genome, the
competition between the two initiatives was great. In March 2000, the American
President Bill Clinton declared the human genome to be not patentable, and spoke for
its use by everyone for the common good.
How did it come that a competing private initiative started to sequence the
genome? In spring 1995 Craig Venter and his group identified the entire genome for
the bacteria Haemophilus influenzae by using the shotgun method. The enormous
amount of 1,830,121 base pairs that code for 1,749 genes was sequenced. The
complete genomes of individual viruses were already known, but this was the
decoding of the genetic information of a self-contained creature. The subsequent
decoding of the sequence of 580,067 base pairs of the Mycoplasma genitalium
genome by Venter’s wife, Claire Fraser, took only four months.
Venter and his group worked with the shotgun method on the entire genome, the
so-called “whole-genome shotgun sequencing.” The statistical approach that was
followed by Venter initially seemed so unusual and utopian that his application for
a research grant from the American National Institutes of Health (NIH) was
rejected. This brought about the founding of The Institute for Genomic Research
(TIGR) and the Celera Genomics company. There, Venter could pursue his
research according to his ideas and plans. Finally, the success proved the feasibility
of the proposed strategy.
Whose genome was actually sequenced? In both initiatives the DNA of multiple
individuals was mixed and the individual differences were purposefully calculated
out. In this way the “consensus sequence” of the human genome was determined.
But it did not stop with the human genome. The complete elucidation of baker’s
yeast Saccharomyces cerevisiae, and the common thale cress Arabidopsis thaliana,
the rice plant Oryza sativa, the pinworm Caenorhabditis elegans, the fruit fly
Drosophila melanogaster, the chimpanzee Pan troglodytes, the mouse Mus
musculus, and many other organisms (Table 12.1) has been accomplished. In the
meantime new ones emerge weekly. This raises new questions: how should this
plethora of information be managed? How can the genetic information be trans-
lated into useful knowledge? The field of bioinformatics has been challenged.
Computer programs for the intelligent comparison of sequences and the analysis of
metabolic pathways and signaling cascades already exist. New initiatives were
founded that have the goal of determining the spatial structure of all or at least
many sequences. The structural space of all real, naturally occurring proteins is
filling slowly. The crystal structures of all representatives of some protein families
of the human genome have been determined. Therefore it is only a question of time
until we can lay spatial blueprints next to the catalogue of all sequences in our
genome.
240 12 Gene Technology in Drug Research
After the human genome was sequenced, the exciting question arose as to which
gene products all of these DNA sequences code for. Initially it must be remarked
that the genome is not static, it is constantly changing. It is only in this way that the
genetic variations can occur that make up the diversity of all creatures. In the course
of evolution, the genetic constitution has expanded. Simple single-cell organisms
without cell nuclei (prokaryotes) have a circular genome that contains only coding
genes. Single-celled organisms with a cell nucleus (eukaryotes) such as yeast, have
a larger genome, of which about 20% represents coding genes. Multicellular
organisms such as humans have a genome that is 200-times larger than that of
yeast (Table 12.1). The number of coding genes, however, is not larger. There are
even organisms such as the amoeba that have a genome that is 200-times larger than
that of humans. Even the miniscule water flea numerically overshadows us with its
31,000 genes. So the alleged masterpiece of creation does not necessarily also have
the largest genome. Obviously only a small number of additional DNA sequences
have accrued during the course of evolution that in fact code for additional gene
products. Many genes in higher organisms are similar to those in simpler species. If
the number of coding genes has hardly grown from the single-cell organisms to
humans, and even the gene products that are coded for are similar, what is the
explanation for the massive increase in complexity of the genome in higher-
developed organisms? The answer is not in the diversity of the needed gene
products, but much more in the finely tuned regulation of gene expression
(Sect. 12.13). In higher organisms, it is of decisive importance where and at what
time particular genes are expressed and gene products are synthesized. The 95% of
12.4 What Is Contained in the Biological Space of the Human Proteome? 241
human DNA that does not code for proteins contains numerous sequences and
signals that control this regulation. Therefore the total number of genes in higher-
developed creatures does not seem to increase, but rather the gene density
decreases. On average, 12 genes per one million base pairs are found in the
human genome, whereas this number is 118 in the fruit fly, 197 in the pinworm,
and 221 in the common thale cress. Furthermore the human genome is very
scattered. It seems that it is not the number of genes but rather how they are used
and how their activation is regulated that is decisive for the developmental state of
the organism. It must also be considered that multicellular organisms also need
a great deal of cell differentiation into different organs. These processes must be
reliably regulated and controlled. Moreover, higher organisms achieve a much
larger diversity in their protein composition by so-called alternative splicing.
Posttranslational modification after the biosynthesis also plays a role. This is
observed to a much smaller extent in, for instance, prokaryotes. The splicing
process cuts out the portions of DNA that are not coding for proteins during
translation from DNA to RNA. During alternative splicing, it is decided in what
is cut out and what is translated. In this way, one DNA sequence can code for
multiple different proteins.
To date, the largest genome of a prokaryote that has been found belongs to the
pathogenic protozoa Trichomonas vaginalis. It consists of 160 million base pairs.
This pathogen is usually transmitted in humans by sexual intercourse and causes
urinary tract infections. Its enormous genome takes on an over-proportional dimen-
sion in the cell. This could create an advantage for the pathogen because its large
surface area adheres to the vaginal mucosa better. Furthermore, the immune system
has trouble to attack and destroy such an over-sized parasite. The genome of the soil
bacterium Sorangium cellosum with 13 million bases and 10,000 genes is four times
as large as the average genome of other bacteria. This might have something to do
with the fact that this soil bacteria is able to carry out special tasks that makes its
therapeutic use interesting. It is a versatile producer of complex natural products
such as the epothilones, which are potent chemotherapeutics that have great poten-
tial in the treatment of cancer.
According to an analysis carried out in 2007, the human genome encompasses
3.25 billion bases. It contains around 25,000 genes, a few thousand of which are
recognized as RNA genes (even today the number is not exactly named because
only 92% has been fully sequenced). The earlier textbook knowledge that one gene
product is behind each DNA sequence, must be expanded. It must not be
overlooked that our genome contains many thousands of genes that are for non-
coding RNA segments. The resulting RNA molecules accomplish important func-
tions in our bodies. The large groups of tRNAs that serve as adapter molecules for
the reading and translation of base-pair triplets in the genome into the correct amino
acid sequence deserve special mention. Furthermore it has been shown that the
ribosome itself, which is the molecular machinery for protein synthesis, consists
largely of RNA. The spliceosome, the complex machinery for the removal of non-
coding segments of the genome, contains RNA molecules, so-called snRNAs.
242 12 Gene Technology in Drug Research
There are even more small RNA molecules (snoRNAs) that are responsible for the
processing and modification of other RNA molecules.
Since then, it is known that over 21,500 genes in our genome are translated into
proteins. It is not known however, what functions all of these proteins fulfill.
Bioinformatics has contributed a great deal to classification of their biochemical
function, that is, whether the protein is an enzyme (e.g., a protease, kinase, or
oxidoreductase) or whether it is a receptor, ion channel, or transporter. The function
or to what protein class a new sequence belongs can be discovered by sequence
comparisons to already annotated proteins. Often by making so-called multiple
sequence comparisons within a protein family, a significant similarity can be recog-
nized. The information about the spatial architecture and folding (▶ Sect. 14.2) can
be analyzed through relationships because the spatial geometry of proteins has
been much more strongly conserved than the sequential composition of the folded
protein chain. It is often that individual motifs or characteristic sequence segments
disclose a particular biochemical function of a protein. Another tool in this detective
tour de force of functional annotation has proven to be protein sequence comparisons
between the genomes of other species.
The assignment of a biochemical function to a protein sequence affords a
glimpse into its molecular function. It shows whether, for example, it cleaves
a peptide sequence as a catalyst, carries out a metabolic reduction, or transduces
a signal to the cell as a receptor. What this regulation and control mean for
the organism remains to be resolved. Whether a particular protein causes
a disease by either a defective function or by dysregulation is just as unclear.
The correction of such a defect could lead to a successful pharmaceutical
therapy.
In the Science publication from the Venter group in 2001, it was assumed that the
genome coded for more than 26,500 proteins. At that time, a definitive function
could not be assigned to 40% of the sequences. In the remaining part, about 10%
were detected to be enzymes. Another 12% proved to be involved in signal
transduction, and 13.5% are nucleic acid binding proteins. The large remaining
group was scattered across many different functions such as proteins of the cyto-
skeleton, surface receptors, ion channels, transporters, extracellular matrix proteins,
immune system proteins, or chaperones. Seven year later this picture could be
refined. The largest protein family with more than 7,000 members contains the zinc
finger domain (▶ Sect. 28.2). These proteins assume an important role in transcrib-
ing sequence segments of the DNA into RNA. Most zinc finger proteins belong
to the group of transcription factors. Another large protein family contains the
immunoglobulins. These domains (▶ Sect. 32.1), which are constructed from b-
pleated sheets, occur in antibodies. A few protein families are listed in Table 12.2
and are presented in more detail in ▶ Chaps. 23, “Inhibitors of Hydrolases with
an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”;
▶ 25, “Inhibitors of Hydrolyzing Metalloenzymes”; ▶ 26, “Transferase Inhibitors”;
▶ 27, “Oxidoreductase Inhibitors”; ▶ 28, “Agonists and Antagonists of Nuclear
Receptors”; ▶ 29, “Agonists and Antagonists of Membrane-Bound Receptors”;
▶ 31, “Ligands for Surface Receptors”; and ▶ 32, “Biologicals: Peptides, Proteins,
12.4 What Is Contained in the Biological Space of the Human Proteome? 243
Table 12.2 Examples of protein families in the human genome and the number of their members
Protein superfamily Number
Zinc finger (C2H2 and C2HC) 7,707
Protein kinase-like 876
G Protein-coupled receptor-like 784
a/b-Hydrolases 151
Cysteine proteases 164
Trypsin-like serine proteases 155
Metalloprotease (“Zincins”), catalytic domains 132
FAD/NAD(P)-binding domains 79
Cytochrome P450 79
Integrin a, N-terminal domains 51
Cytokines 52
cycl. Nucleotide-phosphodiesterase, catalytic domains 50
Caspase-like 39
Carbonic anhydrases 23
Aquaporin-like 20
Integrin domains 18
Aspartic proteases 16
ClC-chloride channel 16
Subtilisin-like 14
http://hodgkin.mbu.iisc.ernet.in/human/
Fig. 12.3 The composition of protein families that are particularly often associated with human
diseases (GPCR: G-protein-coupled receptor; Fibronectin: extracellular glycoproteins in tissue
construction; homeobox: proteins that influence the morphogenetic development; spectrin: cyto-
skeletal proteins; MHC I: major histocompatibility complex proteins that are involved in immune-
recognition processes; myosin: motor protein in muscle control; RRM: RNA-recognition motif
transcriptions factor; trypsin-like: serine proteases; laminin EGF: a growth factor in the extracel-
lular matrix; Ras: oncoprotein in tumorigenesis; SH2: protein domains in the phosphorylation
signal cascade).
Early on, pure or enriched enzymes were available for in vitro tests, but only in
those cases in which the material was easily available, for instance, human throm-
bin from blood. In other cases, animal material had to be used with all the risks that
come with it considering the relevance for rational design (see ▶ Sect. 19.11).
There are many proteins that cannot be isolated in adequate amounts or in
a homogeneous form. The sequence determination and the production of such
proteins are simple today. The unbelievably small amount of a few picomoles
(1 pmol¼1012 mol) is enough to determine the primary structure of a short
sequence. It is over the thus-determined amino acid sequence that, after the
translation procedure, the genetic code can be reconstructed into a gene. In doing
so, it must be considered that multiple base triplets can stand for a particular amino
acid (so-called degenerate codes, ▶ Sect. 32.7). A group of single-stranded oligo-
nucleotides are synthesized that could theoretically cover all the original peptide
segment. These molecules can be used to find a complementary sequence in
a cDNA library. cDNA (complementary DNA) is the complementary DNA to
the mRNA (messenger RNA). It is obtained from the mRNA, which merely
contains the sequence that is needed for the biosynthesis of proteins, by translation
with a reverse transcriptase (▶ Sect. 32.5). Finally the gene is produced in larger
quantities by using PCR techniques, and the amino acid sequence is determined via
its base sequence simply because oligonucleotides are much easier to sequence.
Next, the gene is brought into cells that are allowed to reproduce. There can be
difficulties in a few cases with this step. In bacteria, such as the intestinal bacterium
Escherichia coli, or in yeast cells, only soluble proteins can be produced. Some
proteins accumulate in inclusion bodies. They must be extracted, dissolved, and
refolded under specific conditions. The gene segment for a small protein is often
coupled with the information for another protein and both are then expressed
together. The large protein conjugate that forms in the cell is better protected from
metabolic degradation than small proteins. In the preparation, the non-essential part is
cleaved from the protein conjugate. There can be problems if the folding of the
protein is not correctly accomplished, or if multiple chains (as in insulin) must be
coupled over disulfide bridges. Larger proteins that must be furnished with sugar
groups to accomplish their function (glycosylation) must be produced in cells from
higher organisms, for example in mammalian cells. The manufacture of complex
proteins in insect cells has become particularly attractive. These cells are infected
with the so-called baculovirus, in which the desired information has been incorpo-
rated into its genome. The virus codes for the protein and insect cells provide the
production and subsequent glycosylation abilities. Not only enzymes, but also recep-
tors, ion channels, and entire signaling cascades can be produced in cells in this way.
12.7 Silencing Genes by RNA Interference 247
The approaches that were described in Sects. 12.5 and 12.7 pursue the goal of
turning a disease-causing gene, or a gene that plays a role in a disease off. But how
is it to be recognized whether a particular gene or gene product is involved in
a disease process at all? Decisive indicators to answer this question can be extracted
12.8 Proteomics and Metabolomics 249
from the protein composition in the cell. This composition changes dynamically. It
is termed proteome, and reflects the totality of all proteins in a cell, actually in the
entire organism, at a given time under entirely defined conditions. If we concentrate
on the protein pattern of a cell from a particular organ, important variables are the
metabolic state, the developmental stage of the organism, the time point in the cell
cycle, or the surrounding temperature. Disease processes and pharmaceutical ther-
apy also change this pattern. In the transcriptome, all theoretically expressed
proteins are coded as static hereditary information. In contrast, the proteome
reflects the protein composition at a particular time point. The difference between
a butterfly in its caterpillar and adult phases serves as an impressive example of the
difference between the genome and the proteome. The genome is the same for both,
but the proteome is significantly changed, which is expressed in the form of
a completely different phenotype.
In view of disease processes or a pharmaceutical therapy thereof, the proteome
can be used to compare the state of cells that are healthy, diseased, as well as under
the influence of a drug therapy. Initially this seems like an extremely complex,
barely solvable task. A cell contains thousands of proteins, of which many are
modified after their expression. For example, the first amino acids in a sequence are
cleaved (▶ Sect. 25.9), phosphate groups are transferred (▶ Sect. 26.3), sugar
building blocks are added on, disulfide bridges are coupled, prosthetic groups are
added, and ubiquitin or prenyl groups are added (▶ Sect. 26.10). In addition,
alternative RNA splicing occurs, which is carried out as a mechanism of gene
regulation and further increases the diversity of the proteome on the basis of
a comparatively small number of genes. All of this dramatically increases the
diversity of the protein composition, probably by a factor of 5–10 compared to
the genome composition. Nevertheless, a sophisticated analytical method has been
developed with which it is possible to analyze the proteome of a cell at a particular
time point. First a cell must be denatured in a way that all modifying processes are
abruptly stopped so that conclusions can be drawn about the cell contents. The cell
lysate is then subjected to separation. Proteins contain many acidic and basic amino
acids so that an exactly defined pH value exists for each protein at which the
protonation or deprotonation arrives at a state at which the protein appears to be
overall electrically neutral. This pH value is specific for each protein and depends
on the amino acid composition (isoelectric point). The protein mixture is added to
a solid support (a polyacrylic acid gel) as would typically be used for chromatog-
raphy purposes. Then voltage is applied. If the proteins carry a charge, they migrate
over this solid support in the direction of the oppositely charged pole. In this way, at
some point in their migration over the gel, which is construed in such a way that
a continuous pH gradient from one end to the other is established, the applied
proteins reach a point where their exterior appears to be uncharged overall. If this
position is reached on the solid support, the proteins no longer migrate. Proteins are
then separated according to their isoelectric point by using this so-called isoelectric
focusing. All proteins with the same isoelectric point migrate the same distance and
occur as a mixture. Then the chromatography plate is turned 90 , and the proteins
are separated again, however now according to a different principle. For this, the
250 12 Gene Technology in Drug Research
a b c
2 3 1 2 3 1 2 3
1
8 9 8 9 8 9
5 7 5 7 5 7
4 6 4 6 4 6
10 10 10
Fig. 12.4 2D-Gel electrophoresis for cellular proteome analysis: (a) proteome of a normal cell.
(b) Proteome for a pathologically altered cell. (c) Proteome of a pathologically altered cell after
treatment with a drug. Changes in the protein concentration are indicated by red circles. Above all,
the proteins at positions 3, 6, and 7 are clearly up-regulated in the diseased state. A few of the
pathological changes are corrected by the drug therapy, but new changes in the proteome (e.g.,
2, 8, and 10) might be induced by side effects (Figure taken from Lottspeich F (1999) Angew
Chem Intl Ed Engl 38:2476–2492).
proteins are thermally denatured and their charges are masked with sodium
dodecylsulfate, a strongly charged anionic surfactant, so that all are virtually
equally charged on the exterior (SDS-PAGE). The denatured proteins migrate
again by the application of an electrical field. Now the migration speed is, however,
dependent on the mass of the proteins. The direction of the migration, which is
perpendicular to the first isoelectric separation, causes the originally applied pro-
teome to be broadly distributed and well separated on the solid support in the end.
By using this 2D electrophoresis, it is possible to separate many thousands of
proteins. The quantity and sequential composition of the separated proteins must be
characterized. Many different staining and fluorescence techniques have been
developed for the quantitative analysis. They allow quantitative determinations,
especially in comparison to the proteomes of analogous cells that are in a different
state. This is how the quantitative comparison of the protein composition in
a diseased and healthy state is accomplished. How the proteome changes under
the influence of a drug (Fig. 12.4) can also be determined. But how does one
recognize what is hidden in each individual protein spot on such a 2D gel? For
this the proteins are extracted from the plate and digested with trypsin. This
protease (▶ Sect. 23.3) cleaves the denatured proteins into small peptide fragments,
which are finally analyzed by mass spectrometry. Sophisticated technologies
together with computer analyses of precalculated fragmentation patterns of proteins
allow the proteins to be reconstructed and characterized with regard to their
sequence. Proteins in the proteome that have either been up- or down-regulated
due to a disease process can be detected in this way. Whether, however, the altered
expression pattern causes the pathological state or is a consequence of it is
a question that remains to be answered by independent experiments.
As described, the proteome of a cell can change upon therapy with a drug. What
are the interaction partners for a given drug? Are the induced effects always the
12.8 Proteomics and Metabolomics 251
same if drugs from the same compound class are used? The properties of three
kinase inhibitors that were developed for the treatment of chronic myeloid leukemia
(▶ Sect. 26.5) were investigated in detail in the research group of Giulio Superti-
Furga in Vienna, Austria. For this, the drug first had to be equipped with
a chemically inert anchor group. It is certainly quite a sophisticated challenge to
find the correct position to place an anchor on such an active substance so that the
mode of action is not significantly perturbed. As a rule, multiple positions along the
molecular scaffold must be tried for this purpose. Finally the drug is irreversibly
covalently coupled over the attached anchor group to a chromatography column.
Once equipped with these “baits,” the proteome from the lysate of a cell is added to
the column. Proteins that have affinity to the immobilized drug stick to the column.
Finally the binding partners that were detected in this pull-down experiment must be
released from the column, separated, and characterized analogously to the above-
described technique. The composition of all proteins that have affinity to the active
substance is obtained. It is difficult to initially extract quantitative conclusions about
the affinity of the binding partners, above all because the protein quantities and their
composition in the lysate are highly variable. It is possible, however, to construct
a profile for each active substance according to its protein interaction partners. This
led to the surprising result that even drugs that belong to the same or similar substance
classes and were developed for the same therapeutic indication can well display
significantly different interaction profiles in the cell. This is an impressive observa-
tion, the evaluation and application of which will require great research effort. We
will see in the next section that the different efficacy, therapeutic deviations, and
variable side-effect profiles in patients can be explained by this.
Proteome analytical techniques (proteomics) can also be used in clinical diag-
nostics. Without exactly resolving the analyte, significant changes in the form of
a mass fingerprint can be recognized. Tumor diseases are revealed by changes in
their protein composition. These can be recognized at a very early point, which
should hopefully still allow a curative treatment for the tumor.
Another technique that is analogous to proteomics is the analysis of metabolites
that are produced in an organism. The term metabolome comprises all metabolites
(e.g., metabolic degradation products) that are present at a specific time point. The
techniques of metabolomics try to quantify the metabolite composition and to
draw conclusions about the condition of the cell based on this information. This
is particularly valid when the cell is exposed to foreign substances. If the metabolite
profile at a particular time point is studied, especially in pathophysiologically
or genetically changed conditions, the term metabolomic is used. The goal of this
technique is to draw conclusions about the molecular composition in cells from
body fluids such as urine, serum, or cerebral spinal fluid. This can lead to an
improved and more sophisticated diagnostic procedure, and therefore an easier
early detection of diseases. These techniques also serve to characterize proteins
for drug therapy or to analyze the greater influence of an event in the cell that is
being treated with a drug. The hope remains that these techniques will allow for
a better understanding of the total effects of the use of pharmaceuticals, and finally
achieve a higher safety standard for therapy.
252 12 Gene Technology in Drug Research
Gen2 Gen3
Gen1
Reverse
RNA Transcriptase
PCR Isolation Fluorescence
Labeling
mRNA mRNA
Hybridization
…..
…..
Fig. 12.5 Manufacture and testing of an expression pattern with microarray technology. Individ-
ual gene segments from an organism are cut out and amplified by using PCR (above left). Next
they are immobilized on a microchip support as single-stranded oligonucleotides (below left). In
addition to the isolated and amplified DNA, synthetically manufactured DNA building blocks or
cDNA molecules, which are obtained from reverse transcription can also be brought onto the
support. One sort of bait molecule is at each point on the support. RNA molecules are isolated from
the cells of healthy (green) and diseased tissue (red), translated into mRNA, and reverse-
transcribed into cDNA. The cDNA is provided with a colored fluorescence marker. Then the
test molecule is added in a single-stranded form to the microarray plate, and if it is complementary,
a hybridization (below middle) results. Finally the binding is analyzed under fluorescent light
(below right). Yellow areas indicate that mRNA molecules from the healthy as well as the diseased
cells have bound. The mRNA that binds there is expressed in healthy as well as diseased states.
Areas that remain dark indicate that the mRNA is up-regulated neither in a healthy nor in
a diseased state. Areas that are either only green or only red fluorescent indicate a difference in
the expression pattern between cells from healthy and diseased tissue.
equipped with different fluorescence dyes according to their origin. For example,
the mRNA from a healthy cell is labeled green and that from a diseased cell is red.
After the hybridization on the chip, there will be areas that fluoresce green, red, or
yellow upon excitation, and others that remain without fluorescence. Areas that
glow yellow under the fluorescent light indicate that mRNA molecules from
healthy as well as diseased tissue have been bound. Obviously the mRNA that
binds there is available equally in the diseased and healthy states. Areas where no
fluorescence is seen indicate that neither healthy nor diseased cells produced
mRNA that bound there. Areas that fluoresce either green or red are interesting
254 12 Gene Technology in Drug Research
because they indicate differences in the expression pattern between healthy and
diseased cells. In this way, gene products can be discovered that are involved in
a disease process. If a misregulation is present, an attempt can be made to correct
this state with a pharmaceutical therapy.
What makes a single organism of a particular species different and leads to the
enriching diversity of a population? We speak of the human genome, but many
interesting deviations must be present so that we all look different and have
different features. Polymorphisms, that is, variations in the composition of the
genome, cause the observed diversity in or form the different phenotypes of
a species. The most obvious phenotypic difference is the division into male and
female individuals. Of course this is not the only difference that we recognize for
the human species. Many sequence variations occur within a population at the
genome level. If they occur in more than 1% of the population, then different alleles
are spoken of, otherwise they are attributed to mutations that have not yet been
enforced by evolution. Genetic polymorphisms are, for instance, observed as
insertions or deletions in which at least one nucleotide has been either partially or
completely incorporated or lost. However, single nucleotide exchanges occur as
the most common sequence variation. Here the term SNPs (spoken “snips”) is
used, which is an abbreviation of single nucleotide polymorphism. Compared to the
entire genome, polymorphisms encompass only a very small portion. They are
estimated to be 1% of the entire genome, so about three million bases. Of these,
SNPs are the overwhelming portion with about 90% share. Therefore the largest
part of our genome is identical over the entire species human, even though enor-
mous diversity in the phenotype is observed between us.
Within the SNPs, coding and non-coding changes are differentiated according to
whether these observed exchanges are translated into proteins or not. In the coding
regions of the genome the single exchange of a nucleotide can lead to an altered
protein sequence. In ▶ Sect. 32.7 the translation procedure of a base triplet into
a protein sequence is introduced. If a base in a coding triplet is changed, it can either
be translated into the same amino acid, or it leads to the incorporation of a different
group. This is related to the fact that sometimes multiple triplets code for the same
amino acid. The incorporation of a different amino acid into a protein can change its
properties. For example the amino acid composition of a glycosyltransferase is
decisive for the blood group that we have. An example is introduced in ▶ Sect. 29.7
of how an altered incorporation of a few amino acids in a G protein-coupled
receptor can exert an influence on our sense of smell. Humanity is divided into
different alleles according to their ability to smell different intensities and qualities.
However, not only SNPs in coding regions lead to differences in our species.
SNPs in noncoding segments of the genome can lead to changes in gene regulation.
In the context of drug research and therapy, SNPs can also be relevant where they
have no immediate effect on the phenotype. It is assumed that some SNPs confer
12.11 The Personal Genome: Access to an Individual Therapy? 255
Genome sequencing and the analysis of SNPs and polymorphisms have impres-
sively uncovered the source of disease predisposition, and why drugs have attenu-
ated tolerability and different side-effect profiles. It has offered an explanation for
why undesirably high variations in the efficacy of drugs can occur in different
patients. All the more reason to ask whether the sequencing of the individual
genome of each person would provide options for a tailored individual and
personalized therapy. It is in no way an utopian idea that in a few years the full
sequencing of each individual person will be possible at manageable prices and
within an acceptable time frame.
It is long known in medicine that the blood groups of donor and recipient must
match for blood transfusions. A genome analysis would make the search for a
matching donor organ easier for transplantations. A particularly high density of
SNPs has been discovered in the genome, especially in regions coding for proteins
that present antigens in the immune system on their surface to stimulate an immune
response (▶ Sects. 31.7 and ▶ 32.2). An SNP analysis of each individual could
indicate the probability of developing a particular disease. Here, early detection of
this risk and possible lifestyle modification could be better than any therapy.
Already today high-resolution DNA chips (Sect. 12.9) allow the simultaneous
determination of more than 500,000 genetic SNP markers. Discovered SNPs can
indicate an elevated disposition for, for instance, the development of Alzheimer’s
disease in old age. A simple screening of the individual DNA sequence would allow
a predisposition for a particular disease pattern to be recognized.
Craig Venter, who determined the human genome in his company by the mRNA
shotgun method, had his own genome analyzed and published. From the gene
analyses of these data, a tendency for obesity and cardiovascular disease was
identified. His own father died at 59 years old of a heart attack. Based on this
analysis, Venter decided to take a lipid-lowering agent from the statin class
preventatively. A doctor could simply read from a personal genome whether the
patient displays an SNP pattern that would lead one to expect an intolerance for
a particular drug therapy. Moreover the doctor could see what type of metabolizer
category (▶ Sect. 27.7) the patient belongs to. This could reduce intolerance upon
the simultaneous treatment with multiple drugs, and would allow a safe adjustment
of individual dosing. It can also help to choose the right drug for a therapy,
particularly if multiple drugs are available for one indication.
256 12 Gene Technology in Drug Research
Genetic diseases have a molecular origin. A gene is altered (allele), sometimes the
two genes originating from both parents. Each of us carries a large number of such
altered genes, which are a result of arbitrary base exchanges: the SNPs. The
principle of evolution is based on these random mutations. If a mutation causes
a better adaptability of an individual in the environment, the chances of survival and
reproduction increase. Those genes are then reproduced with increased probability.
So-called horizontal gene transfer exerts an accelerated effect on evolution in
asexually reproducing species. There, entire DNA fragments between individuals
or even species are exchanged. Crossover plays an important role in this sense in
sexual reproduction. In this case, neighboring gene sequences of both parents
arbitrarily crossover and make new couplings. Without mutations and crossover,
all species would remain absolutely constant. In individual cases many errors are
produced as a mechanism of evolution. Some of these errors are the cause of genetic
disease. In sickle cell anemia a single amino acid in hemoglobin, which gives
blood its red color, is exchanged and a glutamic acid in position 5 of the b chain of
hemoglobin A (HbA) is replaced by a valine. The altered hemoglobin aggregates: it
“sticks” together in the red blood cells. The cells collapse and take on
a characteristic sickle form. Homozygous carriers, that is, individuals in whom
the “sick” gene is inherited from the father and the mother, are not able to survive.
12.13 Epigenetics: Lifestyle and Environment Influence 257
Heterozygous carriers who carry one “sick” and one “healthy” gene produce
normal and altered hemoglobin alongside one another. These people indeed have
a shorter life expectancy, but usually achieve reproductive maturity. In areas in
which malaria is endemic, there is a selection pressure for the genetic disease.
Heterozygous carriers of sickle cell anemia are more resistant to malaria than
healthy people (▶ Sect. 3.2). Here we are witnesses to Nature’s great experiment.
How will it end? Even people intervene. If malaria is successfully treated, wild-type
HbA carriers are no longer disadvantaged, the evolutionary advantage of sickle cell
anemia and the consequent selection pressure in the direction of this disease
disappears. This genetic disease could become “extinct” after a few generations.
On the other hand, if sickle cell anemia is treated either conventionally or gene
therapeutically, then these people would have entirely normal “healthy” red cells.
The malaria pathogen could reproduce well in them again. The protection from this
disease would disappear, and the susceptibility of these people to malaria would
rise to a normal risk level.
In addition to sickle cell anemia, around four thousand other diseases and their
molecular causes are known. Some, for example cystic fibrosis, phenylketonuria,
and inherited coagulopathies occur relatively frequently. Many others are rare and
are sometimes only described once. In the last years a multifactorial genetic cause
has been established for an increasing number of diseases, for example for diabetes,
rheumatoid arthritis, some cancers, asthma, and Alzheimer’s disease. The occur-
rence of these diseases is brought about by the simultaneous coincidence of
multiple genetic alterations, or is at least fostered by them.
The mechanisms of evolution are also responsible for the development of
resistances (▶ Sect. 4.8). Here, the selection pressure is exerted by a drug or an
insecticide (e.g., to exterminate malaria-carrying mosquitoes). Because of their
rapid reproduction, bacteria and viruses adapt quickly to a “hostile” environment.
The true masters are the retroviruses, which can develop resistance particularly
quickly because of their high mutation rates, and can therefore annihilate the
success of a drug with one stroke (▶ Sect. 24.5).
For the development of an organism, it is not only the kind of hereditary infor-
mation stored in the DNA that can be translated into gene products that is critical,
it is just as important that particular genes are only read in particular cells at
particular times. Even social factors and environment influence the genes and
change their behavior. Scientists observed the following example with zebra
finches. If a male zebra finch hears the song of another male, the gene EGR-1 is
more strongly read. The unknown song of a potential rival leads to a much
stronger activity in EGR-1 than background bird song that the finch has already
heard. EGR-1 is itself a key gene in gene regulation so that a change in the social
surroundings of the finch leads to many shifts in the protein expression pattern of
258 12 Gene Technology in Drug Research
the bird. This response helps the bird to adapt to the new changes because the
intrusion of a potential competitor into his own territory can be of essential
importance to him.
Pluripotent embryonal stem cells can differentiate into very different cell types.
For example, liver, brain, and muscle cells have the same chromosome set. They
are fundamentally different in their function. Many different phenotypes arise
from the identical genotype. This is true for the different cell types of an organism
at the same time as well as for different time-staged developmental steps in an
organism. Research on twins has produced remarkable results in this regard.
Comparative studies on identical twins, who are genetically identical, show that
with increasing age, and above all with different lifestyles, progressively larger
differences in the phenotype occur. There must therefore be mechanisms that lead
to changes in the phenotype that are passed along without changes in the genotype.
They regulate the transcription process and pass along this property to daughter
cells. This process is summarized under the term epigenetics. It leads to the
situation where an additional level of information is formed that regulates the
reading of the genes from the DNA.
The surroundings exert their effect on the genes through the epigenome.
Upbringing, childhood experiences, the effects of chemicals or intoxicants, and
stress are all epigenetic regulatory influences over which the gene activity is
temporarily or even permanently changed. As the following example of the Agouti
mice shows, such information can even be passed along to subsequent generations.
Normally, these rodents are small brown, thin, and very agile animals. The so-
called Agouti gene is contained within their genes, which after activation causes the
animal to become ill, their coat turns yellow, and they become ravenous and fat.
The offspring of these ill mice are colored the exact same way and are just as frail as
their parents. The American molecular biologist Randy Jirtle at Duke University in
Durham, NC, fed pregnant Agouti females a special diet that was rich in dietary
supplements such as vitamin B12, folic acid, choline, and betaine. As a result, the
majority of the offspring of these females were brown, thin, and in the best of
health. The Agouti gene was turned off by the enriched diet, without requiring any
changes to the genome sequence of the rodents.
On the molecular level, it is in particular methylation and acetylation that
transmits the additional epigenetic information. In contrast to genetic changes
that cause mutations in the translated gene products, epigenetic changes have
a strong dynamic component and are, above all, reversible. In the stretched-out
state, there is more than two meters of DNA in the cell; this is wound into a highly
compact form onto small basic proteins: the histones. Lined up like pearls along
a string, they collectively make up the chromatin, which makes up the chromo-
somes in its maximally packed form. Histones are the most strongly conserved
proteins in existence, for example, the 102-residue histone protein H4 from the pea
and from the cow are only different in two positions.
Epigenetic changes modify as one option the DNA in that methyl groups are
transferred to cytosine by methyltransferases (see ▶ Sect. 26.9) to give
5-methylcytosine. The base pairing with guanine in the DNA is not affected by
12.14 The Scope and Limitations of Gene Therapy 259
this modification, and the genetic code remains unchanged. If a methylation occurs
in a promoter region of the DNA, this leads to a silencing of the corresponding
gene. The methylation makes the DNA inaccessible to the reading apparatus, which
is somewhat similar to password-protected computer data. If the promoters in these
gene segments are demethylated again by methylases, the translation into the
corresponding protein is possible once more. As a second epigenetic change
histone proteins can be modified. Methyl, acetyl, and phosphate groups can be
enzymatically transferred to lysine and arginine residues of these basic proteins
with, for example, histone acetyltransferases (HATs). The added acetyl groups
neutralize the positive charge on the Lys and Arg residues (the so-called “histone
tails”). They can no longer interact as efficiently with the negatively charged
phosphate groups of DNA. Added phosphate groups have an even more repulsive
effect. These changes lead to less densely packed chromatin, which makes the DNA
reading in particular regions easier. The transcription and gene expression is
regulated in this way. On the contrary, the cleavage of acetyl groups by histone
deacetylases (HDACs) or by methylation of the Lys and Arg residues of the
histone causes the packing density of the chromatin to increase, and this diminishes
the probability for the DNA to be read in the affected areas.
Misregulation of the described enzymes is associated with the development of
diverse cancers. Because epigenetic processes are fundamentally reversible, there is
a chance that a drug therapy could intervene in the misregulated function of these
transferases. For this reason, intensive research efforts are underway for inhibitors
of different methyltransferases and histone deacylases, the latter of which are
mechanistically comparable to metalloproteinases (▶ Chap. 25, “Inhibitors of
Hydrolyzing Metalloenzymes”). The hope remains that these inhibitors can sup-
press disease-causing epigenetic changes and become potent drugs for cancer
therapy in humans.
In September 1990 the 4-year-old Ashanti DeSilva was the first patient to be treated
with a gene therapy. The alleles of both parents for the enzyme adenosine deam-
inase were defective. Because this enzyme is critical for the function of the immune
system, the little girl suffered from severe immune insufficiency that could no
longer be classically treated. As a therapy, the white cells of the patient were
repeatedly infected with a virus that carried the correct information for the missing
enzyme. The patient, who previously was hospitalized and in constant danger of
infection, has developed into a person with entirely normal health.
The term gene therapy refers to any technology with which a gene is introduced
into a cell of a patient to replace a defective or missing gene. In principle it is very
simple. Viruses demonstrate it for us daily: they bring their own genetic informa-
tion into a foreign cell and use it to code for a few key enzymes that are necessary
for their own reproduction. For the rest they use the biosynthesis machinery of the
infected cell. The retroviruses, the genetic information of which is coded in RNA,
260 12 Gene Technology in Drug Research
translate this information into DNA and integrate it into the host’s DNA. In gene
therapy, a nucleic acid segment is inserted into the genome of a virus that codes for
the protein that is to be substituted in the patient. The construct, which is what
these modified viral genes are called, is surrounded by the virus capsid and is
introduced into the cells of the patient. This can either take place outside of the
body, that is, in bone marrow or in white blood cells, that have already been
aspirated or within the body such as by injection into tumor tissue or in
a particular organ. Adenoviruses, herpes viruses, or retroviruses are all well suited
as carriers of the genes because these viruses incorporate their own genetic infor-
mation into mammalian DNA. Although retroviruses only transfer their genes
during cell division, adenoviruses can cause non-dividing cells to incorporate and
use foreign genetic information. Plasmids, DNA and liposomes and pure DNA
constructs are also being experimented with. The rates of transfer for the new
information into cellular DNA is significantly higher here than for the viruses. In
the meantime over 1,000 gene therapy clinical studies are underway, most in the
USA and overwhelmingly for tumor therapy. Cancer is indeed not a hereditary
disease, but the genetic information that is inherited from cell to cell creates
a “local” genetic disease. Oncogenes are a large group of proteins that are respon-
sible for the occurrence of cancer. Tumor-suppressor genes code for proteins that
interfere in the cell cycle and stop the division of cells. The quickly increasing
knowledge of the molecular structure of these proteins has afforded many
approaches for the gene therapy of tumors.
Other diseases can also be approached with gene therapy. The standard therapy
for cardiovascular diseases that are characterized by an excessive growth of endo-
thelial cells and consequent narrowing of the blood vessels is widening with
a balloon catheter. That helps, but only temporarily. After a few months the cells
proliferate anew and the blood flow in the downstream areas decreases threaten-
ingly. Here a gene therapy could be employed. Adenoviruses can be released
locally during the balloon catheter treatment. These carry the genetic information
for a protein that inhibits cell division, the so-called retinoblastoma protein. The
cells can then no longer proliferate.
AIDS patients die from infections because their immune systems are damaged.
The so-called T cells die. Bone marrow transplantation is a possible therapy. For this
it is decisive that the immunological properties of the donor and patient are as close as
possible. Many people are eliminated as possible donors, not to mention animals. Or
are they suited? A new approach for bone marrow transplantation and perhaps even
organ transplantation is the humanization of animals. For this immature human
bone marrow cells, stem cells, are transplanted into an animal, for example, a baboon.
The rejection reaction of the foreign cells is prevented by treatment with immuno-
suppressants. The human recipient does not bear the risk of an immune reaction, but
rather the animal donor. After the proliferation of the human cells in the animal, the
cells can be safely transplanted into the human “pro-donor.”
Will gene therapy replace classical drug therapy? The answer is absolutely
certain: no. The technique is very laborious and each patient needs an individually
adapted therapy. Moreover, the results to date have been a bit disappointing and
12.15 Synopsis 261
12.15 Synopsis
Bibliography
General Literature
Cooper NG (ed) (1994) The human genome project. Deciphering the blueprint of heredity.
University Science, Mill Valley
Kiely JS (1994) Recent advances in antisense technology. Ann Rep Med Chem 29:297–306
Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
Monastersky GM, Robel JM (eds) (1995) Strategies in transgenic animal science. Blackwell
Science, Oxford
Mullis KB, Ferré F, Gibbs RA (eds) (1994) The polymerase chain reaction. Birkh€auser, Boston
Pandit SB, Balaji S, Srinivasan N (2004) Structural and functional characterization of gene
products encoded in the human genome by homology detection. IUBMB Life 56:317–331
Post LE (1995) Gene therapy: progress, new directions, and issues. Ann Rep Med Chem 30:219–226
Slagboom PE, Meulenbelt I (2002) Organisation of the human genome and our tools for identi-
fying disease genes. Biol Psychol 61:11–31
Venter JC et al (2001) The sequence of the human genome. Science 291:1304–1351
Wolff JA (1994) Gene therapeutics. Methods and applications of direct gene transfer. Birkh€auser,
Boston
Special Literature
Adams MD et al (1995) Initial assessment of human gene diversity and expression patterns based
upon 83 million nucleotides of cDNA sequence. Nature 377(Suppl 6547):3–174 (85 authors
including JC Venter)
Carlton JM et al (2007) Draft genome sequence of the sexually transmitted pathogen Trichomonas
vaginalis. Science 315:207–212
Chang MW, Barr E, Seltzer J, Jiang Y-Q, Nabel GJ, Nabel EG, Parmacek MS, Leiden JM
(1995) Cytostatic gene therapy for vascular proliferative disorders with a constitutively active
form of the retinoblastoma gene product. Science 267:518–522
264 12 Gene Technology in Drug Research
Craig C (1995) Bristol-Myers to Pay $2.7M for transgenic goats that make human antibodies.
BioWorld Today 6:1
Explore the Homo sapiens genome. http://www.ensembl.org/Homo_sapiens/index.html
Fleischmann RD et al (JC Venter et al) (1995) Whole genome random sequencing and assembly of
Haemophilus influenzae Rd. Science 269:496–512
Human genome database with functional predictions
Schneiker S et al (2007) Complete genome sequence of the Myxobacterium Sorangium
cellulosum. Nat Biotech 25:1281–1289
Seide RK, Giaccio A (1995) Patenting animals. Chem Ind 16:656–659
Sippl W, Jung M (2009) Epigenetic targets in drug discovery methods and principles in medicinal
chemistry. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal
chemistry, vol 42. Wiley-VCH, Weinheim
Experimental Methods of Structure
Determination 13
of the 1950s, the famous natural product chemist Leopold Ruzicka dismissively
told him that crystals are a “chemical graveyard.” Nonetheless, Dunitz and his
research group showed over many years that a crystal in no way belongs in
a “graveyard,” but rather is the key to understanding the structure, dynamics, and
reactivity of molecules.
If a mineral is considered, the regular construction of the single crystals stands
out. Even organic materials have the ability to form shapely crystals. One must only
think of the fascinating crystals of candied sugar. Is this external regularity
a representation of the inner structure? Before this question is answered, the way
that crystals are obtained should be clarified. A mineralogist got it easy. Nature has
already provided well-formed crystals over thousands or millions of years. Organic
molecules and proteins rarely occur in Nature in a crystalline state. Conditions must
be found under which they crystallize.
In general, crystals are grown from a solution. For simple organic substances this
can also be accomplished from liquid material or by sublimation. Both crystalliza-
tion methods are known from water when a lake freezes to ice, or from beautiful
crystals of frost. For crystallization from solution a solvent is sought in which the
compound is adequately soluble. By changing the conditions, the saturation point
of the solution is exceeded. If this occurs slowly, small crystal nuclei form that can
grow to large crystals. As a rule the solubility of the compound decreases with
sinking temperatures. The saturation point of the solution can be exceeded by
changing the temperature. The solution can also be “thickened”, that is, some of
the solvent is removed. Another possibility is the addition of a second solvent in
which the compound is less soluble. If the ratio of the two solvents is correctly
chosen, the saturation point can be slowly approached. For compounds with acidic
or basic groups, pH conditions can be found under which the compound exists as
a salt. Because of strong ionic interactions the salts often form better crystals. They
can be “salted out.” For this, a salt, for example, sodium chloride, is added to an
aqueous solution of the compound. The salt “uses up” the water molecules as it goes
into solution. It becomes surrounded by a solvation sphere of water molecules. In
doing so, the water is removed from the organic compound, which also has a sphere
of water surrounding it, the solvent. The saturation point of the compound is
exceeded, and the crystallization begins.
Proteins are complex entities that, as a general rule, are only soluble in water.
Because of their amino acid composition, they carry charged ionic groups on their
surfaces. Even with proteins it holds true that conditions must be found under which
they associate in periodic array. This is accomplished by slowly changing the
amount of water in which the protein is dissolved. This can work in both directions.
Hydrophobic proteins begin to aggregate when the amount of water increases.
Proteins that have stronger polar groups on their surfaces aggregate when the
water molecules are removed from their surfaces. Adjusting pH to find the right
value, the choice of suitable salt for salting out, and different temperatures are the
conditions that must be optimized. In addition to salts, surface-active substances
(detergents) can also influence the solvent shell and support the crystallization.
Despite this, crystallization is a kind of fine art. The search for suitable conditions
13.2 Just Like Wallpaper: Symmetries Govern Crystal Packings 267
a b
Fig. 13.1 Paving stones cover a surface without leaving holes (a). This is only possible if they are
derived from a particular basic geometric pattern, for instance a parallelogram, rectangle, square,
triangle, or hexagon. This basic pattern can by modulated by complementary bulges and recesses.
A path cannot be covered without holes if equilateral pentagons or octagons are used. If an
octagonal stone is combined with a square stone, however, the surface is completely covered. It
is immediately clear that if a square stone is cut along its two diagonals, two triangles result.
Adding four such pieces an octagon can be amended to a square in this way (b).
requires creativity and diligence. Today, however, the crystallization methods are
so elaborate that the tedious work of setting up thousands of different test condi-
tions is carried out by robots.
Sometimes considerable effort is invested into structure determination. In 1995,
the crystallization and structure determination of HIV integrase, one of the key
enzymes in the generation cycle of the virus, was accomplished only after the
40th point mutation of the original protein. This point mutation was made with the
goal of changing the surface properties of the protein so that an orderly aggregation
to a crystal could occur.
Let us return to the original question of whether the orderly outward appearance of
a crystal is a reflection of the internal construction. Chemically, a crystal is
homogenously composed. The organic molecule or the protein represents the basic
building block. It is only when these building blocks are spatially neatly organized
that a periodic array occurs that optimally fills the space. In daily life, many solutions
to these packing problems are easily seen, for example, sugar cubes that only fit into
the box if they are layered in the right direction, or paving stones that must be neatly
laid in a periodic fashion to completely cover the path without gaps (Fig. 13.1).
A single paving stone, when correctly fitted to the next, represents a repeating unit
in the lattice. A crystallographer refers to this unit as an elementary unit cell, and the
orderly setting of one unit upon another in terms of periodic translation. In the most
simple organic crystal structure, the elementary cell is one molecule (Fig. 13.2).
The contents of an elementary cell can also be more complexly composed, for
example, like a wallpaper pattern. A basic motif is repeated so that it fills the surface
area. Crystallographers call the basic motif the asymmetric unit. In Fig. 13.3 this
motif is a flower branch. Not all of the motifs can be generated simply by shifting
268 13 Experimental Methods of Structure Determination
Fig. 13.2 In the most simple case, molecular packing, or unit cell, is accomplished purely by
shifting the molecule in all three spatial directions. The resulting unit, the elementary cell, is
derived from an irregularly angled body, a parallelepiped (above right, violet). If a point near the
molecule is picked out and all of the molecules in the crystal packing are connected by this point,
a three-dimensional lattice results.
the branch, some must be additionally reflected. A pair of image and mirror-image
branches represent the elementary cell. The surface can now be filled with this
building block by simply shifting it. In addition to reflecting, basic motifs can also
be rotated. By using reflections and rotations, both so-called symmetry operations,
the contents of the elementary cell is generated from the asymmetric unit. This cell
is layered on itself in all three spatial directions in an orderly formed crystal lattice.
Even as a three-dimensional entity, the elementary cell must take on a particular
13.3 Crystal Lattices Diffract X-Rays 269
form to completely fill all of the space. If the basic types of elementary cells are
combined with all of the possible symmetry operations, 230 possibilities result for
the basic motif to fill the space. The crystallographer calls them the 230 space
groups. For chiral molecules, and proteins belong to this group, mirror reflection
does not occur. Therefore proteins only crystallize in 65 space groups.
Max von Laue used crystals to prove the wave nature of X-rays (Roentgen rays) by
diffracting them. For illustration, we shall consider a water wave. When a drop of
rain strikes a puddle, circular waves form that propagate from the center outward.
The drop generates a so-called elementary wave upon submersion. If two drops that
are separated by a particular distance simultaneously strike the water’s surface,
circular waves propagate outwardly from both submersion points. It is better to
observe this experiment if the water’s surface is constantly being “excited,” for
instance, with a constantly dripping tap. The circular outwardly spreading wave
fronts meet each other at some point. What happens? A lamellar pattern forms, parts
of the water’s surface remain at rest and other parts seem to move vigorously
(Fig. 13.4). In the cross section the water surface moves sinusoidally (Fig. 13.5).
How do two waves behave that collide and superimpose with one another? If the
wave peak and another wave peak or the wave trough and another wave trough
meet, the wave is amplified. If, on the other hand, a wave peak meets a trough, they
cancel one another out. The water surface remains calm. The lamellar pattern of
moving and still water surface between waves that are moving outwardly and
inwardly is caused by this superimposition. It is called interference. The band
density depends on the distance between the submersion points of the drops. The
ensuing interference pattern therefore contains information about the relative posi-
tion of the points from which the elementary waves were generated.
Fig. 13.5 The waves run in a sinusoidal manner in cross section. The distance between two wave
peaks is called the wavelength. The height of the water wave at the summit is called the amplitude.
The position at which the wave crosses the resting position determines the phase. (a) If two wave
trains with the same phase meet, they add to one another and the amplitude doubles. This situation
is in the places in Fig. 13.4 where the water’s surface moves more strongly. (b) If there is a phase
difference of exactly one half of a wavelength, the wave peaks meet with the troughs. Both waves
cancel one another out. This represents the parts of Fig. 13.4 where the water surface is very still.
(c) Any other superimposed phase shift causes a wave, the amplitude of which is somewhere
between the extremes in (a) and (b).
If parallel water waves (e.g., a wave front at the coast) collide with a barrier that
has a small opening (e.g., a harbor entrance) semicircular waves spread outward
from the backside. If this barrier has two neighboring openings (double slit),
a semicircular wave develops behind each opening. The same picture as with the
two raindrops is achieved (Fig. 13.4). The waves interfere with one another behind
the double-slit barrier, and a diffraction pattern forms. The density of this pattern,
that is, the progression of the bands, depends on the geometry of the double slit.
Formally, the diffraction sequence on the crystal lattice is analogous. The same
principles are valid, but the superimposition is more complex. A very simple lattice
shall be considered that only has one type of atom. An X-ray runs as a parallel wave
toward this crystal. It collides with an array of atoms and initiates an interaction that
is comparable to that between the raindrop and the puddle. Each atom generates
a spherical wave because of the interaction between the atom’s electrons and the
X-ray. The circular wave on the water’s surface represents therefore the spherical
wave in space. The spreading spherical waves superimpose on one another and
form a wave that leaves the crystal in a changed direction (Fig. 13.6). Formally
seen, the incoming and outgoing waves have an angular relationship to one another
that is equivalent to the reflection of the wave in a plane perpendicular to the
13.4 Crystal Structure Analysis 271
a b
Fig. 13.6 If a wave front (blue) in one plane meets with a row of atoms (black points on the dotted
lines), each atom in this row becomes the starting point for a circular wave. This is analogous to
those created when the raindrop hits the surface of a puddle. The circular waves that formed from
the back row of atoms superimpose upon one another just as in the case with the water waves
(Fig. 13.4). All circular waves are generated with the same phase in the indicated direction of the
incoming wave (a). As a result of this superimposition, a new wave front forms (red) that leaves
the crystal in an altered direction. Relative to the direction of the incoming wave, they have an
angle that is formally a reflection of the incoming wave front on the atom row that is marked with
the green line. If a different incoming direction is taken the circular waves are not generated from
the same place (b), that is, there is a phase difference between them. Their superimposition does
not lead to a new wave front.
a b
c d
Fig. 13.7 A cluster of parallel planes can be laid through the atoms of a crystal lattice (a, b, c).
Their relative distance from one another and their atomic occupation density varies. Each one can
give rise to “reflections” in an X-ray diffraction experiment. For this the crystal must be brought
into the correct orientation for the incoming beam each time. The X-ray counter is positioned so
that it captures the out-going X-ray beam. It is from this geometry that the spatial orientation of the
cluster of planes in the crystal is determined. The occupation density of the atoms decides how
“well” a particular plane cluster reflects. This information is contained in the intensity (amplitude)
of the outgoing wave. (d) Different types of atoms in a molecular crystal have different spatial
relationships to one another. A parallel cluster of planes can be placed through each atom in the
molecule (here a three-atom molecule). The amplitude of the outgoing beam results in the
superimposition of wave trains that are reflected in these planes.
In the first two masks the distance and symmetry of the pinhole mask is changed.
In the third and fourth mask the repeating motif of the three or five differently
sized holes represent a molecule that has two types of atoms. These motifs
produce a periodic lattice when lined up next to each other. They have the same
dimension as is found in the first image on the left. If the diffraction pictures are
compared, the distribution of the intensity of the light points is different. That is
13.4
Crystal Structure Analysis
Fig. 13.8 A perforated mask can be used for a diffraction experiment with a laser pointer. For this the displayed hole patterns (above) must be brought to the
size of the wavelength of laser light. The diffraction patterns below were generated from the masks. The holes in the two left masks are all the same size, which
is comparable to having only one type of atom. The hole pattern changes from wide-meshed squares to an angular orientation. The diffraction patterns reflect
the symmetry and distance of the holes to one another. In the two masks on the right, the distance between the repeating units is identical to the first masks. The
composition of the motif in the repeating unit, however, varies. It is made up of multiple holes and can be compared to the different atoms in a molecule. The
distance between the diffracted light reflections (lower row) is identical for the first, third, and fourth masks. The intensity of the diffracted radiation, however,
varies from reflection to reflection. It contains the information about the composition and the geometry of the original motif.
273
274 13 Experimental Methods of Structure Determination
what contains the information about the construction of the motif that generated the
lattice. It is just this information that is used to determine the crystal structure.
The reflections, that is, the intensity of the individual light points in the
diffraction pattern, contain the information about the form of the molecule. There
is a mathematical technique, the Fourier transform, which can be used to translate
the diffraction pattern back to the generating motif. A Fourier transform is the
superimposition of many sine and cosine functions. The intensity of the diffraction
reflections determines the contribution of the functions, as does the phasing. The
importance of these aspects was already underscored in the interference of the waves
(Fig. 13.5). Unfortunately just this information about the relative phasing is lost in the
diffraction experiment. The diffractometer only registers the intensity of the reflec-
tions. The missing information is referred to as the phase problem of crystal
structural determination. It must be reconstructed for the individual reflections
by computational methods and by using appropriate measuring conditions. Fre-
quently large electron-rich elements (e.g., heavy-metal ions) are embedded in the
protein (i.e., by coordinating to histidine or cysteine). These heavy atoms dominate
the diffraction pattern, and in doing so, they betray their position in the crystal lattice.
Another method takes advantage of the so-called anomalous scattering. This effect is
based on the interaction of X-rays with the electrons of heavy atoms in particular.
This leads to the situation that a spherical wave that is propagating toward an atom is
reflected with a phase shift. Simply stated, it is returned with a delay. The effect is
dependent on the wavelength and can be exploited to determine the phasing. The
crystal is measured on a synchroton (particle accelerator that also produces electro-
magnetic radiation in a broad wavelength range, including X-rays) and the diffraction
experiment is carried out with multiple different wavelengths. Anomalous scatter-
ing requires that a heavy atom is contained in the protein structure. This is already
the case for metalloproteins. Often another approach is taken. Proteins that are
produced in a special expression system (▶ Sect. 12.6), can be generated with
selenomethionine instead of methionine. The heavier selenium serves as an anoma-
lous scatterer in the diffraction experiment. There are methods for small molecules
that allow a straightforward reconstruction of the phase information from the inten-
sity distribution, the so-called “direct methods.” The development of such methods is
being worked on for protein structural determination. Often an already-solved,
related protein structure can be utilized as a starting model for a structure determi-
nation (molecular replacement method). The model is translated and rotated in the
elementary cell by computer simulations until a calculated diffraction pattern is
produced that matches the diffraction pattern of the unknown protein.
The phasing obtained at the beginning of the structural analysis with this method
is only approximate. Altogether the regeneration of the phasing information is not
trivial. Even in the 1960s, phasing calculations kept one scientist busy for several
years. The methodical progress and the increased performance of computers now
allow this to be accomplished in a few minutes. Even today, however, this step can
still be very challenging for proteins. It is becoming apparent though, that the
structure determination of medium-sized proteins is becoming routine. Historically,
the time span from crystallization to structure determination could be quite long.
13.5 Diffraction Power and Resolution Determine the Accuracy of a Crystal Structure 275
A picture of the contents of the unit cell is the result of the Fourier transform. It
is portrayed in terms of the electron density in space (Fig. 13.9). The detail with
which the electron density can be determined depends on the spatial resolution
with which the diffraction pattern was measured. In relation to the Fourier trans-
form, this is a question of the number of different wave fronts that were
superimposed upon one another in the correct amplitude and phase. It can be seen
in the diffraction pattern created with the laser beam (Fig. 13.8) that the intensity
clearly weakens toward the edges. The extent to which the diffraction pattern is
Fig. 13.9 View of a crystal structure of aldose reductase (▶ Sect. 27.4). The electron density (the
so-called 2F0–Fc density at 1s level) is displayed as a blue mesh on the predefined contour level
around a tryptophan residue. In (a) the diffraction data were obtained at a resolution of 4 Å, and
a Fourier transform was used to calculate the electron density. The resolution increases from (a)
4 Å to (b) 3 Å, to (c) 2 Å, and to (d) 0.66 Å. The resolution in the last-shown contour density is so
high that hydrogen atoms can be recognized as single density peaks in the difference density map
(positive is yellow, negative is violet F0–Fc difference density, 2s level). The electron density is so
clearly structured at 2 Å that it is simple to fit the indole building block in place. At 4-Å resolution
this assignment is problematic and can easily lead to errors.
276 13 Experimental Methods of Structure Determination
perceivable in the edges limits the accuracy with which the generated motif can be
spatially resolved. For small organic molecules, this resolution is easily achieved in
that the atoms are visible as distinct maxima in the electron density. If the crystal’s
quality is diminished due to lattice defects or disorder, the resolution is poorer. The
resolution in protein crystals is usually between 1.5 and 3 Å. In the best case,
a resolution is achieved that is in the order of magnitude of a bond length. The
upper limit falls into the range of the cross section of a benzene ring. Resolutions of
less than 1 Å, however, have been achieved (Fig. 13.9). In those cases many details are
recognizable, such as single hydrogen atoms or multiple arrangements of side chains.
At higher resolution the electron density maxima are directly assigned to the atoms
in the molecule (Fig. 13.10). In the beginning this assignment is crude, the phases used
in the Fourier transform are only approximate. The position of the detected maxima
must still be optimized. This is defined as “refinement of the structure.” For this the
experimentally observed diffraction pattern is compared with the diffraction pattern
that is calculated from the atomic positions of the preliminary model. If the measure-
ment is very accurate, the density of a “pseudomolecule” with spherical atoms can be
subtracted from the observed electron densities at the end of the structure determina-
tion. What remains is the electron distribution of the bonds between the atoms in the
molecule (Fig. 13.10). This is, however, only possible with very high-resolution
measurements. At lower resolution, as is the case in moderately resolved protein
structure determinations, a direct assignment of the atoms of the protein to the
electron density maxima cannot be made (Fig. 13.11). More commonly the course
of the chains is fitted to the electron density. Because proteins are constructed from
20 different amino acids that prefer to take on typical geometries, the interpretation
of the electron density is simplified (Fig. 13.11). As with low-molecular-weight
structures the model is iteratively refined, and the structural data improved.
Electrons scatter X-rays. Therefore, the number of electrons around an atom
determines how well it is detected in the resulting density. Hydrogen atoms have
only one electron in their shell. As a consequence, they are often not located or are
located with poor accuracy in the electron density. Hydrogen atoms can be recog-
nized as densities in the structure determination of small molecules, but this is only
possible in protein structures if the resolution is less than 1 Å. It is unproblematic as
long as it only concerns hydrogen atoms at positions that correspond to spatially
fixed positions at a rigid molecular scaffold, for instance, hydrogen atoms on phenyl
rings. It is more difficult if the hydrogen atom is on a conformationally flexible
group or groups that can be protonated or deprotonated. It is good to know if
a carboxyl group is ionized, or if it exists as the free acid, and in which direction
the hydrogen atom is oriented. This information can only be indirectly gleaned from
the protein structure through an exact analysis of the spatial orientation of the
surrounding hydrogen-bonding partners.
The accuracy of the structure determination depends on the resolution of the
data that was obtained from a crystal. Even if the structure of the protein is
displayed on the computer screen like that of an organic molecule, its geometry is
much less accurately determined. The error margins in small molecule determina-
tions are approximately 0.01 Å for bond lengths, 0.1 for bond angles, and 1 –2 for
13.5 Diffraction Power and Resolution Determine the Accuracy of a Crystal Structure 277
a b
c
d H
H 0 0 0
H
C C H
H
0 0
0 H
e f
Fig. 13.10 Crystals with an edge of 0.1–0.3 mm are needed for the structure determination
of small organic molecules. (a) A diffraction pattern is obtained in an X-ray beam (compare
Fig. 13.8) that is displayed on a photographic plate or (b) is registered with a diffractometer
counter. The molecule that generated this diffraction pattern, which is periodically arranged
in the crystal is back-calculated from the reflections. (c) A Fourier transform is carried out
with approximate phasing, and a map of the electron density in space is obtained that is
contoured according to its height. The maxima are assigned to the atoms in the molecule
(here oxalic acid). (d) The spatial blurring of the electron density is associated with thermal
motion of the atoms. It is displayed with ellipsoids that represent the 50% probability of the
occupancy of each atom. (e) Crystals that scatter well allow the determination of the electron
density in the bonds between atoms. (f) The application of symmetry operations generates the
molecular packing in the crystal lattice. It delivers information about noncovalent interactions
between molecules.
278 13 Experimental Methods of Structure Determination
a b
c d
e f
Fig. 13.11 (a) The diffraction pattern of a protein crystal clearly shows more reflections. As they
are made up by larger molecules the unit cells comprise a bigger volume and exhibit more lattice
planes and therefore reflections. However, due to high solvent content and inherent flexibility of
the more complicated macromolecules the crystals give rise to poorer diffraction quality and the
data are registered to a lower resolution. (b) The enormous data flood is registered with an area
detector on a diffractometer. This allows the simultaneous registration of many diffracted inten-
sities. (c) A Fourier transform performed with phases from the first model delivers the distribution
of the electron density in space (blue mesh). Because no atomic centers are resolved in this density,
the trace of the protein chain (here a segment from a b sheet of tumor necrosis factor, TNF) is fitted
to the electron density distribution. (d) Similarly to small molecules, the obtained model is refined
until all of the atoms of the protein fit optimally into the density. (e) The color-coded thermal
motion of the molecule is shown over the entire molecule. Blue to yellow to red color changes
show the transition from mild to severe movement. (f) Symmetry operations generate the molec-
ular packing in the crystal lattice. There are “empty” areas that are occupied by numerous water
molecules. Because of the strong thermal motion and the disorder that it causes, they are not found
in the electron density map.
13.6 Electron Microscopy 279
are large enough for an X-ray structure analysis has only worked a few times and
requires very special additives for the crystallization.
In recent times crystallization of membrane proteins has been successful in
lipidic cubic phases. Sophisticated mixtures of lipid, water, and protein can form
structured three-dimensional lipidic arrays that are pervaded by water channels.
Protein molecules diffuse into this structured yet flexible matrix, which facilitates
crystal nucleation and growth.
In addition to the work with readily obtained crystals, electron radiation has
another advantage over X-rays. It can be used for a diffraction experiment as well as
for the direct visualization of an object. The microscopic visualization is unfortu-
nately not possible with X-rays because a convergent lens cannot be built for
X-rays. This is successful for electrons because they can be focused by using
magnetic fields. Why not use an electron microscope to visualize molecules in
general? Despite the reduced radiation, electrons still damage the samples con-
siderably. Furthermore the crystals that are used represent about a millionth the
sample size that is used for X-ray structure analysis. The data for an X-ray
structure can be collected on one single crystal. In contrast, several hundred to
thousand tiny, often only 5-mm large crystals are needed for electron microscopy.
They are shock-frozen under high vacuum and directly exposed to the electron
beam. Proteins can only withstand these conditions after special preparation.
A very low radiation dose is worked with. Because of this, the images are very
noisy and must be averaged over many observations. To obtain a detailed reso-
lution in the plane perpendicular to the crystal’s plane, the crystal must be
measured in many orientations. Fine structural details are lost in doing this. The
analogous patterns in the electron diffraction diagram, as would be obtained in an
X-ray experiment, can be corrected by computational methods. With the help of
the Fourier transform, an electron density map of the molecule is obtained. Its
interpretation or refinement is accomplished in the same way as for the X-ray
experiment. The phasing that is necessary for the transform can be determined
from the images in electron microscopy.
The technique is relatively young and the methods are developing further.
There is more work to be done. Structural determination still takes several years,
and only a few laboratories have adequately powerful microscopes. Nonetheless,
the knowledge that we have about the structure of membrane-bound receptors
today is often based on the results that were achieved with this method
(▶ Chap. 30, “Ligands for Channels, Pores, and Transporters”).
Many atomic nuclei have an angular momentum, or spin. The nuclei that occur in
biological systems that have a nuclear spin are the hydrogen isotope 1H, the carbon
isotope 13C, the nitrogen isotope 15N, the fluorine isotope 19F, and the phosphorus
13.7 Structures in Solution: The Resonance Experiment in NMR Spectroscopy 281
a b
Fig. 13.12 Atomic nuclei with a rotational momentum behave like a spinning top. In the absence
of an external magnetic field, they orient in all possible directions randomly (a). Upon application
of a magnetic field, they orient their rotation axes parallel or antiparallel to the direction of the field
(b). The precession movement is oriented in an arc around the applied field direction. The two
orientations, parallel or antiparallel, with respect to the direction of the field are energetically
different. Because of this, there is a small difference in occupancy between the two states. By
applying an electromagnetic field with a frequency that corresponds to the rotational speed of the
top’s axis, the occupancy can be inverted. This resonance absorption, the exact frequency of which
depends on the type of nucleus and its immediate chemical environment, is registered with
a spectrometer.
isotope 31P. Just as a top would, these nuclei rotate about their axes. As long as no
magnetic field is applied, the tops orient in all possible spatial directions. In
a magnetic field they are forced into alignment (Fig. 13.12). If a toy top is spun,
it moves in the gravitation field. This field has one preferred direction. If the
alignment of the rotation axis of the top and the direction of the gravitation field,
which is oriented toward the center of the Earth, are not exactly the same, the top
wobbles. The end of the rotation axis performs a circular movement, an arc, with
a very precise rotational speed. It depends on the mass and geometry of the top. In
physics this movement is known as precession.
Atomic nuclei with a spin behave in a very similar way. In contrast to the
macroscopic top, they obey the laws of quantum mechanics. This means that the
rotation axes that their precession movement takes on can only adopt very specific
angles with respect to the applied field direction. The result for the 1H, 13C, 15N, 19F,
and 31P nuclei is that the rotation axis for the precession arc can only be parallel
or antiparallel to the direction of the field. The orientation in the direction of the
field is energetically somewhat more favorable than the rotation antiparallel to
282 13 Experimental Methods of Structure Determination
the direction of the field. Statistically, therefore, more nuclear spins in the substance
sample will align with the direction of the field. If an additional magnetic field
is applied to the outer magnetic field, and its frequency corresponds to the
precession frequency of the nuclear spin, the occupancy of “parallel” to “antipar-
allel” spinning nuclei can be reversed and a resonance absorption for the sample can
be registered. After a particular time span, the original situation is restored
(relaxation).
The rotational speed of the top’s axis for precession movements is character-
istic for each type of nucleus. It depends further on the composition of the
chemical environment in which the nucleus resides. A carbon atom of a phenyl
ring has a different resonance frequency than that of an aliphatic chain. The
relative position of the resonance absorption in relation to a standard reference is
also called the chemical shift. Furthermore the individual nuclei can perceive the
spin orientation of the neighboring nuclei. An alignment in the same direction as
a neighboring nucleus is energetically different from that of an antiparallel
orientation. This influence also modulates the rotational speed of the spin on
the observed nucleus. The information transfer regarding the orientation or the
magnetic state of the nuclei in the vicinity can be transmitted over several bonds.
This transfer can even occur through space without any direct covalent
connection.
To measure an NMR spectrum (nuclear magnetic resonance), a solution of the
substance has to be placed in a strong magnetic field. In addition, a variable
electromagnetic field is applied to the sample. The frequencies at which the nuclei
in the sample have resonance, meaning when they flip from parallel to antiparallel,
are recorded. The resulting spectrum discloses information about the composition
and the chemical environment around the studied nuclei. It contains information
about the spatial structure of the molecules under investigation. Based on the work
of Richard Ernst, multidimensional NMR techniques have been developed in the
last 30 years. By using suitable measuring conditions and selectively irradiating
electromagnetic fields, information about the mutual influence of resonance fre-
quencies among individual nuclei is separated and analyzed. This either-way
induced information transfer about the magnetic state of neighboring nuclei is
apparent from the signal form of multidimensional spectra, which are registered
in terms of cross peaks. Only the hydrogen isotope 1H occurs in nearly 100%
natural abundance. Therefore, it can be assumed that for statistical reasons, two 1H
nuclei will always be adjacent to each other in a molecule. In contrast, the 13C and
15
N isotopes are scarce. As a result, statistically they are only very rarely found in
the direct vicinity of one another. Data on the mutual influence of the magnetization
of these nuclei are required for the spectra. Therefore it is necessary to enrich the
proteins with the appropriate isotopes. For this, bacteria are fed with isotopically
labeled substrates such as glucose or ammonium chloride and will then produce
proteins that are isotopically enriched. It is even necessary to produce deuterated
proteins for the structural investigation of very large proteins. Today, by using
numerous spectroscopic techniques, spectra from proteins of more than 800 amino
13.9 How Relevant Are Structures in a Crystal or NMR Tube to a Biological System? 283
acids have been successfully interpreted. The following questions can be addressed
by NMR analysis:
• Which atomic nuclei occur in which chemical environment?
• What is in the immediate, covalently connected neighborhood of these nuclei?
Information about the spatial orientation of atoms in the vicinity is also
contained within these spectral parameters.
• Which geometric relationships are given between different segments of the
polypeptide chain? This results from information transfer about magnetic states
of nuclei that are not directly connected by covalent bonds.
COOH
B H2N A B
COOH
H2N A
0.0
2.0
4.0
6.0
8.0
A 10.0
Fig. 13.13 A multidimensional NMR spectrum contains information about the spatial vicinity of
atomic nuclei in a molecule (here, the trypsin inhibitor from bovine pancreas). It is expressed in so-
called cross peaks. Information can be extracted about the distance between non-covalently bound
atoms in a molecule. The individual signals of the spectra are assigned to atoms in the molecule
(e.g., A and B). The positions that these atoms have in the polypeptide chain are known from the
sequence of the protein (above left). The intensity of the cross peak indicates which spatial distance
is found between nuclei A and B in the folded polypeptide chain (above right). Just as was done for
A and B, the many other cross peaks are evaluated and translated into distance conditions.
Fig. 13.14 The accuracy of an NMR structure depends on the density of the experimentally
determined atomic distances. These come from experiments that deliver information about the
exchange of the magnetic state of spatially adjacent, but not directly connected atoms (so-called
nuclear Overhauser effect, NOE). With the connectivity list and the NOE conditions, multiple
structural models are generated. These models represent the low-energy geometries that agree with
the spectral parameters. In the left part of the figure (a) the experimentally measured NOEs (black
dashed lines) are distributed over the 3D structure of a domain of the guanine nucleotide exchange
factor. For the sake of clarity, only the long-range NOEs are shown. Most of the amino acid side
chains are also suppressed; many of these NOEs therefore indicate the positions of atoms that are
not shown. In areas in which very few distances could be determined (e.g., in the green loop areas
or at the termini), the model is ambiguously defined. Multiple models are consistent with the
experimental data (b). The main chain of the protein fans out. In areas where a large number of
NOE conditions are found (e.g., the helices and the central b strand), the structural models diverge
only slightly from one another.
crystals can make up to 70% of the crystal’s mass! Therefore, the crystal can also be
considered as a highly concentrated, ordered solution. NMR measurements also
require high concentrations. They are considerable higher than in biological
systems, but are still 10–100 times lower than in protein crystals.
The high water content of protein crystals offers the possibility to allow small
molecules to diffuse into the crystals. In the water channels, they move as they
would in an aqueous solution. In favorable cases, the binding pocket of the protein
is directly accessible from one of these channels. By placing the protein crystal
directly in a solution of the active substance (soaking), the latter can penetrate the
crystal through the channels, diffuse into the binding pockets, and dock there. Then
a new diffraction experiment is carried out with the loaded crystal. The reflections
are measured, and, based on the known structure of the protein, the electron density
map is generated. The density of the uncomplexed protein is subtracted from that
map. The difference density of the incorporated ligand remains. This information is
of essential importance for understanding the interactions between small molecules
and proteins. The question of whether the experimental structure is really relevant
for the biological conditions has still not been answered. Crystalline hemoglobin is
able to reversibly take up and release oxygen. It could be shown on crystals of
purine nucleoside phosphorylase (PNP) that the enzyme is still catalytically active
in the crystal (Fig. 13.15).
The research group of Malcolm Walkinshaw at the University of Edinburgh
could even show on the example of the enzyme Cyp3, a peptidylproline isomer-
ase, that there is a quantitative agreement between the crystalline and solution
states. Different concentrations of an inhibiting prolyldipeptide were allowed to
diffuse into the crystal. Afterward, the occupancy of this inhibitor obtained from
the differently concentrated soaking solutions was determined in a crystallo-
graphic experiment. The binding constants were then ascertained from this
occupancy data. They quantitatively agreed with the inhibition constants that
were determined in a functional assay in solution.
The diffraction data can be very quickly collected with even more intense,
so-called white X-rays from a synchrotron source (the so-called Laue technique).
With this experiment, it was possible to observe stable intermediates of enzyme
reactions. Structural changes of the two-dimensional crystals of the acetylcholine
receptor (▶ Sect. 30.4) could be observed with electron microscopy after loading
with the natural ligand. This and other experiments have proven that proteins exist
in a crystal lattice that must be, at the very least, very similar to the biologically
active form.
13.10 Synopsis
• The most powerful methods to determine the spatial structure of molecules are
X-ray crystallography and NMR spectroscopy. The former requires the bio-
molecules to be arranged in periodic arrays in a crystal, and the latter studies
them in solution, usually in an isotopically labeled form.
13.10 Synopsis 287
O O
N N
NH NH
HO N N NH2 N N NH2
O H
+ H2PO3− PNP
HO OPO2H−
OH OH O
+
OH OH
Crystal
removed
Reaction rate
Crystal
removed
Crystal
soaked
Crystal
soaked
Time
Fig. 13.15 The enzyme purine nucleoside phosphorylase (PNP) transforms guanosine and
phosphate to guanine and ribose-1-phosphate. If a protein crystal is placed in a solution of the
substrate, the reaction begins. This could also have been caused by a partial dissolution of the
enzyme crystal. If the crystal is removed from the solution, the reaction stops. If the crystal is
brought back into the solution, the reaction carries on. This experiment demonstrates that even
crystalline enzymes are catalytically active. Therefore, a geometry must be present in the crystal
that corresponds to the biologically active form.
• Crystals need special conditions to grow from saturated solutions. They spatially
arrange in periodic arrays, and the molecules pack through translational sym-
metry in three dimensions. In addition to the pure shifting of basic motifs,
usually one molecule that represents the asymmetric unit, symmetry operation
such as mirror reflection, two-, three-, four-, and six-fold rotation or inversion
can be applied.
• Crystal lattices diffract X-rays and the diffraction experiment can be understood
as a three-dimensional interference of elementary spherical waves generated
at the positions of the atoms in the lattice. The diffraction phenomenon
at a 3D lattice can be treated formally as multiple reflections at crystal planes
in the lattice.
• Because the relative phases of the generated elementary spherical waves,
superimposed in the various reflections, are not accessible by experiment, they
288 13 Experimental Methods of Structure Determination
Bibliography
General Literature
Friebolin H (2010) Basic one- and two-dimensional nmr spectroscopy. Wiley-VCH, Weinheim
Glusker JP, Trueblood KN (1985) Crystal structure analysis, a primer, 2nd edn. Oxford University
Press, New York
Glusker JP, Lewis M, Rossi M (1994) Crystal structure analysis for chemists and biologists. VCH
Publishers, New York
Pellecchia M, Bertini I, Cowburn D et al (2008) Perspectives on NMR in drug discovery:
a technique comes of age. Nature Rev Drug Discov 7:738–745
Wuthrich K (1986) NMR of proteins and nucleic acids. Wiley, New York
Special Literature
In drug design the ligand, which is generally a small organic molecule with
a molecular weight of under 500 Da is under focus. It undergoes interactions with
a macromolecular receptor and exerts an influence on the receptor’s characteristics.
On the other hand, the surrounding receptor can also determine the properties of the
bound active ligand. Selective interference in these interactions requires not only an
understanding of the ligand but also the receptor. After the methods for the
structural determination of biomolecules were introduced in the last chapter, we
want to take a look at what can be learned about the construction principles and
characteristics of these molecules. Proteins are made up of 20 basic building
blocks, the amino acids (see Appendix 1). A dipeptide is formed by coupling two
amino acids through an amide bond. Larger peptides and proteins are formed by the
addition of further amide bonds.
The simplest molecule with an amide bond is formamide 14.1. Its structure is
shown in Fig. 14.1. This connection occurs many hundreds of times in proteins,
for instance, over 50,000 times in the shell of the rhinovirus. The bond length
between the carbon, oxygen, and nitrogen atoms can be obtained from the crystal
structure of formamide. The microwave spectrum of gas-phase formamide also
affords bond lengths, but different values are obtained. In the gas phase, formamide
is “isolated,” that is, it does not “perceive” any neighbors in its immediate vicinity.
The C═O double bond is shorter, and the C–N single bonds are longer than in the
crystalline formamide. In the crystal assembly, the individual formamide molecules
are not “alone.” They are connected to neighboring molecules by hydrogen bonds.
A hydrogen bond is a non-covalent interaction. It couples a functional group
carrying a hydrogen atom (e.g., NH or OH) with an electronegative heteroatom
(e.g., N, O; ▶ Sect. 4.2). Obviously, incorporating a molecule into a network of
H H Bond length in Å
Formamide
N C=O C-N
Crystal assembly 1.241 1.318
H O Gas phase 1.219 1.352
14.1
H H
C O
Fig. 14.1 Formamide 14.1 is the smallest molecule that has an amide group. Its molecular
structure is shown the lower part. Because of thermal motion in the solid state the molecule
carries out vibrational movements. Its electron density is therefore distributed over a larger area.
This is described by using ellipsoids that encompass the 50% probability of occurance the atom.
Two hydrogen bonds are incurred between the carbonyl group and the amide group of
a neighboring molecule in the crystal packing. An extended H-bond network stabilizes the crystal
structure and polarizes the amide group. The bond lengths (in Å) are different in the crystal
assembly and in the gas phase (upper part).
hydrogen bonds causes a change in its geometry. The electron density between the
atoms is shifted so that the C═O double bonds are longer and consequently weaker.
Simultaneously, the C–N single bonds become shorter and stronger. Twisting the
molecule around this bond away from planarity is therefore made difficult.
The amide bond is the fundamental building block of proteins. Every third bond
in the polymer chain is an amide bond. As we have seen in formamide, they have
a planar geometry, that is, a plane can be defined through its atoms. The folding of
the polymer chain and the concomitant spatial construction of the protein is
determined by the torsion angle in the plane of the amide bonds against one another
(Fig. 14.2). Its rigidity and planarity is decisive for the stability of the spatially
folded protein. In proteins, the amide bonds are practically only in the trans
configuration. Only the rotation around the plane of the amide bond remains as
a degree of freedom for the polymer chain. These torsions (▶ Chap. 16, “Confor-
mational Analysis”) occur around bonds that lie between the Ca carbon atoms. As
was shown in the bond-length comparison between the gaseous and crystalline
formamide, the decisive additional stiffening of the amide bond is caused by its
incorporation into a hydrogen-bonding network.
14.1 The Amide Bond: Backbone of Proteins 293
Cα
N H
O C
ψ
H
Cα
φ
H
R
N
C
O
Cα
180
160
140
120
100
80
Ψ
60
40
20
0
−20
−40
−60
−80
−100
−120
−140
−160
−180
−180
−160
−140
−120
−100
−80
−60
−40
−20
0
20
40
60
80
100
120
140
160
180
Fig. 14.2 The spatial course of a polypeptide chain is determined by the relative orientation of the
planar peptide bonds (above). The twist of these planes against one another is measured on
the basis of the two twisting or dihedral angles f and c. These do not assume any value around
the bond axes, but rather are limited to a few combinations of value ranges. In the diagram, a
so-called Ramachandran plot, the values for both angles (below) along the peptide chain are
plotted. The angle combinations for an a helix are found in the middle left (Fig. 14.3), and those for
a b-pleated sheet in the top left (Fig. 14.4).
294 14 Three-Dimensional Structure of Biomolecules
a b
R 11
10 R
9 R
R
8
R
7
6 R
5
R
R
4
3 R
2 R
R
1
Fig. 14.3 The a helix is a commonly found secondary structure. The polypeptide chain forms
a right-handed spiral with a pitch of 7 Å, and 3.6 amino acids per turn (a). All carbonyl groups
(oxygen is red) are oriented parallel to the helix axis in the same direction. The NH functionalities
(nitrogen is blue, hydrogen is light blue) are oriented in the opposite direction. The groups form
a pronounced hydrogen-bond network (violet dashed line) between themselves (b). The side chain
(R) on the Ca atoms are on the outside pointing away from the helix axis. This forms a typical
furrow pattern that runs in a spiral over the surface. This “ridge and groove” pattern determines the
mutual packing of a helices in proteins.
Typically, the angles named f and c are used to describe the two dihedral angles
around the Ca carbon atom, and these angles usually take on value pairs from two
ranges. These ranges are related to a helical or sheet-like course of the polymer
chain (Fig. 14.2). In an a helix with a right-handed turn, all CO and NH groups
orient in the same direction (Fig. 14.3). They form a network of H-bonds among
themselves. Each amino acid in the helix is in contact with the next fourth amino
acids in the sequence. This unidirectional orientation of the polar groups of the
14.2 Proteins Fold in Space to Form a Helices and b Strands 295
R
R
R
R
R
R
R
R
O C
N H O C H N H H
H H N
H
C C C C
R C O H N R R C R C O
O
H N R R C O H N H N
R R
C C C C
O C H H C H H
N H O O C
N H O C N H H
H H H H N
C C 7Å C C
R C O H N R R C O R C O
H N R C O H N H N
R R R
C C C C
O C H H H H H
N O C O C
N H O C N H N H
H H H H
C C C C
R C O H N R R C O R C O
H N
H N O
Antiparallel Parallel
Fig. 14.4 A second important secondary structure, the b strand is composed of multiple sections
of the polymer chain that exist in a stretched conformation (top). The strands can run parallel or
antiparallel. They are crosslinked to each other via hydrogen bonds (violet). The sheet-like
structure displays a zigzag wrinkle and is called a b-pleated sheet. The side chains (R) of the
amino acids point away from, and alternate above and below the pleated sheet.
Fig. 14.5 Within a b-pleated sheet of multiple strands, here shown with a parallel orientation,
a right-handed twist occurs. For simplification, the single b strands are indicated with an arrow.
The twist can be seen by the internal rotation of the arrow. The pleated sheet here is shown in two
perpendicular views.
chains alternate above and below the pleated sheet. The entire strand is slightly
twisted upon itself. Because of this a pleated sheet of multiple strands has a twist to
it when viewed from the side (Fig. 14.5).
Aside from these two common secondary structures, other typical combinations
of torsion angles occur. A polymer chain that folds to a globular structure in space
must reverse its direction. This is achieved in the so-called turn or loop region.
Turns can be classified according to the number of involved amino acids and the
type of interaction that closes the turn. Loops that form a C═O···H–N hydrogen
bond in the direction of the polymer chain, inverse turns with hydrogen bonds in the
reversed orientation, and open turns in the chain that are held together by van der
Waals interactions and polar interactions can be distinguished from one another
(Fig.14.6). A total of 158 turn classes were summarized in a recent evaluation by
Oliver Koch.
What force effectuates the organization of a protein? Amino acids possess
hydrophilic and hydrophobic side chains. Hydrophobic groups avoid aqueous
environments (▶ Sect. 4.2). During the folding of the polymer chain in an aqueous
medium the hydrophobic amino acids aggregate to diminish their common hydro-
phobic surface. That is why the hydrophobic amino acids are predominantly
found in the inside of a folded protein. The polar groups of the amide bonds of
the main chain become saturated in the secondary structure by hydrogen bonds.
The side chains of polar amino acids are only found on the inside of a protein if
they can form a polar interaction with another amino acid in the vicinity. Other-
wise they orient themselves on the outside of a protein; they protrude into the
surrounding water. Proteins can also span a cell membrane. In those areas where
they have contact with the membrane they have a large, cohesive hydrophobic
surface (Sect. 14.7). If the packing density in the interior of the protein is
14.3 From Secondary to Tertiary and Quaternary Structure 297
Fig. 14.6 The polymer chain of a globular protein reverses its direction in the loop or turn area.
Numerous turn patterns have been found. They are made up of 2–6 amino acids. Normal turns
(left) form a C═OHN hydrogen bond (violet) in the direction of the polymer chain. This
hydrogen bond has a different order in inverse turns (middle). Another group of open turns
(right) is held together by van der Waals contacts and polar interactions.
Fig. 14.7 The course of the polypeptide chains is symbolized with spirals for a helices, with
arrows for b-pleated sheets, and with threads for different turn segments. Approximately 30% of
the structurally known proteins can be assigned to one of the nine shown folding classes. The first
folding pattern (bottom left) is a “TIM barrel,” and the one above is an open pleated-sheet structure.
14.4 Are the Fold Structure and Biological Function of Proteins Correlated? 299
groups that protrude into the binding pocket. The folding pattern in the vicinity
of the binding pocket, however, exerts an influence on the properties that are
found there. For example, a helix that is arranged toward the binding pocket
decisively determines the local electrostatic potential. Even this can be exploited
for the design of selective ligands that bind only to proteins of a particular
folding class.
Despite progress in the methods of structure determination techniques, it can
occur that the structure analysis of an important protein fails, but the structure of,
for example, a related protein can be solved. A model of the desired protein can be
built on this basis (▶ Sect. 20.5). Information about the construction and folding
principles of proteins are needed for this purpose. They allow the understanding of
what part of the protein stabilizes the scaffold, what parts determine functions, and
what parts make up the differences between homologues.
An in-depth discussion of these principles would go too far here. As an example,
the folding pattern of the b barrel should be examined. A stretched-out sheet of
multiple b strands has an internal twist (cf. Fig. 14.5). If, as an example, eight such
strands are lined up next to one another, a cylinder is formed. This barrel-like
folding pattern of eight and more strands is often observed. Several variations of
this folding pattern are displayed in Fig. 14.8 that show how, and according to
which principles, a polypeptide can spatially fold.
A loop acts as a connecting element between the pleated sheet strands of the b
barrel in the example in Fig. 14.8. a Helices can also serve as connecting elements
(Fig. 14.7). A barrel-like structure forms on the surface, and the bridging a helices
align on its surface. This folding pattern was first discovered in triosephosphate
isomerase. It is therefore called a TIM barrel (Fig. 14.7). Another important
folding class that is made up of a-helical and b-pleated sheet segments are the
open-sheet structures (Fig. 14.7). In this class the pleated sheet does not close to
a cylinder but rather it remains open. Helices group above and below the sheet.
How is the structure of a protein coupled to its function? Do all proteases, for example,
display the same folding pattern? A large number of enzymes that have distinctly
different functions all belong to the TIM barrel type, or the open-sheet structure.
There are many oxidases, isomerases, kinases, aldolases, synthases, dehydrogenases,
or proteases that can be assigned to these two classes. Here, Nature started from
a common origin and developed divergently. Consequently, the function of a protein
is not necessarily coupled to a particular folding pattern. If the construction of the
enzyme is analyzed further, it turns out that the catalytic sites of the proteins of
a folding class are at the same position. This is found at the C terminal end of the
barrel in the TIM-barrel structure, and at the topological switch of the connecting
helices from the upper to the lower side of the open-sheet structure (Fig. 14.9).
300 14 Three-Dimensional Structure of Biomolecules
C N
C
N
C
3 2
4 1 C N
4 1 2 3 N C C
C N
N
c
5 4
6 3
7 2
C
8 1 N
Figure 14.8 The folding pattern of different b-barrel structures can be thought of as
a polymer chain with eight separate b strands (arrows). These are separated by loop areas.
(a) An up-and-down barrel forms when the folding of the polymer chain of eight b strands follows
a zigzag pattern. The antiparallel sections form hydrogen bonds between themselves that close
up to form a cylinder. (b) The four-b-strand polypeptide chains lie next to one another so that the
first chain interacts with the fourth, and the second interacts with the fifth. Then the double
strand folds and the first pair comes to lie next to the second. Because the course of the polymer
chain is reminiscent of the engravings on Greek vases, the pattern is called a Greek key. Two
such patterns can come together into a cylinder-like orientation and form a Greek-key barrel.
(c) Another folding pattern is formed from a double strand that is placed together with
an internal twist. The double strand wraps itself into a cylinder-like structure that is called a
jelly roll.
14.4 Are the Fold Structure and Biological Function of Proteins Correlated? 301
Fig. 14.9 The folding-pattern-determining and function-carrying amino acid groups are found in
proteins in different regions. (a) The catalytic site (yellow spheres), which binds and transforms
substrates lies in a TIM-barrel-type structure (a helices: red cylinder, b strands: light-blue arrows)
at the end of the barrel where one would expect to find a lid. The loops of the polymer chain that
surround this “lid” (gray and green threads) carry the function-determining amino acids. (b) The
function-determining amino acids in the loop area occur in the open-pleated-sheet structure there,
where the attached helices change from the top to the bottom of the pleated sheet.
The function-determining amino acids occur in the loop area between neighboring
pleated sheets and helices. Why would Nature follow this principle of separating the
folding structure from the function? The amino acids that enable the stable folding of
a domain are separated from those that induce a specific function. This approach is
a very efficient evolutionary strategy. Two areas were simultaneously optimized:
• The stability of the protein scaffold in special folding patterns
• The layout of the amino acid sequence to serve a special function.
Spatially separating and displacing the function-carrying groups in the structur-
ally less-committed loop areas allowed the two tasks to be optimized in parallel.
Exchanging a single amino acid in a secondary structure element could destabilize
the entire folding pattern and stop the folding. This is avoided if the amino acid
sequence that is to be functionally optimized is placed on a stable scaffold that does
not interfere with the optimization.
A protein class that implements this principle to perfection are the immunoglob-
ulins. As antibodies they recognize and bind to xenobiotics, the antigens. To remove
an antigen, immunoglobulins with highly specific binding pockets and high affinity
must be available within a few days. The recognized substances could be anything
from small organic molecules to large proteins. Despite this, it is estimated that
about 1012 different variable sequences are formed based on only about 25,000
human genes. The difficult task of achieving such high diversity is solved by
immune-system cells by using a combination of different variable gene segments
and excessive amino acid exchange in these segments during lymphocyte maturation.
In this way, variable loop areas are formed that are set upon a stable scaffold of
302 14 Three-Dimensional Structure of Biomolecules
Fig. 14.10 The immunoglobulins form a highly specific binding pocket in which they recognize
antigens, which are exogenous substances. The enormously large structural variety of these binding
pockets is achieved by variations in the amino acids in the loop areas. The immunoglobulins
have a Y-like form that is divided into a trunk (constant Fc domain) and two identical Fab branches
(a). The course of the polymer chain in these branches corresponds to the barrel type. The antigen-
binding site is indicated by an arrow. Picture (b) is an enlargement of the circled branch in
(a). Loops are found at the right end (colored) that are responsible for the recognition of exogenous
substances. They grasp the antigen (here dark red) like the fingers of two hands.
barrel-like pleated sheet structures (Fig. 14.10). The therapeutic value of such bio-
molecules (so-called biologicals) has been recognized. Many humanized antibodies
can be found in development as therapeutics (▶ Sect. 32.3).
S3 S1 S2⬘
S2 S1⬘ S3⬘
Fig. 14.11 The side chains of a peptide substrate and the binding pockets that they belong to them
are classified on the N-terminal side of the peptide as P3, P2, P1. . . or S3, S2, S1. . . (left); on the
C-terminal side they are classified as P10 , P20 , P30 . . . or S10 , S20 , S30 . . . (right).
in the 3D structure. The S3 and S4 binding pockets in the serine protease thrombin
are really only one large pocket (▶ Sect. 23.3). It can also happen that a substrate
amino acid has no complementary binding pocket in the enzyme. It then pro-
trudes into the water.
Peptides are easily synthesized with enormous diversity (▶ Sect. 11.5). If the
peptide is attached to a probe that changes its color or fluorescence upon release
(▶ Sect. 7.2), the labeled peptide can be used to ascertain the substrate profile of
the protease. For this purpose a large library (▶ Sect. 11.1) of these peptides is
offered to the protease, and the members that are well cleaved are identified. In
Fig. 14.12 the amino acid composition of a labeled tetrapeptide is given that is
preferably cleaved by the proteases trypsin, factor Xa, plasmin, and chymotryp-
sin. Peptides with basic groups such as arginine or lysine are preferably cleaved
by trypsin, plasmin, and factor Xa. Factor Xa converts peptides with arginine in
the P1 position almost exclusively. Chymotrypsin behaves entirely differently. It
prefers to have aromatic amino acids such as tyrosine, phenylalanine, and tryp-
tophan in the P1 position. The selectivity at the positions P2 to P4 is not nearly as
pronounced. Trypsin transforms tetrapeptides that have branched groups at P2
such as Phe, Tyr, Trp, Ile, or Val much more poorly if an arginine is at the P1
position. Basic groups are also less preferred. Trypsin shows virtually no selec-
tivity at the P3 and P4 positions. Factor Xa has a particular preference for the small
glycine at position P2, but hardly any difference at all is seen for the groups in the
304 14 Three-Dimensional Structure of Biomolecules
a
Trypsin Faktor Xa
F K H D E N Q S T Y RWG A P V I L n F K H D E N Q S T Y RWG A P V I L n
NH2
O
O P4 H O P2 O
H
N N
N N N O O
H O P3 H O H
P1
Plasmin Chymo-
trypsin
F K H D E N Q S T Y RWG A P V I L n F K H D E N Q S T Y RWG A P V I L n
b
Trypsin P4 P3 P2
R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n
NH2
O
O P4 H O P2 O
H
N N
N N N O O
H O P3 H O H
P1- constant
NH
H2N NH
Faktor Xa P4 P3 P2
R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n R K H D E N Q S T Y F WG A P V I L n
Fig. 14.12 A tetrapeptide library, held constant in position P2 to P4, was varied at position 1 with
19 amino acids (one-letter notation; n norleucine). It is cleaved by trypsin after arginine and lysine,
by factor Xa after arginine, and by plasmin after lysine (a). If arginine is held in position P1 and the
remaining three positions are varied, trypsin shows practically no selectivity for the amino acids at
P2, P3, and P4. On the other hand, factor Xa prefers a glycine in position P2 (b).
14.7 When Crystals Learn to Walk 305
P3 position for this enzyme. On the other hand, different groups in the P4 position
are more strongly selected. The substrate-binding profile helps to expose the
selectivity characteristics of enzymes. They display the complementary proper-
ties in the binding pocket and help to inspire the first ideas about the design of
imaginable inhibitors.
This concept was applied to cysteine proteases in the research group of
Jonathan Ellman at the University of California at Berkley. Substrate molecules
were synthesized that carried a fluoresence marker at the end of an amide bond
that was to be cleaved. Different organic building blocks were placed on the other
side. If such a substrate molecule is cleaved by the protease, the organic part
must be bound in the binding pocket of the enzyme. Therefore, the transformation
indicates the binding of a test molecule. The method can be optimally used
for screening. A hit that is discovered in this way can easily be chemically
transformed from a substrate molecule to an inhibitor. If the cleaved amide bond
is replaced with, for instance, an aldehyde function, a cysteine-protease inhibitor
(▶ Sect. 23.9) can be developed that has very little in common with the peptide
substrate.
What kind of information about the dynamics and reactivity of molecules can be
extracted from a crystal structure? Molecular vibrations are visible even in the solid
state. This is reflected in the blurriness of the electron density. If a molecule takes
part in a reaction, bonds are broken and new ones are formed. The formation and
cleavage of amide bonds is a central task in biochemical processes. The molecule
14.2 contains an amide and an ester group (Fig. 14.13). If a crystal of this compound
is exposed to thermal energy, a reaction takes place in the solid state to form 14.3.
The molecule is in a geometry in the incipient crystal structure that is conducive for
entry into the reaction pathway.
Having information about changes in the geometric orientation of functional
groups in the chemical reaction is decisive for understanding the concomitant
structural changes that occur. This knowledge is a prerequisite for the design of
transition-state-analogue inhibitors (▶ Sects. 6.6 and ▶ 22.3). In view of the for-
mation or cleavage of an amide bond, the question is posed: from which direction
does the amino group attack the carbonyl carbon in the course of the nucleophilic
addition to form a new bond?
In the early 1970s Hans-Beat B€ urgi and Jack Dunitz began to extract information
about the geometric changes along such reaction steps from crystal structures. Before
there were movies and television, people developed creative ideas to bring pictures to
movement, for example, with flip-books (Fig. 14.14). These impart the impression of
the dynamic sequence of a story. Let us imagine that because of frequent use, the
pages of the little book have fallen apart and are now in disarray. You must bring
them into the correct order again. Ordering criteria are needed in this case. A similar
306 14 Three-Dimensional Structure of Biomolecules
C(9)
N(1)
C(8)
C(7)
O(3)
a O O(2)
CH3
O(1)
O H2N HN
O O HO O C(2) C(1)
CH3
C(4) C(5)
Fig. 14.13 If thermal energy is applied to a crystal of 14.2, the carbonyl group of the ester
function reacts with the amide NH2 group and an imide bond is formed between N1 and C8 to give
14.3 (a). There must be implied vibrational motion (b) that ends in the reaction. Simultaneously
the ester bond between C(8) and O(2) is cleaved during the reaction steps.
task is posed for the organization of structural data to describe a reaction. Particular
crystal structures are sought from databases of known crystal structures (▶ Sect. 13.9)
in which an amino group is in the vicinity of a carbonyl group, as in the structure of
14.2. Finally they are brought into a logical order (Fig. 14.15).
The systematic comparison of crystal structure data affords a first understanding
of structural molecular properties, for instance, about the preferred conformation
(▶ Sect. 16.4). The geometry of non-covalent interactions can also be evaluated this
way. The side chain of the amino acid histidine contains an imidazole ring with its
two nitrogen atoms. In the neutral state one of these nitrogen atoms is a hydrogen-
bond acceptor, and the other is a donor. There are hundreds of molecules with an
imidazole ring in the database of low-molecular-weight crystal structures. In these
structures the imidazole ring has, in fact, acceptor and donor interactions, usually
with neighboring molecules. All these structures are superimposed upon one another
based on their common imidazole ring (Fig. 14.16). It shows in which spatial
direction the imidazole nitrogen atom’s hydrogen-bonding partner is found. The
task of estimating the possible interaction positions in the binding site of the protein
for the functional groups of a ligand is undertaken in the course of de novo drug
design (▶ Chap. 20, “Protein Modeling and Structure-Based Drug Design”). Fur-
ther, this information is needed for comparing the binding properties of molecules
(▶ Chap. 17, “Pharmacophore Hypotheses and Molecular Comparisons”) or for the
exploration of binding pockets for their preferred ligand-binding sites (hot spots).
14.7
When Crystals Learn to Walk
Fig. 14.14 A story is shown in static pictures in flip-books. If the different pages of this story flip past the eyes quickly enough, the impression of a dynamic
process is given.
307
308 14 Three-Dimensional Structure of Biomolecules
Fig. 14.15 The formation or cleavage of an amide bond occurs by nucleophilic addition.
A nucleophile, for instance, an oxygen or nitrogen atom, approaches the planar carbonyl carbon
atom. During the reaction it rises out of the plane of the three neighbors and adopts a tetrahedral
configuration. Examples were sought from low-molecular-weight crystal structures in which
a nitrogen atom approaches a carbonyl group between a single bond and a van der Waals
contact in the crystal packing. By superimposing these data it is recognizable that the approach
of the nucleophilic nitrogen towards the carbonyl group is “perfomed from back and behind.” With
this approach the carbon migrates out of the plane in the direction of the nucleophile. The
geometry of this reaction step also determines the structural composition of the catalytic center
of a variety of hydrolases (▶ Sect. 22.3).
It was shown in Sect. 14.4 that the amino acids that determine the folding
and function of a protein occur in separate parts of the structure. For enzymes
with the same function, Nature has come to the same solution, however, by different
folding.
The function and therapeutic meaning of serine proteases will be discussed in
more detail in ▶ Chap. 23, “Inhibitors of Hydrolases with an Acyl–Enzyme Inter-
mediate”. A unit of three amino acids, the so-called catalytic triad, plays a key role
in accelerating the hydrolysis of amide bonds by these enzymes. The two amino
acids serine and histidine, and an acidic amino acid, such as aspartic or glutamic
acid, are found in a characteristic spatial orientation. They are defined by the narrow
14.9 DNA as a Target Structure of Drugs 309
boundaries that are established by the reaction geometry required for a nucleophilic
addition (Sects. 14.6 and ▶ 23.2). Their composition is ideally suited for the
cleavage of amide bonds.
The enzyme trypsin is constructed from two barrel-like subunits (Fig. 14.17a).
The catalytic site is located at the interface of these two subunits. Subtilisin is
another serine protease that belongs to the class of open-pleated-sheet structures.
The catalytic triad occurs in a loop area at the edge of the pleated sheet
(Fig. 14.17b). If the amino acids that are involved in the catalysis are removed
from the protein and superimposed in space, the identical geometry of the triad is
obvious. In addition to the mentioned enzymes, this catalytic triad is also encoun-
tered in lipases and esterases (▶ Sect. 23.7), which also cleave peptide or ester
bonds. Although they display divergent scaffold folding, the geometric orientation
of their triads is once again identical.
Fig. 14.17 Trypsin (a, red) and subtilisin (b, green) are serine proteases. They have the same
catalytic triad of serine, histidine, and aspartic acid. These function-determining amino acids are,
however, placed upon entirely different folding patterns. In the above-right picture, the course of
the chain of both proteins is superimposed upon one another (c). Despite this, the side chains of the
amino acids of the catalytic triad are in the same spatial position (d). The course of the polymer
chains are shown with colored ribbons that represent the spatial orientation of side chains of the
three catalytic amino acids.
a b c
e
ov
ro
jo rg
ma
e
ov
ro
rg
no
mi
Fig. 14.18 The DNA molecule is built of single stair steps. A base pair forms each step.
The sugar phosphate chain suspends the steps like a double banister. It forms a major and
a minor groove on the surface. (a) A segment of DNA with 14 base pairs, (b) a schematic
representation with the sugar phosphate backbone as a gray arrow, thymine (light blue) adenine
(red), cytosine (violet), and guanine (light green). (c) A model of a DNA surface in which the
size difference between the minor and major grooves is emphasized. The individual bases
align according to their interaction properties (blue: H-bond donor, red: H-bond acceptor, gray:
hydrophobic contact).
from the major groove, from the side (cf., ▶ Sect. 28.2). Only there is it possible to
read the prescribed code (AT, TA, GC, CG) unambiguously. Due to the many
outwardly oriented phosphate groups, the DNA molecule is heavily charged.
This charge is neutralized by the formation of ion pairs, mostly with magnesium.
Because of its important role in the mediation of genetic information, several
important drugs act on DNA. Two examples are briefly mentioned here. Cisplatin
14.4 is a reactive metal complex that can react with the nitrogen atoms of
two nucleobases on two adjacent steps of the DNA by exchanging both chlorine
substituents (Fig. 14.20). This crosslinking distorts the DNA in such a way that the
sequence information is no longer readable. Cisplatin and analogous derivatives such
as carboplatin are used in cancer therapy as potent chemotherapeutics. Daunorubicin
14.5 is a representative with a somewhat different mode of action, but it also prevents
the reading of the DNA base pairs. By slightly spreading the DNA along the chain the
planar molecular part of 14.5 slips largely between two adjacent base pairs and causes
a structural distortion of the DNA (intercalation). This intravenously administered
cytostatic is used as a combination scheme therapeutic for the treatment of acute
leukemias. Many natural products also use this so-called intercalation mechanism for
312 14 Three-Dimensional Structure of Biomolecules
major major
H
H
N O H N H
N N H O CH3
O N G N H N C O N
N N
A N H N T
N N
N H O O
H O O
H
minor minor
G•••••C A•••••T
major major
H H
H N H O N H3C O H N N
N N O
C N H N G O T N H N A
N N N N
O O H N O O H
H
minor minor
C•••••G T•••••A
Fig. 14.19 The DNA base pairs of cytosine (C) with guanine (G) and thymine (T) with adenine
(A) on the individual steps are formed by complementary hydrogen bonds. Each base carries a sugar
phosphate group that is coupled with the polymer chain. It affords a double-helical construction
with a minor (green) and major (yellow) groove (cf. Fig. 14.18). If viewed from parallel to the steps,
four groups can been seen in the major groove that possess either hydrogen bond donors (blue),
acceptors (red), or hydrophobic properties (gray). Three such groups are aligned in the minor
groove. If an attempt is made to read the interaction pattern from this side, a GC or CG pair and a AT
or TA pair are recognized as identical. Here the orientation of the interaction pattern cannot be
distinguished. In the major groove, on the other hand, the pattern of exposed interaction is
unambiguous. Therefore, proteins read information about the DNA from the major groove.
14.10 Synopsis
• Every third bond in the polymer chain of a protein is an amide bond. It is the
fundamental building block in the protein backbone and the mutual spatial
arrangement of the sequential planar amide bonds determines the overall archi-
tecture of a protein.
14.10 Synopsis 313
Fig. 14.20 Crystal structure of an oligomeric DNA segment after a reaction with cisplatin 14.4
(a) or intercalation with daunorubicin 14.5 (b). In both cases the DNA molecule is severely
distorted and the genetic information on the DNA cannot be read for cell division. Cisplatin reacts
with the nitrogen atoms of two nucleobases (here guanine) of the DNA on neighboring steps with
substitution of both chlorine atoms. With its planar tetracyclic ring system, daunorubicin interca-
lates between two neighboring base pairs by spreading the DNA along the helix axis. The
compound’s amino sugar accommodates in the DNA minor groove.
Bibliography
General Literature
Branden C, Tooze J (1999) Introduction to protein structure, 2nd edn. Garland, New York
Bürgi HB, Dunitz JD (1994) Structure correlation, vol 1. VCH, Weinheim
Jeffrey GA, Saenger W (1991) Hydrogen bonding in biological structures. Springer, Berlin
Schulz GE, Schirmer RH (1978) Principles of protein structure. Springer, New York
Special Literature
Allen FA, Kennard O, Taylor R (1983) Systematic analysis of structural data as a research
technique in organic chemistry. Acc Chem Res 16:146–153
CSD Database: www.ccdc.cam.ac.uk/products/csd/
Klebe G (1994) The use of composite crystal-field environments in molecular recognition and the
de novo design of protein ligands. J Mol Biol 237:212–235
Koch O, Klebe G (2008) Turns revisited: a uniform and comprehensive classification of normal,
open, and reverse turn families minimizing unassigned random chain portions. Proteins: Struct
Funct Bioinform 74:353–367
Lario PI, Vrielink A (2003) Atomic resolution density maps reveal secondary structure dependent
differences in electronic distribution. J Am Chem Soc 125:12787–12794
Orengo CA, Jones DT, Thornton JM (1994) Protein superfamilies and domain superfolds. Nature
372:631–634
PDB Database: http://www.rcsb.org/pdb/home/home.do
Vyas K, Monahar H, Venkatesan K (1990) Thermally induced O to N acyl migration in
salicylamides. Thermal motion analysis of the reactants. J Phys Chem 94:6069–6073
Wood VJL, Patterson AW et al (2005) Substrate activity screening: a fragment-based method for
the rapid identification of nonpeptidic protease inhibitors. J Am Chem Soc 127:15521–15527
Molecular Modeling
15
Three-dimensional structure models have been used since Jacobus H. van’t Hoff
and Joseph Le Bel. Emil Fischer reported in his book Aus meinem Leben about
a vacation in Italy:
In the previous winter 1890/91 I was busy with the task of clarifying the configuration of
sugar, without entirely achieving my goal. Then the thought came to me in Bordighera that
the decision about the configuration of pentose has to do with its relation to trioxyglutaric
acid. Unfortunately for lack of a model I could not tell to what extent such acids are
possible according to theory and I therefore posed the question to Baeyer. He picks up such
things with great enthusiasm, and directly constructed carbon atoms from balls of bread
and toothpicks. But after many attempts he gave the cause up, ostensibly because it was too
hard. Later in W€urzburg after considering good models at length, I managed to find the
conclusive solution.
Linus Pauling was the first to propose the a helix as a secondary structure in
proteins.
The key to Linus’s success was his reliance on the simple laws of structural chemistry. The
a-helix had not been found by only staring at X-ray pictures. The essential trick, instead,
was to ask which atoms like to sit next to each another. In place of pencil and paper, the
main working tools for this work were a set of molecular models superficially resembling
the toys of pre-school children.
With these sentences the Nobel prize winner James Watson described the
approach of Pauling in his book The Double Helix. Pauling’s success was also
based upon well-founded proficiency in theoretical chemistry. That is how Pauling
knew that an amide bond is stiff and flat, whereas his rivals, William Bragg, Max
Perutz, and John Kendrew, were of the misconception that they would be flexible.
James Watson and Francis Crick went the same way as Pauling in the search for the
DNA structure:
We could thus see no reason why we should not solve the DNA problem in the same way [as
Pauling]. All we had to do was build a set of molecular models and begin to play—with luck
the structure would be a helix.
Working with molecular models must not have been pure pleasure back then.
In one place in the book, for example, he writes:
Our first minutes with the models, though, were not joyous. Even though only about fifteen
atoms were involved, they kept falling out of the awkward pincers set up to hold them the
correct distance apart.
Based on this background the achievement of Watson and Crick seems even
more impressive. They were awarded the Nobel Prize in 1962 for the elucidation of
the double-helix structure of DNA. This example should underscore the importance
of models in science. To end with a word from Francis Crick: “A good model is
worth its weight in gold.”
In contrast to the 1950s and 1960s, computers are available today with impressive
graphical performance and high computing speed. Accordingly, programs are
available for working with molecular models. The new field of molecular model-
ing has been established. This term encompasses the display and manipulation of
realistic three-dimensional molecular structures along with the calculation of their
physicochemical properties. The most important methods that are employed in the
context of molecular modeling are summarized in Table 15.1.
15.2 Strategies in Molecular Modeling 317
and the geometry of the resulting hits most closely resembling the query molecule
are used. In the next step the molecule is optimized by a force-field calculation.
There are also standard programs for the generation of starting models that
translate a 2D structure formula into a 3D spatial structure according to the
principle of a molecular model kit. These “electronic molecule-construction kits”
have lists of bond lengths and angles as well as preferred fragment geometries
stored, and build molecules according to a sophisticated system of rules. In frac-
tions of seconds they determine the 3D spatial structure for the 2D structural
formula. The program CONCORD from Robert Pearlman in Austin, Texas, and
CORINA from Johann Gasteiger and Jens Sadowski at the University of Erlangen
are among the most important. Both programs are used to generate 3D structures of
small molecules. The 3D structure of a protein, however, cannot be built with these
programs. More sophisticated techniques are necessary for proteins (▶ Sect. 20.1).
Perhaps the most often used technique for molecular modeling is the so-called
knowledge-based approach. Here an attempt is made to exploit the enormous
accumulated knowledge from experimentally determined molecular structures,
crystal packings, protein structures, protein sequences, and structure–activity rela-
tionships from protein–ligand complexes, etc., to efficiently solve the relevant
problem. Basically nothing more is done here than to imitate the approach that
a conscientious scientist would take with a computer program. Initially as much
experimental data as possible is collected and analyzed. Important information
sources are the Cambridge Crystallographic Database with over 500,000 crystal
structures of small molecules as well as the protein databank (PDB) with more than
80,000 protein and DNA structures. Physicochemical properties are also available
in databases. The Beilstein database, with almost 10 million chemical structures,
contains, for example, pKa values for more than 20,000 compounds. The challenge
lies in the extraction of the necessary data for the question at hand from the
enormous plethora of electronically available information. Furthermore, it must
be considered that the data comes from different sources and could be partially
erroneous.
The largest growth in electronically available data recently has occurred in the
area of DNA sequences. Hundreds of genomes have been sequenced, and new ones
are added weekly. The nearly endless number of sequences can only be conquered
with intelligent searching protocols. Knowledge-based approaches play a central
role in this area and in the modeling of protein structures.
E=
1
∑
2 Bonds
Kb (b − b0)2
+
1
∑ KΘ(Θ − Θ0)2
2 Bond angle
+
1
2 ∑ KΦ(1+cos(n Φ − d )2
Torsion angle
Fig. 15.1 E is the total energy of a molecule or a complex of several molecules. It is composed of
various contributions. The first term describes the energy change upon stretching or compressing
a chemical bond. In the example at hand, it describes the so-called harmonic potential with the
force constant Kb and the equilibrium bond length b0 as a parameter. The energy as a function of
the bond angle Y is described in the second term. Here too, the harmonic potential is used with the
force constants KY and an equilibrium constant Y0. The third contribution describes the change in
the energy upon changing the dihedral angle, and the last term stands for non-covalent interactions.
The sum of three terms is used for this last contribution. The first term Aij/rij12 is always positive
and rises quickly with decreasing distance. It describes the repulsion between atoms that come too
close together. The contribution from Cij/rij6 is always negative and approaches zero with
increasing distance rij, though not as fast as the repulsive term. It describes attractive interactions,
which are also called dispersion interactions. Other attractive interactions exist between polar
molecules that are also proportional to 1/rij6 (for a description of the potentials see ▶ Sect. 18.12,
▶ Fig. 18.5). The last term qiqj/erij describes the electrostatic interactions based on Coulomb’s law,
which are based on a point charge model. The dielectric constant is e. The non-covalent contri-
bution to the total energy, without the electrostatic term, is called van der Waals energy.
determined by its pKa value. This indicates how easily a group accepts or releases
a proton. This property, in turn, depends heavily upon the partial charge that the
group carries and what other charges are in the immediate vicinity of the group.
Thus, the pKa value shifts if a functional group comes into an altered environment.
For example, carboxylic acids become more acidic when they are brought near
a positive charge. Their acidic nature changes, on the other hand, if a partially
negatively charged group is nearby. This effect must be considered in a reliable
force-field calculation. An attempt can be made to predict the protonation state in
protein–ligand complexes with such calculations. For this, the contribution to the
energy content of the complex is determined by evaluating all possible combina-
tions of states of titratable groups. In this way the shift in the pKa values of
functional groups can be estimated.
The importance of water as a binding partner in the formation of protein–ligand
complexes was emphasized in ▶ Chap. 4, “Protein–Ligand Interactions as the Basis
for Drug Action”. Complex formation causes a change in the solvation conditions
for the involved molecules. This must be considered in the force-field calculations.
For this, a force-field is combined with estimations for the contribution from
solvation. Newer methods such as the MM-PBSA or MM-GBSA methods try to
sum up these contributions over the local environment in a surface-dependant way.
The choice of a relevant starting geometry is important for any force-field
calculation. A force-field calculation leads to an energy minimization. By starting
from an energetically unfavorable geometry, the force field drives “downhill” to the
next local minimum on the multidimensional energy surface (▶ Sect. 16.2). If one
starts with two different geometries, the resultant minimized structure can also
be different. Many molecules and especially protein–ligand complexes can adopt
numerous energetically favorable conformations. It is therefore recommended that
multiple force-field calculations are performed by starting from different
geometries.
function, the so-called atomic orbital (AO) or, in a molecule, molecular orbital
(MO). The wave function of the entire molecule is applied as the antisymmetric
product of these many orbitals. The Hartree–Fock equation is obtained on the
condition that optimally chosen orbitals lead to minimal energy. The main defi-
ciency of the Hartree–Fock approach, namely, neglecting the electron correlation,
can be corrected with more elaborate methods, whereby the calculation time,
however, severely increases.
Quantum mechanical ab initio calculations allow the calculation of the molec-
ular structure and electron density distribution as well as molecular properties
without the assumptions that are necessary for force-field calculations. In many
cases it is difficult to make predictions a priori based on the hybridization state of
the atoms. In the case of amines and sulfonamides, it is often impossible to predict
whether the atoms that are bound to nitrogen are in the same plane or whether
nitrogen is in a pyramidal environment. In a force-field calculation one must specify
from the very beginning what atom type is to be assigned to which atom (i.e., for
the above case, whether it should be a planar or a pyramidal nitrogen atom). If
the wrong atom type is chosen, the resulting structure is, of course, meaningless.
Quantum mechanical calculations require no such assumptions.
The majority of currently applied force-fields use a point-charge model to
describe the electrostatic interactions. One possibility to derive the atomic charges
is to calculate the electrostatic potential of a small molecule that contains the group
in question by using quantum-mechanical methods. Subsequently, a set of partial
charges is assigned to the various nuclei so that the quantum mechanically calcu-
lated potential is depicted as accurately as possible. These charges can then be
transferred to force-field calculations to be used in a large system.
A further important application of quantum-mechanical calculations in drug
design is found in the calculation of conformational energies of small molecules
to calibrate force-fields. The force-fields that have been developed for proteins
and peptides are based on conformational energies that have been quantum-
mechanically calculated for small peptides.
In contrast to force-field methods, quantum-mechanical techniques are able to
consider the polarization of the electron density caused by the influence of neighboring
groups. For example, the amide bond dipoles in an a helix are all oriented in the same
direction so that they sum up to a significant total dipole moment. As a consequence,
such large compiled dipoles can polarize other groups that are localized at the end of
the helix. In this way the induced dipoles are incompletely described by force-field
methods. For quantum-mechanical methods, this is not a problem. A further important
application area is chemical reactions for which force-fields are hardly parameterized
at all, with the exception of a few special cases. Here quantum mechanical methods
are the only possibility for theoretical description.
Quantum-mechanical methods are considerably more elaborate than force-field
methods. The most accurate methods, which also devour the most calculation time,
are the so-called ab initio methods. These techniques meet their limits however,
with very large systems. Therefore other less computationally demanding methods
were developed. In these so-called semiempirical methods, certain integrals, the
15.6 Computing Molecular Properties 323
Fig. 15.2 Different computer graphics representations of dopamine (▶ Sect. 1.4, Formula 1.13).
Carbon atoms are colored gray, hydrogen atoms are white, nitrogen atoms are blue, and oxygen
atoms are red. (a) Dreiding models. (b) Ball-and-stick models. (c) Space-filling models (CPK
representation). (d) Solvent-accessible surface. (e) Electrostatic potential projected on the surface
(positively charged areas are blue, negatively charged areas are red). (f) Highest-occupied
molecular orbitals (HOMO), calculated for the uncharged dopamine molecule. The blue or red
areas of the wave function indicate a different sign.
is less misleading. It is generated by rolling a sphere with a radius of 1.4 Å, which
corresponds to the size of a water molecule, over the surface of the molecule. This
surface appears much smoother. Depressions that are still present mean that
small molecules – at least a water molecule – can really fit in there. The Lee–
Richards surface is less frequently used but very helpful. It is so chosen that ligand
atoms that come into contact with the examined surface lie directly on this surface
(Fig. 15.3c).
The surface can be colored too. For example, a color can be assigned to each
atom type, and then the color of the next-closest atom can be used for the surface.
A representation in which the molecule’s surface is colored according to other
properties, for example, electrostatic or hydrophobic potential, is very instructive.
15.7 Molecular Dynamics: Simulation of Molecular Motion 325
Fig. 15.3 Definitions of molecular surfaces (a) van der Waals surface. The arrow marks a place
where a crevice is found, but it is too small to accommodate a water molecule. (b) Solvent-
accessible area. (c) Lee–Richards surface.
None of the processes that are interesting to us run at 0 Kelvin, but rather at body
temperature, which is approximately 310 Kelvin. It is therefore clear that not only
the potential energy but also the kinetic energy must be considered. Molecules
move at room temperature. They diffuse and change their shape in that they adopt
different conformations. The flexibility and adaptability of both partners play a big
role in protein–ligand interactions. A prerequisite for protein binding is that the
ligand can take on a conformation that corresponds to the shape of the binding
pocket. On the other hand, the protein is flexible to a certain extent. For example,
side chains on the surface can adopt different conformations or entire domains can
move relative to one another. The mutual adaptation of protein and ligand shapes
plays an important role in the formation of protein–ligand complexes in particular.
The molecular dynamics simulation (MD) is a theoretical method to describe
these effects. In molecular dynamics simulations the movement of atoms and
molecules is followed under the influence of the chosen force-fields. It is assumed
326 15 Molecular Modeling
in these calculations that the interactions between particles obey the laws of
classical mechanics. For this, the Newtonian equations of motion are solved in
parallel and stepwise for all particles simultaneously. Usually it is assumed that the
force between two particles is not influenced by other particles.
In practical applications, a starting geometry is generated at first (Fig 15.4). If an
experimentally determined structure, for instance, the crystal structure of a protein–
ligand complex, is available, then that is the starting point. To take the surrounding
water shell into consideration, the complex is dipped into a “water bath,” that is,
a large number of water molecules enclose it. Further, an adequate number of ions is
added to keep the whole system in an electrically neutral state. To prevent boundary
effects on the “walls,” a trick called “periodical boundary conditions” is used on the
water bath. If the simulated protein complex approaches such a wall and wants to
leave the water bath, the process is handled on the computer as though the complex
had again entered from the opposite side. Formally, the boundary areas of the water
bath are eliminated.
In the beginning of the actual simulation each atom is assigned a random starting
velocity with an arbitrary orientation. The velocities are chosen so that on average
they correspond to the desired temperature (Boltzmann distribution). Then all
forces from all surrounding atoms acting on a particular atom are calculated.
At set time intervals the next position is calculated with Newtonian motion equa-
tions, and so forth. The step width is typically a femtosecond (1 fs ¼ 1015 s). This
small step width is necessary because there are many extremely fast processes that
occur on the molecular level. The development of the movement is followed for
multiple nanoseconds, and is shown in terms of a trajectory. Ten nanoseconds are
enough to follow the movement of side chains and sometimes even of protein
domains. It is not enough, however, to describe the diffusion of an active compound
into the binding pocket. For this, longer simulation times are necessary. The folding
of a protein is also difficult to simulate with this technique. The necessary time for
protein folding is on the actual time scale between 20 ms and 1 h. The calculation of
one time step (1 fs) still requires seconds of processing time on even the fastest
computers. Nonetheless new algorithms and computers with more specific archi-
tectures are being developed that will make such simulations possible in the
foreseeable future.
Another application of MD simulations, the calculation of binding affinity,
should be mentioned here. In principle the free energy DG for a given system can be
calculated. From the point of view of statistical thermodynamics the so-called
partition function (German: Zustandssumme) is determined for this, in which the
energetic contributions of all possible configurations of a system are considered.
The entropic component of the system is automatically calculated by determining
the distribution and relative population of the many states. Differences in the free
binding energy of different ligands is of particular interest in the context of protein–
ligand interactions. Experience has shown, however, that only differences in the
binding free energy between two similar ligands can be reliably calculated. In
modern applications (e.g., for screening purposes, ▶ Sect. 7.4), particularly large
amounts of data are evaluated. Therefore, the effort associated with MD
15.8 Dynamics of a Flexible Protein in Water 327
Save Coordinates
Another Step?
Yes
No
End
applied to GPCRs in particular, which are introduced in ▶ Chap. 29, “Agonists and
Antagonists of Membrane-Bound Receptors”.
Matthias Zentgraf carried out extensive molecular dynamic simulations on
aldose reductase. The profile that resulted was consistent with the crystallographic
structure determinations. Amino acids that are repeatedly found in many protein–
ligand complexes with modified geometries were shown to be very flexible in MD
simulations as well. If the trajectory of such simulations is evaluated, it is apparent
that the protein flips between the above-mentioned parent conformations. Addi-
tionally, many geometries occur that have only small but structurally critical
variations to these parent conformations. Small areas in the binding pocket are
thus opened that are able to accommodate, for example, an additional methyl group
or a phenyl ring on a ligand. Such information can be directly used for the design of
new inhibitors.
To provide an overview of the flexibility of a protein, the variation of the atom
positions is calculated from one simulation state to the next along a trajectory. Just
as with photographic film, these momentary pictures of complexes are called
“snapshots.” Above all, it becomes transparent if a protein fluctuates for
a particular time in one conformation before it flips into another geometry. In
further progress it can either return to the original geometry or flip into another
basis geometry. Such an orientation map is shown in Fig. 15.5. From this map,
it can be extracted that the protein spends time in multiple parent conformations.
If representative snapshots from these clusters of basis conformations are
superimposed upon one another, a very good picture of which groups in the binding
pocket show enhanced flexibility is obtained. In the example at hand, the side
chains from two neighboring phenylalanines (Phe121 and Phe122, Fig. 15.6) are
particularly implicated. These can swing out of the way to open a new, previously
closed cavity in the binding pocket. In the context of drug design, such information
can be translated into the design of new inhibitors that can occupy new binding
pockets. In this way an improved affinity or selectivity for the target protein can be
achieved. A ligand is shown in Fig. 15.7 that has been furnished with an additional
benzyl group (red), that optimally fills the newly opened cavity in the snapshot (in
light blue) in Fig. 15.6.
To conclude this chapter, the terms “model” and “simulation” should be briefly
compared and contrasted. Molecular models are used to approach questions that are
experimentally difficult or impossible to address. What different conformations can
a molecule adopt? This question is currently difficult to answer experimentally.
Does a possible drug candidate fit into a protein’s binding pocket? Even this
question is only answerable with laborious experiments. The use of models is
an elementary component of every scientific discipline. Models have always played
a central role in chemistry. It is shown in ▶ Chaps. 23, “Inhibitors of Hydrolases
with an Acyl–Enzyme Intermediate”; ▶ 24, “Aspartic Protease Inhibitors”;
15.9 Model and Simulation: Where Are the Differences? 329
600
Number of Snapshots
rmsd [Å]
1.8
500
1.5
400
1.2
300
0.9
200 0.6
0.3
100
0
0
0 100 200 300 400 500 600
Number of Snapshots
2D RMS Diagram
Fig. 15.5 The development with time of the spatial deviations of various snapshots along the
simulation trajectory are visualized on this map. Large deviations are color-coded with red, medium-
sized deviations with green, and small deviations are colored blue. Green delineated square areas are
recognizable along the main diagonal. There the complex spends time near a parent conformation.
The transition to the next square represents a flip to a new geometry. If sectors outside the main
diagonal are colored increasingly red, the geometry deviates strongly from the previously adopted
conformation. If an area outside the diagonal is reached that is green, the newly adopted geometry is
not very different from a state that the system reached one time. With such a map it is possible to see
which of the many parent conformations a complex swings between.
Fig. 15.6 Representative snapshots were taken from the different square area along the main
diagonal in Fig. 15.5 and superimposed upon one another. It can be seen that above all else, the
side chains of the phenylalanines Phe121 and Phe122 can undergo severe movements in the
binding pocket. In doing so, they can also adopt conformations (e.g., the light-blue geometry) that
open a new hydrophobic cavity in the binding pocket.
O
N COOH
Fig. 15.7 Conformations occur along the trajectory of the protein that open a new hydrophobic
pocket when the side chain of a phenylalanine swings away (Fig. 15.6, i.e., light-blue geometry).
This pocket can be occupied by a ligand. For this a benzyl group was added to the scaffold of the
shown benzodiazepine-like inhibitor, which can occupy the opened pocket during the simulation.
15.10 Synopsis
• Models have been and still are used in chemistry in general, but in particular in
modern drug design. Computer graphics is a versatile tool to display structures
and models along with various properties assigned and/or geometrically
superimposed onto these molecules.
• Structures can be calculated by starting from first principles and by trying to
regard physics as closely as possible. This is done with quantum mechanical
calculations. Because these methods easily become elaborate and computation-
ally intractable, an alternative is the empirical approaches. They are based on
much simpler physics, normally classical mechanics, and treat molecules as a set
of point charges in space interconnected by springs following harmonic
potentials.
• Empirical approaches can only be used if enough experimental data are available
to parameterize and calibrate the empirical concepts. Therefore large databases
assembling knowledge about molecular properties have been developed.
332 15 Molecular Modeling
Bibliography
General Literature
Barnickel G (1995) Molecular modelling – von der Theorie zur Wirklichkeit. Chemie in unserer
Zeit 29:176–185
Birner P, Hofmann HJ, Weis C (1979) MO-theoretische Methoden in der organischen Chemie.
Akademie-Verlag, Berlin
Burkert U, Allinger NL (1982) Molecular mechanics, ACS monograph 177. American Chemical
Society, Washington, DC
Goodfellow JM (ed) (1995) Computer modelling in molecular biology. VCH, Weinheim
Kunz RW (1991) Molecular modelling f€ ur Anwender, Teubner Studienb€ ucher
Leach A (2001) Molecular modelling: principles and applications, 2nd edn. Prentice Hall, New York
Lipkowitz KB, Boyd DB (eds) (1990) Reviews in computational chemistry. VCH, Weinheim
Bibliography 333
Special Literature
Cornell WD et al (1995) A Second generation force field for the simulation of proteins, nucleic
acids, and organic molecules. J Am Chem Soc 117:5179–5197
Cram DJ (1988) The design of molecular hosts, guests, and their complexes. Angew Chem Int Ed
Eng 27:1009–1020
Fischer E (1922) Aus meinem Leben. Springer, Berlin, p 134
Pullman B (1990) Molecular modelling, with or without quantum chemistry. In: Rivail JL (ed)
Modelling of molecular structures and properties, vol 71, Studies in physical and theoretical
chemistry. Elsevier, Amsterdam, pp 1–15
van Gunsteren WF, Weiner PK (1989) Computer simulations of biomolecular systems. ESCOM,
Leiden
Watson JD (2010) The double helix, Phoenix, London; originally published by Weidenfeld &
Nicholson 1968
Conformational Analysis
16
Assembling a molecule with a modelling kit makes it already clear that rotations
around single bonds can be easily carried out. The molecule will achieve a different
shape, or as the chemists say, it is transformed into a different conformation. In a real
molecule, rotations around these bonds are not fully free. They are subjected to
a potential and the molecule adopts during the rotation particular, energetically favor-
able arrangements. n-Butane represents the simplest case (Fig. 16.1). The central
torsion or dihedral angle determines the relative orientation of the two bonds to the
methyl groups to one another. If n-butane is rotated out of the arrangement with the two
bonds to the methyl groups in 180 orientation (trans), the methyl group at the “front”
carbon and the hydrogen atom at the “back” carbon will directly coincide which each
other at a rotation angle of 120 and 240 called “eclipsed”. In this geometry, they come
closer to one another, therefore this arrangement is unfavorable for steric reasons. At a
rotation angle of 60 and 300 the groups are again in a staggered geometry, which is an
energetically more favorable situation. This arrangement is somewhat less favorable
than the staggered trans orientation because of the spatial vicinity of the methyl groups,
which are now said to be “gauche” to one another. Finally along the rotation path an
orientation is adopted at 0 and 360 in which both methyl groups are exactly behind
one another. This is an even less favorable orientation.
Multiple energy maxima and minima can be passed through during the course of
a full rotation about 360 depending on which atoms and groups are attached to the
rotatable bond. They are at different energy levels relative to one another. The
lowest minimum is called the global minimum, and the energetically higher minima
are called local minima. Knowledge about these minima is important because
molecules adopt geometries that correspond to such energy minima. Calculations
are necessary to find these minima. A possible method is in the systematic rotation
of all rotatable bonds, for instance in 10 steps. At each step the energy of the
25,5 kJ
Energy 14,6 kJ
(kJ/mol)
3,8 kJ
CH3 CH3
CH3 CH3 CH3 CH3 CH3 CH3 CH3CH3 CH3
CH3
τ CH3 CH3
Fig. 16.1 Butane, CH3CH2CH2CH3, is made up of a linear chain of carbon atoms. If the terminal
methyl groups are covering one another after rotation around the central C—C bond, the torsion
angle about the central bond is 0 . At a 60 angle the “back” methyl group is half way between the
“front” methyl group and a hydrogen atom. This situation is called a “gauche” orientation. At 120
a methyl group and a hydrogen atom are eclipsed to one another. At 180 the terminal methyl
groups are exactly opposite one another. Here the energetically most favorable situation, the trans
orientation, is achieved. From now on, the course of the rotation is mirror symmetrical, and ends in
the starting position at 360 . The orientations at 120 and 140 are energetically less favorable than
the 180 -orientation by 14.6 kJ/mol. The gauche orientations at 60 and 300 are the least
favorable ones and are 25.5 kJ/mol higher in energy. If a minimization method is applied that
can only run “downhill,” the three minima on the potential curve can be reached by starting at the
110 , 130 , and 250 points.
It was shown in ▶ Chap. 15, “Molecular Modeling” that the energy and geometry
of a molecule can be calculated with the help of a force field or a quantum
mechanical method. In this way every possible angle value combination about the
rotatable bonds in a molecule can be found that correspond to energetically
favorable states. The mathematical method that is used to search for such
a minimum geometry can only move downhill on the potential energy surface
(▶ Sect.15.5). For this, the potential of n-butane should be considered again
(Fig. 16.1). If an angle of 130 is used as a starting value, the minimization ends
with a trans geometry. If an angle of 110 is started with, which is only 20 distant, the
optimization will lead to a gauche orientation. By doing this, two of the three possi-
bilities are detected. The third minimum that mirrors the gauche conformation is
reached if an angle of 350 is started from. In this way, all three conformations are
found for the simplest possible case.
How are complex molecules to be approached? In principle, in exactly the same
way. Because it is not known which torsion angles of the individual single bonds
will give access to potential minima, that is, stable conformations, the minimization
must be started from numerous angles for each of the single bonds. From these
values the minimization always goes “downhill”. The minima on the potential
surface are found in this way. The art is to efficiently define the starting points
from which a given geometry is minimized. This is a very laborious task, particu-
larly with large molecules. It is akin to a hiker in the mountains searching for the
deepest valley.
Adenosine monophosphate 16.1 serves as an example (Fig. 16.2). The analysis
concentrates on the five-membered ribose ring, the bond to nitrogen in the adenine,
and the three bonds of the sugar phosphate side chain. What conformations can this
molecule adopt? Rotations are performed about the open-chain bonds in 10 steps.
In the systematic search for the ribose ring only those orientations are considered
that allow the ring to close. To get a rough overview of the hypothetically obtained
geometries, the distance between the center of the adenine scaffold and the phos-
phorus atom is measured in each generated geometry. This falls between 4.5 and
9.3 Å for the more than 300,000 generated geometries. To estimate the energy content
of a molecule in an arbitrary geometry, its van der Waals energy (▶ Chap. 15,
“Molecular Modeling”) is calculated. Such a calculation is quickly accomplished.
The energies of the 300,000 geometries are between 0 and 64 kJ/mol. The
so-generated structures are not yet in local potential minima. To achieve this, each
starting geometry must be minimized (cf., the potential energy curve of n-butane in
Fig. 16.1). The subsequently obtained conformations are compared to determine
whether the same local minima have been reached by starting from different points.
This is a rather laborious endeavor for 300,000 starting geometries! It is akin to letting
our hiker walk downhill from each level square to find the deepest valley. Hopefully
he is granted great longevity so that he lives long enough to see the results of the
search! Can this search be structured more effectively?
338 16 Conformational Analysis
O− NH2
O OH
P N
N
t3
O t2
N N
O
t1 t4
HO OH 16.1
Fig. 16.2 Adenosine monophosphate 16.1 exhibits the conformationally flexible ribose ring and
four open-chain torsion angles, t1–t4. Rotations are performed and the center of the around these
torsion angles during the conformational analysis. To get a rough description of the attained
geometry,
N the distance between the phosphorus atom in the side chain and the adenine scaffold
( ) is measured.
Sometimes rolling the dice is better than systematic probing! The hiker could
choose random places in the mountains from which to descend into the next valley.
With a little luck he will find the deepest valley with significantly less effort. Such
Monte Carlo methods are very popular in conformational analysis. For this the
starting angles for the conformation search are chosen purely randomly. Molecular
dynamics serves as another approach. The hiker would have to climb into an
airplane that flies at high speed between the mountains and changes its direction
with each obstruction. After set time intervals, the hiker jumps from the airplane
and hikes to the base of the valley upon landing. The higher the airplane flies,
the fewer mountain peaks are encountered and the faster the mountains can
be crisscrossed. In the course of molecular dynamics a molecular trajectory
(▶ Sect. 15.8) is followed, and the geometry is saved at predefined time intervals
to use them as starting points for energy minimizations in a conformational anal-
ysis. By increasing the temperature (i.e., flying higher) a larger area of conforma-
tional space can be searched in a shorter period of time.
Until now molecules have been considered in an isolated state. How does their
flexibility change when they are brought into an environment like the binding
pocket of a protein? In principle nothing changes in their conformational flexibility.
It could be that minima are found at different positions that have different relative
energies because of electrostatic and steric interactions in the binding pocket. This
begs the question of whether the torsion angles in all areas must be sought for
16.4 Is It Necessary to Search the Entire Conformational Space? 339
80
Frequency [%]
60
40
20
0
0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°]
Fig. 16.3 A value distribution for the torsion angles with clusters at 60 , 180 , and 300 is derived
from a database of small-molecule crystal structures for the C—CH2—CH2—C fragment. Most
values are found at 180 . Torsion angles between 0 and 360 are entered as the relative frequency
in percent. The maxima of the distribution are at the points where the potential curve of n-butane
(Fig. 16.1) shows its energy minima.
Unfortunately, the search cannot be narrowed here. This looks better for the other
angles t1–t3. There, only specific values occur. If the systematic search is limited to
these areas, and a search in 10 steps is carried out around the average value, it would
only be necessary to generate 6,340 geometries. Almost the same distance between
phosphorus and adenine is covered with 5.9–9.3 Å as in the unrestricted search.
If a van der Waals energy calculation is carried out on these geometries, values
between 0 and 16.3 kJ/mol are obtained. In contrast to the results from Sect. 16.2, all
the geometries that correspond to the energetically unfavorable areas are discarded.
How can it be confirmed that this restricted search also covers that part of the
conformational space that includes the receptor-bound conformations? Adenosine
monophosphate 16.1 often occurs as a substructure of cofactors in protein com-
plexes so that there is enough information about receptor-bound conformations for
this particular example. They come from crystal structures of proteins with these
bound cofactors. The distance range of 5.9–9.2 Å between the adenine scaffold and
the phosphorus in the receptor-bound structures covers the same range that was
detected in the enhanced systematic search. It can therefore be assumed that enough
geometries were generated that satisfactorily populate the local minima of the
bound state of adenosine monophosphate. Reflecting back to the initial butane
example (Fig. 16.1), this means that the starting points were well distributed so
that all minima were reached.
40 60
Frequency [%]
Frequency [%]
30
40
20
20
10
0 0
0 30 60 90 120 150 180 210 240 270 300 330 360 0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°] Torsion Angle t [°]
t3 t2 NH2
HO N
N
−
O P O
N N
O O
t1 t4
HO OH 16.1
60 15
Frequency [%]
Frequency [%]
40 10
20 5
0 0
0 30 60 90 120 150 180 210 240 270 300 330 360 0 30 60 90 120 150 180 210 240 270 300 330 360
Torsion Angle t [°] Torsion Angle t [°]
Fig. 16.4 The frequency distribution of the torsion angles of the open-chain bonds of adenosine
monophosphate as found in the crystal structures of small organic molecules. The torsion-angle
histograms are constructed for fragments that are representative for corresponding portions of the
test molecule. There are clearly preferred values for the angles t1–t3, but a broad distribution of all
possible angles is found for t4. This knowledge is used in the conformational analyses and limits
the search for t1–t3 to the preferred value ranges.
hydrogen bonds are formed by its three carboxylate groups and the hydroxyl group
to three histidine and two arginine residues of the protein (Fig. 16.5). If the free, not
to the protein bound citrate molecule is considered and its geometry is minimized in
an isolated state, it takes on a conformation with internally saturated hydrogen
bonds (▶ Sect. 15.5). Of course, a different geometry can be started from, but in
all cases, conformations with intramolecular hydrogen bonds will result upon
minimization. Such hydrogen bonds rarely occur in the protein-bound state.
Therefore the conformation that was obtained after minimization in the isolated
state has no relevance for the conditions in the protein.
As a general rule, ligands rarely bind to proteins in a conformation exhibiting
intramolecular hydrogen bonds. The H-bond-forming groups are generally
involved in interactions with the protein.
To circumvent the problem of intramolecular H-bond formation, a minimization
of the generated starting structure can be neglected, and all geometries from the
systematic search can be used for further comparison (▶ Chap. 17, “Pharmacophore
Hypotheses and Molecular Comparisons”). Then, however, very many geometries
must be examined. This would severely limit the scope of such comparisons for
342 16 Conformational Analysis
Many drug-like molecules are flexible. They can adopt markedly different confor-
mations depending on the surrounding environment. Usually the receptor-bound
geometry is not in the energetically most favorable conformation found for the
isolated state, but will fall in an energetically favorable area. For the conformational
analysis, this means that it is not necessarily the deepest minimum that is sought.
Rather, it should be the “relevant” minimum that corresponds to the bound state.
There is only a chance of finding it when the criteria for the search are known.
There is no difference in the difficulty of finding the energetically most
favorable conformation, or the one that “fits” best the binding site. An important
tool in the search for novel lead structures is the docking of candidate molecules
into the binding pocket of a given protein. Programs that are able to use this
approach must be able to handle the conformation problem. Meanwhile, a large
variety of methods have been developed that allow efficient docking searches on
computer clusters, particularly for molecules of drug-like size.
344 16 Conformational Analysis
16.8 Synopsis
Bibliography
General Literature
Leach A (2001) Molecular modelling: principles and applications, 2nd edn. Prentice Hall,
Englewood Cliffs
Special Literature
Böhm HJ, Klebe G (1996) What can we learn from molecular recognition in protein–ligand
complexes for the design of new drugs? Angew Chem Intl Ed Eng 35:2588–2614
Klebe G, Mietzner T (1994) A fast and efficient method to generate biologically relevant
conformations. J Comput Aided Mol Design 8:583–606
Klebe G (1994) Structure correlation and ligand/receptor interactions. In: Bürgi HB, Dunitz JD
(eds) Structure correlation. VCH, Weinheim, pp 543–603
Bibliography 345
The structure of the binding pocket determines which functional groups are neces-
sary for the ligand to bind. The spatial orientation of these functional groups in
ligands is referred to as the pharmacophore (▶ Sect. 8.7, Fig. 8.9). Because of its
importance for drug design and model hypothesis in medicinal chemistry, an
official IUPAC definition has been established by Camille G. Wermuth
(Table 17.1). The interacting groups that a ligand must possess to be able to
successfully interact with a protein defines the pharmacophore in space and is
independent of the special molecular scaffold to which they are attached. The
hydrogen-bond-forming groups or hydrophobic parts are considered for this.
A more detailed examination differentiates between positively and negatively
charged groups in a molecule. When derived from a set of similarly binding ligands,
this generalized description is referred to as the ligand-based pharmacophore. On
the other hand, the protein structure can be the starting point. For this, an analysis is
made as to which amino acid functional groups are in the binding pocket. They
define the properties with which a ligand can bind to them. In this sense, the protein
structure determines how the pharmacophore of a ligand must be shaped to be
able to successfully bind to the protein. This description is referred to as the
O
O OH
O CH3
O O
17.1
Fig. 17.1 Picrotoxinin 17.1 is responsible for the centrally stimulating effect of the extracts of
fishberries. Its structure and spatial architecture were proven by X-ray structure analysis.
placed upon one another for this superimposition. The superposition of all active and
inactive derivatives along with the common volumes of both classes are shown in
Fig. 17.3. The difference between both volumes is computed. It describes those areas
in space that are only occupied by the inactive molecules.
R2 = OH
O H
OH OAc O
O OH
O
R2
CH3 N CH3
inactive
O OCOCH3 O OH
O O
O CH3 O
CH3
O O O O
OH
O OH O
O O
O O
CH3 CH3
O O O
O
Fig. 17.3 Superposition of the spatial structure of active (yellow) and inactive (blue) derivatives
of picrotoxinin. The united volumes around the active derivatives are shown by the red mesh. The
total volume around all inactive derivatives is shown in blue. A difference is formed between the
two volumes. The remaining volume (green) shows areas that are only occupied by inactive
derivatives. An explanation for the lack of activity of these derivatives can be that they try to
occupy volume areas that are already occupied by the receptor protein. This spatial clash does not
occur with the active derivatives.
To resolve the first problem, the role that the functional groups of the active
substance that form the contact with the receptor must be considered. They must
form hydrogen-bonding and hydrophobic interactions with the protein. In this
context, similarity of the functional groups means that they can form analogous
interactions with the protein. To define a pharmacophore in space, at least three
interacting groups are needed. This is immediately clear if one considers how many
fingers on a hand are needed to hold a randomly formed object (e.g., a potato) in
space. With only two fingers, the object can still rotate about an axis. In contrast, if
three anchor points are taken, its position is fixed in space. Practical experience
with a compound class is often helpful when assigning pharmacophoric groups.
For example, inhibitors of the angiotensin-converting enzyme (Fig. 17.4 and
▶ Sect. 25.5) need a terminal carboxylate group, a carbonyl group, and a group
that coordinates to the catalytic zinc ion.
354 17 Pharmacophore Hypotheses and Molecular Comparisons
O OH
O
HS S
N HS
N N
N N
N HOOC
H3C
O COOH O COOH
O COOH
O
OH
N
CH3
HS N S
HS N HS N
O COOH
O COOH O COOH
O OH CH3 O
CH3
P HOOC
N N HS
HO N N N
H N
O COOH O COOH
O COOH
O
CH3
HS N N
O HS O H
COOH COOH HS N
O COOH
O CH3 CH3
P N
HO N HOOC N
N N HS O
H H COOH
O COOH O COOH
CH3
S HOOC N
N
HOOC N H
N O
H S COOH
O COOH
S
P N
O
OH O
COOH
O CH3
HOOC N N
N N
H O
H COOH
O COOH SH
12 12
9 10 9 10
11 11
8
5
2 6
N 5
3 7 2 6
3
HO 1 4 1
17.2 17.3
12 12
9 8 10
10
6
5
NH
2 7
5 N 3
3 6
1 4
1 4 17.5
17.4
Fig. 17.5 “Virtual” springs are coupled to the atoms that are marked with numbers around the
steroid 17.2 and the three derivatives 17.3–17.5. The structural superposition (bottom) that is
shown is determined by the force of these springs and the simultaneous consideration of molecular
force-fields.
In the last chapter conformational analysis was the central topic. Could the tech-
niques described there, for example, the systematic rotation around particular
bonds, be used in the search for the pharmacophore? Garland Marshall developed
such a technique, called the active-analogue approach, at the end of the 1970s.
First a pharmacophore must be assigned to all molecules in a data set. Then the
equivalency of groups must be defined, that means, which groups are equivalent to
which other groups. Then a systematic conformational search is carried out for the
first compound in the data set. The distance between each functional group in the
pharmacophore for each geometry is determined during the search. These distances
are saved. Because molecules cannot take on any arbitrary geometry, the distances
will occur in particular intervals. An analogous approach is taken for the second
molecule in the set. In principle, only the distance ranges of the first molecule must
be searched. It could be that all of the distances found with the second molecule
were already found with the first. It could also be that particular ranges are
excluded, and the “allowed” distance ranges are therefore limited. All of the
molecules in the data set are analyzed in this way.
17.6 Molecular Recognition Properties and the Similarity of Molecules 357
The question must be allowed as whether the conceptions presented in the previous
sections to represent the properties of molecules were really appropriately consid-
ered in the attempted comparisons? Deciding which functional groups belong to the
individual “teeth” of a pharmacophore is not easy. Analogous functional groups
must be oriented in a similar spatial direction in all molecules. In the case of the ACE
inhibitors (Fig. 17.4) conflict occurs already during the assignment of the functional
groups. Some analogues carry two carboxylate groups, which must be unambigu-
ously assigned to the pharmacophore prior to comparison with other inhibitors.
The binding of low molecular weight ligands to a protein is a mutual, targeted
recognition process. Both partners must fit together so that a strong interaction can
be formed. Parts of the ligand that have complementary recognition properties
determine the binding to the receptor. The term “recognition properties” refers to all
qualities that contribute to the specific interaction between molecules. Until now,
only properties and similarities have been considered that could be directly read
from the molecular scaffold. But is that sufficient? How would the world look if we
recognized ourselves only by our “scaffolds,” that is, only by the skeletons? Male
and female could not even be differentiated straightaway on these grounds! All of
the allure of interpersonal relationships that function over personal appearance and
charisma would be lost. Until now, molecules have been considered on the grounds
of their “skeleton”. Why should ligand–receptor interactions be described at this
level? Even molecules recognize one another by the properties of their shapes and
surfaces exposed to their immediate vicinity to form contacts. The following
example should clarify this point. Methotrexate 17.6 (MTX) and dihydrofolate
358 17 Pharmacophore Hypotheses and Molecular Comparisons
17.7 (DHF) bind to the enzyme dihydrofolate reductase (Fig. 17.6 and ▶ Sect.
27.2). The side chains of both molecules are nearly identical, but the heterocycles
are different. It is known from NMR spectroscopic investigations that the proton-
ated form of MTX binds to the protein. When considering the chemical formulae, it
is tempting to overlay the two heterocycles directly upon one another. Good
scaffold equivalence is achieved, and the heteroatoms in both molecules fall on
top of one another. The receptor, however, does not care about the apparent
equivalence of molecular skeletons. The interaction with the molecular surface is
much more important. Polar molecules such as MTX or DHF are bound to the
protein through hydrogen bonds. The arrows in Fig. 17.6 characterize the H-bond
donor and acceptor groups. The arrows are pointing to the molecule when an
acceptor property is exposed, and away in the case of donor groups. At the start, the
molecules are oriented in space so that they correspond in terms of a direct atom–
atom matching. For the moment, the basic molecular skeleton should be ignored,
and only the distribution of H-bond donor and acceptor groups is considered. The
equivalence achieved is not very convincing. Another variant is taken into consid-
eration in which the heterocycle of DHF is flipped over along the bond between the
heterocycle and the side chain. The spatial overlap of both molecules is no longer
optimal, but the pattern of exposed donor and acceptor groups for both molecules
shows much better agreement (Fig. 17.6). If transformed into another conformation,
the molecule now has entirely different molecular recognition properties. This
difference can hardly be read from chemical formulae, even by a trained eye in
cases such as this one.
Models are nice, but are they also correct? Here only an experiment can provide an
answer. Luckily, in the present case, crystal structures are available for both ligands in
complex with DHFR. The observed binding geometries are shown in Fig. 17.7. One
aspartate and two carbonyl groups in the main chain and two water molecules are
responsible for recognition in the binding pocket. The water molecules mediate the H-
bonds between ligand and protein. The experimentally determined binding geometries
show that the conceptions about the similarity of the hydrogen bond properties led to
the correct conclusions. On first glance, a surprising and seemingly “non-equivalent”
orientation of both ligands in the binding pocket is easily explained. The properties
that are responsible for the mutual recognition process must be compared to one
another. Only these count in the comparison! It is notable that this experimental
confirmation of the above-described ideas came eight years after the working hypoth-
esis was proposed. This is a nice example of the performance of model hypothesis.
Other properties, apart from hydrogen bonds, can serve as additional criteria to
define similarities in the molecular-recognition process. The electrostatic poten-
tial (▶ Chap. 15, “Molecular Modeling”) computed for the heterocyclic ring
systems of DHF and MTX (Fig. 17.7) suggests very similar conclusions. In addition
to the previously mentioned H-bonding properties and electrostatic potential, steric
space filling and the distribution of hydrophobic properties on the surface of both
ligands, play an important role. When molecules are superimposed to predict their
putative geometries in the binding pocket, their conformational flexibility must also
be considered.
17.7 Automated Molecular Comparisons and Superpositioning 359
a H H H H
N N
O O
N R H N
N R H
R N R
N N
N N
H H
H H
N N+ N N
N N
N N
N N N N
H H H
H H H H H
17.6 17.7
b c
H H
N O H H H R
N N
N R H N R
N N N R N
N N
H H
N N+ N N N N H H
N N+ N N N O
H H H H
H H H H
Fig. 17.6 Methotrexate 17.6 and dihydrofolate 17.7 are ligands of dihydrofolate reductase. The
side chain R (see ▶ Sect. 27.2, Fig. 27.9) is identical for both except for a methyl group on the
nitrogen atom. The heterocycles are different. (a) Intuitively, superposition of both heterocycles
directly upon one another when comparing the structures appear reasonable. Heteroatoms match
pair-wise one another. (b) Arrows are distributed around the molecules to compare the hydrogen-
bonding properties. They are pointed to the molecule when an acceptor is present and they point
away for donor groups. If the molecular skeletons are masked out, and the distribution of H-bond
donor and acceptor groups is concentrated upon, the atom–atom overlap obtained via the direct
superposition of the rings shows rather unconvincing equivalence. (c) Instead if the heterocycle in
17.7 is flipped about the bond between the heterocycle and the side chain R, the pattern of donor
and acceptor groups that is obtained exhibits convincing equivalence.
Is it possible to consider all of the properties that were mentioned in the last section
in a method to superimpose molecules for a relative comparison? For this,
a measure of similarity for all properties must be calculated. This measure must
be related to a spatial distance function. Subsequently, an optimization of the spatial
superposition can be performed. At the same time, the maximum similarity of the
chosen properties is sought. The program SEAL from Simon Kearsley and Graham
Smith determines the spatial similarity of different properties distributed over
the molecular scaffold. It simultaneously ranks the similarity with respect to the
overlap volume of the molecules that were determined during superposition. In
this way the superposition of MTX and DHF is correctly predicted according to
experiment. The conformational flexibility is also considered in this analysis.
360 17 Pharmacophore Hypotheses and Molecular Comparisons
Fig. 17.7 Experimentally determined binding geometries of methotrexate (green carbon atoms)
and dihydrofolate (gray carbon atoms) in dihydrofolate reductase. The heterocycles of the ligands
are bound through H-bonds to the carboxylate or carbonyl group of an amino acid that is oriented
into the binding pocket. Two water molecules (red spheres) mediate additional H-bonds between
the ligands and the protein. The difference in the binding mode that is discussed in Fig. 17.6, is
clearly recognized. On the right-hand side the electrostatic potentials around methotrexate (top)
and dihydrofolate are shown. The molecules are found in a spatial orientation that was determined
by crystal structure analysis. Considered qualitatively, the electrostatic potentials of both mole-
cules in this orientation have very similar form.
a comparable effect on the receptor. There is a toy with which children try to push
differently shaped pieces through preformed holes into a box, a so-called “shape
sorter.” For each block form, be it a cube, cuboid, round cylinder, or elliptical
cylinder, there is one performed hole that it fits. In similarity considerations there
is a tendency to group cube and cuboid, or round and elliptical cylinder into related
categories because of their similar form. If an attempt is made to push these parts
through the holes of the shape sorter, it is easily discovered that the cuboid will not
only fit through the square hole but also, with a bit of force, through the hole for the
elliptical cylinder. The cube is only slightly too big to, in addition to the square
hole, also fit through the hole for the circular cylinder. Therefore, are the cuboid and
the elliptical cylinder or the cube and the circular cylinder not more similar to one
another? The measure of similarity that is to be used for a molecule is calibrated
with respect to the receptor to which the molecule should fit. It is therefore always
a relative measure!
Thiorphan and retro-thiorphan (▶ Sect. 5.5, formulae 5.23 and 5.24) differ only
in the spatial sequence of the amide bond. They bind with almost identical affinity
to the zinc protease thermolysin, and NEP 24.11. Therefore, one would classify
them as very similar. The zinc protease ACE binds thiorphan by at least a factor of
100 times more strongly than retro-thiorphan (▶ Sect. 5.5, Fig. 5.10). Relative
to this enzyme, both substances must be called dissimilar. Another extreme is
seen in the oligopeptide-binding protein A (▶ Sect. 4.1). It binds every tri- to
pentapeptide comprising a central Lys—Xxx—Lys moiety with almost equal
affinity. In principle, only information about the shape of the binding site is
needed for a similarity analysis. Only then the requirements can be adequately
defined. However, the structure of the receptor is still not known in many drug-
design projects. Here there is no choice: it is only through hypothesis and its
experimental testing in gradual steps that the structural requirements of the
receptor can be approximated.
Fig. 17.8 Superposition of the steroid 17.2 and three inhibitors 17.3–17.5 according to a spatial
comparison of their molecular properties. In contrast to methods with “virtual” spring forces, this
method does not require a predefined equivalence of molecular groups. It is automatically
generated by the similarity comparison of many different conformations.
In the last example a largely rigid reference compound was furnished. How should
one proceed when no such reference compound is known? Only experiment can
help here. Rigidized analogues must be synthesized. These are tested for biological
activity. If they still exhibit affinity to the receptor, it can be assumed that the active
conformation was frozen.
An example should demonstrate how the receptor-bound conformation can be
probed by synthesizing rigid model compounds. The calcium channel blocker
nifedipine 17.8 (▶ Sect. 2.5) contains multiple rotatable bonds (Fig. 17.9). It can
therefore adopt numerous conformations. Which orientation does the phenyl ring,
for instance, take relative to the dihydropyridine ring? This question was very
elegantly clarified by Wolfgang Seidel at Bayer through the synthesis and crystal
structure determination of cyclized derivatives 17.9. An additional lactone ring
changes the biological activity of the derivative depending on the ring size. In
compounds with a six-membered lactone the phenyl and dihydropyridine rings lie
virtually in the same plane. Conversely, the phenyl ring stands perpendicular to the
dihydropyridine ring in the derivative with the twelve-membered ring. The affinity
of this compound is about five orders of magnitude higher than for the derivative
with the six-membered lactone. Therefore it must be assumed that nifedipine exerts
its effect in a conformation in which the phenyl and dihydropyridine rings are
perpendicular to one another.
After this question has been answered, more compounds can be designed.
A relevant superposition that corresponds to the conditions in the protein’s binding
pocket will be possible. Such superpositions have gained a decisive meaning in the
context of 3D structure–activity relationships. An example is shown in ▶ Sect. 29.4
of how the structural fixation of the biologically active conformation of a ligand can
support the design process.
17.9 If Rigid Analogues are Lacking 363
7
40
11 9
8
20 10
12
0
1 10 100 1,000 10,000 100,000
Ki (nM)
364 17 Pharmacophore Hypotheses and Molecular Comparisons
It was described in Sect. 17.1 that a pharmacophore can also be derived from the
protein structure. The computer program GRID from Peter Goodford is a tool that
is often used for this purpose. It calculates favorable positions for functional groups
on a putative ligand in the protein’s binding pocket. These could be, for instance,
a carboxylate group, a hydroxyl group, or an aliphatic carbon atom. The potential
function, implemented into GRID, has been calibrated on numerous functional
groups from crystal structures of organic molecules. The result of a GRID calculation
is a set of interaction energies assigned to the intersections of a regularly spaced grid
that is inscribed into the binding pocket. The energies are graphically displayed, for
instance, by contouring the spatial area at which the interaction energy reaches or
exceeds a certain predefined threshold. They indicate hot spots for the placement of
functional groups of a potential ligand. The areas in which the interactions with an
aromatic carbon atom or a hydroxyl oxygen atom are favorable are shown for the
enzyme thermolysin in Fig. 17.10. Such calculations are carried out with a set of
different probes, for instance, a water molecule, an aromatic carbon, a hydrogen-bond
acceptor or donor, or a positively or negatively charged group. The results provide
valuable information about the shape and electrostatic properties of the binding pocket.
Another way of analyzing protein structures is based on the idea that the physical
nature of non-bonding interactions in protein–ligand complexes and in the crystal
packing of small organic molecules is identical. The latter are particularly inter-
esting for this purpose because the crystal structures of small organic molecules are
regularly determined with great precision. There are over 500,000 crystal structures
stored in the Cambridge Database (▶ Sect. 13.9). This collection is ideal to obtain
relevant and reliable data via a statistical analysis for ligand-design purposes
(▶ Sect. 14.7). Let us assume that there is a carboxylate group —COO— on the
protein that protrudes into the binding pocket. Where must a partner group be
positioned to form a favorable interaction? To answer this question, the Cambridge
Database was searched first for compounds with carboxylate groups, and then for
each of the retrieved groups, the position of the counter group that forms an H-bond
to the carboxylate was saved. Finally, the collective of all the found H-bonds was
superimposed in that the carboxylate groups of all examples are superimposed
exactly onto one another. The distribution of H-bond-donor groups (Fig. 17.11)
offers a valuable picture of the allowed area of the H-bond geometry. Subsequently,
such a distribution can be superimposed onto the protein structure by matching with
the carboxylate group of the protein. Areas in which the distribution overlaps with
other atoms of the protein are discarded. In this way the energetically most
favorable areas for a counter group in the binding pocket are found. In Fig. 17.12
these distributions are compared with a protein–ligand complex. As expected, the
hydrogen-bond geometries found in the complex coincide nicely with the range that
was found in the crystal packings of organic molecules. A system of rules for non-
bonding interactions in protein–ligand complexes was obtained from the statisti-
cal evaluations of all groups that are found in proteins. These rules are compiled at
17.10 The Protein Defines the Pharmacophore 365
Phe114
Asn112
Zn2+
Arg203
O
CH3 OH
H2O
HO
N O HO O
HO
Water Acetonitrile Acetone Isopropanol Phenol Benzylsuccinic acid
Fig. 17.10 An analysis of the binding pocket of thermolysin. Areas of favorable interactions were
calculated for an aromatic carbon probe (white) and a hydroxyl oxygen atom (red). There are also
fragments mentioned in Fig. 7.8 that could be determined by allowing the probe molecules to
diffuse into the protein crystals. The calculated hot spot corresponds well with the positions that
were crystallographically determined with molecular probes.
a b
OH
OH O
O
O O−
OH
c d
OH O OH O
Fig. 17.11 Hydrogen-bonding geometries (carbon is green, oxygen is red, and hydrogen is white)
around a carboxylate group (a), ester group (b), carbonyl group (c), and ether group (d). Structures
with these central groups that form hydrogen bonds with OH donor groups were extracted from the
Cambridge database. These examples were superimposed based on the geometry of the central
group. It is obvious that there is considerable variability in the interaction geometry, but also that
preferred orientations are to be found. It is also shown that, for instance, the interaction pattern
around an ester group (b) is not simply a superimposition of the distribution around a carbonyl
group (c) and an ether group (d).
state, an energy function can be calculated from it. In this function it is assumed that
contacts that occur more frequently than the average distribution are energetically
favorable. If they occur rarely, they are assigned to be unfavorable. These statistical
potentials have been integrated into the scoring function DrugScore. They can also
be used for the analysis of binding pockets and help to indicate hot spots in the
ligand binding.
The MCSS method was developed in the group of Martin Karplus. Several
thousand random probe molecules such as acetone, water, methanol, or benzene
were placed in a binding pocket for this. A computer simulation is started with
17.11 The Search for Pharmacophore Patterns in Databases 367
Ala97
Leu4
Asp26
Tyr155
Fig. 17.12 The distribution of H-bond-donor groups (carbon is white, oxygen is red, and nitrogen
is blue) around a carboxylate group or a carbonyl group are superimposed with the 3D structure of
the complex of methotrexate with dihydrofolate reductase (Fig. 17.7). The distributions are
imposed onto the acid group of Asp26 and the carbonyl groups of Leu4 and Ala97. The hydrogen
bonds formed between protein and ligand coincide geometrically with ranges often found in small
organic molecules in the crystal structures.
which the single probe molecules are moved into optimal positions. They are driven
by a calculation according to the underlying force-field. The probe molecules
experience the interaction with the protein, but they do not “see” one another. At
the end of the calculation a frequency distribution for the probe molecules is
obtained. If this distribution is evaluated, a hot spot for an interaction with the
protein is highlighted. If the so-obtained hot spots are compiled into a composite
picture, a protein-based pharmacophore is obtained.
nitrogen atom, then numerous hits will be found. However, it is important which
relative spatial distances are given between these groups. Such information is not
taken into account in searching a 2D database. Matthias Rarey and Scott Dixon
developed the Feature-Trees method, which can screen large databases according to
topological criteria. However, the connectivities of the chemical formulae are not
compared. Rather, the database entries are initially classified by the topological
sequences of particular characteristics, for instance, the presence of an H-bond-
donor group or a hydrophobic cyclic molecular portion. Such a method can
compare molecules and find candidates that have pharmacophore properties in
a comparable topological sequence extremely quickly.
Databases that contain 3D molecular geometries allow the search for the spatial
pattern of the pharmacophore. For example, the Cambridge Database of crystal
structures of small organic molecules (▶ Sect. 13.9) can be used for such a search.
Molecules are found with experimental geometries that satisfy the pharmacophore.
In the search for ligands for HIV protease (▶ Sect. 24.3) a pharmacophore pattern
was derived from the known crystal structure of the enzyme, and the Cambridge
Database was searched for molecules that match this pattern. The result of this
search is presented in ▶ Sect. 24.4 (Fig. 24.16) in detail. It inspired the researchers
at Dupont–Merck with the first ideas that led to the development of an entirely new
class of non-peptidic HIV-protease inhibitors.
These days databases containing 3D structures of molecules generated from 2D
structural formulae are commonly used along side experimental structural databases.
In other approaches, the molecules spatial structure is generated on the fly during the
search (▶ Sect. 15.2). Here, as with most entries in the Cambridge Database, each
molecule is present in only one conformation. Molecules can, however, adopt many
different conformations (▶ Chap. 16, “Conformational Analysis”). It is therefore
usually the exception that a flexible molecule exists in the “right” conformation
required for the search. Therefore conformational flexibility must be considered
during the search. An elaborate search, for example, the active-analogue approach,
would demand too much computational time. Therefore fast algorithms have been
developed to figure out whether particular pharmacophoric groups on the molecules
could fall within predefined distances. It is enough to estimate the minimum or
maximum achievable distances. This concept has been realized e.g., in the program
UNITY from the company Tripos. One can start from a database holding multiple
precalculated conformers. Here it is critical that the stored conformers are distributed
as representatively as possible throughout the conformational space (▶ Sect. 16.6).
The single conformers are then checked to see whether they fit to the defined
pharmacophore. This concept is followed by the program Catalyst from the company
Accelrys.
It is not to be expected that such database searches directly deliver candidates for
clinical trials. As an idea generator, however, they can guide the drug researcher to
novel lead structures and can drive synthetic plans down entirely different path-
ways. Today database searches are carried out on a large scale during the course of
virtual screenings (▶ Sect. 7.6). For this, proprietary compound libraries are
screened, or collections of commercially available compounds are searched.
17.12 Synopsis 369
John Irwin and Brian Shoichet at UCSF in San Francisco have taken on the initiative
with the database ZINC, which collects current commercially available compounds
and makes the collection available for database searches. Preset filters help to sieve out
the desired subsets for the search at hand from the millions of compounds in the
databases. As a major advantage, the found hits can be purchased and experimentally
tested in an assay. Many candidates for new lead structures have already been
discovered by using this “lead discovery by shopping” strategy (see ▶ Sect. 21.7).
17.12 Synopsis
• The structure of the binding pocket determines which functional groups are
necessary on the ligand side for successful protein binding. Either the ligand
or the protein structure can be used as the starting point from which
a pharmacophore is derived.
• The superposition of active and inactive small molecule ligands from a series of
related compounds upon one another can be used to define the allowed and
forbidden areas in a hypothetical binding pocket. Logical operations of volume
differences are indicative for the design of optimized ligands.
• Flexible molecules that can adopt different conformations present a special
challenge in superpositions. The molecules must be energy-minimized as part
of the superposition procedure or, alternatively, multiple conformations must be
evaluated.
• Alternatively, a set of molecules can be superimposed by assigning
pharmacophoric groups, and through systematic rotations about all open-
chain single bonds a common alignment is found in the active-analogue
approach.
• Care must be taken to not be deceived by molecules that look similar with
respect to their chemical formulae. Instead, the interacting functional groups are
important for the molecular recognition at the binding pocket and not the
scaffold itself. The role of water in the binding must not be underestimated.
• Molecular recognition properties can also be considered to mutually superim-
pose molecules.
• The synthesis of a structurally rigid analogue (or analogues) can help to define
and validate the pharmacophore assignment and the determination of the bio-
logically active conformation.
• Binding “hot spots” can be found by examining the protein by mapping the
binding pocket with small molecules and probes with different properties. These
give some ideas as to what sort of molecule might show successful binding to the
target protein.
• The Cambridge Database of crystal structures provides valuable insights into
preferred interaction geometries and motifs. Such information is of high rele-
vance for protein–ligand complexes because the forces that are responsible for
crystal packing are the same as for non-bonding interactions between active
substances and proteins.
370 17 Pharmacophore Hypotheses and Molecular Comparisons
Bibliography
General Literature
Klebe G (1993) Structural alignment of molecules. In: Kubinyi H (ed) 3D-QSAR in drug design,
Theory, methods and application. ESCOM, Leiden, pp 173–199
Langer T, Hoffmann RD (2006) Methods and principles in medicinal chemistry. In: Mannhold R,
Kubinyi H, Folkers G (eds) Pharmacophores and pharmacophore searches, vol 32.
Wiley-VCH, Weinheim
Marshall GR (1989) Computer-aided drug design. In: Richards WG (ed) Computer-aided
molecular design. IBC Technical Services, London, pp 91–104
Special Literature
Bolin JT, Filman DJ, Matthews DA, Hamlin RC, Kraut J (1982) Crystal structure of Eschericha
coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 Å resolution. J Biol Chem
257(13):13650–13662
Kearsley SK, Smith GM (1990) An alternative method for the alignment of molecular structures:
maximizing electrostatic and steric overlap. Tetrahedron Comput Methodol 3:615–633
Klebe G, Mietzner T, Weber F (1995) Different approaches toward an automatic structural
alignment of drug molecules: applications to sterol mimics, thrombin and thermolysin inhib-
itors. J Comput-Aided Mol Des 8:751–778
Klunk WE, Kalman BL, Ferrendelli JA, Covey DF (1983) Computer-assisted modeling of the
picrotoxinin and g-butyrolactone receptor site. Mol Pharmacol 23:511–518
Kuster DJ, Marshall GR (2005) Validated ligand mapping of ACE active site. J Comput-Aided
Mol Des 19:609–615
Mackay MF, Sadek M (1983) The crystal and molecular structure of picrotoxinin. Aust J Chem
36:2111–2117
Marshall GR, Barry CD, Bossard HE, Dammkoehler RA, Dunn DA (1979) The conformational
parameter in drug design: the active analog approach. In: Olson EC, Christoffersen RE (eds)
Computer-assisted drug design, vol 112, ACS symposium series. American Chemical Society,
Washington, DC, pp 205–226
Martin YC (1992) 3D database searching in drug design. J Med Chem 35:2145–2154
Mayer D, Naylor CB, Motoc I, Marshall GR (1987) A unique geometry of the active site of
angiotensin-converting enzyme consistent with structure-activity studies. J Comput-Aided Mol
Des 1:3–16
Seidel W, Meyer H, Born L, Kazda S, Dompert W (1984) Rigid calcium antagonists of the
Nifedipine-type: geometric requirements for the dihydropyridine receptor. In: Seydel JK (ed)
QSAR as strategies in the design of bioactive compounds. VCH, Weinheim, pp 366–369
Quantitative Structure–Activity
Relationships 18
e.g. R1
R1 pH < 9 R1 CH3I
+ +
R2 N H R2 N R2 N CH3
R3 pH > 9 R3 R3
Fig. 18.1 The protonation of a tertiary amine depends on the pH value of the medium (left). On
the other hand, the quaternization of a nitrogen atom leads to a permanently positively charged
compound (right).
The South American dart poison tubocurare (▶ Sect. 7.1) was the first therapeutic
principle for which the exact mode of action was elucidated. In 1852, Claude
Bernard recognized that this quaternary alkaloid causes muscle paralysis, but that
the nerve as well as the muscle remain independently excitable. Curare must
therefore act on the coupling between nerve and muscle. Scottish pharmacologists
Alexander Crum-Brown and Thomas Fraser occupied themselves somewhat more
exhaustively with the question of whether the quaternization of the nitrogen atom of
different alkaloids (Fig. 18.1) has an influence on their biological effects. In 1868,
from entirely different effects observed before and after the transformation of
alkaloid, they formulated a general equation to describe structure–activity rela-
tionships (Eq. 18.1).
F ¼ f ðCÞ (18.1)
This equation is ingeniously simple, but it says only that F (Greek letter Phi), the
biological activity, is a function of C, the chemical structure. At that time, the
tetrahedral structure of the carbon atom had not been clarified, and the constitution
of many organic compounds, above all complex natural products, was entirely
unknown.
Around the turn of the twentieth century, pharmacologist Hans Horst Meyer and
botanist Charles Ernest Overton founded the lipid theory of anesthesia indepen-
dently, which unifies three important statements:
• All chemically unreactive substances that are lipophilic and can be distributed in
biological systems have anesthetic effects.
• The biological effect occurs in nerve cells because fat plays an important role in
their function.
• The relative potency of anesthetics depends on their partition coefficient
(▶ Sect. 19.2) in a mixture of fat and water.
The work of Crum-Brown, Fraser, and Richet, or the contribution of Meyer and
Overton can be seen as the origin of quantitative structure–activity relationships. In
fact after the formulation of the anesthesia theory, numerous other linear, and later
non-linear, dependencies on the lipophilicity, the “fat affinity” of active substances,
were found. But all of these activities were relatively unspecific “membrane”
effects.
In the middle of the 1930s Louis P. Hammett formulated a relationship between
the electronic properties of the substituents and the reactivity of aromatic com-
pounds. Accordingly, the relative contribution of electron-withdrawing and elec-
tron-donating substituents on the electron density of the aromatic ring is always
constant. They are determined by the electronic parameter of the substituent, the
Hammett constant, s. Electron-accepting substituents with positive s values are,
among others, the nitro group, the cyano group, and the halogens. Electron-
donating substituents with negative s values are hydroxyl and amino groups, the
methoxy group, and alkyl substituents. Acceptor substituents enhance the acidity of
benzoic acids and phenols, they reduce the basicity of anilines, and they accelerate
the basic hydrolysis of benzoic ethers. Electron-donating substituents exert an
opposite influence.
However an individual reaction constant r must be applied for each reaction
type of aromatic compounds. By using Eq. 18.2, later generally called the Hammett
equation, the equilibrium constant K for an arbitrary reaction can be calculated from
r and s. R–X and R–H represent the relevant aromatic compounds substituted with
the group X, or unsubstituted, respectively.
Acceptor and donor substituents influence the electron density on the heteroatoms
and reduce or increase the ability to form hydrogen bonds. This, among other things,
explains the electronic influence of aromatic substituents on the biological activity of
drug molecules. The Hammett equation was therefore seen as a challenge to phar-
maceutical chemists and biologists to derive quantitative structure–activity relation-
ships from this concept. Many groups have made efforts to find relationships between
biological activity and the Hammet constants s, or between s and/or r-analogous
substituents and to derive test parameters for biological systems. Despite individually
interesting results, no generally valid concept could be established.
374 18 Quantitative Structure–Activity Relationships
It was Corwin Hansch and Toshio Fujita who in 1964 published a work that
established the fundamentals for quantitative structure–activity relationships. In
this, they describe:
• The definition of a lipophilicity parameter p, analogous to the electronic term s
in the Hammett equation.
• The combination of different parameters in a model.
• The formulation of a parabolic model for the description of non-linear
lipophilicity–activity relationships.
n-Octanol was chosen for theoretical and practical reasons. It has a long aliphatic
chain and a hydroxyl group that is an H-bond donor as well as an acceptor. Its
structure therefore resembles the membrane lipids to some extent. It dissolves a
large number of organic compounds, it has a low vapor pressure, but can nonethe-
less be easily removed. Its UV transparence over an extremely wide range is
particularly advantageous.
With the help of the lipophilicity parameter p, the log P values of new com-
pounds, and therefore their lipophilicity, can be calculated. For this the lipophilicity
of the basic scaffold and the p values of the substituents must be known. In this way
the biological activity can be correlated without the tedious experimental measure-
ments of each individual partition coefficient. In addition to the p values of all
important substituents, a very large number of experimentally determined octanol/
water partition coefficients are available in the literature.
In 1964 Corwin Hansch and Toshio Fujita derived a mathematical model more
intuitively than theoretically that can quantitatively describe structure–activity
relationships, the Hansch analysis (Eq. 18.4).
1
log ¼ k1 ðlog PÞ2 þ k2 log P þ k3 s þ K k (18.4)
C
Table 18.1 The biological activity of meta- and para-substituents of phenethylamines 18.1 (i.v.
application in the rat; C in mol/kg rat)
meta (X) para (Y) log 1/C
Br
X N
Y x HCI
18.1
H H 7.46
H F 8.16
H Cl 8.68
H Br 8.89
H I 9.25
H Me 9.30
F H 7.52
Cl H 8.16
Br H 8.30
I H 8.40
Me H 8.46
Cl F 8.19
Br F 8.57
Me F 8.82
Cl Cl 8.89
Br Cl 8.92
Me Cl 8.96
Cl Br 9.00
Br Br 9.35
Me Br 9.22
Me Me 9.30
Br Me 9.52
between the measured biological data and the values that were calculated from the
model. The sum must show the smallest possible value over all of the investigated
compounds. It represents an important criterion for the judgment of the quality of
a model, or for the comparison of different models with different qualities.
The quantitative structure–activity relationship of the antiadrenergic effect of
N,N-dimethyl-b-bromophenethylamines 18.1 (Table 18.1) is considered as an
example. According to their structure, these compounds more or less reverse the
agonistic effect of an adrenaline dose. The value C is the dose of an antagonist that
blocks the adrenaline effect by 50%. The data can be described with the Hansch
model, which is illustrated in Fig. 18.2.
The description of the entire data set is possible with a mathematical model by
using the derived equations. A carbocation is formed upon cleavage of bromine,
18.5 The Hansch Analysis and the Free–Wilson Model 377
The logarithm of
the reciprocal value Lipophilicity Electronic Constant term
gives the correct parameter parameter
scaling
(n = 22; r = 0.945; s = 0.196; F = 78.6)
Fig. 18.2 A QSAR equation delivers individual parameters for a quantitative model for the
prediction of biological activity, in this case from substituted N,N-dimethyl-b-
bromophenethylamines (Table 18.1).
and the substances bind irreversibly to the adrenergic receptor. Accordingly, the sþ
term is found in the Hansch equation (Fig. 18.2), which describes such reaction
types particularly well. Lipophilic substituents increase the biological activity
(positive p term) and electron-withdrawing substituents decrease it (negative sþ
term). Therefore lipophilic electron-donating substituents, for example, large alkyl
substituents, should be optimal for the activity. Second, within certain limits, the
effect of further compounds can be predicted. Interpolations, that is, conclusions
that are drawn based upon very similar substituents, have a better reliability than
extrapolations, which are predictions made outside of the parameter space, for
instance, for considerably more lipophilic, more polar, or larger substituents. As
a first approximation, it can be said of the statistical parameters r, s, and F
(Fig. 18.2) that the correlation coefficient r should have values that are close to
1.00, the standard deviation, s, should be as small as possible, and the F value
should be as large as possible. The better the criteria are fulfilled, the better the
quantitative model will be, in other words, the experimental and calculated values
agree better with one another.
Also in 1964 and independently of Hansch and Fujita, S. R. Free and J. W. Wilson
developed an entirely different model for structure–activity analysis. Because the
original approach is confusingly formulated and awkward to use, here only a variant
shall be discussed that was later proposed by Fujita and T. Ban, the Free–Wilson
analysis. The Free–Wilson analysis assumes that within a set of chemically related
378 18 Quantitative Structure–Activity Relationships
Active
substance
Free-Wilson Model:
X1 Xn
log 1/C = Σ a i + m
(Contribution a1) X2 (Contribution an)
(Contribution a2)
Fig. 18.3 The Free–Wilson analysis uses the additive nature of the group contributions to
describe the biological activity. Accordingly, the biological activity in the displayed equation is
made up of the activity of the basic scaffold, m, and the constant group contributions ai of the
substituents Xi.
F to Cl and Br to I, that is, the influence of the lipophilicity, is obvious. Despite having
almost the same lipophilicity, the methyl and chloro substituents are different. This is
explained by their different electronic properties. Differences in the meta and para
position on the electronic influence can also be followed. Therefore the Free–Wilson
analysis indeed has advantages for the analysis of substituent effects.
parameters. In this form they are regarded in the Hansch equation as well as the
Free–Wilson analysis (Sect. 18.5). Moreover, indicator variables for different
configurations of substituents, for example, the configuration of stereoisomers,
are defined in classical QSAR models. An analogous orientation of the molecule
in a hypothetical binding pocket is assumed for the use of these parameters. For
example, it is assumed that all ortho substituents are oriented toward the “same
side” in a series of ortho-substituted derivatives. As a prerequisite structure–activity
relationships that correlate the biological activity with properties of the 3D structure
need a spatial superposition of the active substances. This superposition should
approximate the relative orientation in the binding pocket as accurately as possible.
A technique was discussed in ▶ Chap. 17, “Pharmacophore Hypotheses and Molec-
ular Comparisons” that can be used for the calculation of these spatial
superpositions.
for amino acid side chains around the ligands. These assumptions can be dropped
once the molecules are embedded in a lattice and can be explored with an interac-
tion probe. Richard Cramer and M. Milne proposed such a model in 1978. It took
another 10 years until the generally applicable CoMFA method (Comparative
Molecular Field Analysis) was established. Despite many theoretical and practical
deficiencies with their application, the method was quickly accepted. Today it is
applied in many different variations.
Before such an analysis can be practically carried out, a few basic consider-
ations should be made. Do steric and electrostatic interactions consider all
contributions to ligand binding that lead to a correct relative ranking of binding
affinity? As already mentioned, the binding affinity is composed of enthalpic and
entropic contributions. A sampling of the properties via probes to map interac-
tions certainly affords a measure for how well a molecule can undergo energet-
ically favorable interactions. How well are the entropic contributions considered?
A considerable portion is made up of solvation and desolvation processes
(▶ Sect. 4.6). These processes change the local water structure around the ligand
and in the binding pocket. The water structure in the immediate vicinity of the
hydrophobic surfaces of the ligand is more ordered in the solvated state than it is
in bulk water. The transition of such ligands out of the bulk water into the
protein’s binding pocket immediately causes a certain number of water molecules
to adopt a less-ordered state. This increases the entropy of the system and pro-
motes spontaneity in the binding process. The number of water molecules that are
involved in this process depends on the size of the hydrophobic surface of
the ligand. Furthermore the displacement of the water molecules from the
binding pocket upon ligand binding increases the disorder of the examined
system and also increases its entropy. In the above-mentioned approximation it
is assumed that this water-related effect is the same for all molecules in the data
set. Therefore, it is not considered in a relative comparison. Additionally,
a molecule can move “freely” in an aqueous solution and adopt different confor-
mations. In the binding pocket, however, it is fixed predominantly in one partic-
ular conformation. Rotational, translational, and conformational degrees of
freedom are lost, and the system loses entropy. All of these influences are to be
taken into consideration for the correct treatment of affinities.
The most important and most often used method for 3D structure–activity analysis
is the CoMFA method. The execution of a CoMFA study first requires the choice of
a data set of suitable compounds. This data set should encompass around 50–100
compounds with related overall geometry. It should also be ensured that all sub-
stances bind to the same protein at the same site, and that a binding affinity is
known for all of them. The ligands must possess a given diversity with regard to
their structural variation. Their binding affinities should scatter over at least three
orders of magnitude. Conformations are generated for all of the molecules
382 18 Quantitative Structure–Activity Relationships
Fig. 18.4 A grid is generated for the calculation of molecular fields that broadly encompasses
a molecule. The grid points are color-coded, with increasing distance from the ligand (red <
yellow < green < blue < gray). The contributions from the chosen fields are calculated at points of
the lattice, which have a grid spacing of 1–2 Å. The field contributions at each point in the grid (S1,
S2,. . .Sn, E1, E2, . . . En) are written into a table. The analysis is carried out for all molecules in the
data set. The binding affinities are incorporated into the table as, for instance, –log (Ki). The field
contributions are weighted with appropriate coefficients (a, b, . . .z) and using a special statistical
method, the PLS analysis, they are related to the affinity. A model is obtained in the form of an
equation that indicates at which grid points and with what weight the different field contributions
explain the biological activity.
384 18 Quantitative Structure–Activity Relationships
E(r)
Lennard-Jones Potential
Cut-off value
Gauss Coulomb Potential
Curve
0 r
Coulomb Potential
(opposite charges)
Cut-off value
Fig. 18.5 The Lennard-Jones potential (green) is a model for describing the intermolecular
interactions of two atoms without considering their charge. Negative potential values correspond
to mutual attraction, positive values correspond to a repulsion of the particles. If a reciprocal distance
becomes infinite, the potential approaches zero. Upon approach it goes through a shallow minimum
due to alternating polarization. At even shorter distance it very steeply rises toward positive infinity
because of atom-atom repulsions. The Coulomb potential (blue) considers only electrostatic inter-
actions that formally reside as point charges on the atomic nuclei. It also approaches infinity when
the distance disappears for like-charged particles. For oppositely charged atoms, negatively infinite
values result. The hyperbolic form of the Coulomb potential is considerably less steep, so that the
particles can still “feel” one another at larger distances. Boundary values are set for potentials in
a CoMFA analysis. A Gaussian function, which takes the course of a bell-shaped curve (here only
the right half of the “bell” is shown) describes the distance dependence of the interaction potential
between the particles in the context of the CoMSIA model. As the distance disappears between the
particles, the curve reaches its maximum value, which remains finite.
Let us assume that multiple molecular fields for each molecule in a data set have
been calculated, and a correlation of their differences with the binding affinity is
attempted. How are these differences expressed? For this we want to consider three
hypothetical examples of substituted phenyl derivatives.
• First, all of the substituents on the phenyl ring in a compound series should be
varied so that increasingly large field contributions result in the vicinity of the
substituent when being scanned with a positively charged probe. If the binding
affinities increase in the same way as the field contributions become larger, this will
be reflected in the quantitative analysis. It means that derivatives with increasingly
positively charged groups in this molecular region are more potent substances.
18.12 Graphical Interpretation of the Results of a Comparative Molecular Field Analysis 385
• A second example should be positioned a little bit differently. Now the phenyl
ring substituents are given positive or negative partial charges. Their variation
has no influence on the potency of the substances. The quantitative analysis
shows that the changes in the electrostatic field contributions have no correlation
with the biological activity. A possible explanation might be that this effect and
another property, for example, the size of the substituents, mutually cancel their
influences. It could also be that the biological activity is influenced through other
qualities of the substituents, for instance, their hydrophobic character.
• In the third case, the electrostatic properties of the substituents that are important
for binding to the receptor should be hardly varied at all at the
examined position. There might be different substituents present, however,
they all have comparable partial charges. The model that analyzes the field
contributions in the vicinity of these groups does not recognize differences and
therefore also does not correlate with the binding affinity. It can indeed be that
a class of substituents at a particular position on a molecular scaffold is actually
very important for binding but nonetheless it remains insignificant in the anal-
ysis. This has to do with the fact that a QSAR analysis only performs a relative
comparisons within a data set.
These examples are still easily manageable. The question can be posed whether
a tedious correlation method with the “detour” via molecular fields is really needed.
The situation is more complicated in practice, above all if molecules with different
scaffolds are considered. The substituents do not fall exactly on top of one another
in the molecular superposition. Their contribution must be described as a field in
space and only as such they can be evaluated. At any rate, these examples under-
score the importance for careful planning of the analysis. The structures in the data
set must be chosen so that they have the largest possible variation of substituents
and their properties.
One or more compounds are randomly extracted from the data set. A model is
constructed with the remaining derivatives and the affinities of the removed com-
pounds are predicted with this model. The removal of compounds is repeated
several times, in the simplest case, so often until all substances have been removed
one time. The quality of the prediction represents a measure for the reliability and
significance of the model. The achieved result is expressed with the q2 value, which
can be calculated from the square of the deviation from the predicted value. It takes
on values from 1 to +1. A value of +1 indicates that a perfect model was
achieved. All predictions exactly agree with the measured binding affinities.
There is no deviation. A value of q2 ¼ 0 indicates that the predictions of the
model are no better than no model at all; it is just as good as the average of all
affinities. If q2 takes on negative values, the model is worse than the average, that is,
worse than no model. A model is therefore only to be trusted when the q2 value lies
above 0.4–0.5.
Another step must be performed to check the predictive value of a trained model.
For this, a test data set of molecules is needed that are similar to the molecules in the
training data set, but that were not used for the training. The binding affinities are
predicted for these molecules. It is only if the correlation coefficient for this set is of
similar size to that of the training set that the model possesses adequate predictive
power.
The derived model can be used to estimate the affinity of new compounds
that have not yet been synthesized. The conformations of these compounds
are calculated and superimposed on the other structures. They must fall within
the grid that was defined in the training set. Next their field contributions are
calculated. By using the correlation derived by CoMFA for the training set, it is
possible to compute which grid points are predictive with respect to the binding
affinity of new compounds.
CoMFA techniques establish a correlation between activity data and molecular
properties. A model can be derived that encompasses the properties of new mole-
cules, from the relative comparison within a training set. Relevant predictions are
only to be expected when the structural variations in the new molecule remain
within the scope of the model. In other words, the model cannot make predictions
about the influence of substituents that occur in areas in which there were no
structural variations in the training set. CoMFA models interpolate between field
contributions from molecules. An extrapolation to areas that were not covered by
the data set is not possible.
The results of a CoMFA analysis can be graphically evaluated. From the model
it is known at which grid points field contributions are obtained that contribute
significantly to explain the binding affinity. These contributions can be contoured
for the different fields according to their importance. They indicate volume areas
around the molecules in which changes in the field contributions run parallel or
opposite to the affinity changes in the data set. These contour maps significantly
support the design of new active substances (Sect. 18.14). They indicate the
position at which the properties of a lead structure have to be varied so that an
increase in affinity can be achieved.
18.13 Scope, Limitations, and Possible Expansions of the CoMFA Analysis 387
Usually only steric and electrostatic field contributions are evaluated in CoMFA
analyses. A hydrophobic field can quantify the size of the hydrophobic surfaces and
therefore partially considers the entropic contribution to affinity. Because CoMFA
evaluations yield relevant models without the explicit use of hydrophobic fields,
these field contributions must be at least partially contained in Lennard-Jones and
Coulomb fields. The lipophilicity of a molecule increases upon enlarging an
uncharged, sterically demanding group, for instance, from methyl to butyl. Here
the changes in the steric field contributions can correctly reflect the lipophilic
surface. A correlation with electrostatic properties is also imaginable. Hydrophobic
molecular portions carry, as a general rule, only minor partial charges. Positively or
negatively charged groups represent hydrophilic regions. In this way the lipophilic
and hydrophilic surface regions can be quantified via differences in the charge.
The deviation that is not explained by a CoMFA model comprises, apart from
experimental errors, also all inadequately described binding contributions. These
include structural adaptations of the protein that are not identical for all compounds
in the data set. Entropic contributions that come from the conformational fixation of
the active substance in the binding pocket or the residual mobility of the ligand in
the binding pocket are also not considered in any of the fields.
In addition to these inadequacies, the fields themselves cause a few problems. Due
to their mathematical function behavior, very large and/or very small values are
achieved at the surface or in the interior of the molecule (Fig. 18.5). Because the
Lennard-Jones potential increases faster upon approaching the atoms than the Cou-
lomb potential does, both achieve arbitrarily set cut-off values (Sect. 18.10) at different
distances from the molecule. Within a distance of 2 Å, which is the commonly chosen
grid spacing, the extremely steep Lennard-Jones potential can change from practically
zero to the cut-off value. These discontinuities and the neglected areas near the surface
can cause significant problems for the interpretation. Furthermore, they often cause
fragmented contour maps in the individual fields that are difficult to interpret.
The deficits in these fields have stimulated the search for other solutions. In one
method the similarity of molecules is investigated by use of their steric and
physicochemical properties in space and correlated to the binding affinity
(CoMSIA methods; Comparative Molecular Similarity Indices Analysis). The
molecules are superimposed just as they are in the CoMFA methods. Then their
relative similarity is determined through their relationship to a probe, a carbon atom
for instance, in that the similarity of each molecule is sampled with a probe at the
intersections of a surrounding grid. The measure of similarity between the probe
and the molecule is defined in a distance-dependent way. A Gaussian function
(Fig. 18.5) is chosen for this purpose. In contrast to the hyperbolic form of the
above-described potentials, the Gaussian bell-type curve approaches for decreasing
distances finite values instead of infinity. Cut-off values need not be set. For many
different properties a similarity is determined at all grid points. The prerequisite is
that the properties must be described by atom-based values, for example, partial
388 18 Quantitative Structure–Activity Relationships
charges or atomic volumes. The same distance dependency is used for all proper-
ties. Property-specific similarity fields are obtained. These are correlated with the
binding affinity. The interpretation of the field contributions is achieved analo-
gously to the CoMFA method. The advantage of this method lies, above all, in the
interpretability and the preserved contour maps. If a particular property in an area of
the superimposed molecules correlates significantly with binding affinity, this area
is enhanced. In contrast, the CoMFA method contours areas outside of the mole-
cules, where a property reveals changes in the field contributions that affect the
affinity positively or negatively. The setting of cut-off values, however, masks
entire areas of these field contributions near the surface (Fig. 18.5).
3D-QSAR analyses were first meant to establish structure–activity relationships in
cases when the target protein’s structure was unavailable as a reference. Nowadays,
more and more crystal structures of the target proteins become available, so, the
technique is increasingly used for cases in which this reference is actually known. It
serves as a method of generating a reasonable and relevant superpositions of the
substances to be compared in their biologically active conformations. It seems all the
more paradoxical to use the information about the surrounding protein environment
only to superimpose the molecules and then to relinquish this valuable data in the
comparative field analysis. Methods have been developed that consider this informa-
tion. The group of Rebecca Wade at EMBL in Heidelberg have developed the
COMBINE method. For this, a set of modeled protein–ligand complexes are used
to calculate a data table. It contains the interaction energies between individual ligand
atoms in the test molecules of the data set and the amino acid residues and water
molecules in the surrounding protein. The interpretation of this enormous data table is
achieved by using a technique that is similar to the CoMFA methods. The graphical
interpretation of the correlation model obtained by COMBINE indicates which
regions of the protein account for decisive contributions to explain the affinity
differences in the ligand data set. These are very valuable details, but they only
help a little for the design of better molecules that achieve higher affinity.
Holger Gohlke in Marburg developed the variation AFMoC (Adaptation of
Fields for Molecular Comparison), with which it is possible to transfer information
about the protein environment into the field-based model. The advantage of the
intuitive interpretation of the field contributions with regard to the structural
optimization of the ligands is not lost. For this, values are generated on
a COMFA-like grid by using the empirical scoring function DrugScore (▶ Sect.
17.10) by placing atomic probes at each grid point. The resulting values reflect the
protein environment and the grid has been “prepolarized.” By using a docking and
superposition technique, the ligands of the training set are then placed onto this
grid. It is only when an atom of the ligand falls upon an area of the grid for which
the protein environment has predicted this atom type as advantageous, the field
contribution is enhanced. In other cases the interaction contribution on the grid is
reduced. In this way a data table is generated for the entire training set analogously
to a CoMFA method. This table is accordingly evaluated and affords a QSAR
equation. The individual contributions can be shown on a grid. They indicate where
particular atom types increase or reduce affinity.
18.14 A Glimpse Behind the Scenes 389
A similar field analysis is also used for the correlation and prediction of
selectivity differences between ligands. Many enzymes occur as isoforms. They
therefore have similarities in their binding pockets. As a consequence ligands show
graduated affinities or “selectivity profiles” to these isoforms. If a ligand is to be
optimized to improve selectivity, the positions at which a change in a property
results in an improved profile must be known. A 3D-QSAR model is constructed for
each isoenzyme. Either the difference in the affinity values can be calculated and
used for the model as values to be predicted, or alternatively, two correlation
models can be constructed and at each grid point the field contributions are
subtracted from one another. The models that are obtained with both approaches
can be graphically interpreted. Contour diagrams show where and how the mole-
cules are to be changed to improve their selectivity with regard to the one or other
isoenzyme.
Today comparative field analyses belong to the standard repertoire in drug research.
As an example, the binding of inhibitors to carbonic anhydrase I and II shall be
examined. The biological function of this enzyme is described in detail in ▶ Sect.
25.7. The sequence identity of the isoforms is 60%. The ligands in the training data
set are derived from the parent structures shown in Fig. 18.6. First, a superposition
model is generated by docking the ligands into the protein (Fig. 18.7). The
enzyme’s funnel-shaped binding pocket is occupied by ligands in a large variety
of ways. A good correlation model is obtained with the three methods, CoMFA,
CoMSIA, and AFMoC. The models also achieve a convincing predictive power on
a test data set that was independent from the training set.
R1
N
N N N NH2
R1 NH2 NH2
R1 SO2
N S SO2 H3C SO2 S
H SO2 S
Thiadiazolsulfonamide Thienothiopyransulfonamide Benzothiazolsulfonamide
R1 O
H H
NH2 R1 N OH R1 N
N OH
SO2 SO2 H SO2
R2
Phenylsulfonamide Hydroxamate Hydroxysulfonamide
Fig. 18.6 The scaffolds of inhibitors that were used in different field analyses to establish affinity
(pKi[CAII]) and selectivity models (pKi[CAII] – pKi[CAI] ¼ DpKi[CAII – CAI]) to describe the
inhibition of the carboanhydrases CAI and CAII. Different substituents were varied at the positions
that are marked as R1 and R2.
390 18 Quantitative Structure–Activity Relationships
Fig. 18.7 The superposition of inhibitors from the data set in the funnel-shaped binding pocket of
CAII; the zinc ion is shown as the blue-gray sphere, carbon is light-yellow, oxygen is red, nitrogen
is blue, sulfur is orange, and hydrogen is white.
The contours for the acceptor properties with regard to the inhibition of carbonic
anhydrase II are shown in Fig. 18.8. Molecules in the data set that exhibit an
acceptor function in the areas marked in red have lower potency. On the other
hand, an acceptor function in the blue area improves potency. Compound 18.2,
which has both acceptor functions of an SO2 group oriented in the detrimental red
area, is a weak CAII inhibitor. Moreover its NH group is in the blue region, which
should be occupied by an acceptor. Compound 18.3, which is about four orders of
magnitude more potent, leaves the area that was occupied by an oxygen atom in
18.2 empty, and orients its thiadiazole ring in the direction of the desirable acceptor
function. It achieves considerably better inhibition of the target enzyme.
Just as for the acceptor properties, contour maps can be generated for steric,
electrostatic, hydrophobic, and hydrogen-bond-donor properties. Their evaluation
18.14 A Glimpse Behind the Scenes 391
Zn2+
O OO O
S S Zn2+
Cl3C N O
H
H
18.2 CA II pKi = 4.7
O O
H S
N S – Zn2+
N
N N H
O
18.3 CA II pKi = 8.7
Fig. 18.8 Contour map for the description of the binding contributions of H-bond acceptor
properties. Inhibitors that occupy the red contour areas with H-bond acceptor groups do not inhibit
CAII well, the occupancy of the blue areas with acceptor groups, however, leads to increasing
values. Both oxygen atoms of the sulfonamide group of 18.2 occupy the red-contoured area, which
is unfavorable for acceptor properties. On the other hand, 18.3 leaves these areas unoccupied and
places its basic nitrogen in the vicinity of the blue-contoured region, which is favorable for
occupancy by acceptor groups. This explains the markedly better inhibition of CAII by 18.3.
helps to make evident where particular properties improve or lower the binding
affinity. Such correlation analyses help the synthetic chemist to plan the optimiza-
tion of lead structures in a tailored way.
Contour maps for steric properties that cause a selectivity difference between
CAI and CAII are shown in Fig. 18.9. Occupancy of the green areas with an
inhibitor improves the selectivity for CAI. On the other hand, spatially filling the
yellow-colored regions improves the selectivity for CAII. Compound 18.4 binds
unselectively with the same affinity to both isoforms, but 18.5 can clearly discrim-
inate between the two. The shown model is purely derived from the correlation of
ligand binding data. The relative alignment of the molecules in the data set is
accomplished in the binding pocket of the protein. Therefore the protein environ-
ment around this binding pocket should be examined more closely, to see if the
derived contours are reasonable. If the amino acid replacement between the two
isoforms is compared, it is apparent that CAI has two large residues Phe91 and
Leu131 that constrain the lower left portion of the binding pocket. The inhibitors
have less room in CAI than they do in CAII. In fact the comparative field analysis
392 18 Quantitative Structure–Activity Relationships
His200 His200
Thr200 Thr200
Tyr204
Phe91 Phe91 Tyr204
Leu204
Ile91 Ile91 Leu204
Leu131 Leu131
Phe131 Phe131
O N N NH2
N N SO2
S
H
His200
Thr200 18.4
CAI: pKi = 8.15
CAII: pKi = 8.10
F
O O O
F
S N N NH2
Phe91 Tyr204
N S SO2
Ile91 Leu204 N
F H H
F
F 18.5
Leu131
Phe131 CAI: pKi = 6.70
CAII: pKi = 9.40
Fig. 18.9 The selectivity can be improved with regard to CAII inhibition by sterically filling
the yellow-contoured area. Filling the green area with sterically demanding group causes an
increase in selectivity with regard to CAI (top left). Compound 18.4 occupies virtually no area
that is particularly selectivity discriminating; the compound is not isoenzyme specific (top left and
top right). On the other hand, 18.5 occupies a yellow-contoured area neighboring position 204,
which causes a selectivity enhancement for CAII. Compound 18.5 inhibits CAII decidedly more
potently than CAI.
in this region generates a yellow contour, (near position 91) the occupancy of which
should be favorable for the inhibition of CAII. CAII also makes a large amount of
space available for inhibitors next to position 204, which is occupied by the less-
crowding Leu204 instead of Tyr204. A yellow contour is seen that indicates
a favorable occupancy of this area. Inhibitor 18.5, which is considerably more
potent on CAII, orients its pentafluorophenyl group exactly in this region (Fig. 18.9,
right). In the vicinity of position 131 (Leu131/Phe131) a yellow and a green area
occur directly next to one another but spatially separated, the occupancy of which is
favorable for either CAI or CAII inhibitors, respectively. Compound 18.4, which
can hardly distinguish between the two isoforms, occupies the upper edge of both
areas equally well. Moreover it leaves virtually all regions unoccupied that should
lead to a better inhibition of either CAI or CAII for steric reasons. Therefore it is
evident why this compound shows no particular selectivity.
18.14 A Glimpse Behind the Scenes 393
H
N
NH2
SO2
H3C S S
O O
18.6
CAI: pKi = 4.30
CAII: pKi = 8.05
CAII CAII selective
Fig. 18.10 Compound 18.6 inhibits CAII significantly more potently than CAI. Its sulfone
oxygen atom lies near one red contoursed area, the filling of which causes an increase in the
selectivity for CAII binding. Interestingly, Gln92 is found in this region in both isoforms.
However, it is only in CAII that this group is available to accept an H-bond from the inhibitor
that will contribute to binding affinity. The comparable residue in CAI is involved in a network of
H-bonds to neighboring amino acids. Therefore it is not available as a binding partner, and
a decrease in the affinity for CAI is the consequence.
its oxygen atoms of the endocyclic SO2 group in the vicinity of the red CAII-
selective areas. Furthermore, a glutamine is neighboring position 92 both in CAI as
well as CAII. This amino acid can accept an H-bond from the inhibitor via the NH2
group of its carboxamide group. However, only CAII allows this structural condi-
tions. Gln92 neighbors Asn69 and Glu58 in CAI. The carboxamide group of Glu92
forms a continuous H-bond network with these residues and with His94. Therefore
the NH group is no longer available for interactions with a bound inhibitor. This is
expressed in the poorer binding affinity of inhibitors that place an acceptor function
at this position, as 18.6 does. The situation is entirely different in CAII. The
neighboring groups of Glu69 and Arg58 form an internal salt bridge with each
other. Therefore they are not available as H-bond partners for Gln92. The
carboxamide group of Gln92 involves His94 via its carboxamide CO group in an
H-bond, and its NH2 group is now available as an acceptor functionality to interact
with a bound ligand. This results in a considerably enhanced binding to CAII and is
expressed as a selectivity advantage.
Alexander Hillebrecht at the University of Marburg has performed yet another
evaluation of the data set of carbonic anhydrase inhibitors that underscores the
difference between 3D, 2D, and 1D QSAR analyses. First, 32 so-called one-
dimensional descriptors were calculated with the MOE program for all molecules
in the data set. These are surface-based descriptors that describe the lipophilicity (log
P), the molar refraction (and therefore the polarization), and partial charges distrib-
uted over the molecules. These 32 descriptors are correlated with the binding affinity
to CAII or the selectivity difference between CAI and CAII to establish a QSAR
model. In another model the connectivities in the chemical formulae (so-called
molecular graphs) were used as descriptors. For this a topological connectivity tree
of all bonds in a molecular formula was generated, and by “walking” along the bond
connections it was counted how often a particular connectivity, for instance, an N–S–
C–C–N or C–N–C–C–C sequence occurs (so-called MACCS keys). In all, the
frequency of 166 different connectivity fragments was evaluated.
Such descriptors code indirectly for the molecular composition of the individual
inhibitors in the data set, as was introduced above in the Free–Wilson analysis (Sect.
18.5). These topological 2D descriptors were then related to the binding affinity or
selectivity data as described above. Good correlation models can be derived using 1D
as well as 2D descriptors. The models based on the 1D descriptors proved to be not
predictive. If an attempt was made to predict a molecule that was not in the data set,
the model failed. The topological descriptors obtain better results. They possess
a certain degree of predictive power, but they perform less well than the above-
described 3D descriptors in the comparative field analysis. This comparison makes
evident that the increase in the complexity of the model and the structural validity of
the descriptors increases their predictive power with regard to the binding properties
of new molecules that were not part of the training data set. But it is especially this
predictive power and the straightforward translation of the obtained correlation
model into the design of new or the modification of existing chemical structures
during the optimization that make QSAR models valuable for drug design.
18.15 Synopsis 395
18.15 Synopsis
Bibliography
General Literature
Hansch C, Leo A (1995) Exploring QSAR. Fundamentals and applications in chemistry and
biology, vol 2. American Chemical Society, Washington, DC
Kubinyi H (1993a) QSAR: Hansch analysis and related approaches. VCH, Weinheim
Kubinyi H (ed) (1993b) 3D-QSAR in drug design: theory, methods, and applications. ESCOM, Leiden
Kubinyi H, Folkers G, Martin YC (1998) 3D QSAR in drug design, vol 2 and 3. Kluwer/ESCOM,
Dordrecht/Boston/London
Ramsden CA (1990) Quantitative drug design. In: Hansch C, Sammes PG, Taylor JB (eds)
Comprehensive medicinal chemistry, vol 4. Pergamon Press, Oxford
van de Waterbeemd H (1995a) Chemometric methods in molecular design. VCH, Weinheim
van de Waterbeemd H (1995b) Advanced computer-assisted techniques in drug discovery. VCH,
Weinheim
Special Literature
Biological systems are open systems that are kinetically controlled. They can be
temporarily found in a dynamic equilibrium. This condition can be compared to
a chromatographic process in which a substance is in a constant exchange between
the solid support and the mobile phase. Locally, equilibria occur that are disrupted
by the continuous progression of the mobile phase. In contrast to the relatively
simple conditions in chromatography, there are a plethora of different phases in
biological systems. A drug is distributed throughout all of these phases. Further-
more, metabolic processes are running in parallel that lead to different metabolites.
19.1 Rate Constants of Compound Transport 399
Aqueous Aqueous
phase A phase C
k1
P¼ (19.1)
k2
k2 ¼ bk1 þ c (19.2)
log k
correspond to the fitting of the
data with Eqs. 19.3 and 19.4.
−6
log k1 log k2
(r = 0.997) (r = 0.998)
−7
−3 −2 −1 0 1 2 3 4
log P
in Figure 19.2. Among the latter are neutral, acidic, basic, and even quaternary
charged compounds with very different molecular weights. The characteristic
course of the curve says that the rate constant k1 for the transfer from the aqueous
phase into the organic phase depends on the partition coefficient P for relatively
polar substances. It is thermodynamically controlled, that is, it increases with
increasing lipophilicity. A point is reached, however, at which the diffusion of
the substance is limited by k1 at the maximally achievable value. More lipophilic
substances cannot simply penetrate the organic phase faster. Analogously, this is
valid for the opposite direction as well, from which the diffusion from the organic
phase into the aqueous phase is described by k2. The chemical structure plays
a role in both cases in that it determines the value of the partition coefficient P.
Because the rate constants are limited by diffusion, there must be an apparent
dependence on the molecular size in this area. According to Fick’s law of
diffusion, the diffusion should be proportional to the radius of the particle, as
a first approximation, parallel to the third root of the volume. Because of the
relatively low variability of the molecular size of organic drugs and their
conformational flexibility, this effect is probably lost by the noise level of
experimental error. Moreover, it must not be forgotten that the discussed
octanol/water system is very simple and it only slightly approximates the
complex structural relationships of real membrane systems. Therefore today
more relevant models to collect experimental distribution data, such as the so-
called PAMPA or Caco-2 models, are increasingly being used (Sect. 19.6).
Here more complex correlations are indicated. Obviously how a compound is
distributed and structurally oriented in the vicinity of membrane structures is
important. These properties simultaneously influence how the penetration, and
therefore the distribution, is to be described.
19.3 The Role of Hydrogen Bonds 401
The rate constant, k, for the penetration through the lipid membrane from the
aqueous phase is described by another equation, Eq. 19.5. Here, the rate constants
k1 and k2 also describe the entry into the organic phase and the transport in the
opposite direction, respectively.
1
Blood-placenta
penetration
0
Intestinal
−1 resorption
Gastric
log K
−2 resorption
−3 Penetration
through
an organic
−4 membrane
−5
−3 −2 −1 0 1 2 3 4 5
log P
Fig. 19.3 The rate constant k for the transport of drugs depends nonlinearly on lipophilicity. This
is valid for simple in vitro models as well as for biological systems. The bottom curve describes the
log k values of the transport of barbiturates in an in vitro absorption model from an aqueous phase,
through an organic membrane into another aqueous phase. Both curves in the middle (gray points)
describe the dependence of the absorption rate constants k on the lipophilicity for the absorption of
homologous carbamates from the stomach (gastric absorption) or the gut (intestinal absorption) of
rats. The top curve was determined for the entry of different drugs into the placenta from the
circulation. In all cases an increase in log k dependent upon log P is seen, until a more-or-less-
pronounced maximum for substances with moderate lipophilicity. For very non-polar substances,
this curve falls, and in rare cases a plateau is reached. The curves for gastric and intestinal
absorption and for the penetration into the placenta run flatter than the curve for the in vitro
transport of barbiturates (below), because here no lipid barrier is present.
the organic phase contains considerable amounts of water so that the molar ratio of
octanol/water ¼ 4:1. Substances with polar, solvated groups therefore do not need
to fully release their water solvation shell upon entry into the octanol phase.
Entering into a biological membrane is obviously different. Aside from the depen-
dence on lipophilicity, even worse membrane penetration is observed for sub-
stances that can form an increasing number of hydrogen bonds. Similarly,
a ligand must release its water shell before it can be accommodated in the binding
site of a protein.
The system water/cyclohexane is more suitable for the description of such
processes. Because of the non-polar character of this hydrocarbon, upon transition
from water into cyclohexane the molecule cannot take its water shell with it. Many
years ago P. Seiler derived an increment IH (Eq. 19.7) from the differences in the
partition coefficients in cyclohexane/water (loss of water shell) and octanol / water
19.3 The Role of Hydrogen Bonds 403
3
AmOH
log 1/c 2
EtOH DecOH
1 MeOH
0
−2 −1 0 1 2 3 4 5
log P
Fig. 19.4 The neurotoxicity (C ¼ molar dose that induces a particular toxic effect) of homolo-
gous primary alcohols in the rat is a measure of their ability to cross the blood–brain barrier. Polar
substances remain overwhelmingly in the blood circulation. In contrast, substances with moderate
lipophilicity reach the central nervous system easily. Accordingly, neither methanol (MeOH) nor
ethanol (EtOH) shows a pronounced neurotoxicity. The high general toxicity of methanol (blind-
ness) is not because of its own effect but rather the severely toxic metabolic products formalde-
hyde and formic acid (acidosis). Short-chained alcohols such as amyl alcohol (AmOH) are
considerably more neurotoxic. The highly lipophilic decanol (DecOH) shows low toxicity.
(no loss of water shell) for different functional groups. These IH values characterize
the tendency of groups to form hydrogen bonds.
X
log Pcyclohexane þ IH ¼ 1:00 log Poctanol þ 0:16 (19.7)
The concept of Seiler remained largely ignored. In 1988 Robin Ganellin and co-
workers described the CNS bioavailability of different substances, that is, their
ability to cross the blood–brain barrier, as a linear function of a Dlog P value. This
Dlog P value is the difference between the log P values in the systems cyclohexane/
water and octanol/water. The bioavailability of peptides also runs in first approx-
imation parallel to the Dlog P value, or the number of groups that potentially
participate in hydrogen bonds. The methylation of all NH groups of a peptide
scaffold can, in fact, deliver substances with good bioavailability. The prerequisite
for good membrane penetration is similar to those for high affinity at the binding
site (▶ Chap. 4, “Protein–Ligand Interactions as the Basis for Drug Action”). Here
too, the requirement to release relatively strongly bound water molecules can also
have a detrimental influence on binding affinity.
Several other distribution systems, for instance, heptane/ethylene glycol, have
been proposed as alternatives to the octanol/water or cyclohexane/water systems
with regard to the simulation of penetration through a lipid membrane. But even
these systems cannot correctly reflect the architecture of membranes with an
interior lipophilic zone and a polar, negatively charged outer rim. Another option
404 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Many drugs are acids (HA) or bases (B). They exist in two forms through dissoci-
ation (Eq. 19.8) or protonation (Eq. 19.9); one is usually a non-polar neutral form
and the other is a polar ionic form. The values of the partition coefficients of the
ionic species are generally three to five orders of magnitude less than the
corresponding neutral molecule.
HA þ H2 O Ð A þ H3 Oþ (19.8)
B þ H3 Oþ Ð BHþ þ H2 O (19.9)
Octanol
HA A−
Pu Pi
HA + H2O A− + H3O+
Ka
Aqueous Buffer
Fig. 19.5 Two-phase system with partition and dissociation equilibria for an acid HA (Eq. 19.8).
Ka is the dissociation constant, Pu and Pi are the partition coefficients of the non-dissociated and
ionic forms, that is, neutral and charged species, respectively. Because there is usually a difference
of several orders of magnitude between the Pu and Pi values, in many cases the Pi value can be
neglected. This leads to considerable simplification of the corresponding mathematical models.
19.4 Distribution Equilibria of Acids and Bases 405
log P
Acid AH
Base B
+H NCH(R)COO−
A− × N(R4)+
3
Acid/Ion pair
Amino Acid
Protonated
Anion A−
Base B+
+H NCH(R)COOH
3 H2NCH(R)COO−
Dibasic Acid
pH = 0 pH = 7 pH = 14
Fig. 19.6 The pH dependence of the distribution equilibrium of acids and bases, the so-called pH
distribution profile, follows simple rules. Typically when an acid or a base is present, sigmoidal,
that is, S-shaped, curves are observed. For a two-base acid, for example, oxalic acid, the decrease
in the partition coefficient continues with increasing pH values. In the presence of lipophilic
counterions, for example, the tetrabutylammonium salt of salicylic acid, the ion pair displays
a very high partition coefficient. Amino acids with neutral side chains carry one basic amino group
and an acidic carboxyl group. Accordingly, they go through a maximum in their partition
coefficient at the neutral point. Here the majority of the substance indeed exists as a zwitterion;
aside from that, however, a larger part is in the neutral form than is at lower or higher pH values.
has an only slightly lower partition coefficient than the neutral form of salicylic
acid. In contrast, the sodium salt of salicylic acid has absolutely no tendency to
cross over into the organic phase. Amino acids and other mixed acidic and basic
compounds afford pH–partition profiles with a maximum between the pKa values of
the two ionizable groups (Fig. 19.6), that is, when the zwitterionic form is present.
Knowledge about the log P value of the neutral form and the pKa value allows
the partition coefficient of a substance to be calculated at neutral pH. These
principles allow the estimation of absorption and distribution properties of new
substances. Of course, these considerations are only valid for drugs for which no
transporter exists that facilitates their membrane penetration ( ▶ Sects. 22.7 and
▶ 30.7)
Because of their importance, today pKa values are routinely measured by
potentiometric titration in pharmaceutical research. However, it remains neglected
that the definition of pKa values of acids and bases are only valid for aqueous
solutions. The addition of an organic solvent, which changes the dielectric con-
stant, shifts this value (▶ Sect. 4.4). This is even more valid for the binding site of
a protein or the interior of a membrane. In individual cases, experimental values
have been determined by NMR spectroscopy and isothermal titration calorimetry.
406 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
The absorption of an active substance, for example, out of the intestines into blood,
should be dependent on the pH of the surrounding medium and the pKa of the
substance, just as the distribution between an aqueous buffer system and an organic
phase is. The absorption should follow very simliar profiles as the distribution. In
the 1950s, Brodie, Hogben, and Schanker formulated the pH–partition theory to
this effect. It says that the dependence of absorption profile on the pH value, the
pH–absorption profile is identical to the pH–partition profile (Sect. 19.4). This
theory was confirmed by, among other things, the investigation of the rate constant
of absorption of a few acids and phenols from the colon of the rat at pH 6.8. The
neutral forms of the strong acids 5-nitrosalicylic acid (pKa ¼ 2.3), salicylic acid
(pKa ¼ 3.0), m-nitrobenzoic acid (pKa ¼ 3.4), and benzoic acid (pKa ¼ 4.2) display
comparable lipophilicity with log P values between 1.8 and 2.3. Under experimen-
tal conditions near neutral pH, they are largely dissociated. Less than 0.1% are in
the neutral form. Therefore they are distinctly more slowly absorbed than the
comparably lipophilic, weakly acidic phenols p-hydroxypropiophenone
(pKa ¼ 7.8) and m-nitrophenol (pKa ¼ 8.2), which are more than 90% in their
neutral form at pH 6.8.
Neutral forms can diffuse through membranes; charged forms are well soluble in
water. An equilibrium is quickly established between the two forms in an aqueous
medium and also at the phase boundaries. In the case that the pKa values of the
substances are not more than 2–3 units from the neutral value of pH 7, the neutral
form is present in the aqueous phase at the entirely adequate concentration of about
0.1–1%. The latter penetrates into the membrane. In the aqueous phase it is
immediately regenerated by the dissociation equilibrium. In a biological system
the distribution of such substances is accomplished quickly and effectively
(Fig. 19.7), and indeed even better the closer the pKa value is to the neutral pH 7.
This also explains why so many drugs are organic acids or bases. Because of the
strongly deviating pH values in the stomach and intestines, at some place along the
gastrointestinal tract the conditions are right that a neutral substance, an acid, or
a base can be well absorbed. If the pKa values are too far from the physiological pH
values, for example, amidines or guanidines with extremely high pKa values, the
absorption can become problematic. This is also true for zwitterionic compounds,
for example, amino acids, and for compounds with multiple acidic or basic groups
in the molecule. Because of the large volume available for the distribution the
diffusion occurs overwhelmingly from the gastrointestinal tract into blood or tissue
and only to a negligible extent in the opposite direction (Fig. 19.7).
The absorption of strongly acidic compounds outside the range in which the
compound exists as a neutral molecule, runs in first approximation parallel to the
difference pH pKa, and for bases the difference is pKa pH. There are exceptions
to this approximation. Highly lipophilic compounds require a more detailed descrip-
tion of the pH–absorption profile. The neutral forms of these substances enter the
lipid phase as soon as they come near the membranes. The neutral molecule is being
constantly removed from the dissociation equilibrium, which is established in the
19.5 Absorption Profiles of Acids and Bases 407
N N HA HA A−
Stomach, pH = 1 Stomach, pH = 1
A-
HA HA
N N
A- A−
Intestines, pH = 6–8 Intestines, pH = 6–8
B B BH+ B B BH+
Stomach, pH = 1 Stomach, pH = 1
BH+ BH+
Blood circulation, pH = 7.4 Blood circulation, pH = 7.4
B B B B
Fig. 19.7 (a) A moderately polar neutral substance N is absorbed very well from the stomach as
well as from the intestines. It is quickly distributed in the circulation so that the back-transport does
not play a notable role. (b) An organic acid HA (pKa ¼ 4) is absorbed well from the stomach, as
long as it is not too polar, because it exists there overwhelmingly in the neutral form. The
absorption is facilitated by the fact that the free acid is in considerably lower concentration in
the blood than in the stomach. The formation of an anion shifts the concentration gradient in this
direction. The absorption is slower from the gut because there the equilibrium lies overwhelmingly
on side of the ionized form. (c) A weak base (pKa ¼ 5) is absorbed relatively poorly from the
stomach because it exists overwhelmingly in its polar, protonated form. It is well absorbed in the
intestines because it exists as its neutral form there. (d) A strong base with a pKa ¼ 9 cannot be
absorbed from the stomach. The equilibrium indeed lies heavily on the side of protonated form in
the intestines, but the non-polar form is supplied in adequate quantities. Therefore the substance
can be absorbed. When a pKa value of >11 is reached by a substance, the concentration of the
neutral, bioavailable form is too low for good absorption.
408 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Amount of an acid,
pH–Absorption
AH, distributed or
Diagram (dynamic
absorbed
equilibrium)
Δ pH =
pH shift
pH–Distribution
Diagram (dynamic
equilibrium)
pH Value
Fig. 19.8 The dependence of the absorption of lipophilic acids on the pH value, the absorption
profile (red curve) decidedly deviates from the pH distribution curve (black curve, see Fig. 19.6).
Although the pH-distribution profile is valid for an equilibrium system, a steady-state equilibrium
is established during absorption. Even at relatively high pH values, that is, when small concen-
trations of the neutral species are present, a fast absorption of these few molecules is achieved.
Because of the high anion concentrations and the continuous adjustment of the dissociation
equilibrium, a minimally necessary concentration of the neutral species is maintained. The shift
in the pH-absorption profile is referred to as a pH shift. Analogous shifts are observed in the
opposite direction for lipophilic bases.
coefficient P. For this, the ratio of the sum of all concentrations of ionized and non-
ionized forms of an investigated compound in the two phases are considered. The pH
value is adjusted for measurement in a buffer solution so that the addition of the
investigated compound does not shift the pH. Usually log D, logarithm of the
distribution coefficient, is used in this place.
results often can only be compared within a series of structurally related substances.
Assay systems with artificial membranes (PAMPA, from parallel artificial
membrane-permeability assay) can be constructed that allow high-throughput
screening. Moreover, the penetration behavior in liposomes can be evaluated by
surface plasmon resonance.
When experimentally determining the absorption of different substances, results
obtained from saturated solutions of the substances should not be compared with
results from solutions with concentrations well below the saturation limit. Due to
the lower solubility of the lipophilic compounds their solutions will exhibit minor
concentrations which pretends worse absorption. In the second case using compa-
rable concentrations for all test compounds improved or good absorption is also
found for the lipophilic substances. A comparison of such different experimental
conditions will lead to incorrect conclusions. Further confusion occurs when the
terms absorption and bioavailability are incorrectly applied (▶ Sect. 9.1). The
absorption of a substance can be excellent, but the bioavailability is nonetheless
poor. Lipophilic compounds and substances with a molecular weight of more than
500–600 Da are often well absorbed, but suffer from very fast biliary (via the bile)
elimination. This usually happens during the first liver passage (first pass effect,
▶ Sect. 9.1) directly after absorption from the intestines. To achieve good bioavail-
ability, the lipophilicity must not be too high. The excretion path also depends on
the lipophilicity. In general, extremely lipophilic substances are more quickly
metabolized, but are also toxicologically worrisome. Hydrophilic substances and
polar metabolites, including those after conjugation with polar groups, are excreted
via the kidneys. The excretion of lipophilic substance is usually accomplished
hepatically, and subsequently over the intestines. Such substances often undergo
oxidative metabolism, with the concomitant possibility of toxic metabolites being
produced.
Substances that interact with membrane-bound receptors or ion channels can
often access their targets more easily if they are enriched in the surrounding
membrane. For this, the substances should be lipophilic, or should carry a large
lipophilic group with which they can be anchored in the membrane (▶ Sect. 4.2,
▶ Fig. 4.2).
Aside from the set-up of suitable test systems to systematically record parameters
that determine the pharmacokinetic properties, major effort has been spent to
establish rules and computer models to predict favorable ADME properties. In
the first place, the rule of five must be mentioned, which was developed by Chris
Lipinski at Pfizer. Accordingly, an active substance should not violate more than
two of the rule of five in Table 19.1. These simple rules were derived from
experience and are almost exclusively used to preselect compounds for screening.
Tudor Oprea refined these rules further and extended them to cover the occurrence
of particular structural building blocks such as, for instance, the maximum number
19.7 Computer Models and Rules to Predict ADME Parameters 411
Active substances are initially investigated in simple in vitro test models, for
instance, with respect to enzyme inhibition, receptor binding, in cell cultures, and
later in organs and animal models. As a general rule, the simplest model is chosen
for which the results are predictive of the effect that can be expected in an animal or
in humans. For this it is necessary to derive quantitative relationships between the
different test models, so-called activity–activity relationship. This describes the
relationship between biological activity, for instance, between in vitro and in vivo
data. In the best case, it even allows the extrapolation of the values of binding
affinity in an inhibition assay to the therapeutic effect in humans.
The confirmation of a correlation between a simple test model and a therapeutic
effect is often more important than the derivation of a structure–activity relation-
ship. After finding the relevant, quantitative relationship, inexpensive and quickly
performed tests can be used instead of laborious animal experiments. The number
of animal tests is reduced in this way considerably. But that is not the only
advantage. The use of automated molecular testing systems allows the profile of
active substances to be reliably characterized.
Prior to the biological testing of an active substance: the following questions must
be clarified. What therapeutic goal should be achieved? How is this goal to be
realized? Therapeutic concepts are derived from the pathophysiology of the disease
mechanism. Regulatory intervention with drugs should restore the original physi-
ological condition as far as possible. Problems can occur in the process: to imitate
natural ligands of enzymes and receptors, the active substance must demonstrate
adequate specificity and must distinctly access the target site.
19.9 Natural Ligands Are Often Unspecific 413
How specifically should a drug act? There is no absolute answer to this question.
Because active substances are almost always administered orally or intravenously,
they act systemically, that is, on the entire organism. The lack of limitation to
a particular organ or a particular compartment must be compensated for with
a higher specificity. At any rate the drug must act as specifically as necessary to
achieve a successful therapy with tolerable side effects.
In the case of enzyme inhibitors substances are preferred that act so specifically
that only one particular enzyme is inhibited. Unspecific inhibitors that simulta-
neously inhibit multiple serine or metalloproteases would wreak havoc in an organ-
ism. A thrombin inhibitor, which should reduce an increased thrombosis risk, must
not act simultaneously as an inhibitor of the closely related plasmin, which causes
fibrinolysis, leading to dissolvation of blood clots that have already formed. The
situation with kinase inhibitors (▶ Sect. 26.3) is a bit different. Because of the
similarity among kinases one member of the family can take over the task of another
related kinase, which has been blocked. In doing so it reduces the therapeutic effect
to nothing. Here, a broad-spectrum kinase inhibitor might be desirable that can
simultaneously shut off an entire protein family. A broad-spectrum action that
inhibits multiple isoenzymes of a parasite equally well can also be beneficial for
antibacterial or antiparasitic compounds (e.g., plasmapepsins, ▶ Sect. 24.7).
Receptor agonists and antagonists should also display a high selectivity.
b-Agonists that are used to treat asthma (▶ Sect. 29.3) must be b2-specific so that
they do not induce an undesirable increase in the heart rate or blood pressure. Often
the necessary effect of a drug cannot be achieved with only one drug. The simul-
taneous use of multiple drugs is often indicated for the treatment of arterial
hypertension (▶ Sect. 22.10). More complex, multifactor-induced disease pro-
cesses must be treated by addressing multiple mechanisms. Because of the low
dosing of the different components, the unspecific side effects of the individual
different components fade into the background.
The specificity is critical for the effect of CNS-acting drugs. Progress in gene
technology has provided us with an explosion of knowledge about receptors, but
also a dilemma. We know the exact receptor profile of established substances. We
know what specificity must be achieved to imitate a particular type of action. But in
many cases, we do not know which profile should be present to achieve a better
therapeutic effect. An example should clarify this point. Neuroleptics and many
antidepressants (▶ Sect. 1.6) act on neuroreceptors. The classic neuroleptics chlor-
promazine 19.1 and haloperidol 19.2 (Sect. 19.9), which are used in the treatment
of schizophrenia, are relatively unspecific dopamine receptor antagonists
(Table 19.2). The mixed-type neuroleptic/antidepressant sulpiride 19.3 acts on the
D2 and D3 receptors simultaneously. All of these substances cause side effects on
the muscular–skeletal system, as is observed in Parkinson’s disease (Sect. 19.4),
which is caused by a dopamine deficiency. Because of its mode of action, it was
assumed that the side effects of the neuroleptics were inevitable consequences of
antagonism of the dopamine receptors. Then an atypical neuroleptic, clozapine 19.4,
19.10 Specificity and Selectivity of Drug Interactions 415
Table 19.2 The natural neurotransmitter dopamine binds with higher affinity to dopamine
receptors of the D1-type. The classic neuroleptics chlorpromazine 19.1, haloperidol 19.2, and
(S)-sulpiride 19.3 are different from clozapine 19.4 (Fig. 19.9) in one point: they have no
comparable selectivity for the D4 receptor.
Binding to the dopamine receptors, Ki in nM
D1-Type D2-Type
Substance D1 D5 D2 D3 D4
Dopamine 0.9 <0.9 7 4 30
Chlorpromazine 19.1 30 130 3 4 35
Haloperidol 19.2 80 100 1.2 7 2.3
(S)-Sulpiride 19.3 45,000 77,000 25 13 1,000
Clozapine 19.4 170 30 230 170 21
Cl
S
O
Cl N OH
N
N
F
N
H
O N
N
N
MeO
N
Cl
SO2NH2
N
H
Fig. 19.9 Chlorpromazine 19.1, haloperidol 19.2, and sulpiride 19.3 are neuroleptics with typical
side effects that are associated with dopamine antagonists. Clozapine 19.4 is different from these
substances in its binding profile on the dopamine receptors (Table 19.2) as well as in its side
effects.
came along (Fig. 19.9). It does not have the described side effects. Today we know
that clozapine, in contrast to the other neuroleptics, acts much more potently on the
D4 receptor than on the D2 and D3 receptors (Table 19.2). At the concentration at
which clozapine acts on the D4 receptor, and which was also measured in the
cerebral spinal fluid of the treated patients, is sufficient so that clozapine also
binds to particular serotonin and muscarine receptors, partly with even higher
affinity. Because of this it could also be that the antagonistic effects of clozapine
on these receptors are responsible for the atypical effects.
416 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Haloperidol
−4 (r = 0.87)
−5
−2 −1 0 1 2
Log (average clinical dose, in mmol/kg)
Many drugs are classified as “dirty drugs” because of their multifaceted action
on many, totally different receptors. From the pharmacologists’ point of view, such
a characterization is appropriate. A general statement about the therapeutic value
cannot be derived from that. It may well be that many dirty drugs are optimal for
therapy because of their balanced action on multiple receptors. Recently, these
compounds have been termed “rich in pharmacology” and they define a “polyphar-
macology.” The suitability or unsuitability of a drug is only decided in the clinical
testing and later by the experience from broad application in patients.
The differences between enzymes and receptors in different species also offers
a chance to therapeutically achieve desired selectivity. Species differences play
a role if an undesired organism should be killed, that is, with antibiotics,
antimycotics, antivirals, and antiparasitic drugs. To avoid side effects in humans,
the metabolic pathways of the bacteria, fungus, viruses, or parasites are purpose-
fully attacked either by adequate selectivity or by selecting a point of action that is
not present in higher organisms (see ▶ Sects. 23.7, ▶ 24.3, ▶ 27.2, or ▶ 30.8).
Table 19.3 Correlation of the clinical efficacy (Fig. 19.10) of 25 different neuroleptics and their
potency in different animal models that are typically used for the evaluation of neuroleptic effects
with the displacement of dopamine or haloperidol 19.2. The clinical data and the results of the
animal models correlate conspicuously better with the displacement of the D2-type ligand halo-
peridol than with the displacement of the D1-type ligand dopamine (r ¼ correlation coefficient).
Correlation with dopamine Correlation with haloperidol
Model displacement (r) displacement (r)
Mean clinical dose in humans 0.27 0.87
Inhibition of the stereotypical 0.46 0.94
behavior after application of
apomorphine (rat)
Inhibition of the stereotypical 0.41 0.92
behavior after application of
amphetamine (rat)
Protection from apomorphine- 0.22 0.93
induced emesis (dog)
unravel correlations between the results of in vitro models, animal experiments, and
the potency of these substances in humans. Two radioactively labeled ligands,
dopamine and haloperidol 19.2 (Sect. 19.10, Fig. 19.9), one of which prefers the
D1-type and the other prefers the D2-type dopamine receptor, were used to charac-
terize binding. It was demonstrated that the average clinical dose significantly
correlated with the displacement of the D2-type ligand haloperidol 19.2. Signifi-
cantly higher concentrations were needed to displace the D1-type ligand dopamine.
A correlation with these data is virtually non-existent. Not only the clinical efficacy,
but also the data from animal models that are used to test for neuroleptic effects
correlate better with the displacement of haloperidol than with dopamine
(Table 19.3). In hindsight, the results suffer from a lack of ligand specificity for a
single receptor, and the preparations are affected by receptor heterogeneity because
the presence of the different receptor subtypes was not standardized in the calf brain
homogenates that were used. All substances were investigated with dirty ligands in
dirty test models. The profile of active substances can only now be unambiguously
assigned by using uniform receptor subtypes, which are produced by using gene
technology (see Table 19.2).
There are many cases in which the relationship between different test models
depends strongly on the species used. Investigations on isolated arteries and veins
from the lungs of rabbits, sheep, pigs, and humans show that the vascular
preparations from rabbits and humans react to noradrenaline in a comparable
way. Sheep and pig arteries are significantly less sensitive. Isolated pig veins
cannot be stimulated at all at comparable doses of noradrenaline. The experimental
results are even more inhomogeneous and difficult to interpret upon stimulation
with acetylcholine. It must not be forgotten that the metabolism in humans and in
animal species is also different and exerts an influence on the test results.
Tachykinins are short peptides that trigger a wealth of physiological and patho-
logical processes. Their central role in pain and asthma is certain. They act over the
NK1, NK2, and NK3 receptor subtypes, which also bind specifically to the three
418 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Table 19.4 Binding of substance P and displacement by the antagonist CP 96 345 19.5 (tested as
a racemate) on cells of different origins.
OMe
NH
19.5 CP 96 345
Table 19.5 Inhibition of the renins of humans and other animal species by remikiren 19.6 and
aliskiren 19.7.
O O O OH
H
S N
N
H
O OH
N NH
19.6 Remikiren
O O
O
H2N N NH2
O H
OH
19.7 Aliskiren
effects, but it would have been judged to be much too weak. A comparison of the
X-ray structure analysis of the renins from the mouse and human also shows
a conserved binding mode in the main chain of the peptide inhibitors that is
common to other aspartic proteases. Subtle differences are found at the rim of the
binding pocket arising from sequence differences between the species.
The amino acid sequences of 5-HT1B and 5-HT1Db subtypes of the serotonin
receptors of humans and rats show more than 90% identity. If the relationships
between the individual amino acids are considered, a homology of 95% is obtained.
Despite these similarities, a series of active substances bind with very different
affinities to these two receptors. The difference is traceable to a single amino acid:
the exchange of threonine 355 for an asparagine (Fig. 19.11). The human receptor
is, from the point of view of the affinity, converted to the rat receptor by this
mutation! After the exchange of this amino acid, the b-blockers propranolol and
pindolol bind with approximately three orders of magnitude higher affinity.
The affinities of many other ligands, on the other hand, are significantly reduced.
420 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
5-Carboxamido-tryptamine (5-CT)
8
5-Hydroxytryptamine
(−)-Propranolol
log 1/Ki (Rat)
7
Pindolol Metergoline
Sumatriptan
6
Methysergide
5-OMe-diMe-tryptamine
Rauwolscine
5 N,N⬘-Dipropyl-5-CT
4
4 5 6 7 8 9
log 1/Ki (Human)
Fig. 19.11 Different serotonin receptor ligands and the b-blockers propranolol and pindolol show
very different binding affinities on very similar 5-HT receptors from rats and humans. The open
circles refer to the wild-type human receptor. They are irregularly distributed over the diagram
(correlation coefficient r ¼ 0.27). If one amino acid in the human receptor is exchanged for the
corresponding amino acid in the rat receptor, the binding profile changes. Relative to the affinity of
the ligands, the human receptor becomes a rat receptor. The black-filled circles refer to this
Asn355 mutant (correlation coefficient r ¼ 0.98).
This may indeed only be a weak indication, but it can be speculated that the two
b-blockers bind to the mutated 5-HT receptor as they do to the b-receptor.
One of the most difficult chapters in preclinical research is the estimation of the
toxicity of a substance, above all the human toxicity, from data that were obtained
from other species. Such considerations must be made to be able to estimate the
potential danger of the substance before it is introduced to the clinic. Are there any
drugs without toxicity and without side effects? Paracelsus recognized in the
sixteenth century:
Everything is poison and nothing is without poison, it is the dose alone that makes a thing
non-poisonous.
The determination of the acute toxicity in multiple animal species, and the
determination of the chronic toxicity in at least two animal species is routine before
entry into clinical trials, phase I, which is tolerability testing on healthy volunteers.
It is considered to be standard that the species for the chronic toxicity investigations
should be selected according to which animal species displays the most similarity to
humans in their pharmacokinetics and metabolism.
Cats and guinea pigs react extremely sensitively to cardiac glycosides. Therefore
they were previously used as models for the effect on humans. Rats react consid-
erably less sensitively. The hallucinogen lysergic acid diethylamide (LSD 2.21,
▶ Sect. 2.5) shows decidedly different toxicity in multiple animal species. An
experiment to test the hallucinogenic effects of LSD on an elephant led to
a disaster. A hallucinogenic, but non-toxic dose was desired. Despite carefully
estimating this dose, the elephant died within minutes after 0.3 g of LSD
(corresponding to 0.06 mg/kg) was administered. Relative to the mouse, which is
relatively insensitive (Table 19.6), the elephant reacted at least 1,000 times more
sensitively. This experiment was not repeated! The discoverer of LSD, Albert
Hofmann, took 0.25 mg of LSD in his first controlled self-experiment. With
about 0.0035 mg/kg he was significantly below the dose that cost the elephant its
life. Despite this, it can be assumed that LSD is less toxic for humans than it is for
elephants. Direct fatality through LSD is not known, only mortality that occurs as
a result of accidents or from suicide while in the psychotic state.
The toxicities of poisons that end up in our environment are very exactingly
investigated. Chlorinated dibenzodioxines and furans form during the uncontrolled
chemical decomposition of the corresponding substituted chlorophenols. The
Seveso accident is attributed to such an incident. Toxic chlorinated dioxins and
furans also occur during many burning processes. Tetrachlorodibenzodioxine 19.8
(TCCD, “Seveso dioxine”) belongs to one of the best investigated substances
regarding its toxicity. Even here, different species react differently (Table 19.7).
Three orders of magnitude difference is found between the two relatively closely
related species of hamster and guinea pig. Accordingly, it is difficult to draw
conclusions about the toxicity in humans. If an extrapolation is made between
primates and humans, TCCS would be classified as relatively non-toxic. In con-
nection with humans, the definition of an acute LD50 is absolutely inappropriate.
422 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Cl O Cl
Cl O Cl
19.8 2,3,7,8-Tetrachlordibenzodioxin
To be able to exclude one fatality per one million people, an “LD0.00001” must be
determined or calculated. Because of its pronounced mutagenic effects, the
long-term damage stands in the foreground with TCDD. It is questionable in this
case whether an absolute no-effect level, that is, the lowest ineffective dose, can
be defined. The estimation of the potential danger of environmentally
relevant chemicals looks entirely different if considered relative to toxic natural
products, natural radioactivity, cosmic radiation, etc., or even when compared
to socially tolerated substances of abuse such as alcohol and nicotine. This
puts some things into perspective that are very contentiously discussed in public
forums.
A difficult problem must be mentioned when discussing structure–activity
relationships from in vitro investigations in order to estimate the mutagenic and
carcinogenic potential. Such tests indeed afford valuable information that must be
carefully checked. In individual cases they are neither in the positive nor the
negative sense proofs.
To develop theoretical models for toxicity and carcinogenic estimation that
have adequate reliability and predictive power has proven to be extremely difficult.
The mechanisms that are responsible for the activity are too diverse and multifac-
eted, and the chemical structures and structure–activity relationships, which are
only valid for one substance class, are too different. Today, testing for toxic,
carcinogenic, and teratogenic adverse effects has reached a high standard. The
pharmaceutical catastrophes of earlier decades such as the following would be
almost impossible with today’s standards:
• Early childhood brain damage and death of many premature and mature new-
borns by the sulfonamides in the late 1930s,
19.13 Animal Protection and Alternative Test Models 423
• Over 100 fatalities in the USA because of the use of diethylene glycol as
a solvent for sulfanilamide (this incident led to the foundation of the Food and
Drug Administration, FDA.),
• The SMON (subacute myelo-optic-neuropathy) illness of thousands of Japanese,
caused by the prolonged and too-frequent use of an antidiarrheal medicine,
• The severe birth defects of approximately 10,000 children worldwide that were
caused by thalidomide (Contergan ®) in the late 1950s.
Nonetheless, criminal intrigue and the uncontrolled distribution of faked drugs
from internet-based providers or the unscrupulous pursuit of economic advantages
can still cause such catastrophes today. The melamine-contaminated baby formula
(melamine makes the protein content of inferior or diluted milk seem higher) in
September 2008 in China, from which many thousand toddlers and babies were
sickened and several even died, serves as an example.
Moreover, in addition to the markedly stricter testing guidelines for medicines
that exist today in most countries, there is a reporting system that registers and
investigates adverse drug effect incidents. The slightest suspicion of a causal
relationship results in anything from public announcement or warning all the way
to the withdrawal of the marketing license.
A complication for the estimation of the toxicity is the formation of toxic, and
particularly reactive metabolites, even in small amounts. As was already discussed
in ▶ Sects. 9.1 and 19.6, an ideal drug should contain predetermined cleavage and/
or conjugation sites in addition to finely tuned pharmacodynamics and pharmaco-
kinetics. The more these requirements are fulfilled, the lower the risk that the
substance will exert toxic effects.
Some toxicity studies suffer from the fact that the extrapolated results to humans
reflect a higher toxicity than is actually the case because of the unphysiologically
high doses that are used in the studies. On the other hand, even the most compre-
hensive investigation cannot eliminate the risk of serious adverse effects occurring
in extraordinarily rare cases once the drug is used broadly. An adverse effect ratio
of 1:10,000 or less can remain undiscovered in even the most careful preclinical and
clinical trial.
Toxic side effects in humans are particularly seen after chronic pharmaceutical
misuse. The life-long consumption of large amounts of pain medication sums up to
kilogram amounts. In the case of phenacetin (▶ Sect. 2.1), this led to the conse-
quence that an effective and principally well-tolerated drug had to be withdrawn
from the market because of the kidney damage that resulted from inappropriate
(abusive) use.
Back in 1780 the philosopher Jeremy Bentham discussed the rights of animals. The
first mass protests against animal experiments were over 100 years ago. In 1875 the
dedicated animal protectionist Frances Power Cobbe founded the first Society
Against Vivisection in England, and the demand that anesthesia be administered
424 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
during animal experiments led to the first animal protection law a year later. In
Germany in 1879 the Internationale Gesellschaft zur Bek€ ampfung der
Wissenschaftlichen Thierfolter (International Society for the Abatement of Scien-
tific Animal Torture) was founded, and in 1883 the American Antivivisection
Society followed. A new militant form of protest against animal experiments,
complete with the violent freeing of experimental animals and attacks on scientists,
came in the 1970s. The book Animal Liberation by Peter Singer was published in
1975 and became the bible of all animal rights activists. The often-cited story of
animal trappers that hawk their prey to the pharmaceutical industry, belonged to the
realm of fantasy, even in the early days of drug research. Any pharmacologist
knows that any results obtained from such diverse animals, without any knowledge
of their health history, would be entirely useless.
More in parallel to the development of the animal protection movement than
inspired by it, in the 1960s alternative methods for pharmaceutical research were
implemented that consisted mostly of binding studies on membrane homogenates
and cell culture investigations. The number of animal experiments has been signif-
icantly reduced in the last decades because of the economic motivation arising from
the enormously high costs of breeding and keeping experimental animals, but also
because of the rapid progress of gene technology. As already explained in
▶ Sect. 7.5, models with lower animals such as pinworms, fruit flies, or zebra fish
have been increasingly used in screening. Here the ethical acceptance limit for
animal experiments is certainly lower.
Over 50% of the experimental animals are used for the testing of drugs, 12–15%
in basic research, for investigating medical methods, and for the recognition of
environmental danger. About half of all experimental animals are mice. The rest are
rats and other rodents, and a small part are fish and birds. Only about 1.5% of the
total number are cats, dogs, pigs, and other animals. A large part of the investiga-
tions on the latter species are chronic toxicity studies that are required by law.
The reduction in these numbers is notable because pharmaceutical companies
are investigating the biological activity of more substances than ever before. Each
year, tens or even hundreds of thousands of substances are meticulously character-
ized, usually in automated in vitro tests. Only a few of these substances reach
animal experiments. The number of experimental animals is also to be seen in the
context of the legal requirements of proof of efficacy and safety of new drugs,
which are increasing rather than diminishing. Such experiments must still be
overwhelmingly carried out on animals.
19.14 Synopsis
• Apart from potent and selective binding to a target protein, a successful drug
candidate must exhibit favorable pharmacokinetics. This comprises all processes
that affect the absorption, distribution, metabolism, and excretion along with
minor toxic side effects.
19.14 Synopsis 425
• Due to high costs and enormous experimental effort, full pharmacokinetics and
toxicity studies can only be carried out on a few drug development candidates.
A plethora of test methods have been developed to relate chemical structure with
ADME and toxicology properties in the lead-optimization phase to reduce the
chances of failure due to insufficient pharmacokinetics at a late stage.
• An active substance has to penetrate multiple lipid membrane barriers and aque-
ous compartments on its way from the site of application to the locus of the target
protein. To achieve sufficient distribution, adequate lipophilicity must be present.
This is described by the partition coefficient between the lipid and aqueous phases.
In the simplest model, the distribution between octanol and water is measured.
• Rather sophisticated models have been established to relate chemical structure
with penetration properties. Considerations about the release of the water solva-
tion shell around a drug molecule and its potential to form hydrogen bonds upon
crossing lipid membranes are particularly important.
• Many drugs are either weak acids or bases. Depending on the applied pH, they
exist through dissociation equilibria in either a more lipophilic neutral or more
polar ionized form. Membrane penetration of such species will therefore depend
very much on the local pH conditions.
• Because of the progressively changing pH conditions in the stomach and intes-
tines, at some place along the gastrointestinal tract appropriate pH conditions
exist that allow sufficient penetration of the neutral form of weakly acidic or
basic drug molecules.
• Because of established equilibria, small amounts of the neutral form of an acidic
or basic drug molecule are the intermediate over which the membrane penetra-
tion occurs. Constant removal of the neutral species from the water phase into
the membrane is quickly replenished by the dissociation equilibrium.
• The adjustment of the lipophilicity of a drug is crucial for pharmacokinetics.
Usually the more lipophilic a compound is, the better it will be absorbed;
however, limited solubility in the aqueous phase restricts lipophilicity. Relevant
test models have been developed by using thin layers of human colon cells.
These also allow the absorption by transporters to be studied.
• Active substances are initially tested in simple in vitro test models. Testing is
gradually moved into animal models via cellular assays. Relevant activity–
activity relationships must be established to correlate response in animal models.
At best, results from appropriate in vivo testing allow the therapeutic effects in
humans to be predicted. They standardize test data and reduce the amount of
animal experiments that are required.
• Nature works with two orthogonal principles upon release of its native sub-
stances: the specificity of the biological effect and a pronounced spatial com-
partmentalization. Some compounds are highly specific and travel long ways
through the organism to exert their action. Others are locally synthesized and
stimulate their target protein in the immediate vicinity. Here, high specificity and
selectivity are not required.
• Drugs administered orally or intravenously act systemically on the entire organ-
ism; no organ or cell-specific compartmentalization can be achieved. This has to
426 19 From In Vitro to In Vivo: Optimization of ADME and Toxicology Properties
Bibliography
General Literature
Dearden JC (1990) Molecular structure and drug transport. In: Ramsden CA (ed) Quantitative drug
design, Band 4 von: comprehensive medicinal chemistry, Hansch C, Sammes PG, Taylor JB
(eds) Pergamon Press, Oxford, pp 375–411
Hansch C, Leo A (1995) Exploring QSAR. Fundamentals and applications in chemistry and
biology, 1st edn. American Chemical Society, Washington, DC
Kubinyi H (1979) Lipophilicity and drug activity. Prog Drug Res 23:97–198
Kubinyi H (1993) QSAR: Hansch analysis and related approaches. Wiley VCH, Weinheim
Kubinyi H (1995) Lock and key in the real world: concluding remarks. Pharmacol Acta Helv
69:259–269
Lipnick RL (1990) Selectivity. In: Kennewell PD (eds) General principles, vol 1 from: Hansch C,
Sammes PG, Taylor JB (eds) Comprehensive medicinal chemistry, Pergamon Press, Oxford, S.
239–247
Bibliography 427
Special Literature
Clozel J-P, Fischli W (1993) Discovery of Remikiren as the first orally active renin inhibitor.
Arzneim-Forsch 43:260–262
Dhanaraj V et al (1992) X-Ray analyses of peptide-inhibitor complexes define the structural basis
of specificity for human and mouse renins. Nature 357:466–472
Gitter BD et al (1991) Species differences in affinities of non-peptide antagonists for substance
P receptors. Eur J Pharmacol 197:237–238
Hansch C, Björkroth JP, Leo A (1987) Hydrophobicity and central nervous system agents: on the
principle of minimal hydrophobicity in drug design. J Pharm Sci 76:663–687
Hanson DJ (1991) Dioxin toxicity: new studies prompt debate, regulatory action. Chem Eng News
69:7–14
Kubinyi H (1978) Drug partitioning: relationships between forward and reverse rate constants and
partition coefficient. J Pharm Sci 67:262–263
Lippold BC, Schneider GF (1974) Zur Optimierung der Verf€ ugbarkeit homologer quart€arer
Ammoniumverbindungen, 2. Mitteilung: In-vitro-Versuche zur Verteilung von
Benzils€aureestern homologer Dimethyl-(2-hydroxy€athyl)-alkylammoniumbromide. Arzneim-
Forsch 25:843–852
Matfield MJ (1991) Animal liberation or animal research? Trends Pharmacol Sci 12:411–415
Parker EM, Grisel DA, Iben LG, Shapiro RS (1993) A single amino acid difference accounts for
the pharmacological distinctions between the rat and human 5-hydroxytryptamine-1B recep-
tors. J Neurochem 60:380–383
Seeman P, Van Tol HHM (1994) Dopamine receptor pharmacology. Trends Pharmacol Sci
15:264–270
Tsuji A, Miyamoto E, Hashimoto N, Yamana T (1978) GI absorption of b-Lactam antibiotics II:
deviation from pH-partition hypothesis in penicillin absorption through In Situ and In Vitro
lipoidal barriers. J Pharm Sci 67:1705–1711
van de Waterbeemd H, Kansy M (1992) Hydrogen-bonding capacity and brain penetration.
Chimia 46:299–303
van de Waterbeemd H, van Bakel P, Jansen A (1981) Transport in quantitative structure-activity
relationships VI: relationship between transport rate constants and partition coefficients.
J Pharm Sci 70:1081–1082
Protein Modeling and Structure-Based
Drug Design 20
that is needed for structure-based design. The “folding problem”, that is, the
prediction of 3D protein structure from the amino acid sequence alone, is still
unsolved. The situation occurs increasingly often, however, that the structure of the
protein of interest is unknown, but the structure of another related protein has been
solved. In such a situation, a model of the unknown protein can be constructed on
the basis of the spatial coordinates of the already characterized biopolymer.
Because of this, at least from the point of view of basic research, it could come to
pass that the extremely exciting question of finding Nature’s rules to the folding
problem will increasingly lose its significance.
If one considers that most of the structures of therapeutically relevant proteins were
determined in the last 15 years, it is even more impressive that the first work in
structure-based drug design was carried out in the 1970s. The pioneers in this area
were Chris Beddell and Peter Goodford, who began to develop methods for ligand
design in 1973 at the Wellcome Research Laboratories. Hemoglobin was chosen as
a protein because it was the only known protein structure at the time that had some
relevance for pathophysiology. The goal of this work was to find a ligand that would
exert an allosteric modulating effect, analogous to diphosphoglyceric acid 20.1
(DPG; Fig. 20.1). The hope was to find a therapeutic approach that could help
homozygotic patients with lethal sickle cell anemia (▶ Sect. 12.13). DPG is syn-
thesized in red blood cells. It binds to hemoglobin and reduces its affinity to oxygen.
In this way, oxygen that is absorbed in the lungs can be released in other tissues.
The part of hemoglobin that binds to DPG contains a large number of positively
charged amino acids (Fig. 20.1). An optimal ligand should therefore contain
a negatively charged group to form multiple salt bridges to hemoglobin just
as DPG does. However, such compounds cannot penetrate the membrane of
a red blood cell. Therefore structures that interact with hemoglobin in other ways
were considered in the Wellcome research group. Compounds were chosen
containing reactive groups that could react with the amino groups of the lysines
b2-N-Term O−
O b2-His143
−
P O
b2-His2 O O P −
O b1-His2
O −
O O O b1-N-Term
b1-His143
20.1 b1-Lys82
Fig. 20.1 Schematic binding mode of diphosphoglyceric acid 20.1 (DPG) to the allosteric
binding site of hemoglobin. The ligand is bound through multiple charge-assisted hydrogen
bonds (N-terminal amino groups, His2, Lys82, and His143) from the b1 and b2 subunits.
20.1 Pioneering Studies in Structure-Based Drug Design 431
CHO
20.2 R=H
SO3H
OH
20.4 R=H
HO
R
SO3H 20.5 R = OCH2COOH
Fig. 20.2 Structures of the diphosphoglyceric acid competitive hemoglobin ligands 20.2–20.5
that were developed by Beddell and Goodford.
b2-N-Term N
N b1-N-Term
b2-His143
H
b2-N-Term N
SO3− b1-His2
−O
b2-His2 3S
O N b1-N-Term
H
b1-His143 COO−
b1-Lys82
Addition product of 20.5
Fig. 20.3 Postulated binding mode of the hemoglobin ligands 20.2 and 20.5 after chemical
reaction to the Schiff base or the bisulfite addition product. It is assumed for both compounds
that they bind covalently to the b1 and b2 subunits of hemoglobin through their N-terminal amino
acids. Compound 20.5 should also be able to form a hydrogen bond with its charged groups to the
side chains of amino acids His2 and His143 of the b1 and b2 subunits, as well as Lys82 of the b1
subunit.
in the binding pocket, or with the terminal valine. The idea was to design
a compound containing two correctly spaced reactive groups that could form Schiff
bases with two of these amino groups. Dibenzyl-4,40 -dialdehyde 20.2 (Fig. 20.2)
was chosen as a parent structure. The assumed binding mode of this compound is
shown in Fig. 20.3. Compound 20.2 was synthesized but proved to be too insoluble
for testing. Adequate solubility was achieved by introducing an additional carboxyl
432 20 Protein Modeling and Structure-Based Drug Design
group in 20.3. Moreover, this compound with its carboxyl group should form an
additional favorable interaction with the lysine side chain of the protein. Compounds
20.4 and 20.5 are the bisulfite adducts of the corresponding aldehydes. These
compounds were tested and indeed showed the desired allosteric effect. However,
they bind to the oxy form of the protein and increase its oxygen affinity. They proved
to be potent inhibitors of the erythrocyte deformation that occurs in sickle cell disease
because they stabilize the oxyform. The deformation begins with the aggregation of
the desoxy form. The targeted design of these dibenzyldialdehydes is the first
example of a rational, structure-guided protein-ligand design.
The first step in the design of ligands for a protein with a known 3D structure is the
precise analysis of this structure. What does the protein’s binding pocket look like?
Where are the hot spots, that is, where can the ligand functional groups bind
particularly well? Today, computer programs are available for just such an analysis.
They search the surface of a protein for suitable binding sites for different func-
tional groups. These methods have been introduced in ▶ Sect. 17.10.
Experimental techniques can also help in the search for hot spots. X-ray structure
analysis and NMR spectroscopy are particularly suitable methods (▶ Sects. 7.8 and
▶ 7.9). Alexander Klibanov and Dagmar Ringe first described this strategy. They
initially grew crystals of the enzyme elastase in water and determined its X-ray crystal
structure. Then the crystals were soaked in the organic solvent acetonitrile, and the 3D
structure was determined again. It was shown that the protein on the whole maintains its
structure. On the other hand, significant changes were found in the solvent structure.
Water molecules from the previously determined structure were displaced by acetoni-
trile. Other water molecules remained in their original positions, just as before. This
experiment allowed the differentiation of displaceable versus non-displaceable water
molecules, which presumably relates to their stronger or weaker binding. Furthermore,
additional preferred binding sites of the organic solvent molecule were identified. They
helped to identify and experimentally map out the energetically favorable binding sites
in the protein pocket. At Abbott, NMR spectroscopy was used analogously to elucidate
binding sites by using small molecular probes.
Often, binding affinities for some initial ligands are already available at the time
of the structure elucidation of the target protein. Based on the protein structure, first
attempts are made to derive a crude structure–activity relationship for these first
ligands. Conclusions can be drawn about the essential interactions between the
protein and ligands. If further compounds are discovered by high-throughput
screening (▶ Sect. 7.3), they are docked into the binding pocket to generate ideas
for their subsequent structural optimization. This way, additional areas in the
binding pocket can be identified that are not yet occupied by known ligands. The
indicated additional interactions can be exploited with appropriately modified
compounds to arrive at more potent and selective binders. Also ideas how to simplify
a ligand can be extracted from the information provided by the 3D structure:
20.3 Search Tools for Databases of Experimentally Determined Protein Complexes 433
a substituent of the inhibitor that is not involved in a favorable contact with the
protein can often be removed. On the other hand, such a substituent can be
purposefully changed to improve the solubility, lipophilicity, or transport and
distribution (ADME) properties of the drug candidate (see ▶ Chap. 19, “From
In Vitro to In Vivo: Optimization of ADME and Toxicology Properties”).
In principle, there are two approaches for the design of new active substances.
Either an attempt can be made to find an entirely new structure (Sect. 20.10), or
a known lead structure that was discovered with the techniques discussed in
▶ Chap. 7, “Screening Technologies for Lead Structure Discovery” can be modi-
fied. The modification of a known structure has the advantage that potent and
selective protein ligands can be arrived at relatively quickly. Furthermore, proteins
with known 3D structure, in general, afford more meaningful structure–activity
relationships. The proposed structures, however, will remain very close to the initial
lead structure. Often the 3D structure of an enzyme in complex with a peptide
inhibitor is determined at the beginning. The initially modified lead structures are
usually also peptides. The way to an orally available drug is then rather protracted
under these conditions (▶ Chap. 10, “Peptidomimetics”). The second approach is
represented by de novo design. The corresponding techniques are introduced at the
end of this chapter. De novo design can lead to entirely new, non-peptide structures.
The problem is, however, that this method often leads to an enormous variety of
possible structures that are difficult to prioritize and rank reasonably.
The most important prerequisite for successful structure-based drug design is an
iterative approach. The 3D protein structure is the starting point for the initial
design of a ligand, which is then to be synthesized and tested. In the case of good
binding, an attempt is made to determine the 3D structure of the protein–ligand
complex with the new compound. This structure is the starting point of the next
design cycle. The approach is summarized schematically in Fig. 20.4. The great
advantage of this technique is that all steps of the design hypotheses can be
validated with each cycle. Surprising binding modes that do not correspond to the
original design and that might mask a proper structure–activity relationship also
become immediately apparent. At this point it is also worthwhile to determine the
3D structure of a protein–ligand complex with a ligand that binds poorly with the
protein. This 3D structure usually provides an explanation for the poor binding, and
information is gained that can be translated into proposals for new structures.
3D-Structure 3D protein
determination structure Ligand design
Biologically Proposed
active ligand ligand
Fig. 20.4 The starting point of a design cycle is the determination of the 3D structure of the target
protein. This information is used for the design of new protein ligands that are then synthesized
and tested. If they show activity, their 3D structures in complex with the protein are determined.
On the basis of these structures, ligands with improved binding properties are designed in the next
design cycle.
(a) NEGDAAKGEKEF-NKCKACHMIQAPDGTDIKGGKTGPNLY
(b) -EGDAAAGEKVS-KKCLACHTFDQGGAN-----KVGPNLF
(c) --GDVAKGKKTFVQKCAQCHTVENGGKH-----KVGPNLW
(a) GVVGRKIASEEGFKYGEGILEVAEKNPDLTWTEANLIEYV
(b) GVFENTAAHKDNYAYSESYTEMKAK--GLTWTEANLAAYV
(c) GLFGRKTGQAEGYSYTDA-----NKSKGIVWNNDTIMEYI
(a) TDPKPLYKKMTDDKGAKTKMTFKMGKNQADVVAFLAQBBP
(b) KDPKAFVLEKSGDPKAKSKMTFKLTKDD--------EIEN
(c) ENPKKYI--------PGTKMIFAGIKKKGER-------QD
(a) BAGZGZAAGAGSBSZ
(b) VIAYLK------TLK
(c) LVAYLKSATS
Fig. 20.5 The primary sequences of three cytochrome C proteins arranged using the typical one-
letter code are shown from (a) the denitrifying bacterium Paracoccus denitrificans (134 amino
acids), (b) the proteobacterium Rhodospirillum rubrum (112 amino acids), and (c) the mitochon-
dria of a tuna fish (103 amino acids). The proteins vary in their length and composition. The
sequence comparison shows the alignment with the best agreement. Invariable or conserved
positions in the sequence are marked in bold. The abbreviations stand for A Ala, C Cys, D Asp,
E Glu, F Phe, G Gly, H His, I Ile, K Lys, L Leu, M Met, N Asn, P Pro, Q Gln, R Arg, S Ser, T Thr, V
Val, W Trp, Y Tyr. Dashes stand for areas in which other proteins carry additional amino acids
(insertions). The red bars underscoring the sequences show preferred helical areas.
Fig. 20.6 Superposition of the folded structures of the three cytochrome C proteins from Fig. 20.5
based on a ribbon model: Paracoccus denitrificans in blue, Rhodospirillum rubrum in red, and
tuna fish in yellow (left). The cytochromes bind via a histidine and a methionine to an iron–heme
center. The structures were determined by X-ray crystallography. Structural deviations occur
particularly in the loop regions. On the right side the same superposition is shown, only here the
individual amino acids are color-coded. The same colors in all three ribbon models show identical
amino acids at different positions (color coding: Ala: light gray, Val: chartreuse, Gly: white: Ile:
bright green, Leu: olive green, Pro: pink, Phe: violet, Tyr: dark purple, Trp: light violet, Asp: dark
red, Glu: wine red, Asn: turquoise, Gln: cyan, Lys: blue, His: light blue, Arg: medium blue, Ser:
light orange, Thr: dark orange, Cys: light yellow, Met: dark yellow.
and sequence, these loops are classified into conformational families. They can be
retrieved with the computer and support the construction of the spatial arrangement
of a modified loop. The validation of the relevance of these protein models follows
empirical rules. They check whether their constructed geometry is in agreement
with experimental evidence. For example, it must be ensured that hydrophobic
groups are oriented in the interior, and hydrophilic groups are oriented largely on
the exterior. The contact between amino acids groups is checked, and the chosen
torsion angles are compared with the typically observed values.
If the sequence identity between the known and modeled protein falls below 30%,
the determination of structural homology becomes difficult. All additional infor-
mation must be employed as a resource. An attempt is made to estimate in which
sections of the polymer chain of the modeled protein particular secondary structure
elements are to be expected (▶ Sect. 14.2). If the frequency with which individual
438 20 Protein Modeling and Structure-Based Drug Design
a search for similarities on the recognition determinants of proteins. In this way possible
interactions between proteins can be discovered, or commonalities in metabolic
pathways become transparent.
The modeling of proteins gives good results, above all when they show high
homology. This is given in areas that determine the folding scaffold. The binding
pockets fall on areas of the loop regions (▶ Sect. 14.4). It is particularly there that
even homologous proteins differ severely. Therefore the model constructions do not
achieve the desired accuracy in these regions. An improvement can be achieved
here if a ligand is already placed in the assumed binding region during model
construction. Model and placement must be finally optimized in an iterative process
by using appropriate energy functions. In this way, models for G protein-coupled
receptors have been constructed (▶ Sect. 29.2) that are sufficiently accurate for
successful virtual screening.
The next step after the analysis of the binding pocket of the either experimentally
determined or modeled protein is the actual ligand design. Here different
approaches to computer-aided design are available to suggest new protein ligands.
A docking program can be consulted with which successively preselected ligands
from a database are placed into the binding pocket (Fig. 20.7). Typically the
database is assembled with molecular candidates that largely resemble customary
drug-like molecules (▶ Sect. 7.6). Another approach starts with a “seed” in the
binding pocket. By starting from this point, the ligand grows stepwise in the binding
pocket. This principle is followed by most de novo design programs. The placement
of the first “seed” is critical. Such approaches are especially successful when there
is a particular hot spot in the binding pocket from which the further optimization
starts. Salt bridges to charged amino acids or the coordination of metal ion centers
are especially well suited for this approach. This concept has been successfully used
on, for example, the serine proteases trypsin and thrombin (▶ Sect. 23.4) and the
zinc-containing carbonic anhydrases (▶ Sect. 25.7). Another approach starts with
multiple small fragments that are placed in the binding pocket. Next an attempt is
made to link the fitted molecular fragments to one another with appropriate spacers.
This strategy could be successfully applied several times by using the SAR-by-NMR
methods (▶ Sect. 7.8).
Docking tries to fit potential protein ligands into a binding pocket using the
computer. For this, a docking program takes one candidate after the other succes-
sively from a precompiled library of molecules. A 3D structure is generated for
each entry. If a flexible molecule is encountered, either multiple conformations are
saved or they are generated on the fly during docking. In the next step, each
440 20 Protein Modeling and Structure-Based Drug Design
N
H
Linking
Placement
Construction
N N N
H H H
O ?
O O O
H ? H ? H
?
O O O
N O
N
H H
S O
O O
H H
O O
N
H
O S
H
N O
H
Fig. 20.7 Possible strategies for ligand design. The complete 3D structures of possible ligands are
fitted into the binding pocket during docking (left picture, insert). The construction of new
molecules is sketched in the middle and right part of the picture. In principle there are two
possibilities. A fragment can be placed as a seed, and step-by-step other groups can be attached
(middle). Alternatively, multiple small molecular fragments can be placed in the binding pocket
independently of one another and later linked with one another (right).
20.8 Docking Ligands into Binding Pockets 441
molecule is fitted into the binding pocket. First, the structures that cannot bind to the
protein are discarded. In addition, other structures are eliminated that cause obvious
problems, for example, due to electrostatic repulsions with the protein in the
assumed docking mode. Typically a docking program generates multiple solutions.
These are scored on the basis of the generated binding geometries, and their affinity
is estimated.
Irwin Kuntz is a pioneer in the field of docking programs; the program DOCK
was developed in his group at UCSF in San Francisco. In the original version in
1982, only the steric complementarity of ligands and proteins were evaluated. For
this the shape of the binding pocket was approximated by a set of different spheres
so that the pocket was completely filled. Next a mathematical method was used to
place the test ligands on this distribution of spheres. The complementarity,
a measure of direct protein–ligand contacts, served as a scoring function. Since
the first version, DOCK has developed much further. The program now uses a force
field for scoring and calculates the contributions for desolvation. Even the place-
ment of the ligands is conducted flexibly by considering rotatable bonds.
A different docking prototype was developed at GMD in Bonn by Matthias
Rarey. The program FlexX represented the first program that could quickly handle
ligand flexibility during docking. It disassembles the test ligands into individual
fragments and subsequently uses an algorithm that works very similarly to the
positioning in the program LUDI (Sect. 20.10). After placement of the first building
block, the ligand is successively reconstructed in the binding pocket. Different
conformers along the rotatable bonds are considered for this purpose. The program
maintains stored tables of preferred torsion angles, similar to those described in
▶ Sect. 16.6. The energetic evaluation of the placement is carried out at this step.
The program AutoDock from the groups of Art Olson at Scripps in La Jolla, San
Diego, uses a lattice-based algorithm for the placement. By using a force-field
function similar to that in the program GRID (▶ Sect. 17.10), potential values are
placed on a lattice that is embedded into the binding pocket. By starting from
a randomly chosen starting orientation, the ligand is shifted across the lattice until
an optimum is found. In doing so it “feels” the interaction potential with the protein.
Because the potential was already precalculated on the lattice, this evaluation runs
particularly fast. At the same time, twisting around rotatable bonds is performed.
The program GOLD, which was developed by Gerrith Jones in the group of Peter
Willett in Sheffield, England, also uses a lattice for placement. Interaction poten-
tials are, however, parameterized on crystal data. GOLD uses a genetic algorithm to
optimize the geometry. In the meantime a plethora of docking programs has been
developed. All follow a slightly different strategy but are based on the concepts
described here. Some follow the idea that it is better to generate a well-distributed
number of rigid-ligand conformers and then to dock these quickly as rigid bodies.
Today there are three main problems that impose limits on docking. One is the
energetic evaluation of the generated geometries. This will be specially addressed
in the next section. Another is that water plays a decisive role in ligand binding
(Sect. 20.3). Even today no really convincing solution to the handling of water
during docking has been found. The third problem is the flexible adaptation of the
442 20 Protein Modeling and Structure-Based Drug Design
protein (▶ Sect. 15.8). Usually there are small adaptations on the side of the protein
that slightly change the shape of the binding pocket. Indeed, they are large enough
to send the docking programs after proverbial red herrings.
All contacts that rarely occur are classified as unfavorable. Next, a relative
energetic ranking of a set of ligands to the same reference protein can be
performed with a thus-derived function. If trained by using a data set of known
geometries and binding affinities, analogously to the regression-based scoring
functions, an affinity prediction can be achieved. The evaluation with the regres-
sion-based or knowledge-based function is very fast. In the meantime, a vast
number of scoring functions have been developed. None are ideal. In each case it
must be checked which function affords the best performance for the protein
under investigation.
The first program for stepwise de novo design was GROW from Jeffrey Howe and
Joseph Moon at the company Upjohn. It concentrates on peptides as lead structures.
An amide group is positioned in a favorable orientation in the binding pocket.
Next amino acids are added stepwise to the starting amide group. At each step,
a large variety of different conformations of all 20 proteinogenic amino acids are
attached to the seed on the fly. For each, the “best” solutions are followed further.
In this way, GROW constructs a peptide ligand in the binding pocket with increas-
ing length.
In the beginning of the 1990s, Hans-Joachim Böhm developed the program
LUDI at BASF. The underlying idea was to read small molecules or molecular
fragments from a database with precalculated spatial geometries, and to position
them in the binding pocket so that hydrogen bonds are formed with the protein, and
hydrophobic pockets are filled with non-polar groups. The program needs the
coordinates of the protein as well as a library of 3D structures of fragments or
drug-like molecules as input.
The precalculation of so-called interaction sites is decisive. These are placed in
the binding pocket around the amino acid groups in terms of fitting points or
directional vectors (Fig. 20.8). The program uses rules that are derived from the
non-bonded interactions found in the crystal packing of small organic molecules
(▶ Sects. 14.7 and ▶ 17.10). Then LUDI extracts small molecules or molecular
fragments from a 3D library. For each entry an attempt is made to position them
into the binding pocket of the protein so that as many of these interaction sites are
satisfied as possible (Fig. 20.8 and ▶ Fig. 17.12). Next, all of the successfully
placed fragments are ranked. The scoring function that is used for this considers
the number and quality of the formed H-bonds and ionic interactions, hydropho-
bic contact surfaces shared between the protein and ligand, as well as unfavorable
contributions arising from the number of rotatable bonds in the ligands.
An example for the successful application of this program is described in
▶ Sect. 21.5.
As the first prototype, LUDI was the gold standard for many de novo design
programs that were developed later. In these approaches improved scoring
444 20 Protein Modeling and Structure-Based Drug Design
functions were implemented, and the fragment libraries were improved. The
programs were also taught synthesis rules, so that the chemical accessibility of
the generated molecules was not completely neglected. The search space for the
programs was also enlarged in that multiple conformations and configurations
could be screened.
A de novo design program represents an idea generator. Its value is, of course,
determined by concepts that went into its development. On the other hand, the
values also strongly depend on the user and how the suggestions of such a program
are interpreted and used for further design.
20.12 Synopsis 445
Certainly many examples proving the scope of de novo design, virtual screening,
and docking have been provided. The example described in ▶ Chap. 21, “A Case
Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase” was
successful because of the massive deployment of such methods. It is decisive
however, that the computer methods are tightly embedded into an iterative process
with synthesis and experimental structure determination. It must be kept in mind
that not all hits from the computer screening are found based on correct assump-
tions and picked up for the right reasons.
The predictive power of the available methods is still limited. The synthetic
accessibility of a suggested molecule is not sufficiently considered, the flexibility of
the protein is neglected, and the methods to estimate the binding affinity are still too
inaccurate. This is because the process and factors responsible for molecular
recognition and ligand binding are still too poorly understood. The correct descrip-
tion of solvation effects and the incorporation of water molecules in the binding
process represent large problems. The contribution of a hydrogen bond to binding
affinity, despite all efforts to the contrary, is still only an estimation. As for
lipophilic interactions, it can at least be assumed that the filling of an unoccupied
lipophilic pocket with additional non-polar substituents is in most cases accompa-
nied by an increase in binding affinity.
How the changes in the binding entropy contributing to ligand binding free
energy can be considered is largely unclear. At least evidence is collected that the
oversimplified assumptions that entropic contributions within a set of congeneric
ligands should be constant is not given.
There are still further fundamental limitations of this approach. The most
important is certainly that the technique is limited to the optimization of direct
interactions with the protein. Successful binding to a target protein is of central
importance for any active substance. However, to be suitable as a drug, additional
prerequisites must be fulfilled. Among these are good selectivity, metabolic stabil-
ity, adequate duration of action, low addictive potential, and negligible toxicity.
Today, at the very least the selectivity of a compound to members of a structurally
related protein family can be estimated with some certainty.
Fully automated molecular design on a computer is indeed not possible, even in
the long term. The methods of structure-based design are of value as idea genera-
tors. The obtained proposals must be checked and, when necessary, modified. Time
will tell whether the methods will gradually approximate the “holy grail” of drug
design: the design of drug molecules from scratch.
20.12 Synopsis
This requires a 3D structure of the reference protein. The goal is to optimally fill
the binding pocket by satisfying non-bonded interactions with the functional
groups of the binding-site residues.
• Structure-based design starts with a detailed analysis of the binding pocket to
elucidate hot spots for putative interactions with the protein. Either experimental
methods or computational tools can be used to perform an active site mapping
with molecular probes or small solvent-like molecules.
• In an iterative process of structure determination, modeling of modified ligands,
docking and screening, synthesis, and biological testing, the properties of small-
molecule ligands are improved to optimize binding to the target protein.
• Databases have been developed to retrieve and compare structural information
about the exponentially growing body of structural data on protein–ligand
complexes. They allow comparison of binding poses, active-site interaction
geometries, protein–ligand binding motifs, and the original solvation structures
in the protein’s binding pocket.
• Proteins can be compared in terms of their exposed binding pockets. The shape
and the exposure of groups which have particular physicochemical properties in
binding pockets are compared and help to design small-molecule ligands with
the desired selectivity. Ideas for isosteric replacements on the ligand scaffold can
also be generated in this way.
• If an experimentally determined structure of the target protein is unavailable,
a homology model can be constructed by using a related protein of known archi-
tecture as a template. The accuracy and the success of such homology modeling
depend strongly on the sequence homology with the template structure.
• Tools for secondary-structure prediction and amino acid replacement propensi-
ties have been developed to improve the reliability of the sequence assignment
of proteins to the 3D structure of the reference template.
• An alternative approach starts with a small molecule seed fragment and grows
putative ligands from this starting point into the binding pocket. Two non-
overlapping fragments can also be linked and turned into a larger ligand with
improved binding properties.
• The geometry of a constructed protein–ligand complex must be evaluated in
terms of the expected binding affinity in all structure-based design strategies.
A large variety of scoring functions are used to predict the binding affinity based
on the geometry of the formed complex.
Bibliography
General Literature
Beddell CR (ed) (1992) The design of drugs to macromolecular targets. Wiley, Chichester
Böhm HJ, Schneider G (2006) Molecular recognition in protein–ligand interactions. In: Mannhold
R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry, vol 19.
Wiley-VCH, Weinheim
Bibliography 447
Böhm HJ (1993) Ligand design. In: Kubinyi H (ed) 3D QSAR in drug design. Leiden, Escom,
pp 386–405
Borman S (1992) New 3-D search and de novo design techniques aid drug development. Chem
Eng News 10:18–26
Branden C, Tooze J (1999) Introduction to protein structure, 2nd edn. Garland, New York
Goodford P (1984) Drug design by the method of receptor fit. J Med Chem 27:557–564
Greer J, Erickson JW, Baldwin JJ, Varney MD (1994) Application of the three-dimensional
structures of protein target molecules in structure-based drug design. J Med Chem
37:1035–1054
Hubbard TJP, Lesk AM (1995) Modelling protein structures. In: Goodfellow JM (ed) Computer
modelling in molecular biology. Weinheim, VCH
Hutchins C, Greer J (1991) Comparative modeling of proteins in the design of novel renin
inhibitors. Crit Rev Biochem Molec Biol 26:77–127
Kuntz ID, Meng EC, Shoichet BK (1994) Structure-based molecular design. Acc Chem Res
27:117–123
Kuntz ID (1992) Structure-based strategies for drug design and discovery. Science 257:1078–1082
Martin YC (1992) 3D database searching in drug design. J Med Chem 35:2145–2154
Müller K (1995) De novo design. In: Anderson PS, Kenyon GL, Marshall GR (eds) Persp drug
discovery and de-sign, vol 3. Escom, Leiden
Schneider G, Baringhaus KH (2008) Molecular design. Wiley-VCH, Weinheim
Special Literature
Böhm HJ (1992a) LUDI: rule-based automatic design of new substituents for enzyme inhibitor
leads. J Comp-Aided Molec Des 6:593–606
Böhm HJ (1992b) The computer program Ludi: a new method for the de novo design of enzyme
inhibitors. J Comp-Aided Molec Des 6:61–78
Henderson R, Baldwin JM, Ceska TA, Zemlin F, Beckmann E, Downing KH (1990) Model of the
structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. J Mol Biol
213:899–929
Hibert M, Trumpp-Kallmeyer S, Hoflack J, Bruinvels A (1993) This is not a G-protein-coupled
receptor. Trends Pharm Sci 14:7–12
Hoflack J, Trumpp-Kallmeyer S, Hibert M (1994) Re-evaluation of bacteriorhodopsin as a model
for G-protein-coupled receptors. Trends Pharm Sci 15:7–9
Overington J, Johnson MS, Sali A, Blundell TL (1990) Tertiary structural constraints on protein
evolutionary diversity: templates, key residues and structure prediction. Proc Roy Soc Lond B
241:132–145
Ring CS et al (1993) Structure-based inhibitor design by using protein models for the development
of antiparasitic agents. Proc Natl Acad Sci 90:3583–3587
Sali A, Blundell TL (1990) Definition of general topological equivalence in protein structures.
J Mol Biol 212:403–428
Schertler GFX, Villa C, Henderson R (1993) Projection structure of Rhodopsin. Nature
362:770–772
Travis J (1993) Proteins and organic solvents make an eye-opening mix. Science 262:1374
A Case Study: Structure-Based Inhibitor
Design for tRNA-Guanine Transglycosylase 21
Shigella attack the epithelial cells in the intestines. To gain entrance to these cells, the
bacteria produce their own virulence factors, so-called invasins. These are proteins
that form a sophisticated apparatus with the proteins on the epithelial cells, which
allows the penetration and proliferation of the bacteria in the infected cells. The gene
for the virulence factors are on a plasmid. Their expression in cases of infection is
regulated by different transcription factors. The factor VirF is particularly respon-
sible for the pathogenesis of the bacteria, and altered tRNA molecules are needed so
that it can be efficiently synthesized in the ribosome. tRNA is a ribonucleic acid that
is made of about 80 nucleotides (Fig. 32.15, ▶ Sect. 32.6). It is loaded with an amino
acid at the end that corresponds to the base-pair triplet in the middle loop, the so-
called anticodon loop. The genetic information encoded in the base-triplet of the
mRNA is transferred when the mRNA binds to the corresponding tRNA in
the ribosome during translation. This tRNA carries the right amino acid so that the
growing peptide chain of the nascent protein is correctly constructed. The changes in
the required tRNA affect the base in position 34 of the so-called wobble region.
A modified base must be incorporated at this site. If these changes do not occur,
the translation remains inefficient. Shigella could then barely produce enough of the
needed invasins to infect the epithelial cells. Their pathogenic potential is therefore
severely reduced.
The bacteria have enzymes that can carry out these changes in the tRNA.
In the first step, a guanine 21.1 is cut out of the tRNA molecule at position 34
and replaced with an altered base, preQ1 21.2 (Fig. 21.1). This step is catalyzed by
tRNA-guanine transglycosylase (TGT). The exchanged base in the tRNA is
further modified in the next step of the enzyme cascade so that the base queuine
is obtained as the final product. TGT inhibitors therefore represent a specific
therapeutic principle to selectively attack the pathogenicity of Shigella.
In contrast to a therapy with broad-spectrum antibiotics, the bacteria are not
killed but rather the disease-causing infection of the epithelial cells is prevented.
21.3 The Crystal Structure of tRNA-Guanine Transglycosylase as a Starting Point 451
a b
O NH2 O NH2 tRNA
TGT
HN HN
H2 N N N H2N N
H tRNAG34 N
G tRNA
preQ1
O tRNA-preQ1
21.2
HN N
H2N N N QueA
H
SAM
21.1
OH O OH U
G N
OH OH U
HN HN
O O
HN ? HN TGT
N Vit B12 N
H 2N N H2 N N
tRNA tRNA
U
tRNA-Queuine tRNA-oQueuine
Q N
U
Fig. 21.1 The enzyme tRNA-guanine transglycosylase (TGT) catalyzes the exchange of guanine
21.1 for preQ1 21.2 in tRNA (a). Next, the further modification of this base to queuine, which is
incorporated in the tRNA is achieved by other enzymes. The exchange of the base takes place in
the wobble position of the anticodon loop of the tRNA (b).
First, the crystal structure determination of TGT in complex with preQ1 was
determined in a related species. It shows an exchange of a Phe for a Tyr in the
active site, which is immaterial for substrate or ligand binding. Later, the structure
in complex with a part of the tRNA was elucidated (Fig. 21.2). According to these
structures, the base exchange occurs along to the following reaction pathway
(Fig. 21.3). Initially the tRNA binds to the covalently attached guanine. The base with
its ribose moiety is pulled out of the tRNA molecule and is specifically recognized by
Asp102, Asp156, Gln203, Gly230, and Leu231. The reaction starts with a nucleophilic
attack at carbon C1 of the ribose ring. The C1–N bond is cleaved, and guanine is
released. The base leaves the binding pocket with a water molecule, and preQ1 is taken
up into the same binding site. For this, the peptide bond between Leu231 and Ala232
452 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
U35
Ribose 34
must flip over. The basic nitrogen atom of preQ1 then releases a proton and carries out
a nucleophilic attack on the ribose, which is covalently bound to Asp280. Once the new
bond to the tRNA is formed, the altered tRNA leaves the enzyme. Asp102 is critically
involved in the recognition process of the bound base. Furthermore this amino acid
probably provides the proton that is required for the mechanism, or picks up a proton in
the other step again.
a b
Glu235 Leu231 Glu235 Leu231 Gln203
Gly230 Gln203 Gly230
OH OH
O N
O O N O H
H NH
Ala232 N H H2N O
Ala232 H2N O
W1 W1
O O
Asp280 Guanine H
N O O
O NH Asp280 O N7 NH
− −O O
−
O N N O
N NH2 N NH2
O −
Asp156 Asp156
RO O O O HO O
OH RO OH
tRNA - Asp102 W2 Asp102
guanine34 OR tRNA- OR
ribose34
c d
Glu235 Glu235
− Leu231
O Gln203 − Leu231
Gln203
O Gly230 O
Gly230
H O H
N N N
O H N
Ala232 H2N O O H
+ Ala232 H2N O
H3 N O +
Asp280 H3N O
Asp280 O
O NH O
O NH
O N N −O − −
NH2 O N O
H N NH2
Asp156
O HO O O − Asp156
RO O O
RO OH OH
W2 Asp102
tRNA- tRNA- Asp102
OR preQ134 OR
ribose34
Fig. 21.3 Mechanism of the base-exchange reaction in glycosylase. The tRNA with guanine 34
binds, and a water molecule makes contact with the nitrogen atom in the 3-position. Asp280
nucleophilically attacks the C1 carbon atom of the ribose ring (a). The C1–N bond is broken and
guanine is released (b). It leaves the binding pocket together with a water molecule. PreQ1 is taken
into the same binding site where the peptide bond between Leu231 and Ala232 is flipped over (c).
After deprotonation, the basic nitrogen atom of preQ1 carries out a nucleophilic attack on the
ribose, which is covalently bound to Asp280, and a new bond to tRNA is formed. The altered
tRNA leaves the enzyme (d).
A potential inhibitor can also compete with this uptake into the binding site, but
must not be much larger than guanine or preQ1. In this way small inhibitors display
a different inhibition profile than structurally larger inhibitors.
Radioactively labeled guanine is used to measure the inhibition. If this guanine is
added to the tRNA, the TGT catalyzes its incorporation, and the tRNA molecule
becomes radioactively labeled. If the tRNA is separated at fixed intervals, and the
incorporated radioactivity is measured, the reaction kinetics of the incorporation
process and therefore the catalytic rate of the enzyme can be followed. If potential
inhibitors are added, fewer TGT molecules are available for the transformation, and
the incorporation rate is reduced. This can be seen in the observed enzyme kinetics.
The inhibition constants can be determined by detailed analysis of the kinetics.
454 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
TGT
tRNA
Guanine
preQ1
tRNA-Competitive Inhibitor
tRNA
Base-Competitive Inhibitor
Fig. 21.4 The base-exchange reaction takes place in two steps. Inhibitors can compete with the
binding of the complete tRNA (left, dark gray) as well as the exchange of the small nucleobase
(middle, light gray).
Whether the inhibitors interfere competitively with the binding of the entire tRNA, or
whether they compete with the exchange of the small base can also be differentiated.
In the beginning of the project, only the structure of the binary complex of TGT
with preQ1 was known. The two-step inhibition mechanism explained in the last
section was also unknown at the time. During the course of the project Bernhard
Stengl managed to clarify the details of this process. Ulrich Gr€adler used the binary
TGT•preQ1 structure as a reference and initiated a search for potential inhibitors
with LUDI (▶ Sect. 20.10). He was able to find hits in a chemical catalog. The
compounds listed in Fig. 21.5 were proposed. Among them, 21.3 proved to be
a micromolar inhibitor. A crystal structure could be determined with this hit
(Fig. 21.6). There was great delight when 4-aminophthalic acid hydrazide 21.3
was shown to bind to the enzyme exactly as LUDI had predicted.
Next, LUDI was consulted to predict further groups for the inhibitor that would
fill in the as-yet unoccupied areas in the binding pocket. On the one hand, an
expansion of the ring system by an additional aromatic ring was proposed. On the
other hand, the placement of a nitrogen-containing heterocycle at the unoccupied
interaction site near Asp102 and Asp280 was considered. Hans-Dieter Gerber
synthesized derivatives 21.4–21.6 (Fig. 21.7). Compounds 21.4 and 21.5 achieved
10-times-better inhibition of the enzyme in the assay than 21.3. The results were
quite different with the heterocyclic derivative 21.6. It was significantly worse than
the initial lead structure. Ulrich Gr€adler was able to solve the crystal structures with
these inhibitors, which exhibited the expected binding mode. It was shown in the
structure with 21.6 that the heterocycle falls very near the terminal amide group of
Asn79. It was then obvious that 21.6 needed an additional amino group to build an
21.5 LUDI Discovers the First Leads 455
H H3C O
CH3O N HN SO2
NH2
NH2
COOH
COOH
H2N O OH
HO OH
O OH
CH3
Leu231
Asp156
Asp102
Fig. 21.6 Crystal structure of TGT with 21.3, the first hit from LUDI. The agreement between the
predicted (above right) and the final experiment is almost perfect. LUDI indicated additional
interaction centers in the lower part of the binding pocket that had not yet been used.
additional contact with the protein. This synthesis was accomplished, and the
crystal structure with 21.7 in fact did show the expected binding mode with the
additional H-bond. However, even this derivative was less potent than the original
lead structure 21.3. A more detailed analysis of the structural data showed that the
appended heterocycle in 21.6 and 21.7 is disordered, and a hydrogen bond between
the exocyclic amino group and the carbonyl group of Leu231 is very long. The
heterocycle was incorporated based on the idea that it would be beneficial to have
a charged group that can also form hydrogen bonds to the two neighboring aspartate
456 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
O NH2 O
H2N
NH NH
NH NH
NH2 O O
21.4 21.5
O O
H2N H2N
NH NH
NH NH
N S O N S O
N N Asp280
NH NH
21.6 21.7
H2N
Asp102
Fig. 21.7 By starting from 21.3, inhibitors 21.4 and 21.5 were developed, which showed an
improved inhibition by a factor of 10 and could better fill the unoccupied area in the binding pocket
that was indicated in Fig. 21.6. Heterocycles 21.6 and 21.7 were introduced to the scaffold to
exploit the additional unoccupied interaction sites (right). Both derivatives showed diminishing
binding affinities, probably because of repulsive interactions with the two neighboring Asp102 and
Asp280 residues. The heterocycles do not display the desired partial-positive charge.
groups. These two groups were assumed to adopt a deprotonated state. Then a positive
charge on the triazole group would be ideal for an interaction. But in what protonation
state are these groups? A pKa measurement was carried out on a related model
compound. A small-molecule crystal structure determination was undertaken on
a crystal grown under the same buffer conditions as the protein complex was crystal-
lized. Both experiments showed that the heterocycle exists without a charge, that is,
both of the neighboring nitrogen atoms are deprotonated. Although it is not obligatory
that the same protonation state is found in the protein’s binding pocket, this model
appears to be plausible to explain the decreasing binding affinity of 21.6 and 21.7: An
uncharged triazole ring between the two negatively charged aspartate groups must
experience a repulsive interaction with at least one of the two acidic groups. This could
reconcile the decreasing binding affinity, the observed disorder, and the elongated
H-bond to the carbonyl group of Leu231.
a Leu231 b
Gly230 Leu231
Gly230
Gln203 Gln203
Asp156 Asp156
NH2 O O O
H H
N N
NH NH NH
NH NH NH
S N
O O O 21.9 O
21.8
21.5
Fig. 21.8 Analogue 21.8 should also emulate the interaction pattern of the original lead structure
(a). If this derivative is placed in the binding pocket (purple), the distance between the polar
nitrogen atoms in the central pyridazinone ring and the carbonyl group on Leu231 seems to be too
large for an H-bond. Nonetheless, 21.8 binds to the protein with micromolar affinity. (b) The
crystal structure that was determined with the very similar inhibitor 21.9 (orange) shows two
surprises: The peptide bond rotates its orientation and now directs its NH group toward the binding
pocket, and a water molecule (red sphere) mediates the interaction with the ligand!
21.7 Hot Spot Analysis and Virtual Screening Open the Floodgate
to New Ideas for Synthesis
How can multiple binding modes be made a virtue out of necessity? Ruth
Brenk used the protein conformers in the structure with 21.3 as well as the geometry
in the complex with 21.9 to carry out a hot-spot analysis (▶ Sect. 17.10). The result
458 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
a Gly230 b Gly230
c Gly230
Leu231 Leu231 Leu231
Fig. 21.9 Hot-spot analysis shows preferred binding areas for a hydrogen-bond donor
(a), acceptor (b), and a hydrophobic group (c). In addition, it was shown that the polar groups
of 21.9 (cf. Fig. 21.8) fall into the preferred binding area. In the bottom left corner of the binding
pocket (near the binding site of the ribose moiety Fig. 21.2, blue) other binding areas are indicated
that were addressed in subsequent design steps.
of this analysis is shown in Fig. 21.9. A virtual screening was performed with the
generated pharmacophore and this produced a plethora of alternative molecular
scaffolds (Fig. 21.10) to occupy the guanine-binding site (Fig. 21.2). Many of the
hits that were discovered in this way proved to be micromolar inhibitors. They
afforded many new ideas for synthetic entry points to develop new inhibitors. Three
scaffolds were chosen for the following work. They are derived from a
pyridazinone (trione, 21.10), pteridine (21.11), and 6-aminoquinazolinone (21.12;
Fig. 21.11). It is conceivable that the last scaffold can be put together by combining
the right half of the natural substrate guanine 21.1 and the left half of the first hit
from LUDI, 21.3.
Let us turn to the distribution of the hot spots in the binding pocket. The new lead
structures all interact at sites in the ‘upper part’ of the binding pocket. However, an
additional favorable binding area, competent to interact with donor properties as well
as hydrophobic moieties, is indicated in the ‘lower left part’ of the binding site next to
the two aspartic acid residues Asp102 and Asp280. These binding sites had not been
exploited in the previous design. Considering the binding mode of the bound tRNA
(Fig. 21.2), the ribose sugar moiety in position 34 is accommodated in this region.
The hot-spot analysis suggests occupancy with a hydrophobic molecular fragment.
A favorable place for an H-bond donor lies a little bit further above. This region
corresponds to the binding site between the two aspartic acids, in which placement
of the two heterocyclic derivatives 21.6 and 21.7 was already attempted. Another
favorable area for an acceptor group is indicated at the rim of this pocket. The 20 - and
30 -hydroxyl groups of the tRNA ribose moiety are placed in this area.
A golden rule in drug design is that the occupancy of an empty hydrophobic pocket
with a lipophilic group leads to an increase in affinity (▶ Sects. 4.9 and ▶ 20.11).
21.8 The Filling of Hydrophobic Pockets and Interference with a Water Network 459
O O O
O O
N N N NH N NH
N NH S S O S
Cl Cl N NH2 N N N
N H N NH2 N NH2
N N NH2 H H H
H O O
O O
N NH N N
N NH N N NH
O N N
N NH2 N NH2
N N NH2 HO H N NH2
N
O
H2N O O
H2N O H
N
NH NH HO O
NH O NH2 NH2 N
S NH2 O NH
N
N N O NH2
O
O O O O
N H O
N N N
NH NH HO NH NH N
N NH2 NH2 O NH
N N NH2 N N NH2
OH N N NH2
N
F
O
O N N N N N N N
N NH NH NH NH
NH
N N NH2 N
N NH2 NH2 NH2 NH2
N
H2N O O
H N HO N N H2N OH
N NH NH NH
NH N
NH N N
N NH2 HO N OH
O OH NH2
O
O
H2N O H O
NH N
N NH2 N
NH NH
N N N
NH2 Cl
N OH I
N
Cl
Fig. 21.10 Proposals from a virtual screening of which a few examples were experimentally
tested and proved to be micromolar inhibitors.
O
H O O
N
NH N H2N
NH NH
NH
RO
N N NH2 N NH2
O O SR R
21.10 21.11 21.12
Fig. 21.11 The pyridazinone (trione 21.10), pteridine (21.11), and 6-aminoquinazolinone (21.12)
scaffolds were sought as possible lead structures for further synthesis and optimization. By adding
appropriate R groups, the synthesis of numerous derivatives was achieved.
460 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
O
H2N
NH
H
N NH2 Ki = 5.6 mM
R 21.12
S S S S S O S S
HN N H3C N
CH3 CH3
O
H
N NH
H
N N NH2
R Ki = 4.1 mM
21.13 OMe
Fig. 21.12 The displayed derivatives were synthesized by adding different R substituents to the
6-aminoquinazolinone scaffold 12.12, and subsequently tested. Surprisingly, even the best com-
pounds from this series remained in the single-digit micromolar region. lin-Benzoguanine 21.13
served as an alternative inhibitor scaffold and hydrophobic groups were attached at the 4-position.
Despite the good inhibition by the basic scaffold, the substituted derivatives could not achieve
a significant improvement in binding affinity.
Accordingly, inhibitors with such side chains were designed and led to the
derivatives displayed in Fig. 21.12. Disappointingly, these showed only a modest
improvement. In addition to the pteridines and aminoquinolinones developed in
Marburg, the lin-benzoguanines 21.13 were advanced by the synthesis program in
Zurich. Thanks to the contributions from Emanuel Meyer and Simone H€orner at the
ETH in Zurich, an entire series of inhibitors was readily prepared for detailed
crystallography studies, which led to the establishment of a structure–activity rela-
tionship. Interestingly, none of the derivatives shown in Fig. 21.12 led to a break-
through improvement in the affinity. As planned, they occupy the small hydrophobic
pocket between Val45, Leu68, and Asn70. A comparison of the individual crystal
structures of these inhibitors and also of the natural substrate tRNA showed that the
amino acid groups of the protein underwent massive induced-fit adaptations upon
binding (Fig. 21.13). These adaptations are very similar to those induced upon binding
to tRNA. Therefore it seemed unlikely that they were energetically very demanding.
The enzyme would otherwise have trouble to adequately bind its substrate. The
straightforward explanation of a possibly too-high energetic cost for this adaptation
was thus excluded as a reason for the lack of improvement in affinity. Bernhard Stengl
and Tina Ritschel re-examined the individual derivatives precisely. It was surprising
21.8 The Filling of Hydrophobic Pockets and Interference with a Water Network 461
O
H
N NH
N N NH2
Asp156
21.14
Asp102
Val45
Asn 70
Fig. 21.13 As a comparison of the crystal structures of the uncomplexed (gray) and with 21.14
bound (brown) showed, the amino acid residues undergo a massive ligand-induced adaptation
upon inhibitor binding to the protein similar to the natural tRNA substrate. This leads to the
opening of a small hydrophobic pocket that is enclosed by Val45, Leu68, and Asn70.
that the small parent scaffold already achieved single-digit micromolar binding.
The addition of small substituents that orient in the direction of the hydrophobic
pocket led only to a loss in the binding affinity. Just the occupancy of the small
hydrophobic pocket with an attached aromatic substituent could compensate for this
initial loss of affinity and recovered the one-digit micromolar binding. A comparison
of the arrangement of the water molecules in different inhibitor structures was very
informative. In the unsubstituted parent structure, multiple water molecules form
a network between the two putatively charged aspartate residues, Asp102 and
Asp280. This network represents a decisive contribution to the solvation of the two
polar acid groups. All of the derivatives listed in Fig. 21.12 create a hydrophobic linker
to cross the area of the water network and to place their hydrophobic substituents in the
small hydrophobic pocket. In doing so, they necessarily destroy the water network.
This has its price!
462 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
O O
NH NH
H3C
N N NH2 N N NH2
CH3 CH3
Fig. 21.14 The quinazolinone derivative with the 7-dimethylamino group, 21.15 loses a factor of
10 in its binding affinity compared to the unsubstituted derivative. If one methyl groups is replaced
with a benzyl group (i.e., 21.16) an increase in potency is obtained. The crystal structure with this
derivative shows that the benzyl group is not oriented in the direction of the small hydrophobic
pocket, but rather is in the uracil33 pocket.
An affinity comparison between the compounds 21.15 and 21.13 (Fig. 21.14)
was eye catching. The derivative with a 7-dimethylamino group on the
quinazolinone scaffold 21.15 lost binding affinity compared to the unsubstituted
derivatives by a factor of more than 10. If one of the methyl groups is replaced with
a benzyl group (21.16) the lost affinity is partially recovered. The crystal structure
of this derivative shows that the benzyl group is not oriented in the direction of the
small hydrophobic pocket, but is rather placed in the direction of a pocket that is
occupied by uracil33 in the natural substrate (Fig. 21.2). With this result, a new
concept for further design was obvious. Under no circumstances should the water
network between Asp102 and Asp280 be traversed by a hydrophobic bridge. On the
other hand, a hydrophobic group that orients in the direction of the uracil33 pocket
should be added to the scaffold of the ligand.
Asp280
carbonyl group of Leu231. The introduction of the amino group in the 2-position,
however, changes the imidazole system into a guanidinium-like group. Such a change
significantly increases the basicity of the scaffold. pKa measurements confirmed
this jump of more than one pKa unit. Calculations to simulate pKa shifts upon
complex formation (▶ Sect. 15.4) indicated an additional shift into the basic area.
The compounds should therefore bind to the protein in its protonated form. As
a consequence, they carry a positive charge on the substituted imidazole moiety. As
a result, the hydrogen bond to Leu231, which is further polarized by the adjacent
Glu235 is converted into a salt bridge. It therefore contributes an important part to the
binding affinity.
It has already been demonstrated that filling the uracil33-binding pocket is
associated with an improvement in the affinity. Therefore groups were introduced
onto the 2-amino group. However, a methylene group was used as a bridge to keep
the amino group unconjugated to the added aromatic substituents. Of the synthe-
sized derivatives, morpholine derivative 21.20 proved to be the strongest binder. It
also has the best water solubility. Interestingly the added side chains in this area are
not clearly visible in the electron density. They are probably in a disordered state in
the binding pocket (Fig. 21.17). This speaks against a good enthalpic interaction for
the groups in this area, but this effect should be compensated for entropic reasons so
that a good contribution to the free energy is achieved in the sum, and the binding
affinity is improved. This situation is explained in an example in ▶ Sect. 4.10.
The question was already addressed as to whether an appropriate side chain on
the ligand can span the region of the water network between Asp120 and Asp280
once attached to the lin-benzoguanine molecular scaffold. We first synthesized the
derivatives with an ethylene hydroxyl 21.25 and an ethylene amino substituent
21.26 at the 4-position (Fig. 21.17). The crystal structures of both derivatives are
identical, and the terminal hydroxyl or amino groups actively participate in the
464 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
O
H 21.13 Ki = 4100 nM
H
N NH H3C 21.17 Ki = 1500 nM
R H2N 21.18 Ki = 77 nM
N N NH2
H
N 21.19 Ki = 58 nM
H3C
H H H H H
N N N N N
S O N
O O
Ki = 55 nM Ki = 35 nM Ki = 70 nM Ki = 35 nM 21.20 Ki = 6 nM
O O
H2N
NH NH
N NH2 N NH2
O
O
H2N
NH
NH
N NH2
N NH2
Ph
S Ph
S
21.23 Ki = 3.8 µM 21.24 Ki = 4.0 µM
Fig. 21.16 Substitution of the lin-benzoguanine scaffold 21.13 in the 2-position leads
to a significant improvement in the binding affinity. Above all, the introduction of a 2-amino
group (i.e., 21.18) alters the basicity of the derivatives so much that they bind in a positively
charged form. Because of this, a charge-assisted hydrogen bond is formed with the carbonyl
group of Leu231, which contributes strongly to the binding affinity. A comparison of
21.21 with 21.22, or 21.23 with 21.24 underscores the fact that this hydrogen bond only
increases the affinity if this part of the molecule is charged. If the charge is missing, the
H-bond-forming amino group can be left out without a concomitant loss in affinity. The
morpholine derivative 21.20 proved to be a nanomolar inhibitor and arranges its side chain in
the uracil33 pocket.
21.9 With a Salt Bridge: Finally Nanomolar! 465
a b
Gly230
Leu231 Gly230
Asp156
Asp156 Leu231
Asp280
Asp102
Asp102
Asp280 Val45
O O O
H H
H N NH H N NH H N NH
N N N
N N NH2 N N NH2 N N NH2
CH3
N 21.20 Ki = 6 nM O 21.27 Ki = 4 nM
R HN
O Asp280 O
21.25 R = OH Ki = 96 nM
21.26 = NH2 Ki = 55 nM
Fig. 21.17 (a) The crystal structure with the morpholine derivative 21.20 does not show a well-
defined difference electron density (green contour net) around the morpholine side chain in the
uracil33 pocket. This observation is an indication of severe disorder over multiple spatial orien-
tations. Computer simulations confirmed this hypothesis and suggest two possible placements for
the side chain. (b) Introducing a hydroxyl function 21.25 or a basic nitrogen atom 21.26 into the
side chain of the 4-position of the lin-benzoguanine scaffold leads to its participation in the
hydrogen bond network between Asp120 and Asp280. Introduction of a hydroxyethylene linker
leads to a loss of affinity by a factor of two compared to the unsubstituted parent structure, the
amino ethylene linker derivative reveals the same binding as the parent structure. The substituents
prevent a collapse in the binding affinity from the destruction of the water network. Compound
21.27 is clearly recognizable in the electron density, forms an H-bond to Asp280, and fills the
small hydrophobic pocket. It binds to the enzyme with Ki ¼ 4 nM.
water network. Interestingly, the hydroxyl derivative is less potent than the
unsubstituted parent structure 21.19 by a factor of 2. In contrast the amino deriv-
ative 21.26 gains about the same potency as the unsubstituted reference. Obviously in
the latter case, attachment of the ethylene amino substituent and the concomitant
perturbance of the water network is just cost-neutral. Most likely the terminal amino
group of the ethylene amino derivative is charged and present as an ammonium group.
Apparently this charge provides an advantage to the sole placement of a hydroxyl
group between the two neighboring aspartic acids. Derivative 21.27, which extends
the ethylene ammonium substituent by a hydrophobic group, experiences a binding
466 21 A Case Study: Structure-Based Inhibitor Design for tRNA-Guanine Transglycosylase
affinity of Ki ¼ 4 nM. The crystal structure confirms the assumed binding mode. It
underscores the relevance of the originally proposed design hypothesis that the water
network should be crossed only by a polar linker and that the filling of the small
hydrophobic pocket achieves a strong increase in binding affinity. In the next step,
appropriate groups were added to the 2- as well as 4-positions. As a result, substances
were obtained that now inhibit the enzyme with subnanomolar potency.
The development of nanomolar TGT inhibitors had to take a few detours.
The many crystal structures that were determined with the system were decisive
for the breakthrough. The basic knowledge for the optimization process can be
summarized in three points. The destruction of the water network can be very
detrimental for the affinity. The filling of a hydrophobic pocket certainly improves
the affinity, but it must be checked whether a linker to place this group can afford an
optimal interaction geometry with the environment. The exchange of a neutral for
a charged-assisted hydrogen bond was critical for the optimization process.
This can be achieved by introducing groups in a molecular building block that
cause a distinct change in the pKa properties of the ligand. Unfortunately, the
inhibitors are not yet suitable for in vivo use. The three potentially positively
charged groups make them very polar. Therefore an attempt must be made avoiding
these charges without losing a large portion of the achieved binding affinity.
21.10 Synopsis
Bibliography
Brenk R, Naerum L, Gr€adler U, Gerber H-D, Garcia GA, Reuter K, Stubbs MT, Klebe G (2003)
Virtual screening for submicromolar leads of TGT based on a new unexpected binding mode
detected by crystal structure analysis. J Med Chem 46:1133–1143
Gr€adler U, Gerber H-D, Goodenough-Lashua DAM, Garcia GA, Ficner R, Reuter K, Stubbs MT,
Klebe G (2001) A new target for shigellosis: rational design and crystallographic studies of
inhibitors of tRNA-guanine transglycosylase. J Mol Biol 306:455–467
H€ortner S, Ritschel T, Stengl B, Kramer Ch, Klebe G, Diederich F (2007) Design, synthesis, and
biological evaluation of inhibitors of tRNA-guanine transglycosylase, an enzyme linked to the
pathogenicity of the Shigella bacterium. Angew Chem Int Ed 46:8266–8269
Meyer EA, Brenk R, Castellano RK, Furler M, Klebe G, Diederich F (2002) De Novo design,
synthesis, and in vitro evaluation of inhibitors for prokaryotic tRNA-guanine transglycosylase
(TGT): a dramatic sulfur effect on binding affinity. Chembiochem 2:250–253
Ritschel T, Hoertner S, Heine A, Diederich F, Klebe G (2009a) Crystal structure analysis and
in-silico pKa calculations suggest strong pKa shifts of ligands as driving force for high affinity
binding to TGT. Chembiochem 10:716–727
Ritschel T, Kohler PC, Neudert G, Heine A, Diederich F, Klebe G (2009b) How to replace the
residual solvation shell of polar active-site residues to achieve nanomolar inhibition of tRNA-
guanine transglycosylase. ChemMedChem 4:2012–2023
Stengl B, Reuter K, Klebe G (2005) Mechanism and substrate specificity of tRNA – guanine
transglycosylases (TGTs): tRNA modifying enzymes from thee three different kingdoms of
life seem to share a common mechanism. Chembiochem 6:1–15
Stengl B, Meyer EA, Heine A, Brenk R, Diederich F, Klebe G (2007) Crystal structures of tRNA-
guanine transglycosylase (TGT) in complex with novel and potent inhibitors unravel pro-
nounced induced-fit adaptations and suggest dimer formation upon substrate binding. J Mol
Biol 370:492–511
Part V
Drugs and Drug Action: Successes of
Structure-Based Design
470 V Drugs and Drug Action: Successes of Structure-Based Design
How many modes of action are there for drug therapy? There are estimations that
the currently commercially available drugs exert their action on approximately
500 target structures. Optimistic prognoses claim that this number can be
increased by perhaps a factor of 10. But this number is still small compared to
the diversity of proteins that play a role in our organism. Our genome has been
sequenced. We know that the number of our genes (about 25,000) is much smaller
than it was originally assumed (▶ Sect. 12.3). The number of relevant proteins for
which these genes code is, however, significantly larger, because, among other
reasons, versatile posttranslational modification and alternative splicing cause
the genetic information to be diversified over multiple protein variants. Accord-
ingly, our genome is mapped, but do we know what function is behind each
individual gene? How can predictions about proteins and their functions and
possible roles in pathophysiology be extracted from this flood of sequence infor-
mation? Many of the proteins that have been discovered in the genome can be
assigned to protein families based on sequence comparisons. Nonetheless,
a significant portion of our genetic information still awaits annotation. The first
step has been taken, but how do the spatial structures of these proteins look, for
which only sequences are know? Which ligands will be recognized by these pro-
teins, and what biochemical role do they assume in our organism? The biochemical
function, that is, the assignment of whether a protein represents, for example,
a protease, an ion channel, or a transporter, still affords no information at all
about what systemic roles the protein takes in the functional processes in a cell
or in a whole organism. The spatial structure of a protein is responsible for this
function. Therefore, the structures of the proteins in our genome are being inten-
sively investigated. The goal is to map the structural space of all proteins as well as
possible. Then it could be possible to find a spatially elucidated and adequately
homologous reference structure for each discovered sequence. Today, the structures
of all members of a few gene families have already been determined. Therefore, it is
only a question of time until we have the spatial structure of all relevant proteins at
our disposal. The way there may be long and hard, but it is clearly sketched out.
Will this revolutionize the market of potential pharmaceuticals and make entirely
new therapeutic approaches possible? The chemical space of all imaginable active
substances and the biological space of all possible pathology-relevant proteins is
discussed in ▶ Sects. 11.4 and ▶ 12.4. Drug design attempts to merge both of these
spaces with one another. There are molecules to be found as candidates for potential
active substances in the cross section of both spaces.
In 2002, Andrew Hopkins and Colin Groom published a summary that accurately
illuminated the drug market at that time (Fig. 22.1). Back then, approximately
20 drugs per year were being launched into the market, and at the moment we are
seeing only a very small change in these circumstances. Approximately half of
today’s drugs inhibit enzymes. Another 30% modulate the behavior of G protein-
coupled receptors (GPCRs). About 7% exert their therapeutic effect on ion channels.
Transporters, nuclear hormone receptors or other receptors for growth factors, inter-
leukins, or peptides such as insulin are influenced by about 4% of the available drugs
each. Then a small portion remains that influences cell-surface integrins or DNA.
These market segments in no way cover the frequency of these target structures in our
genome. For example, the GPCRs are only 2.3% of our genome if sensory GPCRs are
excluded. Approximately 15% of the “druggable” genome, that is, the portion for
which the function can be favorably influenced by a pharmaceutical therapy, is
assigned to the GPCRs. The kinases make up more than 22% of the genome, but
only nine small-molecule inhibitors are commercially available as drugs. However, it
is estimated that about 100 substances are in extensive testing. Therefore, it is to be
expected that the drug market will change in the next years.
In the next chapters, examples of individual target structures will be introduced
that represent potential targets for drug therapy. They are discussed on the basis of
their most important structural characteristics because the structure of the target
generally defines what is needed to qualify a molecule as an inhibitor, agonist,
antagonist, or allosteric modulator. These principles serve as a general concept for
the design of new active substances. In modern drug research, the target structure
for which a new active substance is sought is usually known. In many historical
examples of drug development, this was not initially the case. In the meantime,
however, many modes of action are known. Peter Imming and his research group
have compiled a summary of the modes of action for a broad collection of drugs that
are used today. Furthermore, the database WOMBAT from Tudor Oprea at the
University of New Mexico in Albuquerque offers fast access to functionally
annotated drugs together with their characteristic properties.
GPCRs 30%
Other DNA 1%
Receptors 4%
Integrins 1%
Miscellaneous 2%
To explain how an enzyme prepares its substrate for the transition state we should
consider an example. The crystal structure of creatinase with its natural
substrate creatine 22.1 and a very similar inhibitor, carbamoylsarcosine 22.2, was
determined in the research group of Robert Huber at the Max Planck Institute in
Martinsried, Germany. The enzyme catalyzes the cleavage of creatine to urea and
22.3 How Do Enzymes Push Substrates Toward the Transition State? 475
NH2
NH2
+ O- + O-
H2N N + H2O H2N
H2N +
O
CH3 O CH3 O
22.1
O
O-
H2N N
CH3 O
22.2
Fig. 22.2 The enzyme creatinase cleaves creatine 22.1 with water into urea and sarcosine. The
structurally very similar molecule, carbamoylsarcosine 22.2, is an inhibitor of this enzyme.
sarcosine (Fig. 22.2). For this, the central carbon in the C–N bond in the
guanidinium part of creatine is nucleophilically attacked by a water molecule.
All three C–N bonds in the guanidinium portion exhibit double-bond character and
a planar geometry because of electron delocalization. How does the enzyme
manage to distort creatine in the direction of the transition state of the reaction to
prepare it for the nucleophilic attack as well as for bond breaking? The zwitterionic
creatine is bound through its guanidinium function by two glutamate residues
forming two salt-bridge-like hydrogen bonds (Figs. 22.3 and 22.4). The opposite
acid function finds strongly polarizing bonding partners in two arginine residues.
Furthermore, a water molecule is found near the central imine-like carbon atom in
the crystal structure. A histidine is next to it in the binding pocket. This histidine
orients the water molecule in exactly the right position and also supports the
abstraction of a proton from this water molecule. This increases the nucleophilicity
of the water to generate an OH group. The vice-like fixation of the guanidine
group by the two glutamate residues causes a twisting of this building block, which
is planar in the unbound state. Because of this, the conjugation is disrupted and as a
consequence, the C–N bond that is to be cleaved is significantly weakened.
A nucleophilic attack occurs, and a tetrahedral transition state is formed. At the
same time, the now-protonated histidine is able to polarize the methyl-substituted
nitrogen atom and involve it in a hydrogen bond. This prepares our substrate for the
bond-breaking transition state. After transferring the proton from histidine to
the substrate, a positive charge is formed on the nitrogen atom of the bond that is
to be cleaved. Histidine accepts a proton from the oxygen atom of the tetrahedral
transition state as a C¼O double bond is formed, and the central C–N bond is cleaved.
The products then leave the binding pocket. In this way, the enzyme creates
a stereoelectronically complementary environment for the cleavage reaction. Its
polar groups place the water molecule correctly for the nucleophilic attack, and
histidine induces a pyramidalization of the nitrogen atom in the bond to be broken.
At the same time, it serves as a proton donor as well as acceptor during the reaction.
476 22 How Drugs Act: Concepts for Therapy
a b
Glu O O Glu Glu O O Glu
− − O − − O
O O
O − O O − O
H2N + + NH2 H2N + + NH2
H2N H2N
NH2 NH2
NH NH
HN HN
c d
Glu O O Glu Glu O O Glu
− − O − − O
O O
H2N NH2
H2N NH2
H His His
Phe O Phe O
+
NH +
H3C NH2
N NH H3C HN NH
O − O
H2N + + NH2 H2N +
O − O + NH2
NH2 H2N H2N
NH2
NH NH
HN HN
Fig. 22.3 (a) In the first step, a water molecule is polarized by a neighboring histidine so
that a nucleophilic attack on the imine-like carbon is facilitated. (b) Then, the histidine transfers
a proton to the central nitrogen atom. (c) The substrate reacts further in that a C¼O double bond
is formed and the C–N bond is cleaved. (d) The products urea and sarcosine leave the binding
pocket.
The crystal structure in Fig. 22.4 was determined together with carbamoyl-
sarcosine. This molecule is different from the substrate creatine because of an
exchange of an oxygen for a nitrogen atom. However, because of this, this part of
the molecule does not carry a positive charge as creatine does. The addition of the
nucleophilic OH leads to decomposition and compensation of the charge in the
guanidinium part in creatine. A comparable attack upon carbamoylsarcosine
would lead to the formation of a negative charge next to the two negatively
charged glutamates. This is energetically unfavorable. As a consequence, the
cleavage reaction does not take place on this molecule, instead it blocks the
transformation. The example shows how precisely substrate and enzyme must be
22.3 How Do Enzymes Push Substrates Toward the Transition State? 477
a
Glu H2O
H2N NH2
HN + NH
H3C N Glu
His
O
O−
Arg Arg
b
H2O
Glu
O
O−
Arg
Arg
c H2O
Glu
H H
O
H2N His
NH2 Glu
H3C N HN + NH
O
O− Arg
Arg
Fig. 22.4 (a) The vice-like fixation of the guanidinium group by both glutamate residues causes
a twisting in this portion, which is planar in the unbound state. Because of this, the conjugation is
disrupted and the C–N bond to be cleaved is weakened. The twisting is indicated by the red and
yellow planes that pass through the atoms of the guanidinium group. (b) The neighboring
protonated histidine further polarizes the methyl-substituted nitrogen atom, and involves it in
a hydrogen bond. In doing so the nitrogen atom takes on a pyramidal configuration, by which it
deviates out of the plane (yellow) of its next three neighbors. (c) In the structure with the substrate-
like inhibitor carbamoylsarcosine, a water molecule can be found in the position from which the
nucleophilic attack on the substrate creatine is initiated. This occurs from above and diagonally
behind the C¼N bond.
478 22 How Drugs Act: Concepts for Therapy
in harmony with one another. Small changes can drastically change this system and
convert a substrate molecule into an inhibitor of the targeted transformation
reaction.
Enzymes can be organized into multienzyme complexes that carry out multiple
reactions on one substrate sequentially. They can also form cascades in which one
enzyme activates the inactive precursor of the next enzyme. This activation con-
tinues to the next enzyme, and the next, and so forth. The coagulation cascade
(▶ Sect. 23.3) is activated by two independent pathways, each along multiple steps,
which merge into a common pathway in the end. Because of this, a minor initiating
event is amplified by multiple orders of magnitude. This is good for normal
coagulation after an injury, but in the context of a coagulopathy (i.e., a tendency
to form clots too easily) it can have disastrous consequences!
Quite a number of inhibitors prevent the catalytic effect of an enzyme by occupying
the position at which the substrate binds. Such inhibitors are termed competitive
inhibitors. In addition, there are also allosteric inhibitors that bind at another position
on the enzyme and cause a change in its three-dimensional structure or dynamic
properties. This can prevent the enzyme from adopting the necessary conformation
for catalysis and can lead to a weakening of the catalytic activity. Detailed investiga-
tions of the enzyme kinetics allow for competitive inhibition to be distinguished from
noncompetitive inhibition. According to the type of interactions with the enzyme,
reversible and irreversible inhibitors can be differentiated. In the case of reversible
inhibitors, the binding to the enzyme must be strong so that the transformation of the
substrate can be reliably prevented. Some reversible inhibitors form a covalent bond to
the catalytic center that is chemically labile, and therefore fully reversible, for instance,
a hemiacetal bond. Irreversible inhibitors react with the enzyme by forming
a chemically stable bond. The inhibitors or the reacting groups cannot be detached,
and for the rest of the lifespan of the enzyme until protein degradation in the organism,
the enzyme remains inhibited. Moreover, there are naturally occurring protease inhib-
itors that indeed reversibly bind, but adhere so strongly that the complex is degraded
before the inhibitor is released.
The rational design of an enzyme inhibitor usually starts with the structure of the
substrate. One approach that is particularly successful is to imitate the transition
state with a chemically analogous group that is not attacked by the enzyme. In
the ▶ Chaps. 23, “Inhibitors of Hydrolases with an Acyl -Enzyme Intermediate”;
▶ 24, “Aspartic Protease Inhibitors”; ▶ 25, “Inhibitors of Hydrolyzing
Metalloenzymes”; ▶ 26, “Transferase Inhibitors”; and ▶ 27, “Oxidoreductase
Inhibitors”, many examples for the design of such inhibitors are presented. Overall,
irreversible enzyme inhibitors play a smaller role than reversible inhibitors, but
important drugs such as acetylsalicylic acid (ASA, ▶ Sect. 3.1), omeprazole
(▶ Sect. 3.5), clopidogrel (a thrombocyte aggregation inhibitor), penicillins and
22.5 Receptors as Target Structures for Drugs 479
Adenylate Membrane
Receptor Cyclase Exterior Ion Channel
Gs,
Membrane
γ Gq/11 Interior
β
Na+
α Gi,
Go ATP c-AMP
"second
G-Protein messenger"
Complex Inactive
Metabolite
Protein-
kinase A
Phospho-
diesterase
Inactive Activated
Enzyme Enzyme
Fig. 22.5 Schematic representation of the structure and function of a G protein-coupled receptor
(GPCR). The seven cylinders symbolize the seven transmembrane helices. The extra- and intra-
cellular loops that bind the helices are not shown. After binding an agonist, the a-subunit
dissociates from the so-called G protein complex. If a Gs or Gq/11 protein is present, then an
enzyme is activated that generates an internal hormone, a “second messenger.” For example, the
membrane-bound enzyme adenylate cyclase generates cyclic adenosine monophosphate (cAMP)
from adenosine triphosphate (ATP). This second messenger can further affect target proteins via
protein kinase A, or open an ion channel. To avoid an overreaction, cAMP is constantly being
degraded by the enzyme phosphodiesterase. Gi/0 proteins inhibit enzymes that form second
messengers.
• The different structures of the agonists and receptors and the resulting activation
of different G proteins and effector proteins;
• The different receptor occupancy and density of different cells;
• The location of the cells that produce and release the hormone or neurotrans-
mitter. This is accomplished in very specific cells; neighboring cells or organs
are not involved.
The picture of such receptors can be very complex. For example, in the case of
the acetylcholine receptors, two different groups are distinguished that preferably
bind either muscarine, a toxin of the toadstool Amanita muscaria, or nicotine, the
active ingredient of the tobacco plant, Nicotiana tabacum. In contrast to the
22.5 Receptors as Target Structures for Drugs 481
a b
Acetyl- K+ Cytosolic
cholin Receptor
Hormone
Membrane
Exterior LBD
Homo- or
Heterodimer
DBD
Membrane LBD LBD
nACh Receptor Interior
(Ion Channel) Na+
DBD DBD
DNA
c
Ligand Homodimerized
Growth Ligand Receptor
Factor Membrane
Receptor Exterior
Membrane
Interior
Tyrosine Kinase Activated
Domain Tyrosine
Kinase
Fig. 22.6 (a) The nicotinic acetylcholine receptor (nAChR) is a ligand-gated ion channel
(▶ Sect. 30.4). Here the cylinders do not stand for segments but rather for five separate proteins,
each of which has four transmembrane domains. After binding acetylcholine, the channel is
quickly opened. (b) Soluble receptors dimerize after agonist docking to their ligand-binding
domains (LBD). Here homodimers composed of two identical receptors as well as heterodimers
of two different receptors can be formed. The so-called zinc fingers of the DNA-binding domains
(DBD) recognize very specific sequences of DNA. A particular DNA segment is addressed by
dimerizing two receptor units. (c) Membrane-bound receptors for growth factors and insulin also
dimerize. Two receptors form a complex in the membrane and in doing so activate the intracellular
domain of the receptor, in this case, a tyrosine kinase.
the cytosol, that is, the cell fluid. After binding the agonist, the complex migrates to
the nucleus. There, it binds as a dimer to the signal sequences of the DNA, the
operator and repressor genes, and induces or suppresses the new synthesis of
specific proteins (Fig. 22.6b).
All cytosolic hormone receptors or nuclear receptors are built from common
structural principles (▶ Sect. 28.2). They exhibit domains with a DNA-binding site
and a ligand-binding site. The DNA-binding site is highly conserved, that is, its
amino acid sequence varies very little between the different receptors. It contains
two “zinc fingers” comprising two Zn2+-binding sites that are highly conserved
motifs binding to very specific DNA segments, the so-called recognition sequences.
The ligand-binding site is much more variable. Dimers, either of two identical
receptors (homodimers) or from two different receptors (heterodimers) are formed
for the interaction with DNA. Four zinc fingers in the dimer recognize 12 base pairs
of DNA in total.
Dimerization is also found in other classes of membrane-bound receptors that
do not belong to the GPCR type. Among these are the receptors for growth
factors, for example, for human growth hormone (hGH), epidermal growth factor
(EGF), and insulin (▶ Sect. 29.8). Upon binding to the factor, these receptors
dimerize in the membrane with the extracellular domains. As a consequence, intra-
cellular kinases are activated that are part of the receptor protein (Fig. 22.6c). In
addition, there are receptors that must form complexes of more than two units to
provoke a receptor response. Among these are a series of immunologically important
receptors as well as receptors for the nerve growth factor (NGF) and tumor necrosis
factor (TNF).
Multiple examples of proteins are presented in this section that exert their function
as oligomers. Indeed, oligomer formation is also common in enzymes. There are
many reasons why oligomerization is advantageous. On the one hand, there are
functional requirements that demand, as described above, multiple neighboring
domains. On the other hand, there can be mechanistical advantages, especially with
enzymes. Individual domains of an oligomer are not necessarily independent of one
another. Their catalytic efficiency can depend upon what conditions the other
domains of the oligomer are currently in. This affords an additional possibility to
regulate the protein function. Oligomerization can also have another meaning. The
interior of a cell is crowded with proteins, ligands, substrates, and ions. It must be
compared with a ticker-tape parade given for a winning football team: hectic pushing
and shoving! One way to reduce this number without limiting the catalytic produc-
tivity by sacrificing catalytic centers is the formation of oligomers.
Ion channels, which are embedded in the cell membrane, allow ions to enter or leave
the cell along the corresponding electrochemical concentration gradient when they
22.7 Blocking Transporters and Water Channels 483
are open. The opening or closing of the channel can be either voltage-, ligand- or
receptor-gated. All of these processes occur extraordinarily fast (▶ Sect. 30.1).
The intracellular Ca2+ ion concentration in all cells is a few factors of 10 below
that of the surrounding medium. At the moment of cellular stimulation, all of the
voltage-gated calcium channels are momentarily opened by the arrival of an
electrical signal. An influx of Ca2+ ions into the cell occurs. The intracellular
concentration rises swiftly without ever reaching the extracellular concentration.
In smooth, skeletal, and heart muscle cells, this process induces a contraction. Then
the excess Ca2+ ions are pumped out of the cell, and a resting phase follows. This
process is repeated very quickly in heart cells in a rhythm of less than a second,
corresponding to the length of a heart beat.
Verapamil and nifedipine (▶ Sect. 2.6) affect such voltage-gated calcium chan-
nels and inhibit the influx of calcium ions. They are called “calcium channel
blockers,” which describes the mode of action of these substances. By inhibiting
the influx of Ca2+ ions, the excitability of the cells, for instance, of heart cells, is
decreased, less energy is used, and the muscle work becomes more
economic. Furthermore, calcium channel blockers offer protection from the high
calcium concentrations that are caused by cell demise in poorly perfused areas, for
instance, during a heart attack. A particularly favorable therapeutic effect is their
blood pressure-lowering properties.
The nicotinic acetylcholine receptor (nAChR, Fig. 22.6a) and the family of
glutamate receptors belong to the class of ligand- or receptor-gated ion channels.
Here the opening and closing of the channel is not accomplished by an electrical
impulse but rather by the binding of a ligand.
Many drugs affect ion channels (▶ Chap. 30, “Ligands for Channels, Pores, and
Transporters”). Local anesthetics and antiarrhythmic drugs, which are derived from
the former, are sodium channel blockers; they reduce the excitability of nerve cells.
The venom of the fugu fish, tetrodotoxin (▶ Sect. 6.2), also blocks this channel.
Other antiarrhythmic agents block potassium channels. Substances that stabilize
the K+ channel in an open state, so-called K+ channel openers, act as vasodilators
and decrease the blood pressure. The antidiabetic sulfonylureas are K+ channel
blockers that act on the insulin-producing cells in the pancreas (▶ Sect. 30.2).
Tranquilizers of the benzodiazepine type (▶ Sect. 30.6) increase the binding of
the neurotransmitter g-aminobutyric acid (GABA) to chloride channels. Prolonged
opening of this channel causes an increased influx of chloride ions and with it
a change in the response behavior of the nerve cells. Barbiturates and inhaled
anesthetics also act on the GABA receptors, but on different domains.
Transporters are proteins that affect the active uptake of molecules or ions into
cells. They play a very decisive role in the digestive process. Because amino acids
and sugar cannot cross membranes on their own, they can only be absorbed with the
help of transporters in the digestive tract.
484 22 How Drugs Act: Concepts for Therapy
L-DOPA
DOPA
Decarboxylase
Ion Channel
Presynaptic MAO
Nerve Cell
Ca2+ Inactive
Metabolite
Dopamine
Transporter
Synaptic Gap
Postsynaptic
Nerve Cell
Postsynaptic
Membrane
Dopamine
Receptor
with G Protein Complex
Fig. 22.7 Nerve signal transmission through neurotransmitters is based on a complex interplay of
enzymes, receptors, ion channels, and transporters. Dopamine is produced by enzymatic decar-
boxylation of the amino acid L-DOPA. As with other neurotransmitters, it is stored in special
vesicles. Upon electrical stimulation, Ca2+ ions flow into the cell. This causes the neurotransmitter
to be released into the synaptic gap. The nerve impulse is conducted further by the interaction with
the postsynaptic receptor. Finally, the uptake in the presynaptic cell is accomplished by
a transporter and the neurotransmitter is stored in a vesicle again, or degraded by the enzyme
monoamine oxidase (MAO).
the transporters are differentiated into many families. Most have an even more
complex structure with 12 transmembrane domains (▶ Sect. 30.8).
A few active substances directly target the transporters and displace the natural
ligands. The euphoric effects of cocaine are due to its binding to the dopamine
transporter, which is responsible for the active transport and uptake of dopamine in
the nerve cells. A fast flood of cocaine causes a delayed uptake of dopamine from
the synaptic gap, and this is responsible for the typical physical and psychiatric
effects. A few antidepressants are ligands for the noradrenaline and serotonin
transporters (▶ Sect. 1.4). They are bound, but not transported into the cell. In
contrast, some analogues of amino acids are brought into nerve cells by transporters
and act there as neurotoxins. An overview of the complex interplay of neurotrans-
mitters, enzymes, receptors, and transporters is presented in Fig. 22.7. Some
anti-gout drugs bind to the uric acid transporter. They displace uric acid, inhibit
its absorption from primary urine, and accelerate the excretion of uric acid with the
urine. There are even specific transporters for bile acids.
In addition to the previously described transporters, other representatives of this
protein class are also important for the uptake or excretion of foreign substances into or
out of cells. Tumor cells often react to therapeutic measures by developing multiple
resistance to many structurally diverse substances (▶ Sect. 30.8). Glycoprotein GP
170, also a transporter with 12 transmembrane domains, is responsible for this process.
In contrast to ion channels, ion transporters work against the concentration
gradients. This is an active process that occurs at the expense of energy. Drugs can
influence this too. An example is agents that increase urine production: diuretics.
They inhibit different ion transporters. Na+/K+ ATPase, a pump that exchanges
sodium for potassium ions, is inhibited by the cardiac glycosides, which are
prescribed to treat congestive heart failure. Substances of the omeprazole type
(▶ Sects. 3.6 and ▶ 9.5) inhibit the H+/K+ ATPase, the so-called proton pump.
Nature uses special water channels to regulate water homeostasis, and also to
quickly and selectively transport small, non-charged molecules such as glycerol
or urea across the cell membrane. In contrast to the transporters, and analogously
to the ion channels, these allow water to flow along the osmotic gradient
(▶ Sect. 30.9). Ten isoforms that display different permeabilities have been dis-
covered in mammals. They are tetramers that are composed of six transmembrane
helices. Each monomer unit forms a channel. The channels are partially made
available for water homeostasis by the release of cytosolic vesicles or activation
can be achieved by phosphorylation. Regulation of the water channels by drugs
represents a diuretic therapy concept, but the treatment of parasitic infection has
also been discussed as an additional indication.
The therapy of viral, bacterial, and parasitic diseases attempts to target a pathogen
very specifically. For this, various mechanisms are exploited, for example,
biosynthetic pathways that are either not present in humans in an identical form
486 22 How Drugs Act: Concepts for Therapy
or that do not play an important role in humans. In this way, the danger of adverse
effects can be minimized from the beginning.
Antimetabolites are substances that are incorporated as a false substrate instead
of the natural biological reagents, for example, as enzyme cofactors or in DNA. An
example is the sulfonamide sulfonamidochrysoidine. Its cleavage product sulfanil-
amide (▶ Sect. 2.3) is similar to p-aminobenzoic acid, which is the starting material
in the biosynthesis of an important bacterial cofactor, dihydrofolic acid. Only
bacteria are affected by this. Humans are not dependent on this biosynthetic
pathway. As with other mammals, humans must obtain dihydrofolic acid from
food. A few virostatics and tumor-inhibiting substances are nucleoside ana-
logues. Depending on their structure type, they use a modified base, a modified
sugar, or both. All influence the DNA or RNA synthesis. Aciclovir and a few other
analogues are taken into the cells as Trojan horses in the inactive form, and
“armed” once inside the cell. Their activation is carried out by viral enzymes,
and this process only occurs inside cells that have been infected by the virus
(▶ Sects. 9.5 and ▶ 32.5). Another mechanistic principle tries to interfere with
the translation process so that particular proteins are never manufactured in the first
place by protein biosynthesis. For this, the translation of the mRNA is blocked by
complexation to so-called antisense oligonucleotides (▶ Sect. 32.4). The formed
double-stranded mRNA cannot be read in the ribosome. Such a therapy can find
application for the treatment of exaggerated immune reactions, septic shock,
arterial hypertension, pulmonary emphysema, or pancreatitis.
Many antibiotics, for example, the penicillins and cephalosporins
(▶ Sect. 23.7) inhibit the bacterial cell-wall biosynthesis. In the latter process,
they block the catalytic center of a transpeptidase that shows a similar mode of
action to a serine hydrolase (▶ Sect. 23.7). The antibiotic D-cycloserine, also an
inhibitor of the cell-wall construction, penetrates the interior of the bacteria by
using a D-alanine transporter. Other antibiotics are protein biosynthesis inhibitors
(▶ Sect. 32.6). Tetracycline (▶ Sect. 6.3), streptomycin (▶ Sect. 6.3), and chlor-
amphenicol (▶ Sect. 9.2) also inhibit the protein synthesis machinery. They undergo
an interaction with the 30S or 50S subunit of the ribosome and block ribosomal
peptide synthesis. The elucidation of the spatial structure of the ribosome established
fundamentals that allowed the mode of action of a large number of macrolide
antibiotics to be understood and afforded a perspective on how the mechanisms of
resistance is developed (▶ Sect. 32.6). Antibacterial quinolone carboxylic acids
inhibit gyrase. The latter enzyme causes a twisting, and as a consequence enables
a dense packing of the DNA in the bacterial cells. Without this twisting, there is
simply not enough space in the cell for the genetic material. The so-called polyene
antibiotics are used to treat fungal infections. They form channels in the fungal cell
membrane that causes a loss in intracellular ions and, consequently, cell death.
Azoles inhibit the biosynthesis of ergosterol, which is absolutely required for the
construction of the intact cell membrane.
Alkylating agents play an important role in tumor therapy. Reading and writing
errors occur because of the alkylation of DNA bases, and these errors have a much
22.8 Modes of Action: A Never-Ending Story 487
stronger effect on quickly dividing tumor cells than in normal cells, but they also
have considerable side effects. Intercalating tumor therapeutics are planar mol-
ecules that slip between two base pairs of DNA (▶ Sect. 14.9). The disruption that
occurs as a consequence also leads to errors in cell division. Other DNA ligands
bind in the minor or the major groove on the exterior of the double helix. Taxol
(▶ Sect. 6.1) and the epothilones are important active substances for cancer ther-
apy. They bind to tubulin, a protein that forms tube-like structures: so-called
microtubuli. Because the formation of such structures is an important prerequisite
for cell division, Taxol or the epothilones inhibit this process in a very specific way.
The immunosuppressive ciclosporin (▶ Chap. 10, “Peptidomimetics,” Fig. 10.2)
blocks the activation of the immune system, the so-called helper cells. Two
enzymes are involved in this process. One of them, cyclophilin, is a prolyl cis–
trans isomerase. The other, calcineurin, is a Ca2+/calmodulin-dependent phosphatase.
Ciclosporin acts as “putty” between these two proteins. The complex formation
prevents the activation of helper cells and therefore stops the stimulation of an
immune response. Modern transplant surgery would not be possible without the
immunosuppressive ciclosporin and substances with an analogous mode of action.
The so-called RAS proteins play an important role in tumorigenesis. They are
a family of enzymes with a relatively low molecular weight. RAS proteins with
mutated active centers lose their ability to control cell division, and the cells divide
unstoppably. Therefore they are oncogenic, that is, they cause tumors. Around 50%
of all lung and colorectal tumors have mutated ras genes, and about 95% of the ras
genes in pancreatic tumors are mutated. There are other approaches for therapy.
RAS proteins must migrate from the cytosol, the cell fluid, into the cell membrane
to signal the cell division. For this, they are enzymatically equipped with a farnesyl
group, which anchors the protein in the cell membrane. The prevention of the
membrane embedding by inhibiting farnesyltransferase represents an attractive
approach for targeted cancer treatment (▶ Sect. 26.10). In the meantime, it has been
demonstrated that this principle of blocking the farnesylation of proteins can also be
used to treat parasitic infections. For this, the farnesyl transferases of these parasites
are the target structures for drug development.
Tumor-suppressor genes produce proteins such as the p53 protein that prevent
cell division in the case of DNA damage. Any genetic defect in a cell leading to
a reduced concentration of one or more of these proteins has the consequence that
cells with defective DNA can proliferate. Cell division runs out of control, and
a tumor with additional genetic defects and uncontrolled growth forms.
A vascular occlusion is caused by the aggregation of blood platelets. Proteins on the
cell surface play an important role, for example, the adhesion glycoprotein aIIbb3. Two
of these molecules form a complex with fibrinogen that “glues” the cells together. The
targeted development of low-molecular-weight peptidomimetics (▶ Sect. 10.6)
starting from an RGD motif (RGD stands for Arg-Gly-Asp) represents a great success
in rational drug design (▶ Sect. 31.2). Another system that plays an important role in
the cell–cell recognition between leukocytes and endothelial cells are the selectins. In
cases of inflammation, the E- and P-selectins are upregulated and presented on the
488 22 How Drugs Act: Concepts for Therapy
endothelium, and these prevent leukocytes from rolling along the surfaces of the blood
vessels (▶ Sect. 31.3). After adhesion, the leukocytes penetrate the vessel and migrate
to the site of the inflammation to fight the infection. In some diseases, an excessive
leukocyte infiltration leads to tissue damage. To prevent this, an attempt is made to
interfere with the inflammatory cascade with compounds that block the surface
exposition of selectins. These receptors recognize sugar-like molecular groups on
the leukocyte surface, therefore the development of appropriate antagonists based on
carbohydrates displays a suitable therapeutic concept.
A surface contact must also be formed between the flu virus and the host cell for
infection to take place. The virus docks with its capsule protein, hemagglutinin, to
the host cell to initiate endocytosis. After gaining entry into the cell, it uses the
protein biosynthesis machinery of the infected cell to make copies of itself. After
maturation, the new virus must be expelled from the cell again. For this, the new
virus buds on the cell surface and the bud is finally cinched off. In the last step, the
viral neuraminidase cleaves sialic acid. It is through this acid that the viral hemag-
glutinin is bound to the host cell. This last step can be blocked by neuraminidase
inhibitors (▶ Sect. 31.4). The inhibitors zanamivir and oseltamivir have been very
successfully introduced to the market. The CCR5 receptor antagonist maraviroc has
been launched for the therapy of HIV; the CCR5 receptor acts as an entry gate for
the HI virus, and its inhibition blocks host cell invasion.
The endogenous immune system has developed very efficient defensive
mechanisms. Antibodies represent one such defensive weapon. These proteins are
able to bind to foreign substances very selectively and with high affinity, and to
expose them to phagocytotic cells (i.e., dendritic cells and macrophages) for
degradation. This sophisticated, highly specific recognition system for molecules,
which ranges from very small low-molecular-weight antigens to complex macro-
molecular systems, has been tapped for pharmaceutical therapy (▶ Sect. 32.3).
Today, numerous artificially manufactured antibodies directed against very
different target molecules are found in the therapy of many different diseases.
There is no end in sight because currently about 200 newly developed antibodies
are in clinical trials.
There are only very few really “unspecifically” acting drugs. Antacids, which
neutralize gastric acid purely chemically, belong to this class, as do purely
surface-active substances, for instance, amphiphilic bactericides, fungicides, and
hemolytics. Specific mechanisms of action have been recognized even for the
barbiturates, local anesthetics, inhalation anesthetics, and alcohol, which was
long considered to be an unspecific agent. Frequently the evidence of a specific
effect was provided over the different effects of pure enantiomers of a racemate.
The b-antagonistic effect of an optically active b-blocker is associated with one
enantiomer (▶ Sect. 5.5). The unspecific adverse effects with membranes, however,
are attributed to both enantiomers equally.
Is there anything new to still be discovered? An absolute surprise was the finding
that nitrogen monoxide, NO, a miniscule molecule, is also a neurotransmitter. Sub-
stances that release NO or that interfere with the NO biosynthesis lower or raise the
blood pressure (▶ Sect. 25.8). New subtypes are constantly being discovered for
22.9 Resistance and Its Origin 489
Pathogenic viruses, bacteria, and parasites defend themselves against drug therapy.
In the past the inappropriate and too-broad use of antibiotics led to selection
pressure for resistant strains. Unfortunately, it is the hospitals above all that are
the main location for the emergence and spread of resistant strains. The spatial
proximity and concentration of the most diverse pathogens is virtually unavoidable.
In some cases there are only a few effective weapons left, for example, the
glycopeptide antibiotics. They should be used prudently and purposefully, even if
that goes against the commercial interests of the manufacturer.
Bacterial pathogens overwhelmingly defend themselves against penicillins and
cephalosporins by producing b-lactamases (▶ Sect. 23.7). These are enzymes
that open the four-membered lactam ring of these antibiotics into inactive
cleavage products. During the long time that this substance class was optimized,
metabolically stable analogues as well as specific b-lactamase inhibitors were
developed.
The causative agent of the immune deficiency disease AIDS (▶ Sects. 1.3 and
▶ 24.3), the HI virus, a retrovirus, transfers its genetic information from the RNA
back into DNA. This process is afflicted with an exceedingly high error rate of
about one base mutation per generation. The high mutation rate leads to the fast
emergence and selection of resistant strains. In the last 10 years many active
substances with entirely different modes of action against the HI virus have been
introduced to the market, but resistances to many inhibitors were very quickly
observed, for example, against the HIV protease (▶ Sect. 24.3) or reverse tran-
scriptase inhibitors (▶ Sect. 32.5), and even multiple resistances. The mutated
viruses are even resistant to multiple, structurally different inhibitors! The combi-
nation of different active substances against one and the same target does not help
here much further. Only a combination of active substances that hit the virus at
completely different instances of its lifecycle offers a reprieve.
Tuberculosis is also reemerging. Resistant pathogens require the development of
new therapeutics. After the convincing success of the mosquito extermination
490 22 How Drugs Act: Concepts for Therapy
campaign with DDT and therapy with synthetic antimalarials, malaria is again
progressing in developing countries.
The largest problem in the therapy of tumors is the development of multidrug
resistance (MDR) during the treatment. The resistance is not only against the
causative agent but rather it occurs simultaneously against entirely different
tumor therapeutics. This multidrug resistance is due to the overexpression of
a transporter (▶ Sects. 22.7 and ▶ 30.8), glycoprotein 170, which can largely
eliminate structurally deviating xenobiotics from the cell. Although GP170 prefers
cationic substances, another transporter, the multidrug resistance-associated protein
(MRP) eliminates amphiphilic anionic substances, compounds with polar and
nonpolar character. But amphiphilic substances are also able to break the resistance
of tumor cells. Quantitative structure–activity relationships show that tumor cell
resistance to particular drugs is mainly associated with similarities in their molec-
ular weights, that is, the size of the inducing agent, and its lipophilicity.
22.11 Synopsis
• There are a large variety of known modes of action for drugs. Some of the most
diverse modes of action are found in anti-infective drugs. Furthermore, tumor
therapeutics exploit diverse, toxic modes of action. The goal in addressing these
modes of action in terms of a therapy is to find a pathophysiological process that
is unique, or is as unique as possible to the disease to spare healthy tissue from
damage.
• Drug resistance is an increasingly serious problem and is both an inevitable
occurrence associated with using a pharmaceutical therapy, and a consequence
of the misuse of anti-infectives. There are several mechanisms of resistance
development in bacteria (i.e., enzyme production), viruses (i.e., fast genetic
mutations), and in cancer therapy (i.e., aberrant transporter expression). These
mechanisms are not mutually exclusive.
• The issue of combination drugs is a controversial topic. Some physicians are
against them, and others are in favor of them, and both sides of the argument
have good reasons. Nonetheless, some drug combinations are justifiable and
help with compliance, clinical efficacy and safety.
Bibliography
General Literature
Folkers G (1995) Lock and key – a hundred years after, Emil Fischer commemorate symposium.
Pharm Acta Helv 69:175–269
Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1:727–730
Imming P, Sinning C, Meyer A (2006) Drugs, their targets and the nature and number of drug
targets. Nat Rev Drug Discov 5:821–834
Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are there? Nat Rev
Drug Discov 5:993–996
The journals: Trends in Pharmacological Sciences, Chemistry & Biology, Nature Reviews Drug
Discovery or Pharmazie in unserer Zeit contain in each edition a highly topical article about the
mode of action of a biologically active substance.
Special Literature
Austin DJ, Crabtree R, Schreiber SL (1994) Proximity versus allostery: the role of regulated
protein dimerization in biology. Chem Biol 1:131–136
Hayes JD, Wolf CR (1990) Molecular mechanisms of drug resistance. Biochem J 272:281–295
Rawlings ND, Morton FR, Barrett AJ (2006) MEROPS: the peptidase database. Nucleic Acids Res
34:D270–D272, http://merops.sanger.ac.uk/
Saudou F, Hen R (1994) 5-HT receptor subtypes: molecular and functional diversity. Med Chem
Res 4:16–84
Westkaemper RB (1993) Serotonin receptors: molecular genetics and molecular modeling. Med
Chem Res 3:269–272
Inhibitors of Hydrolases with an
Acyl–Enzyme Intermediate 23
Peptidases and esterases are hydrolytic enzymes; 2–3% of all gene products are
assigned to this group alone. They are therefore, an important group of target
proteins for the design of new medicines and have a special importance
for structure-based drug design. This is reflected in the fact that about 14% of
all known human peptidases are presently being investigated as possible target
structures for drug therapy.
The function of these enzymes is the cleavage of peptide or ester bonds for
which a nucleophile is needed for the attack on the carbonyl group of the amide or
ester bond to be cleaved. A large number of proteins use the OH or SH groups of
a serine, threonine, or cysteine for this purpose. In the following chapters, we will
see other cleaving enzymes that use a different mechanism. During the cleavage
reaction of the hydrolases discussed in this chapter, a temporary covalent bond
between substrate and enzyme is formed. This intermediate, the so-called
acyl–enzyme form, occurs with serine, threonine, and cysteine proteases, but
lipases, esterases, transpeptidases, and b-lactamases also use this reaction
mechanism. The design of inhibitors for these enzymes that act via an
acyl–enzyme intermediate shall be discussed. In the following two chapters,
peptidases that use a water molecule for the primary attack on the peptide bond
to be hydrolyzed shall be discussed: the aspartic and metallopeptidases.
Depending on whether they cleave the amino acid chain at the N or C terminus
or in the center, the peptidases are classified as amino-, carboxy-, or endopep-
tidases. Some of these proteases are relatively unspecific, whereas others are
highly specific and only cleave very particular substrates. These latter enzymes
have the best chances that a selective therapeutic inhibitor can be found causing
only few side effects. Bacteria and viruses have also produced their own pepti-
dases, the inhibition of which can be exploited for chemotherapeutic treatment.
Because these proteins are not endogenous in humans, and therefore also have no
function in us, their inhibition should lead to therapeutic success without risking
severe side effects.
Serine proteases are the most extensive and best-studied class of peptidases. They
are closely related to the esterases and lipases (hydrolases) that hydrolyze ester
bonds. This enzyme class serves the human body in diverse ways. Some
serine proteases, such as, the digestive enzymes trypsin and chymotrypsin, cleave
a broad spectrum of peptides and proteins. Others such as the coagulation enzymes
thrombin and factor Xa are highly selective and only cleave very particular
substrates. Frequently, proteases are expressed in a non-active precursor form, the
so-called zymogens. To transform these into their active form, in many
cases sequence segments of the zymogen polypeptide chain are cleaved that
otherwise serve as endogenous inhibitors of the activated enzyme. The release of
the active form can either occur by autocatalysis (e.g., trypsin) or by other activat-
ing proteases (e.g., the coagulation cascade). An active site serine side chain plays
a decisive role in the catalytic mechanism of serine proteases, esterases, and lipases.
It is characterized by an extraordinarily high chemical reactivity. In chymotrypsin,
only this serine reacts with diisopropylfluorophosphate (DFP), whereas 27 other
serine residues in the enzyme remain unmodified. Upon chemical transformation
with DFP, the enzyme completely loses its catalytic activity.
The digestive enzyme chymotrypsin was the first serine protease for which the
3D structure was determined, by David Blow in Cambridge, England. The num-
bering of the amino acids in serine proteases of the chymotrypsin type is based on
the sequence of chymotrypsin. The spatial structures of a large variety of serine
proteases are now available, of which a few are listed in Table 23.1. The structures
show an extraordinarily pronounced similarity in the active site even for proteases
that have entirely different folding patterns (▶ Sect. 14.7, compare trypsin with
subtilisin). This so-called catalytic triad of Ser–His–Asp is characteristic of serine
proteases. In some of these enzymes, the aspartate can be replaced by a glutamate
whereas some transpeptidases and b-lactamases display a lysine in place of the
histidine in the active site.
As these three amino acids are very far apart from one another in the sequence,
the protein must fold appropriately to bring the three side chains into spatial
proximity to one another. The catalytic serine, found at position 195 in the
trypsin-like proteases, carries out the actual attack on the amide bond that is being
cleaved (Fig. 23.1). The oxygen atom of an unactivated hydroxyl group would not
be reactive enough for this step. Its nucleophilicity which describes its tendency to
attack an electron-poor carbonyl carbon atom, is enhanced by the neighboring
histidine side chain. The imidazole side chain of this histidine can accept
a proton from the serine hydroxyl group, enabling a nucleophilic attack of the now
negatively charged oxygen atom on the partially positively charged carbon atom of
23.2 Structure and Function of Serine Proteases 495
Table 23.1 Serine proteases with physiological importance (X ¼ arbitrary amino acid). The
3D structures of all listed enzymes are known.
Enzyme Cleavage site Function or therapeutic approach
Trypsin Arg–X, Lys–X Digestive enzyme
Chymotrypsin Tyr–X, Phe–X, Trp–X Digestive enzyme
Elastase Val–X Tissue degradation
Thrombin Arg–Gly Blood coagulation
Factor Xa Arg–Ile, Arg–Gly Blood coagulation
Factor VIIa Arg–Ile Blood coagulation
Tryptase Arg–X Asthma
Matriptase Arg–X Oncology
Urokinase Arg–X Oncology
DPP IV Ala–X, Pro–X Diabetes
Furin Arg–X Viral infection
Fig. 23.1 Catalytic mechanism of serine proteases. (a) The peptide substrate binds to the enzyme
in specific pockets on either side of the cleavage site. (b) The oxygen atom of the serine side chain
carries out a nucleophilic attack. This is fostered by the neighboring histidine side chain, which,
supported by an aspartate residue, accepts a proton from the hydroxyl group. (c) The transition
state collapses with formation of an acyl–enzyme intermediate. (d) This is hydrolyzed by the
attack of a water molecule to release the N-terminal cleavage product.
496 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
the amide carbonyl group. The neighboring aspartate can accept a proton from the
histidine imidazole ring, and release it again. In this way, it compensates the
positive charge that is formed on the histidine residue. To stabilize the transition
state formed upon attack on the carbonyl group, serine proteases have another
characteristic structural motif, the so-called oxyanion hole. This is a small pocket
next to the side chain of Ser195 composed of two main-chain NH groups (Fig. 23.1).
In a few cases, the terminal amide groups of asparagine or glutamine can
accomplish this task. The function of the oxyanion hole is to stabilize the negative
charge formed on the tetrahedral transition state and to distort the geometry of
the attacked carbonyl carbon atom from a trigonal-planar to a tetrahedral config-
uration. The formed transition state collapses with release of the C-terminal cleavage
product which carries a free amino group at its end. The N-terminal cleavage
product remains covalently bound to the protease to give an acyl–enzyme
intermediate. In a subsequent step, a nucleophilic attack by a water molecule
again leads to a tetrahedral transition state. This finally collapses with release
of the N-terminal cleavage product. The catalytic enzyme is then ready for the next
transformation.
What happens if the amino acids serine, histidine, and aspartic acid of the
catalytic triad of a serine protease are individually or collectively exchanged for
amino acids without similar functional groups? In 1988, Paul Carter and James
Wells prepared various mutants of the bacterial serine protease subtilisin
(▶ Sect. 14.7) at Genentech. Exchange of the catalytic serine or histidine for alanine
leads to a reduction in the catalytic activity by more than six orders of magnitude.
Surprisingly, exchange of the aspartic acid, the only function of which is to
exchange a proton with histidine, reduced the catalytic activity by more than four
orders of magnitude. The combined exchange of multiple amino acids of the
catalytic triad led to no further reduction in the catalytic activity. The threefold-
alanine mutant, in which the catalytic triad is completely removed, still cleaves the
peptide substrate more than 1,000 times faster than the pure buffer solution! The
substrate remaining binding sites and the oxyanion hole, the structure and proper-
ties of which stabilize the tetrahedral transition state, are responsible for this
acceleration.
Now it is certainly not difficult to destroy the binding site of an enzyme or its
catalytic activity. It is, however, more difficult to purposefully alter its specificity
or function. The subtilisin mutants in which the histidine was exchanged for an
alanine, cleave substrates with the sequence -Phe–Ala–X–Phe- (X ¼ Ala or Gln,
for example) six orders of magnitude more slowly than the unaltered subtilisin
with one exception: A substrate with the sequence -Phe–Ala–His–Phe- is cleaved
only four orders of magnitude more slowly. The histidine of the substrate
takes over the role of the histidine in the catalytic site to a certain extent!
This process is called substrate-supported catalysis. The transformation is indeed
still rather slow, but the specificity of this mutant is distinctly enhanced:
The -Phe–Ala–His–Phe- sequence is cleaved 200 times faster than any of the
other -Phe–Ala–X–Phe- sequences.
23.3 The S1 Pocket of Serine Proteases Determines Specificity 497
Proteases recognize polypeptide chains as substrates. For this task they use a series
of more-or-less-pronounced binding pockets on their surface, as described in
▶ Chap. 14, “Three-Dimensional Structure of Biomolecules”. These are structur-
ally and electronically complementary to the side chains of the substrate. As
a consequence, the polypeptide chain of the substrate will be immobilized on the
surface in the vicinity of the catalytic site. The crevices on the surface look very
different depending on the protease. Surface portions of four different serine pro-
teases from the trypsin family are shown in Fig. 23.2. A comparison of the different
Fig. 23.2 The surface of the trypsin-like serine proteases trypsin, thrombin, factor VIIa, and
factor Xa display deep pockets in the area of the catalytic site. To emphasize this surface
structure better, the color of the surface changes from blue to green to red with increasing
depth. The exposed physicochemical properties in the crevices determine the substrate selectivity
of the protease. The preferred cleavage sequences are indicated in the structures, in which XXX
represents an arbitrary amino acid at this position.
498 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Gly226 Gly226
Gly216 Gly216 NH2 Val216 Thr226
Fig. 23.3 Comparison of the S1 pockets of chymotrypsin, trypsin, and elastase. The binding
pocket of chymotrypsin is tailored for large, lipophilic side chains. The S1 pocket of trypsin binds
amino acids with positively charged side chains through its negatively charged Asp189 residue.
Because of the spatial filling of the side chains of Thr216 and Val226, elastase has a relatively
small S1 pocket and therefore binds small hydrophobic amino acids such as alanine and valine.
serine proteases with different substrate specificities (Fig. 23.3) shows that, above
all, the structures of the S1 pockets of these enzymes are different. The S1 pocket is
largely formed of the sequence segments from 189–195 and 214–220. Significant
differences are unique to the side chains of the amino acid at the positions 189, 216,
and 226. In chymotrypsin, these are Ser189, Gly216, and Gly226. They tailor the
depth and form of this pocket to accommodate the aromatic side chains of the
amino acids phenylalanine, tyrosine, and tryptophan. Correspondingly, chymotryp-
sin preferentially cleaves peptide chains after one of these three amino acids.
Trypsin also has a deep, spacious S1 pocket that is flanked by Gly216 and
Gly226. The negatively charged carboxylate group of Asp189 on the floor of the
pocket is decisive for the recognition of long, positively charged side chains in the
amino acids lysine and arginine in the substrate. In elastase, the S1 pocket is shaped
by the amino acids Val216 and Thr226. Because of this, the pocket is significantly
smaller. It can only accommodate amino acids with short hydrophobic side chains
such as alanine and valine. Amino acids with large groups are no longer accom-
modated. The amino acid 189, a serine, is buried. The substrate specificity of the
described serine proteases is primarily achieved by the recognition of the amino
acid in the P1 position. The neighboring pockets, however, are also important for
substrate binding and selectivity. It is remarkable that the substrate-binding pockets
of serine proteases recognizing the N-terminal part of the substrate (unprimed side,
S1–S4 pockets; ▶ Sect. 14.5) are more prominently established. The pockets on
the unprimed side that anchors the C-terminal part of the substrate are much less
well developed. Because the N-terminal cleavage product remains temporarily
covalently bound to the protease as an acyl–enzyme complex, this part of the
substrate is bound particularly selectively.
These structural characteristics establish how a conceivable competitive inhib-
itor of a serine protease should look: It is decisive that the S1 pocket is filled as well
as possible. The chemical constitution of the parts of the inhibitor that bind in this
region must be complementary to the S1 pocket. In some cases, the occupancy of
the S1 pocket alone is sufficient to generate a selective serine protease inhibitor with
23.3 The S1 Pocket of Serine Proteases Determines Specificity 499
HN NH2 HN NH2
23.5 23.6
Thrombin Ki = 6.5 μM
Table 23.2 Reactive groups that can covalently react with the catalytically active serine.
Inhibitor type Functional group
Irreversible Chloromethylketone –COCH2Cl
Sulfonylfluoride –SO2F
Estera –COOR
Boronic acida –B(OR)2
Reversible Aldehyde –CHO
Ketone –COR (R ¼ Alkyl, –Aryl)
Trifluoromethylketone –COCF3
a-Ketocarboxylic acid –COCOOH
a-Ketoamide –COCONHR
a-Ketoester –COCOOR
a
Reversible as well as irreversible examples are known.
HO
Trp60D
O O
N
His57 Ser195 H N OH
N H
N
Tyr60A H O H
O––Loch O N
HN
O
HO
O
HN Ser195
H2N
+ NH
2
O − Cyclotheonamide A
Tyr228 O
Asp189
Asp189
Fig. 23.5 Crystal structure of the inhibitor cyclotheonamide with thrombin. The inhibitor forms
a covalent bond to the catalytic serine with its a-keto group to form a hemiketal structure. The now
negatively charged oxygen is stabilized by two hydrogen bonds in an oxyanion hole.
His57
N HN
O
N
H
O
H H H HN
N N N
N HO
H O O Ser195
O O
O
H
N
His57 HN
Gly216
P2 NH2
H2N +
Ser195 − O
O
Asp189
P3
-
O –hole
Gly216 P1
Asp189
Fig. 23.6 General binding mode of a peptide chain that is to be cleaved (gray carbon atoms) in
the catalytic site of a serine protease. The amide bond to be cleaved is shown in yellow.
The substrate’s P1 (light-blue) and P2 groups (green) are shown with a surface; they bind in the
S1 and S2 pockets of the protein. Two antiparallel-oriented hydrogen bonds (green) are formed to
the main chain. The H-bonds to the oxyanion hole are in purple, and the direction of the
nucleophilic attack of the Ser195 oxygen on the carbonyl carbon is indicated in blue.
The serine protease thrombin plays a central role in the control of blood coagula-
tion. Thrombin is at the end of a complex, highly regulated cascade of serine
proteases. An injury to the arterial vascular system leads to the situation that
membrane-bound tissue factor that is found outside the vessel comes into contact
with the precursor of the serine protease, factor VII, in blood. The precursor is
activated to factor VIIa, and induces the coagulation cascade. Different factors are
released along the cascade, which are activated by proteases from the previous
step from their zymogen form. Finally, the cascade leads to the release of
“von Willebrand factor,” which binds to thrombocytes, and in doing so initiates
the formation of a blood clot. In addition to extrinsic activation, there is also an
intrinsic coagulation pathway. It is initiated by reduced blood flow or pathologically
altered vasculature. In this case, the coagulation cascade is started to form a platelet
aggregate, which is then stabilized by a fibrin network. Factor X is found in one of
502 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Table 23.3 Relative binding affinity of tripeptide aldehydes on thrombin. Arg–H is for the
aldehyde that was obtained by reducing the carboxylic acid of arginine. The larger the value of
the relative inhibition, the stronger the inhibitor binds to thrombin.
Peptide Relative inhibition
Gly–Val–Ar–H 1
Gly–Pro–Arg–H 9
Phe–Pro–Arg–H 57
D–Ala–Pro–Arg–H 469
D–Val–Pro–Arg–H 1273
D–Phe–Pro–Arg–H 7370
the last steps in which the two pathways merge. All of the different steps involve
proteases that represent conceivable target structures for a drug therapy. Until now,
in particular developments for the enzymes thrombin, factor Xa, and factor VIIa
have been tackled. This has already led to development candidates and marketed
products for the first two.
Thrombin transforms the inactive fibrinogen into reactive fibrin. It forms
a polymer together with aggregated platelets in which different blood cells are
trapped. A thrombus is formed, which is further cross-linked and stabilized
by transglutaminase factor XIII (Sect. 23.8). This is an essential protective
mechanism of the body to ensure wound closure. In particular diseases or
situations, for example, after surgery, after a heart attack, or to prevent stroke
in patients with atrial fibrillation, it is necessary to reduce the coagulation
capacity of the blood. For this reason, there is great interest in the development
of selective, and above all, orally available coagulation cascade inhibitors.
Thrombin cleaves fibrinogen between the amino acids arginine and glycine.
This sequence served as a starting point for the development of the first
synthetic thrombin inhibitors that therefore possessed either an Arg or an
Arg-analogous building block.
In this section, three different approaches for the development of thrombin
inhibitors shall be presented: substrate analogues, benzamidine, and structurally
significantly modified analogues.
One approach for the design of thrombin inhibitors is provided by the P3. . .P30
substrate sequence Gly–Val–Arg–Gly–Pro–Arg of fibrinogen. In the early 1970s,
the Japanese group of Hamao Umezawa established that peptide aldehydes with
C-terminal arginine residues that are isolated from bacteria are potent inhibitors of
some trypsin-like serine proteases. The tripeptide aldehydes that were investigated
by Sándor Bajusz were derived from the amino acids P3–P1 or P30 –P10 , that are the
three amino acids “before” and “after” the cleavage site. The relative binding
affinity of a few peptide aldehydes are summarized in Table 23.3. Interestingly,
the direct comparison of Gly–Val–Arg-H and Gly–Pro–Arg-H shows that a proline
in the P2 position inhibits thrombin about ninefold more strongly. The introduction
of phenylalanine instead of glycine in the P3 position leads to an additional
significant increase in the binding. Then, D-amino acids were investigated in
position P3. Surprisingly, these led to a dramatic improvement in binding affinity.
23.4 Seeking Small-Molecule Thrombin Inhibitors 503
This result was not expected if one considers that the substrate sequence from P5
to P3 Gly–Gly–Gly–Val–Arg contains only achiral glycine residues without
lipophilic side chains that can hardly form any interactions that would correspond
to the D-Phe side chain.
When the above-described work was carried out, the spatial structure of
thrombin had not yet been determined. Wolfram Bode and Milton Stubbs
managed to elucidate the structure of a thrombin complex with a chemically
activated fibrinopeptide, Gly–Asp–Phe–Leu–Ala–Glu–Gly–Gly–Val–Arg-CH2Cl.
This peptide corresponds to the N-terminal portion from P11 to P1 that
thrombin cleaves from fibrinogen. The comparison of this structure with that of
D-Phe–Pro–Arg-chloromethylketone (Fig. 23.7) provided an explanation for the
structure–reactivity relationship found by Sándor Bajusz. The S3 pocket is filled by
both ligands; in the case of the fibrinopeptide, it is achieved by the side chains of
leucine and phenylalanine in the positions P8 and P9. The peptide forms a b-turn
that enables the amino acids in this sequence to be positioned in the S3 pocket. The
same pocket is accessed by the tripeptide through the side chain of the D-amino acid
at position P3.
The compound D-Phe–Pro–Arg-H, synthesized by Bajusz is a high-affinity
thrombin inhibitor (Ki ¼ 75 nM). However, the compound proved to be chemically
unstable. This problem could be addressed by N-methylation of the free NH2
group. N-Methyl-D-Phe–Pro–Arg-H 23.7 (Gyki 14766/Efegatran, Fig. 23.8) is
chemically stable.
Jörg St€
urzebecher and Fritz Marquardt took a different route. They pursued the
goal of managing inhibition without a covalent attachment. Their approach was
based on the finding that aside from trypsin (Ki ¼ 18 mM), benzamidine 23.1
(Fig. 23.4, Sect. 23.3) also inhibits thrombin (Ki ¼ 220 mM). The combination
504 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
O
H
H3C N N O
N H
H
O O
NH
HN NH2
HN NH2 HN NH2
Fig. 23.8 The inhibitor 23.7 (Gyki 14766, efegatran) contains an aldehyde group that binds
reversibly to Ser195. Compounds 23.8 and 23.9 are simple derivatives of the benzamidine that
non-covalently inhibit the enzyme.
of the benzamidine group with a reactive group from Table 23.2 gave potent
thrombin inhibitors. The first low-molecular-weight thrombin inhibitor that
was clinically tested in the 1970s was p-amidinophenylpyruvic acid 23.5
(Fig. 23.4, Sect. 23.3). The compound proved to be efficacious, but its selectivity
was unsatisfactory. The simple benzamidine derivatives 23.8 and 23.9 (Fig. 23.8)
are further typical representatives with micromolar affinity for thrombin, but
without selectivity compared to trypsin.
The coupling of the benzamidine groups with a peptide structure brought
significant improvement. Na-(b-naphthylsulfonylglycyl)-D,L-p-amidinophenylala-
nylpiperidide, 23.10 (NAPAP, Fig. 23.9) was the result of a more than 10-year-
long systematic search for potent and selective thrombin inhibitors. NAPAP was
the most potent representative of the class of low-molecular-weight thrombin
inhibitors (Ki ¼ 6 nM) for a long time, but it has only modest selectivity over trypsin.
In 1989, Wolfram Bode elucidated the crystal structure of thrombin with a bound
inhibitor at the Max Planck Institute for Biochemistry in Martinsried, Germany.
Initially the structure determination was accomplished with the irreversible inhibitor
D-Phe–Pro–Arg-CH2Cl and with NAPAP shortly thereafter. The 3D-structure of the
thrombin–NAPAP complex is shown in Fig. 23.10. The racemic form was used for
the co-crystallization. The result that the p-amidinophenylalanine binds to thrombin
as the D-amino acid was rather surprising. The substrate is composed of L-amino acids
only, therefore it was expected that p-amidinophenylalanine would also bind in the L
configuration.
The groups of the ligand that form polar interactions with the protein can be
directly deduced from the crystal structure. For NAPAP these are the glycine unit in
the center of the molecule (double hydrogen bond to the peptide backbone) and the
23.4 Seeking Small-Molecule Thrombin Inhibitors 505
NH
O CH3 O N
H3C O
H 23.11 CRC220
N
H3C S N Behringwerke
H Ki = 6 nM
CH3 O O
CO2H Thrombin : Trypsin
1:200
HN NH2
O O N
H 23.12 (racemic)
S N
N O IC50 = 15 nM
H Thrombin:Trypsin
O
MeO 1:600
NH2
NH
Fig. 23.9 The thrombin inhibitors NAPAP 23.10, CRC 220 23.11, the latter was developed at the
former Behringwerke, and 23.12 which was derived from 23.10. The two latter compounds have
distinctly better affinity to thrombin and improved selectivity relative to trypsin. The IC50 values
for 23.10 and 23.12 are given for the racemates. Inhibitor 23.11 was measured as an enantiopure
compound.
amidinium group in the S1 pocket for NAPAP. Omitting the positively charged
amidine group will result in a loss of binding affinity because the salt bridge to
Asp189 can no longer be formed. More recent work has shown however that chloro-
substituted aromatic rings can also bind in the S1 pocket and form a hydrophobic
interaction to Tyr228. Today, an arsenal of building blocks is available that can be
used as arginine side chain mimics to fill the S1 pocket of thrombin (Fig. 23.11).
With its naphthyl and piperidyl side chains, NAPAP largely fills the lipophilic S3
pocket and the spatially rather limited S2 pocket (Fig. 23.10). However, it seems as
if even larger substituents could fit in the S3 pocket. A weakness of NAPAP was its
inadequate selectivity compared to the digestive enzyme trypsin. Luckily, the
structures of NAPAP in complex with thrombin and also with trypsin are known
(Fig. 23.12). A comparison of the 3D structures shows that there is a significant
difference in the binding mode between the two enzymes in the S3 pocket that leads
506 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
S2
N
H Ser195
O S N S3
N O
O H
O
+
NH2 Trp215
O
H
N NH2 O– S1
O
Gly 216 O
Gly219 Asp189
Gly219
Asp 189
Fig. 23.10 Structure of the thrombin–NAPAP complex. The most important interactions are
outlined on the left side. The positively charged benzamidine group occupies the S1 pocket and
forms a salt bridge to the negatively charged side chain of Asp189. Two hydrogen bonds are
formed to the amino acid Gly216. The piperidyl and naphthyl groups together occupy the two large
lipophilic pockets S2 and S3.
NH NH NH NH NH N
H
X N
X S N N
N NH
NH
NH NH NH H2N H2N
H2N H2N NH H2N NH H2N
H2N
X=CH,N X=CH,N
NH NH NH NH N N NH
H H
HN N S
N X NH2
NH R NH2 NH2
Y N N N N
H H
Y=NH2,OH R=H,Me
X=NH,S
NH NH HN NH HN NH
HN
R
X
X N
S
N NH2 Cl
H N
N N H2N R = H,Cl, O-Alkyl
H2N
X=CH,N OH H2N X=O,S 5-Membered-Ring Heterocycle
Fig. 23.11 Numerous building blocks have been developed that bind as a mimetic for the
arginine in the thrombin’s S1 pocket.
23.4 Seeking Small-Molecule Thrombin Inhibitors 507
Fig. 23.12 Comparison of the 3D structures of trypsin (left) and thrombin (right), each in
complex with NAPAP. The active site in thrombin is further narrowed by an additional loop
from above. The depth of the pocket is, once again, color-coded (see Fig. 23.2).
to a 180 -flipped orientation of the naphthyl group about the bond to sulfur. In
thrombin, the S3 pocket is more pronounced and is surrounded by multiple lipo-
philic amino acid side chains. In trypsin the top end of this pocket is open, and is
spatially hardly restricted at all. Obviously its structuring is not necessary in the
largely unspecific digestive enzyme. Therefore, the selectivity can be increased by
occupying the S3 pocket of thrombin as optimally as possible. If the thrombin–
NAPAP complex is examined in more detail, it is apparent that an additional
methoxy substituent on the naphthyl ring should be suitable to enhance selectivity.
In fact, inhibitor 23.12 binds 600-fold more strongly to thrombin than to trypsin.
Compound CRC220 (23.11, Fig. 23.9), which fills the hydrophobic S3 pocket
much better than NAPAP was developed at the former Behringwerke in Marburg,
Germany. Because of this improved filling, CRC220 inhibits thrombin almost
200-fold more effectively than trypsin.
Another approach to searching for thrombin inhibitors was taken by the
researchers at Hoffmann-La Roche. Initially they concentrated on optimally filling
the S1 pocket. Benzamidine was known to be a weak thrombin inhibitor that
occupies the S1 pocket. It has, however, the disadvantage that it binds more strongly
to trypsin (Fig. 23.4). Accordingly, the researchers in Basel initially sought a small
molecule that binds more strongly to thrombin than trypsin. More than 200 small
molecules were tested in this narrowly focused search. Structures were chosen only
if their functional groups were able to interact with the negatively charged side
chain of Asp189. Guanidines, amidines, and amines were investigated.
N-Amidinopiperidine (23.13, Fig. 23.13) was identified as an interesting lead
structure. In contrast to benzamidine, amidinopiperidine binds more strongly to
thrombin (Ki ¼ 150 mM) than to trypsin (Ki ¼ 300 mM). A systematic derivatization
led to 23.14, a moderately active thrombin inhibitor (Ki ¼ 0.48 mM). Based on the
structural model with the protease, it appeared obvious that the replacement of the
glycine unit with a D-amino acid, for example, D-Phe, should fill a lipophilic pocket
508 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
23.14 R = H
H O
Ki = 0.48 mM
N N
S N
H 23.15 R = CH2Ph
HN NH2 O O R
N Ki = 0.047 mM
23.13
Ki = 150 mM HN NH2
23.16 R = CH2-(m-NO2-Ph)
Ki = 0.024 mM
O N COOH
O O
S 23.17 Napsagatran
N
H Roche
O N Ki = 0.27 nM
H
N
HN NH2
O O O
H
N 23.18 Ximelagatran (Exanta® )
EtO N N
HO H AstraZeneca
NOH Prodrug of Melagatran
NH2
NH2
O O
N OHex
O O
O
H
N N 23.20
O H
NH2
HN
Fig. 23.13 One approach to the structure-based design of thrombin inhibitors began with 23.13 in
the S1 pocket. Compound 23.14 was derived from this lead structure. Its docking into the active
site of thrombin generated the idea for the synthesis of 23.15. The systematic variation of the side
chain R gave compounds with better binding affinity such as 23.16 and 23.17. The compound was
tested in depth in the clinic under the name napsagatran. The compound melagatran from
AstraZeneca was introduced as the double prodrug ximelagatran 23.18 as the first orally available
thrombin inhibitor on the market. It is derived from the tripeptide sequence D-Phe–Pro–Arg.
Another orally available inhibitor, dabigatran 23.19, was launched to market by Boehringer
Ingelheim. The tricyclic inhibitor 23.30, which was developed at the ETH in Zurich, foregoes
peptide character entirely.
23.4 Seeking Small-Molecule Thrombin Inhibitors 509
and lead to a distinct increase in the affinity. The compound was quickly prepared
and tested. In fact, 23.15 bound tenfold more strongly to thrombin. Other D-amino
acids were then investigated and additional affinity could be achieved. High
selectivity against trypsin was also encouraging; 23.16 binds 840-fold more
strongly to thrombin than to trypsin. The surprise was great when the 3D structure
of 23.14 in complex with thrombin was determined: The compound binds differ-
ently than predicted in the binding pocket! In contrast to the original assumption,
the naphthylsulfonyl group exchanged positions with the benzyl side chain.
The incorporation of a non-proteinogenic amino acid proved to be unfavorable
from a synthetic point of view. Therefore, other central building blocks were sought
that were synthetically more easily accessible. This work finally led to napsagatran
23.17, a highly potent and exceedingly selective substance. Because it is only
intravenously applicable, however, it never found its way to a marketed product,
particularly because argatroban, a marketed product for intravenous use, was
discovered much earlier and was already available.
The search for low-molecular-weight, orally available thrombin inhibitors
intensively occupied numerous large pharmaceutical companies for many years.
It took a long time until AstraZeneca introduced Ximelagatran (23.18, Fig. 23.13)
to the market as the first orally available thrombin inhibitor. The compound is
a double prodrug of the actual active substance melagatran. Its relation to the initial
parent structures (e.g., the tripeptide sequence D-Phe–Pro–Arg) is still quite appar-
ent. The head group of the arginine residue was replaced with a benzamidine, the
five-membered ring of the proline was narrowed to a four-membered ring, and the
terminal benzyl group was shortened to a cyclohexyl ring. The N terminus was
substituted with a methylenecarboxylic acid group. It proved to be extremely difficult
to make the thrombin inhibitors adequately bioavailable and to maintain the neces-
sary plasma level over an acceptable length of time. With regard to the bioavailabil-
ity, AstraZeneca, in collaboration with the group of Bernd Clement at the University
of Kiel, pursued a double prodrug strategy: The terminal acid function was masked as
an ester, and the benzamidine group was transformed into an N-hydroxyamidine. The
release of the active substance, melagatran, in the body is made possible by ubiqui-
tously present esterases and a set of three reductases. AstraZenecca withdrew
Ximelagatran (Exanta®) after 2 years because some issues with liver toxicity were
observed in a small number of cases after weeks of use.
Many years of thrombin research finally also led to success at Boehringer
Ingelheim. The compound dabigatran (23.19, Fig. 23.13) was introduced to the
market in the spring of 2008 for the prevention of stroke in patients with atrial
fibrillations. It also has a benzamidine anchor, and it has a pyridine group for the
hydrophobic S3 pocket. A benzimidazole building block with an attached amide
bond was chosen as a linker between these groups. As with ximelagatran, it uses
a carboxylic acid on the N terminus. It shows distinctly less peptide character than
the lead structures. A double prodrug strategy was also used for this substance to
make it adequately bioavailable. In addition to the esterification of the acid group,
the amidine group was masked as a carbamoyl moiety. The prodrug carries the
name dabigatran (Pradaxa® in the USA and Europe and Pradax® in Canada).
510 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
O O CH3
H H
N N N COCF3 23.21
MeO N
H
O CH3 O O
O
H
Cl N N COCF3
N 23.22
H H
N O O ICI 200880
S
O O O
Fig. 23.14 Elastase inhibitors 23.21 and 23.22 (ICI 200880) are substrate analogues. Compound
23.22 is a highly active compound, but it is not orally available.
Ser214
O C H3 F 3C
H O Ser195
N N
{ N
O H–N
H
O O
H
N
O H
N
Val216 Ser214
O F3C
H O Ser195
N N
Fig. 23.15 Comparison of { N
the binding mode of the H O H–N
elastase inhibitor Ac-Ala– O O
Pro–Val-CF3 with the H
postulated binding mode of N
the pyridone moiety (e.g., O
23.23, Fig. 23.16). Both H
N
compounds should be able to
form a double H-bond to
Val216. Val216
512 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Factor Xa and VIIa occur along the coagulation cascade prior to thrombin and are
investigated as targets for antithrombotics. They both have an aspartic acid on the
bottom of their deep S1 pockets, similarly to thrombin. Moreover, a narrow
and deep S3 pocket that is flanked by aromatic amino acids (Tyr99, Trp215, and
Phe174) is specific to factor Xa. Therefore, this pocket is ideally suited for
aromatic groups on inhibitors. As already mentioned, in the mid-1990s a dogma
prevailed that the S1 pocket of trypsin-like serine proteases could only accept
groups with basic character. The binding of chloro-substituted aromatic portions,
however, could be demonstrated for thrombin at Merck & Co. in the USA. These
groups made a breakthrough for factor Xa inhibitors. Highly potent inhibitors could
be developed with chlorophenyl, chloronaphthyl, or chlorothiophene groups
23.6 Serine Protease Inhibitors, Thrombin Was Just the Starting Point 513
R
O O
N CF3 23.23
O N N R = Phenly
H H
O O Ki = 5.6 nM
N R
O O
N CF3 23.24 R = Phenyl
O N N
H H
O O Ki = 6.6 nM
N R
O O O
S N CF3 23.25 R = p -F-Phenyl
N N N
H H H Ki= 1.6 nM
O O
N R
O
23.26 R = p -F-Phenyl
N CF3
H2N N
H Ki= 100 nM
O O
N R
O O O
S N CF3 23.27 R = p - NH2-Phenyl
N N
H H
O O Ki= 15 nM
N Ph
O N N
23.28 ONO-6818
N
H2N N O
H
O O
O O
S 23.29 Sivelestat
O N
H ONO -5046
OH
O O N
H
O
Fig. 23.16 Design of orally available elastase inhibitors at Zeneca. The original idea to replace
the Ala–Pro unit with a pyridone afforded 23.23. Later, pyrimidinones were overwhelmingly inves-
tigated. An additional nitrogen atom has been added to the heterocycle. Very potent compounds (e.g.,
23.25) are found in this class. Compound 23.26 has the best in vivo properties. The p-fluorophenyl
group (in 23.26) or the p-aminophenyl group (in 23.27) increases the lipophilic contact with the
enzyme. The compound ONO-6818 23.28 was developed in Japan all the way to clinical trials, where
it was discontinued due to abnormally elevated liver values in the treated patients. Another compound,
23.29, was clinically tested under the name sivelestat (ONO-4056). These compounds specifically
transfer an acyl group to the catalytic serine and reversibly block the enzyme.
514 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Val216
O– hole
protruding into the S1 pocket. So much affinity was gained by the accommodation
of additional groups in the deep, aromatic S3 pocket that these compounds bind to
the protease with single-digit-nanomolar values without the benzamidine anchor.
Such a derivative was introduced by AstraZeneca (23.30, Fig. 23.19). In addition to
the development of compounds with chloro-substituted aromatic rings for the S1
pocket, other inhibitors with benzamidine groups were also synthesized as factor
Xa inhibitors. However, it is much more difficult to achieve adequate selectivity
compared to other trypsin-like serine proteases with these derivatives. Furthermore,
they showed similar problems with regard to bioavailability as the thrombin
inhibitors. Bayer HealthCare introduced a new factor Xa inhibitor to the market
in September 2008, rivaroxaban (Xarelto ®; 23.31, Fig. 23.18 and 23.19), that places
a chlorothiophene group in the S1 pocket.
Other companies are working on inhibitors with comparable chloroaromatic
groups to fill the S1 pocket. The subnanomolar-binding inhibitor apixaban 23.34
from Bristol-Myers Squibb has recently been approved for market. It foregoes the
halogen group for the interaction in the S1 pocket completely. As a crystal structure
shows, the p-methoxy group displaces the water molecules that are almost always
in the S1 pocket. The previously mentioned compounds that lack a benzamidine
group show better bioavailability and good selectivity for factor Xa. Attempts were
also undertaken to develop dual inhibitors for thrombin and factor Xa. By inhibiting
both proteins together, not additive, but rather synergistic antithrombotic effects are
achieved that hopefully expand the therapeutic scope.
Factor VIIa is at the beginning of the extrinsic path of the coagulation cascade.
This enzyme also belongs to the family of trypsin-like serine proteases, and specific
inhibitors have been sought for this enzyme for years. In this case, the activation of
the protease is interesting. In cases of injury, blood comes into contact with tissue.
When this happens, factor VIIa and the membrane-bound tissue factor can form
a complex that causes a change in the conformation of the protease’s catalytic
domain. A peptide segment next to the catalytic center goes from being in an
unfolded conformation into a helical structure. This leads to a change in the
geometry of the catalytic site. Only in the complexed state does the protease have
23.6 Serine Protease Inhibitors, Thrombin Was Just the Starting Point 515
O
N N
N
N Cl
23.30
S
O
O
O
O N N H 23.31 Rivaroxaban
N Cl (Xarelto®)
S
O
O
O
N
O O N
O
CH3
Fig. 23.19 Three potent factor Xa inhibitors. The chloroaromatic rings of the two first examples
bind in the S1 pocket of the enzyme. Compound 23.30 was developed by AstraZeneca.
Rivaroxaban 23.31 was introduced to the market in 2008 by Bayer as the first orally available
factor Xa antithrombotic. Apixaban 23.32 from BMS binds at the subnanomolar level with its
methoxysubstituted aromatic rings in the S1 pocket of factor Xa.
Whether the viruses can be activated depends on the availability of the ubiquitously
occurring furin, and this is a prerequisite for the high pathogenic potential of the avian
influenza viruses. Other genetic combinations or prerequisites must be fulfilled to
convert these viruses to dangerous pathogens for animals and humans. Inhibitors of
furin could then contribute to the suppression of “arming” of these viruses. However,
the translation of such highly charged substrates into inhibitors that meet the common
rules for good bioavailability is a tremendous challenge.
In the early 1990s an interesting observation was made that the incretin
hormones GIP and GLP-1, which stimulate the pancreas to release insulin after
eating, are substrates for dipeptidylaminopeptidase IV (DPP IV). They are quickly
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 517
OH N
H 2N H
H
N 23.35 Saxagliptin (BMS-47718)
O
OH N
O
H3C
O N CH3 + HO Ser 23.36 S-Rivastigmine
N CH3
H3C CH3
H3C N O Ser
H3C
OH
CH3
N
H3C CH3
X
O OEt 23.37 X=O Paraoxon
P
23.38 X=S Parathione, E605
O2N OEt
H3C O
H
O N
CH3
O
O
Fig. 23.21 (S)-Rivastigmine 23.36 transfers a carbamoyl group to the catalytic serine in the
binding pocket of acetylcholine esterase and blocks its function because the carbamoyl–esterase
complex decomposes very slowly. The acetylcholine esterase inhibitors paraoxon 23.37,
parathione 23.38, propoxur 23.39, or malathione 23.40 are phosphoric acids, thiophosphoric
acids, or carbamine esters and are used as insecticides. They also react with the catalytic serine
and form a stable covalent bond.
H
S Enzyme-Ser RNH H S
RHN
N
O COOH O N
O H COOH
23.42 R=H, Aminopenicillic acid Ser
23.43 R=PhCH2(C=O),Penicillin G
23.44 R=PhOCH (C=O) Penicillin V
2
H
S RNH H
RHN S
Enzyme-Ser
N O HN
O CH2R' O CH2R'
COOH COOH
Ser
23.45
Fig. 23.23 In the last step of the bacterial cell wall synthesis, a glycopeptide transpeptidase
cleaves the bond between two D-Ala–D-Ala groups and forms a new bond between D-Ala
and a glycine in a peptidoglycan strand. Lactam antibiotics of the penicillin (23.42–23.44) or
cephalosporin type (23.45) can block this step. The penicillin scaffold (green) is reminiscent of the
D-Ala–D-Ala group (orange) and is bound analogously by the enzyme. An irreversible inhibition of
the transpeptidase is achieved by a nucleophilic opening of the lactam ring with the help of the
catalytic serine.
and an irreversible covalent coupling to the enzyme results. The cross-linking of the
glycan strands is prevented, and the newly synthesized cell wall does not achieve
adequate stability. It cannot withstand the osmotic pressure of the cell contents, and
the bacterial cell is killed as a consequence.
Of the first penicillins that were discovered by Alexander Fleming (▶ Sect. 2.4),
only benzyl 23.43 and phenoxymethylpenicillin 23.44 still have clinical importance
(Fig. 23.23). The residues on the 6-amino group of penicillic acid were exchanged
to improve the pharmacokinetics, activity spectrum, and acid stability. Electroneg-
ative atoms on the a-carbon atom of the acyl function increase the stability with
respect to acid-catalyzed decomposition and contribute to an improvement in the
oral bioavailability.
Bacteria quickly develop resistance to penicillins. They use lactamases, which
are enzymes that are structurally related to the transpeptidases. Four classes of
lactamases are known, of which three have a catalytic serine in their active sites.
A further class belongs to the zinc-dependent metalloenzymes (▶ Chap. 25, “Inhib-
itors of Hydrolyzing Metalloenzymes”). The catalytic serine of even the
b-lactamases is acylated by penicillins and related cephalosporins (Fig. 23.23). Up
until this step, the mechanism in the transpeptidases and the b-lactamases is iden-
tical. However, transpeptidases form very stable acyl–enzymes, whereas the cova-
lent intermediate of the b-lactamases is quickly hydrolyzed. The antibiotic to
deactivate the transpeptidase is therefore rendered inactive. b-Lactamases are prob-
ably descendants of the transpeptidases. They are widespread in nature and have
evolved out of the competition between bacteria and molds. The resistance gene for
b-lactamases is easily transferred between bacteria because the information is stored
on an extrachromosomal plasmid. Such plasmids are transferred very quickly.
How are the b-lactamases different from the transpeptidases so that they are able
to quickly dispose of the covalently bound ring-opened penicillin? The release
requires a hydrolytic cleavage from the protein. For this a well-placed water
molecule in the active site is needed that can initiate the nucleophilic attack on
the acyl–enzyme species. Although the structural architecture of transpeptidases
and b-lactamases is very similar, the sequence identity is small. Nonetheless,
a transpeptidase has been equipped with the hydrolytic properties of a lactamase
by selective mutagenesis! Only a few amino acid exchanges were needed for this.
Above all, the hydrophobic amino acids such as phenylalanine and tryptophan are
the ones that protect the acyl–enzyme complex from hydrolysis in the
transpeptidase. In contrast, polar amino acids such as glutamic acid (Fig. 23.24,
Glu166) are found in the same positions in the lactamases. In contrast to the
transpeptidase’s hydrophobic amino acids, these anchor and activate a water mol-
ecule in the correct orientation for nucleophilic attack on the acyl–enzyme complex
in the lactamases. As a result, the covalent complex with the penicillin cleavage
product that was formed by ring opening in the lactamases is hydrolyzed, but it
remains stable in the transpeptidases.
How can this lactamase-caused resistance be broken, and the degradation
of penicillins stopped? Unsubstituted penicillic acid 23.46 is quickly cleaved by
TEM-1b-lactamase (Fig. 23.24). Based on structural considerations it was proposed
522 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Arg244
Ser70
Asn170
Glu104
Glu166
that a hydroxymethyl group should be added to the 6-position. This group should be
located in exactly the position from where the water molecule would start its
nucleophilic attack on the acyl–enzyme form. In fact, derivative 23.47 inactivates
TEM-1b-lactamase. A water molecule was detected in the vicinity of the CH2OH
group in the subsequently determined crystal structure, but it is too far away to
successfully hydrolyze the acyl–enzyme. The hydroxyl group therefore blocks the
attack of a water molecule on the ester carbonyl group of the acyl–enzyme.
23.7 Serine, a Favored Nucleophile in Degrading Enzymes 523
can easily become chronic and can lead to severe liver damage as well as liver
cirrhosis and hepatocellular carcinoma. Orally bioavailable inhibitors of these
serine proteases such as telaprevir from the company Vertex have recently been
launched to market.
The carboxy serine peptidases represent another group that is folded analogously
to subtilisin (▶ Sect. 14.7). They have a triad of serine, glutamate, and aspartate.
A member of this family was recently discovered on the human cnl2 gene. Mutations
in this gene lead to severe neurodegenerative diseases. An oxyanion hole is also
found in this enzyme, to which, interestingly, an aspartate contributes. It is only in the
protonated state, however, that it can fulfill its task as a hydrogen-bond donor and
negative-charge stabilizer in the transition state. Because the enzymes of this family
are active in a pH range of 3–5, the requirements for being protonated are fulfilled.
There could be many more cleavage enzymes that use a catalytic serine to
discover. We can only wait and see which of the discovered peptidases will be
singled out for pharmaceutical development. Their catalytic machinery adopts the
same spatial architecture in all examples. Therefore, the general principles can be
transferred between the individual members of the family.
Aside from serine, another amino acid also carries an aliphatic-OH group: threo-
nine. This amino acid can also be catalytically active in a protease. The proteasome
represents the central protein-shredding machine for the cell and cleaves proteins
that have been marked with ubiquitin into small oligopeptides containing between
3 and 20 amino acids. The ubiquitin label is itself a highly conserved protein with
76 amino acids. As a cellular shredding machine, the proteasome plays a central
role in the metabolism of proteins, cell growth, and cell demise. Therefore it is an
important target structure for the treatment of cancer. It is a multiprotease complex
composed of more than 30 proteins and is found in the cytoplasm as well as the
nucleus (Fig. 23.26). The proteasome is constructed like a large barrel with two lid
regions that have regulatory function; these regions control the entry of substrates
into the shredder. A threonine is found in the catalytic sites of the proteases, which
have chymotrypsin-like, trypsin-like, and peptidyl-glutamyl-peptide-like substrate
specificity. The OH group of this threonine adopts the role of the nucleophile.
A neighboring, positively charged lysine and a balancing aspartate reinforce its
nucleophilic strength. Because the threonine is the first amino acid at the
N terminus, it carries a free amino group. This serves as a proton acceptor in the
mechanism. Two serines and one aspartate group contribute to the stabilization of
the transition state and complement the nucleophilic center.
The Millenium Pharmaceuticals company, which was founded as an academic
research institute, introduced bortezomib 23.51 (Velcade ®) to the market in 2006;
this was the first active substance that blocks the threonine protease function of the
proteasome. Chemically, bortezomib is a boronic acid derivative (Fig. 23.26). The
inhibitor reacts with the threonine of the catalytic triad to create a covalent
23.8 Triads in All Variations: Threonine as a Nucleophile 525
O OH
H
N N B
N OH
H
O
N
23.51 Bortezomib
Gly47
Thr1
B
Thr22
Lys33
Fig. 23.26 The proteasome, a cellular shredding machine, proteolytically cleaves ubiquitinylated
proteins selectively into small oligopeptides that have between 3 and 20 amino acids. The crystal
structure of the 20S proteasome from yeast (subunits are shown in different colors) is shown at the
left. Six of these units are inhibited by bortezomib (yellow). The boronic acid derivative
bortezomib 23.51 (right, gray) reacts with the N-terminal Thr1 and forms a covalent boronic
acid ester complex.
myeloma, the plasma cells produce massive amounts of misfolded proteins that
must be digested by the proteasome. Therefore these cells need a proteasome
that functions optimally; otherwise apoptosis would be induced. Blocking the
proteasome function is therefore desirable for such cells. Moreover these cells
are significantly more sensitive to bortezomib therapy than normal cells. Some
tumor cells also activate a transcription factor, NF-kB, which controls the prolif-
eration and survival of the tumor cells. The proteasome is critical for the activation
of NF-kB because it degrades an inhibitor of this transcription factor that acts as
a kind of emergency brake on NF-kB. Therefore the inhibition of the proteasome
serves to keep NF-kB in its benign form, because its inhibiting binding partner is no
longer being degraded. It is possible that bortezomib induces apoptosis of tumor
cells in that it stabilizes cyclin-dependent kinase inhibitors (▶ Sect. 26.2) as well as
the tumor-suppressor protein p53.
Interestingly, a protease was discovered in bacteria that exists as a 14mer and the
spatial structure of which is reminiscent of the proteasome. The ClpP-protein is
a serine protease that is involved in the degradation of cellular proteins in
bacteria. Treatment with a macrolide antibiotic can cause its function to run out
of control and degrade proteins in an unregulated way. This leads to cell death in the
bacteria. This principle was recognized by the company Bayer and exploited for an
antibiotic therapy. The goal was not to block the protease function of the ClpP-
protein, but rather to promote its uncontrolled effects through synthetic antibiotics.
In addition to the OH group of serine and threonine, the thiol group of cysteine is
able to carry out a nucleophilic hydrolytic attack on amide bonds. These enzymes
possess a catalytic triad, analogously to the serine proteases, and are termed
cysteine proteases. The first protease of this family to have been structurally
investigated in detail is papain, which is isolated from the latex of the papaya, the
fruit of the papaya tree (Carica papaya). Its triad is composed of a nucleophilic
cysteine as well as a histidine and an asparagine. The asparagine adopts the role of
the aspartate in the serine protease. The catalytic mechanism is comparable to that
of serine proteases. Even the oxyanion hole (Cys25 and Gln119) is found in pro-
teases from the papain family. There are indications that the transition state is
structurally similar to the acyl–enzyme intermediate. An attempt has been made to
exchange serine for cysteine in trypsin. The binding properties for the substrate (Km
value) remained virtually the same, but the catalytic rate of the reaction decreased
by five orders of magnitude. Even though the structures are geometrically nearly
unchanged, the experiment shows that the difference between serine and cysteine
proteases is more complicated than a simple exchange of sulfur for oxygen. The
fine-tuning of the structural and electronic properties is pivotal. In contrast to the
trypsin-like serine proteases, the nucleophilic cysteine exists as a preformed ion
pair with its neighbor, histidine.
23.9 Cysteine Proteases: Sulfur, the Big Brother of Oxygen as a Nucleophile in the Triad 527
Table 23.4 Cysteine proteases with physiological importance (X ¼ arbitrary amino acid). The
3D structures of all of the listed enzymes are known.
Enzyme Cleavage site Function or therapeutic use
Papain –Val–X–X– Model botanical enzyme from papaya
Cathepsins B, L, K, M –Arg–X– Inflammation
–Gly–X– Tumor metastasis
–Ser–X– Muscular dystrophy
–Tyr–X– Myocardial infarct
Calpains –Lys–Ser– Stroke
–Arg–Thr– Neuroprotection
–Tyr–Ala– Cataract
Falcipain –Arg–Lys– Malaria
–Lys–X–
Cruzipain –Lys/Arg– Sleeping sickness
–Phe/Ala–
Caspases –Asp–X– Rheumatoid arthritis, apoptosis, sepsis
Picornavirus 3C-proteinase –Gln–X– Viral infection
SARS-main proteinase –Gln– Viral infection
Ser/Ala
different pathological conditions that are associated with tumor disease, disruption
of the immune system, or neurodegenerative damage. Inhibitors of different
caspases have potential as neuroprotective agents, as active substances for tumor
therapy, or for the treatment of rheumatoid arthritis.
The third family includes the viral 3C proteases, which occur in picornaviruses
(human rhinovirus, poliomyelitis, or hepatitis viruses) or corona viruses (SARS).
These viral proteases process the primary polypeptide chain and generate the
specific viral proteins during maturation. Inhibitors of these proteases represent
a concept for antiviral chemotherapy.
A special feature associated with the papain-type proteases is the stereochemis-
try of the nucleophilic attack. In contrast to other serine and cysteine proteases, the
attack occurs from the opposite side, the so-called Si face. The S1 pocket in papain
is not prominent, and the P1 group of the substrate is oriented away from the protein.
In contrast, all of the neighboring pockets are much more prominent. Interestingly,
some of the pockets on the C-terminal side (the primed side, S10 –S40 ) of cysteine
proteases are strongly structured. This can be exploited for the design of potential
inhibitors. Papain prefers substrates with hydrophobic P2 and P3 groups. An aspar-
tate is recognized as a P1 group by caspases of the second folding family. For these
reasons, many inhibitors that have been developed for caspases carry a functional
group with a carboxylic acid group or a corresponding mimetic at this position. The
interaction with the thiol group of the catalytic cysteine is decisive for the binding
of cysteine protease inhibitors to their target enzyme. It is interesting that many of
the developed inhibitors try to involve the sulfur atom in a covalent bond. Revers-
ible and irreversible head groups have been developed for this purpose. The
complex of the inhibitor leupeptin 23.52, a natural product with an aldehyde
function, with calpain II is shown in Fig. 23.27. This group reacts with the thiol
group of cysteine and forms a hemithioacetal. Leupeptin binds with high affinity to
many members of the papain family. In addition to the aldehyde head group, many
other functionalities (so-called warheads) that can be used to inhibit cysteine pro-
teases are known (Fig. 23.28). Such irreversible inhibitors have been developed in
cases of viral proteases and have a Michael-acceptor group at their disposal (i.e.,
23.53). This reactive group forms an irreversible bond with cysteine and switches
the enzyme off permanently. An attempt has been made to develop inhibitors for
cathepsins, calpains, and caspases that can form a reversible connection to the thiol
group. Most of these structures are derived from aldehydes or ketones (23.53–
23.57). From a chemical point of view, the caspase inhibitor 23.56 from Vertex is
interesting. In a cyclic structure, it combines an aspartate-like side chain for the S1
pocket of the enzyme and a capped aldehyde function in the form of a cyclic acetal.
The aldehyde is released as active compound from this prodrug.
Another group of enzymes that actually belongs to the family of transferases, but
follows a cysteine-protease-like mechanism, are the transglutaminases. Nine
isoenzymes have been discovered in our genome. They are constructed from four
domains and contain a catalytic domain composed of a Cys–His–Asp triad. Their task
is the posttranslational modification of proteins (▶ Sect. 26.2), that is, they modify
proteins after they have been synthesized in the ribosome. As one aspect, they can
23.9 Cysteine Proteases: Sulfur, the Big Brother of Oxygen as a Nucleophile in the Triad 529
Gly208
Glu349
O- hole
Cys115
Glu261
Leu260
H C CH3
O R1 O 3 O
H
H N
R2 N O N H
H H
O O
O R1
CH3 23.54 MDL-28170
R2 N
H
O H 3C CH3
O R1 O
O O H
CF3 S N
R2 N N H
H H
O O
O R1 F
R2 N 23.55 SJA-6017
H N
O
O R1
CH2Cl N
O
R2 N N
H
O O
O R1 O
N OO N
CHN2 H H
R2 N N O
H
O
CH3
O R1 O
R2 N COOH
H 23.56 VX-740 Pralnacasan
O
O R1
O O O
R2 N COOH H
N
H O N CH2F
H
O OH
H
O N
23.57 MX1013 O
O
N
H3C N
H
O N O N
O H COOEt
23.53
Fig. 23.28 In addition to the aldehyde head group, many other functionalities have been
developed that reversibly or irreversibly couple (reactive site is marked in red) to the catalytic
cysteine and in doing so block cysteine proteases. Irreversible inhibitors such as 23.52 that have
a Michael-acceptor group are available for viral proteases. The two aldehydes 23.54 and 23.55 are
development substances for the inhibition of calpains; 23.56 and 23.57 are caspase inhibitors.
Compound 23.56 is a prodrug that releases an aspartate-like P1 side chain with ring opening and
forms a thiohemiacetal with the protein with its newly generated aldehyde function.
23.10 Synopsis 531
23.10 Synopsis
• Serine proteases belong to the class of hydrolyzing enzymes that cleave amide or
ester bonds. Depending on where they cleave a peptide chain, they are classified
as amino-, carboxy-, or endopeptidases.
• Three amino acids, a serine, a histidine, and an aspartic acid that reside at quite
distant positions in the sequence, are folded in characteristic proximity to one
another. The hydroxyl oxygen atom of the serine performs a nucleophilic attack
onto the carbonyl carbon atom of the scissile peptide bond. Its nucleophilicity is
enhanced by an H-bond to an adjacent imidazole moiety of a histidine.
• The histidine accepts a proton from the nucleophilic serine OH group and is
thereby transposed into a positively charged state. The neighboring aspartate
residue compensates for the positive charge. The simultaneously created nega-
tive charge on the former carbonyl oxygen is stabilized by NH functions in the
H-bond-donating oxyanion hole. Simultaneously, the carbon atom of the cleav-
ing amide bond rearranges to a tetrahedral geometry.
• Upon release of the N-terminal part of the peptide substrate the C-terminal part
remains covalently bound as acyl–enzyme complex. This is finally degraded via
a similar mechanism by using a water molecule as a nucleophile.
• The residues involved can be different; in particular, the nucleophilic serine can
be replaced by a threonine or cysteine. The corresponding enzymes are named
threonine and cysteine proteases.
• The peptide chain to be cleaved is primarily recognized in small binding pockets
on the protease surface that accommodate the amino acid side chains on the
C-terminal end adjacent to the cleavage site. Their composition determines the
chemical building blocks required for inhibitor design to develop highly potent
ligands for the protease.
• A number of warhead groups are known to either reversibly or irreversibly block
the catalytic serine, threonine, or cysteine residue.
• The major contribution to binding affinity and ligand specificity is achieved
through binding to the S1 pocket next to the cleavage site.
• Blood coagulation is a highly regulated cascade of serine proteases. Potent
inhibitors for antithrombotic therapy have been developed for thrombin and
factor Xa, which are located in the last steps of the cascade.
• Whereas thrombin and factor Xa exhibit deep and well-structured S1
pockets, elastase exhibits a flat S1 pocket. Binding to this pocket contributes
much less to the overall affinity of an inhibitor for this protease and the
developed compounds all involve the catalytic serine in a reversible cova-
lent attachment.
532 23 Inhibitors of Hydrolases with an Acyl–Enzyme Intermediate
Bibliography
General Literature
Abbenante G, Fairlie DP (2005) Protease inhibitors in the clinic. Med Chem 1:71–104
Babine RE, Bender SL (1997) Molecular recognition of protein-ligand complexes: applications to
drug design. Chem Rev 97:1359–1472
Berliner LJ (ed) (1992) Thrombin: structure and function. Plenum, New York
Branden C, Tooze J (1991) Introduction to protein structure. Garland, New York
Kimball SD (1995) Challenges in the development of orally bioavailable thrombin active site
inhibitors. Blood Coagul Fibrinolysis 6:511–519
Shafer JA, Gould RJ (eds) (1994) Design of antithrombotic agents, vol 1, Perspectives in drug
discovery and design. ESCOM Science Publishers, Leiden, pp 419–550
Steinmetzer T, Stürzebecher J (2004) Progress in the development of synthetic thrombin inhibitors
as new orally active anticoagulants. Curr Med Chem 11:2297–2321
Türk B (2006) Targeting proteases: sucesses, failures and future prospects. Nat Rev Drug Discov
5:785–799
Special Literature
Gustafsson D, Bylund R et al (2004) A new oral anticoagulant: the 50-year challenge. Nat Rev
Drug Discov 3:649–659
Hilpert K, Ackermann J, Banner DW, Gast A, Gubernator K, Hadvary P, Labler L, Müller K,
Schmid G, Tschopp TB, van de Waterbeemd H (1994) Design and synthesis of potent and
highly selective thrombin inhibitors. J Med Chem 37:3889–3901
Veale CA, Bernstein PR, Bryant C et al (1995) Nonpeptidic inhibitors of human leukocyte elastase
5 design, synthesis, and x-ray crystallography of a series of orally active 5- aminopyrimidin-6-
one-containing trifluorormethyl ketones. J Med Chem 38:98–108
Aspartic Protease Inhibitors
24
The task of aspartic proteases is the cleavage of peptide bonds. Their name comes
from two aspartic acid residues that determine the catalytic mechanism. A water
molecule, which is appropriately polarized by these two residues, is used as
a nucleophile for the attack on the peptide bond. At the same time, these groups
stabilize the transition state, balance the charges, and transfer protons. The diges-
tive enzyme pepsin was intensively investigated as the first member of this enzyme
class. It is active at strongly acidic pH conditions between values of 1 and 5.
The first 3D structure of this aspartic protease was determined in the early 1970s
in the group of Alexander Fedorov. The aspartic protease family is relatively small
in the human genome; it contains 15 members. Table 24.1 lists a few important
aspartic proteases.
Pepsin preferably cleaves peptides that contain hydrophobic residues to the right
and left of the cleavage site. Its spatial structure shows that two catalytically active
aspartic acid residues are found next to one another. One of these residues has an
unusually low pKa value of 1.5. The pKa of the other aspartic acid is higher: 4.7.
Therefore under the pH conditions in the stomach, apparently one of the side chains
in the catalytic site is protonated, but the other one is not. This difference is decisive
for the catalytic mechanism. In other aspartic proteases that exert their function at
higher pH values, a comparable difference is observed between the two groups. It is
the local environment that determines the pKa values (▶ Sect. 4.4). On the other
hand, both aspartic acid residues are spatially so close to one another that they can
no longer be considered as independent of one another. The two aspartates behave
like a coupled system, similar to a dicarboxylic acid; they are practically a diprotic
acid (Table 24.2).
The mechanism of the peptide cleavage by aspartic proteases is shown in
Fig. 24.1. The cleavage of the amide bond is accomplished by the nucleophilic
attack of a water on the carbonyl carbon atom. The deprotonated aspartate polarizes
Table 24.1 A few aspartic proteases and the preferred site for enzymatic cleavage.
Enzyme Cleavage site Function
Pepsin Phe–Phe, Leu–Phe, etc. Digestion
Renin Leu–Val, Leu–Leu Increasing blood pressure
Cathepsin D Phe–Phe, Leu–Leu, etc. Tissue degradation
b-Secretase Met–Asp, Leu–Asp Proteolytic degradation of membrane
proteins
Chymosin Phe–Met Milk curdling
HIV-Protease Phe–Pro, Tyr–Pro, Phe–Tyr, Leu–Phe, Virus replication
Phe–Leu, Met–Met, Leu–Ala
Plasmepsin Phe–Leu Hemoglobin digestion
this water molecule and the protonated residue simultaneously forms an H-bond to
the carbonyl group of the amide bond being cleaved. In this way the C¼O bond is
polarized and the nucleophilic attack on the carbon atom is facilitated. The reaction
proceeds via a tetrahedral transition state in that the oxygen atom of water attacks
nucleophilically, and a proton is transferred to the deprotonated aspartate. One
approach to the development of aspartic-protease inhibitors is to imitate the tem-
porarily occurring, unstable geminal-diol transition state with a stable molecule.
Hydroxyl compounds (Fig. 24.2) but also a-ketoamides and phosphinates can be
used for this purpose.
A view into the binding pocket of five different aspartic proteases is shown in
Fig. 24.3. Substrate molecules are bound by these proteases in an elongated channel
that reaches from one side of the enzyme to the other and surrounds the substrate
like a tunnel. For this, the protease must allow the substrate access to the reaction
pathway by opening a flexible flap. The upper part of this tunnel is cut away in the
figure. The areas in which the protein orients its hydrogen-bond-forming groups are
shown in blue. The two catalytic aspartic acid residues are found in the center,
24.1 Structure and Function of Aspartic Proteases 535
hidden underneath the blue area. Neighboring this are regions in which hydrogen
bonds are formed to the peptide backbone of the substrate. This binding motif is
common to all aspartic proteases. Binding pockets to the left and right of the
catalytic site are responsible for the selective recognition of the substrate. They
accommodate the side-chain groups of the substrate molecule. It is noteworthy that
536 24 Aspartic Protease Inhibitors
OH R
H OH R2 OH R2 OH O H H HO OH 2
H H
N N N N N
F
R1 O R1 OH R1 R1 O R1 F O
OH O OH O O OH
H H H H HO H
N N N N N P N
R1 R2 R1 O R1 O R2 R1 R2 O
Hydroxyethylamine α−Hydroxylamide α−Ketoamide Phosphinate
Fig. 24.2 Possible transition-state isosteres for the design of aspartic protease inhibitors.
Hydroxyl groups are particularly well suited. Statin, a non-proteinogenic amino acid, is found in
many inhibitor structures.
Fig. 24.3 A view into the binding pockets of the five aspartic proteases: HIV protease (a),
endothiapepsin (b), cathepsin D (c), plasmepsin (d), and renin (e). The catalytic site passes like
a tunnel through the protease (above left, inset). In the figures, the proteins are displayed in a way
that the cut runs through the middle of the tunnel, and the front side of the tunnel is clipped off. The
backside of the tunnel is visible through a view from the side (arrow); the protein surfaces are cut
above and below the active site. The blue areas on the backside of the tunnel surface indicate donor
or acceptor groups in the proteins to which the substrate is bound along the peptide backbone.
24.2 Design of Renin Inhibitors 537
24.1 Pepstatin
in contrast to the serine proteases, the pockets on either side of the cleavage site are
well defined. This observation can be understood based on the reaction mechanism.
Also in contrast to the serine proteases, a covalent bond is never formed between
a reaction intermediate and the enzyme. Aspartic proteases often cleave between
hydrophobic amino acids. Because such residues cannot form strong interactions, it
is important to recognize and immobilize the substrate molecule by multiple
contacts on either sides of the cleavage site. For the design of inhibitors, groups
must therefore be found that can mimic the interaction in the binding pockets S3, S2,
S1 and S10 , S20 , and S30 . A group from Fig. 24.2 is then placed in the cleavage site
that represents a transition-state analogue.
Hamao Umezawa isolated one of the first potent and specific aspartic-protease
inhibitors, pepstatin, from a culture of Streptomyces sp. This peptide, Iva–Val–Val–
Sta–Ala–Sta-OH, 24.1 (Fig. 24.4) is a good to highly potent inhibitor of many
members of the aspartic-protease family. It contains the non-proteinogenic amino
acid statin with a hydroxyethyl group. The 3D structure of the pepsin–pepstatin
complex shows that statin indeed binds as a transition-state mimic to the catalytic
aspartic acids.
Renin is an aspartic protease that is composed of 340 amino acids. It plays a pivotal
role in endogenous blood pressure regulation and in electrolyte and water homeo-
stasis. The enzyme cleaves the peptide angiotensinogen to form the decapeptide
angiotensin I (Fig. 24.5). This is subsequently cleaved by angiotensin-converting
enzyme (ACE, ▶ Sect. 25.4), a metalloprotease, to give the octapeptide angioten-
sin II, which increases blood pressure. Inhibition of the enzyme renin leads to
a decrease in the concentration of angiotensin I and, as a consequence, of angio-
tensin II. Renin inhibition therefore has a hypotensive effect. As a consequence of
the great therapeutic success of the ACE inhibitors, many pharmaceutical compa-
nies began to search for selective renin inhibitors. Renin has an unusually high
specificity. Angiotensinogen is the only known natural substrate of this enzyme.
Therefore, it should be possible to find a highly specific renin inhibitor that blocks
no other enzymes and causes no side effects, which is not the case with many other
antihypertensive compounds.
538 24 Aspartic Protease Inhibitors
P3 P2 P1 P1¢ P2¢
Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu-Val-Ile-His-Protein Angiotensinogen
Renin
Inactive Fragments
Table 24.3 The replacement of the cleavable amide bond in Leu–Val by a stable isostere leads to
potent renin inhibitors. The Leu–Val group is replaced by a group in the inhibitors that the enzyme
cannot cleave.
Substrate/Inhibitor IC50 (nM)
His Pro Phe His Leu Val Ile His 300,000a
His Pro Phe His Leu[COCH2]Val Ile His 500
His Pro Phe His Leu[CH2NH]Val Ile His 200
His Pro Phe His Statin Ile His 20
His Pro Phe His Leu[CHOHCH2]Val Ile His 3
Substrate, KM value
a
The starting point for the work was the peptide sequence of the substrate
angiotensinogen. Renin cleaves angiotensinogen between Leu and Val. Initially
an appropriate surrogate for the Leu–Val unit was sought that would allow the
retention of the amino acids in the positions P5 to P30 (Table 24.3). The octapeptide
His–Pro–Phe–His–Leu–Val–Ile–His is cleaved as a renin substrate. A replacement
for the Leu–Val amide bond that is cleaved by the enzyme with the stable, isosteric
groups CH2NH or COCH2 led to modestly effective inhibitors. The isostere with
the hydroxyethylene group, CH(OH)CH2 was better suited as a transition-state
analogue, and afforded a strong inhibitor (IC50 ¼ 3 nM). The incorporation of the
24.2 Design of Renin Inhibitors 539
Table 24.4 Optimization of the P1 side chain. The binding pocket is lipophilic and obviously has
just the right size for a cyclohexylmethylene group.
OH
Boc-Phe-His-NH S CH3
R CH3 24.2
R IC50 (nM)
Isobutyl 81
Cyclohexylmethylene 4
Cyclohexyl 150
Adamantylmethylene 2,500
Benzyl 15
Table 24.5 The introduction of a second hydroxyl group in the R2 position leads to a significant
increase in the binding affinity.
H OH
N CH3
Boc-Phe-His
R1 R2 CH3 24.3
R1 R2 IC50 (nM)
Isobutyl H 1,500
OH 11
Cyclohexylmethyl H 10
OH 1.5
non-natural amino acid statin (see Fig. 24.4) produced a very well-binding inhib-
itor. As a dipeptide isostere, statin replaces the P1–P10 unit in the Leu–Val segment
of the substrate.
The next step was the optimization of the P1 moiety. Different groups were
investigated as a replacement for the leucine side chain. The results of such
a structural variation of 24.2 are listed in Table 24.4. The replacement of the
isobutyl group by a larger cyclohexylmethylene group increased the affinity by
a factor of 20. An adamantylmethylene group is obviously too large for the pocket
because the corresponding derivative only weakly inhibits the enzyme. Next, the P2
moiety was investigated. Here, however, the replacement of the histidine by another
group did not lead to a significant improvement in the binding affinity. Nonetheless,
the replacement of the basic histidine in the P2 position brought significant progress in
the context of renin research because it enabled the discovery that glycols are potent
renin inhibitors. A few compounds from the 24.3 class are listed in Table 24.5.
The introduction of a second hydroxyl group in the correct configuration increased
the affinity by a factor of 10–200 depending on the chosen P1 side chain.
540 24 Aspartic Protease Inhibitors
OMe
CH3 O O OH CH3
H3C H 24.4 A-64662
N
H2N N N CH3 Enalkiren
H H
O OH IC50 = 14 nM
N NH
O O O OH 24.5 Ro 42-5892
H
H3C S N Remikiren
N
H3C H IC50 = 0.7 nM
CH3 O OH
N NH
Fig. 24.6 Enalkiren 24.4 and remikiren 24.5 were the first renin inhibitors upon which a clinical
trial was conducted.
Table 24.6 By modifying the P3 moiety, Phe, the stability to chymotrypsin is improved.
O OH CH3
H
N
R N CH3
H
OH
N
NH 24.6
N N
H
O O
N N
H
O O
0.58 Stable
CH3
H3C
O
N N
H
O O
laboratory of Michael James, which was relatively late. Until then, it had already
been recognized that renin had a certain, if modest, sequence homology of 20–30%
with aspartic proteases that came from fungi, for which the 3D structures were
known. This was the starting point for homology modeling in multiple laboratories.
The first model was published in 1984 by Tom Blundell’s group. They used the
crystal structure of endothiapepsin as a reference. Initially the renin sequence
was compared with that of other aspartic proteases to find structurally conserved
regions. Then the modeling in the interior of the protein was accomplished by
replacing residues in endothiapepsin with those of renin. Truncations and insertions
in the polypeptide chain had to be considered. The flap region had particular
importance. It opens so that the ligand can enter and form hydrogen bonds with
the protein. Its structural architecture is therefore important for ligand binding.
Unfortunately the renin sequence deviated from that of the fungal enzymes exactly
at the flap region. A comparison of the renin model with the later-determined crystal
542 24 Aspartic Protease Inhibitors
S NH2
CH3
Fig. 24.7 Structures 24.7–24.9 are a few moderately orally available renin inhibitors. All have
a diol unit and a cyclohexylmethylene side chain in common that binds in the P1 pocket.
structure showed good agreement, especially in the lower face of the binding pocket
near the two aspartic acids. Significant differences were found in the turn areas of the
loop region. In the context of the entire protein architecture, these were less impor-
tant. In the context of drug design, they were decisive! Errors in the structural model
necessarily had to lead to the wrong suggestions for inhibitor design.
The renin structure in complex with the inhibitor CGO-38560 24.10 that was
determined by Markus Gr€ utter and John Priestle at Ciba in Basel, Switzerland is
shown in Fig. 24.8. Based on this structure, the researchers at then Ciba, now
Novartis, managed a breakthrough. If the arrangement in the binding pocket of
renin is considered, it is apparent that the S1 and S3 pockets merge into one large
hydrophobic cavity. The residues in the P1 and P3 position, a cyclohexylmethylene
and a benzyl group, approach one another closely. The following design concept
was then obvious: Instead of stretching the molecule and its groups on a peptide
backbone, the scientists disrupted the chain at the amide bond. This created a new
polar group: a charged, free amino terminus. The tethering of the molecule was
instead diverted to the closely placed hydrophobic residues in the S3/S1 pocket. An
entirely new, dipeptide-like scaffold resulted (24.11, Figs. 24.8 and 24.9). It had an
24.2 Design of Renin Inhibitors 543
Ser219 Asp32
Asp215
Ser76
Thr77
Fig. 24.8 Superposition of the crystal structures of renin complexes with the inhibitors CGP-
38560 24.10 (gray carbon atoms) and aliskiren 24.12 (light-green carbon atoms). The inhibitors
bind with their peptide-like architecture in an extended, pleated-sheet-like conformation. Com-
pound 24.10 orients its benzyl and cyclohexylmethyl groups in the broad S3/S1 pocket. Both of the
hydrophobic side chains from 24.10 were linked to each other for the design of aliskiren. Instead
the peptide chain could be cut open and a new polar N terminus is formed.
IC50 of 6 nM. Finally, a few steps of side-chain optimizations were under taken on
the aromatic ring and on the amide bond. The methoxypropoxy side chain occupies
a somewhat different pocket than the corresponding groups in CGP-38560. It
causes a significant increase in the binding affinity. The optimized residue in P10
exerts only a small influence on the in vitro affinity, but it is pivotal for the duration
of action. A geminal substitution with two methyl groups and a terminal
carboxamide group proved to be optimal for the P20 position. This inhibitor was
introduced to the market in 2006 with the name aliskiren (24.12) as the first orally
available renin inhibitor. Despite being so well optimized, the compound does not
show ideal bioavailability. Therefore, it must be applied in therapy with rather high
dosage. Nonetheless, aliskiren shows virtually no binding to other aspartic pro-
teases such as cathepsin D or pepsin.
Roche achieved another hit from their renin work that later proved to be
stimulating for the entire research field. The company had a potent inhibitor with
remikiren 24.5, which unfortunately lacked the desired oral bioavailability. The
company then initiated a broad screening program. Chlorophenylmethoxybenzy-
loxypiperidine 24.13 was discovered (Fig. 24.10) with an IC50 value of 50 mM.
This structure was surprising because it does not have a typical transition-state-
mimicking group. The crystal structure with a very similar derivative showed that
the protonated nitrogen on the piperidine ring binds between the two catalytic
aspartic acids. The lipophilic chlorophenyl portion orients into the broad S1/S3
544 24 Aspartic Protease Inhibitors
O
O O
O
N NH2
H2N
O H
OH
N N
H H
24.13 24.14
Trp39
Tyr75
Asp32
Asp215
Fig. 24.11 The crystallographically determined binding mode of the piperidine lead structure
24.14 with renin. The basic nitrogen of the inhibitor binds between the two aspartic acids of the
catalytic dyad. The lipophilic side chain lies in a newly opened binding pocket. It was formed by
breaking up a hydrogen bond that was originally present between Tyr75 and Trp39 in the
uncomplexed protein. Both residues adopt a new position with larger distance between one another
after binding of 24.14.
the inhibitor’s 4-phenyl group occupies a region where the aromatic ring of Tyr75
would be located if the flap were closed. This structure afforded the researchers two
important pieces of information: (1) a nitrogen-containing heterocycle is an
interesting peptidomimetic that binds to the catalytic aspartic acids, (2) inhibitors
can bind to the aspartic protease family in more conformers than the closed-flap
conformation. The open conformer can also be stabilized by an inhibitor. These
exemplary studies on renin afforded important information for new work on the
aspartic proteases (Sect. 24.6).
546 24 Aspartic Protease Inhibitors
Ile50ⴕ Ile50
Asn25 Asn25ⴕ
Fig. 24.12 3D Structure of HIV protease in complex with the peptide substrate Arg–Pro–Gly–
Asn–Phe–Leu–Gln–Ser–Arg–Pro. The structure with the substrate could be obtained with
a catalytically inactive enzyme because both acidic aspartic acids of the catalytic dyad had been
mutated to asparagines. The protease exists as a C2-symmetric homodimer. The peptide chains are
shown in green and red, respectively.
N NH N NH N NH
O OH O Ki = 5 nM
H H H
Boc N N N N
N N N COOH
H H H
O O O O
24.15 H 261
H2N
Ki = 0.66 nM
OH O
O H O H OH H O
N N N N OMe
N N N
H O H O O H O
24.16 JG 365
Fig. 24.13 The peptidic HIV protease inhibitors H 261 24.15 and JG 365 24.16 are potent
inhibitors in the enzyme test. They are inactive in cell culture.
24.24 was withdrawn from the European market in 2007. It was noticed that tablets
containing this substance had an unusual smell. The subsequent analysis gave
the alarming result that the drug was contaminated with ethyl mesylate from the
synthesis. Because Saquinavir has unsatisfactory bioavailability (3–5%), it is
administered in combination with ritonavir 24.20, a potent CYP 3A4 inhibitor
548 24 Aspartic Protease Inhibitors
H2N
O
O H OH
N N O
O N
H
O O
24.18 IC50 = 2 nM
H2N
O H H
O OH
H H
N N N N
N
H
O O
24.19 Ro 31-8959
Saquinavir
IC50 < 0.4 nM
Ki < 0.12 nM
Fig. 24.14 The stepwise optimization of the substrate-analogue inhibitor 24.17 led to the highly
potent HIV protease inhibitor Ro 31-8959 24.19 via 24.18. This compound was the first protease
inhibitor to pass clinical trials and is marketed with the name saquinavir.
(Ki ¼ 17 nM; ▶ Sect. 27.6). This significantly minimizes the first pass effect upon
co-administration with saquinavir. Amprenavir 24.23 was withdrawn in 2004
because it was replaced by the better-soluble prodrug fosamprenavir (Lexiva ®).
The relationship to the parent substrate is clearly visible in the inhibitors that were
introduced in the last section. Fundamentally the compounds are still peptides. The
crystal structures of the peptidic HIV protease inhibitors in complex with the
enzyme all show that the inhibitors form essentially the same H-bond pattern in
the immediate vicinity of the catalytically active aspartic acid residues (Fig. 24.16).
A water molecule is particularly interesting here because it is found in all of the
crystal structures. This water molecule forms two hydrogen bonds to both the
24.4 Structure-Based Design of Non-Peptidic HIV Protease Inhibitors 549
H2N
O H H
O OH H H
H H OH
N N N N H H
N N N N
H HO
O O
O O
S
N OH OH
H
N N
N
24.25 Atazanavir Reyataz®, Zrivada® (2000)
CONHtBu O
24.23 R=H Amprenavir Agenerase®, Prozei® (1999) 24.27 Tipranavir Aptivus® (2005)
R=PO3H Fosamprenavir Prodrug Lexiva® (2003)
Fig. 24.15 Up to now, nine new compounds for AIDS therapy have been introduced to the
market. Compounds 24.19–24.26 are peptide-like inhibitors, and tipranavir 24.27 alone has
a completely non-peptidic structure.
inhibitor and the enzyme. Inhibitors designed to displace this water molecule were
hoped to increase the binding affinity by the entropically favorable release of this
water (▶ Sect. 4.6). Moreover, it was expected that such an approach would also
increase the selectivity because a water molecule with a similar function is not
known to exist in the other aspartic proteases.
550 24 Aspartic Protease Inhibitors
a b
N Ile 50 P1 8.5–12 Å P1′
Ile 50′ N
H H
H-bond
O R O donor/acceptor
N
H OMe
R OH
HO OH
O O
HO OMe 24.28
- - Asp 25′
Asp 25 O O
O
OH
HO OH
O
R R
N N
HO OH
24.29
Fig. 24.16 The pattern of the hydrogen bonds between HIV protease and the peptide inhibitors in
the vicinity of the catalytic aspartic acids (a). A water molecule is found in the binding pocket that
forms two H-bonds to the inhibitor and to the protein. The hydroxyl group of the inhibitor
displaces the water molecule that is involved in the catalytic process (c.f. Fig. 24.1). By starting
with this binding mode, the spatial pharmacophore of a potential inhibitor was defined (b). The
search in databases of crystal structures of low-molecular weight compounds was started with this
pattern. It produced the substituted phenol 24.28 as a hit. From there, six- and seven-membered
cyclic ketones and a cyclic urea 24.29 were developed. These derivatives could displace the
structurally conserved water molecules from the binding pocket of the protease with their carbonyl
groups.
24.30 Ki = 4500 nM
HO OH
24.31 Ki = 0.3 nM
HO OH H2N NH2
O
O N N
N N
HO OH
HO OH
aspartates. Two lipophilic groups, separated by 8.5–12 Å, were sought that were
also 3.5–6.5 Å away from a hydrogen-bond acceptor or donor (Fig. 24.16b). In
addition, a functional group should be between the two lipophilic groups that can
displace the structurally conserved water molecule from the binding pocket.
A search in the Cambridge database (▶ Sect. 17.11) afforded a molecular scaffold
derived from a substituted phenol (24.28). From this, the idea emerged to use
4-hydroxycyclohexanone as a scaffold (Fig. 24.16). Modeling studies and intensive
discussions with the synthetic chemists finally led to a cyclic urea 24.29 as
a scaffold for the new inhibitors 24.30–24.33 (Fig. 24.17). The first result of this
development was DMP-323 24.32, a low-molecular-weight HIV-protease inhibitor.
The 3D structure of 24.31 in complex with the protease is shown in Fig. 24.18. It
confirms the hypothesis that the carbonyl group displaces the structural water
molecule, and the two hydroxyl groups bind to the catalytic aspartate residues. As
promising as the design of the cyclic urea as an HIV protease inhibitor seemed, to
date no compounds have survived all stages of clinical trial to achieve approval.
A new lead structure 24.34 (Ki ¼ 1.1 mm, Fig. 24.19) was found by screening at
Parke–Davis. The spatial structure with the protease was determined with the
homologous inhibitor 24.38 (Fig. 24.18). It showed that this structure, analogously
to 24.31, displaces the water molecule in the active site and forms H-bonds to the
552 24 Aspartic Protease Inhibitors
N N
HO OH
24.31
O O Br
OH
24.38
Fig. 24.18 The superimposition of the crystal structures of the complexes of HIV protease with
the urea-containing inhibitor 24.31 (gray) and the coumarin derivative 24.38 (light green).
catalytic aspartate residues as well as to the NH groups of Ile50 and Ile500 . The
X-ray structure was used to design derivatives with improved binding properties
such as 24.36. Modeling studies led to the idea to introduce an acidic group in the S3
pocket to form a salt bridge with Arg8. The corresponding compound with an
OCH2COOH group in the para position of the 6-phenyl ring was synthesized and
led to a marked increase in the binding affinity. The inhibitor 24.37 (Ki ¼ 51 nM) is
achiral, has a low molecular weight, and can be prepared in three steps. The
hydroxypyrone building block proved to be successful for the inhibitor develop-
ment in the end. In 2005, Boehringer Ingelheim introduced tipranavir 24.27, the
first non-peptide HIV protease inhibitor, to the market. This compound binds to the
catalytic dyad with its hydroxyl function, but the side-chain optimization resulted in
a much more complex structure than was purported in 24.34.
The first inhibitor for the HIV protease was developed and introduced to the market
in less than 8 years. In the following 10 years, nine drugs have been introduced as
marketed products (Fig. 24.15). Compounds from completely different structural
classes have been successfully developed to block HIV protease. In the meantime,
an arsenal of non-peptide, low-molecular-weight, orally available compounds are
available for therapy. Inhibitors were also developed and introduced to the market
for another important viral enzyme, reverse transcriptase (▶ Sect. 32.5). In addition
to the substrate-analogue inhibitors such as zidovudine (AZT 24.39) and didanosine
24.5 The Development of Resistance Against HIV Protease Inhibitors 553
OH
S
24.34
O O Ki = 1100 nM
IC50 = 3000 nM
OH
S
24.35
O O Ki = 700 nM
IC50 = 1670 nM
OH
S
O O 24.36
IC50 = 1260 nM
OH
S
24.37
O O Ki = 51 nM
HOOC O IC50 = 160 nM
OH
Fig. 24.19 Optimization of the coumarin-like HIV protease inhibitor 24.34, which was discov-
ered by mass screening at Parke–Davis. The extension of the thioether side chain to 24.35 and
24.36 as well as the introduction of a carboxyl group led to 24.37. A hydrogenated hydroxypyrone
building block could be incorporated in tipranavir 24.27 at Boehringer Ingelheim. The compound
represents the first non-peptide HIV protease inhibitor in therapy.
O
H3C O
NH
N
NH
HO N O HO
O O N N
N3
O
H3C H
N
HN OH
N N N O
N N N
H3C H
N
O N O
24.41 Nevirapine O H3C CH3 CH3
24.42 Raltegravir
Because about 108–109 replication cycles take place each day, 105 point mutations
occur in the population of viral proteins in one infected patient. Therefore, it is not
surprising that the introduction of HIV protease inhibitors has induced a great deal
of resistance. Different mutations in the binding pocket further away from the
active site also lead to a severe reduction in the binding affinity of HIV protease
inhibitors. The positions at which mutations have been observed are shown in
Fig. 24.21. If all of the previously observed exchanges are taken together, half of
all positions in the protease are affected by now. However, similar amino acid
exchanges in the vicinity of the catalytic center are observed over and over again.
This is certainly because the peptide-like inhibitors 24.19–24.26 all adopt a rather
similar binding mode in the protease (see Fig. 24.25).
Therefore a combination therapy is used to treat AIDS. The simultaneous
administration of multiple inhibitors should lead to better suppression of viral
replication. Here the formation of resistance is markedly slowed and hindered.
The best results are achieved with the co-administration of antiviral drugs with
different modes of action, for example, the combination of a nucleosidic and a non-
nucleosidic reverse transcriptase inhibitor and a protease inhibitor. Such therapies
are part of theso-called HAART strategy (highly active antiretroviral therapy)
that has found application in the clinic.
24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the Catalytic Dyad 555
Substrate
Binding Pocket
Fig. 24.21 Mutations in the amino acids in HIV protease lead to resistance to the inhibitors. The
course of the polymer chain is coded in green or red. Red represents residues that, with high
probability, have mutated, and green areas show little exchange. Many mutations are found in the
vicinity of the active site, but some are fairly far away from the substrate-binding pocket.
R
R N Ile 50
Ile 50′ N
H H
+
N
H H HOH
Fig. 24.22 Secondary amines are promising binding partners for the aspartic acids of the catalytic
dyad of aspartic proteases. In a rational design approach, the nitrogen atom of the five-membered
pyrrolidine ring 24.43 was placed between the two aspartic acids. At the same time, the scaffold
and its side chains symmetrically reach the specificity pockets on the primed and unprimed side of
the protease. The incorporated acceptor function should form H-bonds to the structurally con-
served water in the flap region of HIV protease.
structure with HIV protease. It contained a big surprise (Figs. 24.23 and 24.24a).
The nitrogen in the pyrrolidine ring of the R,R enantiomer is, as defined, in the
pivotal point between the two aspartic acids. It takes the same position as the
hydroxyl groups in the transition-state-analogue inhibitors. However, in contrast
to the original concept, the inhibitor displaces the structural water from the binding
pocket! The oxygen of the sulfone group forms a direct hydrogen bond to the NH
group of Ile50 in the flap. The carbonyl group of the amide bond that is found on the
opposite side is not involved in any interactions with the flap region. On the other
hand, the loop of the flap adopts a distorted geometry in that the NH function of
Ile500 makes a hydrogen-bond contact to the turn of the other monomer unit. Such
a geometry had never been seen before. In a more detailed analysis, it seemed that
the inhibitor does not optimally fill the S2 to S20 subpockets of the protease.
Compared to amprenavir 24.23 (Fig. 24.15), the S2 pocket remains virtually
unoccupied. Moreover, the molecule’s bulky dimethylphenoxy group seemed to
protrude out over the S10 pocket and interfere with the loop of the flap. Despite
single-digit micromolar affinity (Ki ¼ 1.5 mM), the inhibitor seemed to be uncom-
fortable in the pocket. It seemed to break almost all of the “golden” rules of drug
design (▶ Sect. 4.11). Is the awkward dimethylphenoxy group responsible for
the binding mode? To check this, a three-armed inhibitor 24.46 (Fig. 24.23) was
synthesized. Surprisingly, the crystal structure of this inhibitor 24.46 (Ki ¼ 52 mM)
shows the same binding mode: the structural water is displaced, and the loop region
takes on a distorted form, even though there is obviously no longer a large group to
24.6 A Basic Nitrogen as a Partner for the Aspartic Acids of the Catalytic Dyad 557
O O
S O
O
N N O N S
O
O +
N+ N
O H H O– S1⬘ O H H O–
S1⬘ S1 S1
O– O O– O
Asp 25 Asp 25⬘ Asp 25 Asp 25⬘
24.45 24.46
Ile 50
Ile50⬘
N N N Ile 50
S2 H Ile 50⬘ N
H S2⬘ H H
O O S2⬘
O S2 H H
SO2 S O O
O NH2
N N H2N S O S
N N
N+ H
S1⬘ O H H O– S1 S1⬘ N+ S1
O–
O H O–
O
Asp 25 Asp 25⬘ O– O
24.47 Asp 25 24.59 Asp 25⬘
Fig. 24.23 Schematic representation of the different binding modes of the four inhibitors that are
shown in Fig. 24.24. The three pyrrolidine derivatives 24.45, 24.26, and 24.47 differ in their
connecting geometry at the ring and the number of substituents. The central heterocycle was
opened in 24.59. Interestingly, the conserved structural water returns to the structure with this
ligand.
interfere with the region (Fig. 24.24b). The occupancy of the specificity pockets
seems to be anything but optimal with this inhibitor.
Next, an attempt was made to bring the substituents on both sides of the central
pyrrolidine ring closer together. Andreas Blum eliminated the two methylene
linkers to use 3,4-diaminopyrrolidine 22.44 as a central building block
(Fig. 24.22). This scaffold was symmetrically decorated with substituted sulfon-
amides on both sides. Initially benzenesulfonic acid derivatives were optimized
with regard to the substitution on the tertiary nitrogen atom (24.47–24.58;
Table 24.7). In addition to the inhibitory effects on the wild-type enzyme, a quickly
induced resistant mutant that carries a valine instead of an isoleucine in position 84
was also studied. The enlarged pocket in the resistant mutant shows reduced
binding affinity with many inhibitors because of the reduced hydrophobic contact
surface. It was also demonstrated that the wild-type enzyme does not tolerate
558 24 Aspartic Protease Inhibitors
Fig. 24.24 Crystal structures of the inhibitors from Fig. 24.22 in HIV protease. (a) Compound
24.45 leaves the S2 pocket virtually unoccupied. Its voluminous o,o0 -dimethylphenoxy substituent
only incompletely fills the S10 pocket and seems to push against the loop in the flap region. The
structural water is displaced from the binding pocket. (b) Compound 24.46 only partially fills the
S2 and S10 pockets. Water is also displaced from this structure and the loop takes on a distorted
geometry even though no unfavorable contacts are recognizable. (c) Compound 24.47 binds
almost C2 symmetrically and places its benzenesulfonyl groups in S2 and S20 . The N-benzyl
groups are found in S1 and S10 . Here too, the structural water is displaced from the complex.
(d) Compound 24.59 orients its p-aminobenzenesulfonyl group in S2 and S20 . The N-benzyl
substituents occupy S1 and S10 . Both SO2 groups form H-bonds to the structural water, which
has returned in this structure. The inhibitor seems to fill the binding pocket perfectly, but it does
not achieve better binding affinity than the other derivatives despite the additional NH2 functions
that form H-bonds to the protein.
Table 24.7 By modifying the R1 and R2 groups on 3,4-diaminopyrrolidine 24.44 the affinity and
resistance profile is improved (wt: wild type; I84V: mutant).
O R1 R1 O
O O
S N N S
R2 R2
N
H
24.44
(continued)
560 24 Aspartic Protease Inhibitors
NH2
CONH2
CONH2
the binding pocket. The benzyl groups on the amino group are placed in the S1 and
S10 pockets. The benzenesulfonyl group occupies the S2 and S20 pockets. The
pockets seem to be much better filled with this inhibitor than it was with 24.45 or
24.46. Nevertheless, it stood to reason that the substituents in S1 and S10 in the para
position should be enlarged for optimization. An added bromine or iodine substit-
uent increases the affinity relative to the wild type by about six-fold. The mutant’s
inhibition is improved by a factor of 2. There even seemed to be adequate room in
the S2 and S20 pockets for larger groups. Indeed, a methyl group or a chlorine atom
in the ortho position increases the affinity by a factor of 2. The effect was not as
pronounced with the mutant. Furthermore, the acidic amino acids Asp29 and
Asp30, which can be involved in the interactions with the ligand, are found at the
end of this pocket. This interaction is possible by introducing an amino or
carboxamide group in the para position. The binding to the enzyme then increases
by a factor of about 10. Further optimization led to derivative 24.58 with a CF3
group on the P1 benzyl group and an amide group on the P2 substituent. It inhibits
the wild-type enzyme with Ki ¼ 61 nM and the mutant with 14 nM. The binding
mode of these new inhibitors based on a 3,4-diaminopyrrolidine 24.44 deviates
from that of all other currently marketed inhibitors. Perhaps this will open new
perspectives to break resistance (Fig. 24.25).
In one of the last steps, the central heterocycle was “cut open” and replaced with
open-chain, secondary amines (Fig. 24.23). Two- and three-membered aliphatic
chains were introduced between the SO2 groups, which were attached to address the
flap region, and the central amine nitrogen atom. The measurement of the inhibition
24.7 Other Targets from the Family of Aspartic Proteases 561
In addition to the two examples of renin and HIV protease, many other members of
the aspartic protease family have been validated as targets for drug development.
First, cathepsin D, a protein involved in protein catabolism, appeared to be
interesting, and concepts for the treatment of breast cancer or muscular dystrophy
have been pursued. The previously mentioned pepsin from the stomach has been
discussed as a possible therapeutic target for peptic ulcer disease. Secretory aspartic
562 24 Aspartic Protease Inhibitors
proteases (SAP) from Candida albicans have been considered as a possible target
enzymes to treat fungal diseases.
Drug design in the area of b-secretases looks very promising. Their inhibition
could lead to an efficacious Alzheimer’s disease treatment. b-Amyloid protein,
which makes up the pathological and dangerous plaques in the brains of
Alzheimer’s patients, is cleaved from a larger precursor, amyloid precursor protein
(APP). In 1999 two membrane-bound proteases from the aspartic protease family,
b- und g-secretases, were reported that catalyze the release of b-amyloid. Thereafter
potent inhibitors of this protease have been intensively sought. They are termed
BACE-1 and 2, which are abbreviations of beta-site-APP-cleaving enzymes. Addi-
tional target structures that are being currently investigated are the plasmepsins.
They serve the malaria parasite in the digestion of hemoglobin in the phagosome.
The parasite uses the components of hemoglobin as nutrition. Plasmepsin is used in
the initial cleavage of hemoglobin and cleaves its a-chain between Phe33 and
Leu34. Four isoforms of plasmepsin are involved in the further digestion to larger
peptide fragments. Moreover, falcipaines, cysteine proteases, and falcilysin, a zinc
protease, are involved in this process. The plasmepsins show a structural homology
to cathepsin D. The first lead structures are derived according to entirely analogous
principles as, for example, were used with renin. Newer results have underscored
the fact that multiple enzymes must be simultaneously shut off to efficiently fight
the parasite and achieve a malaria therapy based on the protease inhibitors of
hemoglobin metabolism. Therefore, it might be appropriate to develop inhibitors
for the simultaneous, selective deactivation of all four plasmepsins.
24.8 Synopsis
• Aspartic proteases possess two facing aspartate residues in their catalytic cleav-
age site. A water molecule, located at the apex between both aspartates, is
polarized and nucleophilically attacks the carbonyl carbon atom of the amide
bond to be cleaved.
• The cleavage reaction proceeds through a tetrahedral transition state with
a temporarily formed geminal-diol structure. Peptidomimetic inhibitors imitate
this intermediate structure by using chemically stable building blocks. Hydroxyl
groups embedded into hydroxyethylene or statin moieties have been especially
used as transition-state isosteres.
• Aspartic proteases often cleave between hydrophobic amino acids. These resi-
dues cannot form strong interactions to the specificity pockets of the protease on
both sides of the cleavage site. They bind through multiple contacts, and the
recognition pockets are well formed on both sides.
• Renin specifically cleaves angiotensinogen to angiotensin I. Subsequently, this
product is further cleaved to the octapeptide angiotensin II, which stimulates
blood pressure to increase once recognized at its receptor. Renin exhibits
a large, virtually merged S1/S3 pocket; this gave rise to the design concept to
bridge P1/P3 substituents in the inhibitor aliskiren and to disrupt its central
Bibliography 563
Bibliography
General Literature
Anderson PS, Kenyon GL, Marshall GR (eds) (1993) Therapeutic approaches to HIV, vol 1,
Perspectives in drug discovery design. Escom, Leiden
Babine RE, Bender SL (1997) Molecular recognition of protein-ligand complexes: applications to
drug design. Chem Rev 97:1359–1472
Dash C, Kulkarni A, Dunn B, Rao M (2003) Aspartic peptidase inhibitors: implications in drug
development. Crit Rev Biochem Mol Biol 38:89–119
De Clercq E (1995) Toward improved anti-HIV chemotherapy: therapeutic strategies for inter-
vention with HIV infections. J Med Chem 38:2491–2517
de Clercq E (ed) (2011) Antiviral drug strategies, 50th edn, Methods and principles in medicinal
chemistry. Wiley-VCH, Weinheim
Eder J, Hommel U, Cumin F, Martoglio B, Gerhartz B (2007) Aspartic proteases in drug
discovery. Curr Pharm Des 13:271–285
Ghosh AK (ed) (2010) Aspartic acid proteases as therapeutic targets, vol 45, Methods and
principles in medicinal chemistry. Wiley-VCH, Weinheim
Greenlee WJ, Weber AE (1991) Renin inhibitors, drugs. News & Perspectives 4:332–339
Hutchins C, Greer J (1991) Comparative modeling of proteins in the design of novel renin
inhibitors. Crit Rev Biochem Mol Biol 26:77–127
Martin JA, Redshaw S, Thomas GJ (1995) Inhibitors of HIV proteinase. Prog Med Chem
32:239–288
564 24 Aspartic Protease Inhibitors
Special Literature
A metal ion in the catalytic site is needed for the function of another important
class of enzymes that cleave peptide and ester bonds. By coordinating the metal ion,
these enzymes activate a water molecule for nucleophilic attack on the bond that is
to be cleaved. The water molecule experiences a drastic change in its pKa value in
this state. By far, zinc is the most commonly used metal ion in these enzymes, but
iron, cadmium, cobalt, or manganese are also found. The presence of a metal ion is
essential for the activity of the protease or esterase. If the metal ion is removed
from the enzyme by the addition of a strong complexation reagent, for example,
b-mercaptoethanol or ethylenediaminetetraacetic acid (EDTA), the catalytic activ-
ity is not observable anymore.
Many therapeutically important enzymes are metalloproteases. The zinc pro-
teases must be mentioned first, above all, angiotensin-converting enzyme (ACE).
ACE inhibitors have been used for many years to treat high blood pressure.
Moreover, in recent years further metalloproteases have been identified as possible
targets for drug design. Among these are the endothelin-converting enzyme, neutral
endopeptidases, and the matrix metalloproteases (Table 25.1). Further groups of
important zinc enzymes are the carbonic anhydrases, the zinc-containing
b-lactamases and phosphodiesterases.
In 1967 William Lipscomb determined the 3D structure of the first zinc protease for
the digestive enzyme carbopeptidase A. The zinc ion that is necessary for the
enzyme activity is complexed to two His and one Glu side chains. A water molecule
occupies the fourth coordination site. Moreover, an additional glutamate is found in
the vicinity of the zinc ion. The same amino acids are also responsible for binding
zinc in many other metalloproteases. The presence of the amino acid sequence
His–Glu–X–X–His (X is an arbitrary amino acid) is characteristic for most of the
known zinc proteases. For example, it is found in collagenase, thermolysin,
neutral endopeptidase 24.11, and in endothelin-converting enzyme (Table 25.2).
Table 25.2 Characteristic amino acid sequences in the active site of different metalloproteases.
Enzyme Position Amino acids
Thermolysin 142–146 His Glu Leu Tyr His
NEP 24.11 583–587 His Glu Ile Thr His
ECE 590–594 His Glu Leu Thr His
Astacin 92–96 His Glu Leu Met His
Collagenase 201–205 His Glu Phe Gly His
Stromelysin 201–205 His Glu Ile Gly His
The discovery of this amino acid sequence in the primary sequence of a new protein
is strongly indicative of a zinc protease. In metalloproteases or carbonic
anhydrases, zinc is complexed by three histidine residues. A water molecule
occupies the fourth site.
In the body, zinc exists as the double-positively charged cation Zn2+. This
positive charge is used by the enzyme for the amide cleavage. Ivano Bertini’s
group at the University of Florence in Florence, Italy managed to determine the
high-resolution structures of the uncomplexed and product-inhibited
metalloprotease MMP-12. These structures allow the following mechanism to be
derived: In the uncomplexed metalloprotease, the zinc ion is octahedrally coordi-
nated by three water molecules in addition to three amino acid residues (His or
Glu). One of the water molecules forms an additional hydrogen bond to
a neighboring glutamate. This residue Glu219 in the MMPs, Glu270 in carboxy-
peptidase, and Glu143 in thermolysin additionally polarizes the water molecule.
Therefore it probably exists as an OH ion (Fig. 25.1). The peptide substrate
diffuses into the binding pocket and displaces the two other water molecules from
the zinc ion. The water molecule that is polarized by glutamate attacks the carbonyl
group of the substrate’s amide bond, which is to be cleaved; the substrate is held in
25.1 Structure of Zinc Metalloproteases 567
place by hydrogen bonds to the peptide backbone on the C-terminal side. A geminal
diol structure is formed at the reaction site that is stabilized by the now
pentacoordinated zinc ion. The actual cleavage of the amide bond is achieved,
and the two product molecules initially remain in the vicinity of the zinc.
568 25 Inhibitors of Hydrolyzing Metalloenzymes
O
COO−
N
H3N+ H H3N+
H
N H
O N
O COO−
H2N O
N-Terminal
Cleavage Product
Glu219
C-Terminal Cleavage
Product
Zn2+
Fig. 25.2 A crystal structure of MMP-12 with both cleavage products has been determined
(Fig. 25.1c). The cleavage product of the former N terminus (left, light-red carbon atoms)
coordinates with its newly formed carboxylic acid function through an oxygen atom to the zinc
ion and forms no hydrogen bonds to the enzyme itself. The cleavage product originating from the
C terminus (right, light-green carbon atoms) forms four H-bonds to the main chain of the protein
and binds with its P10 residue in the deeply formed S10 pocket. The released amino group
coordinates to Glu219 and the water molecule that is bound to the zinc ion.
The glutamate residue presumably adopts the role of the proton transfer agent in
this step. The cleavage product of the former N terminus is coordinated through an
oxygen atom of the newly formed carboxylic acid function to the zinc ion
(Fig. 25.2). However, it does not form any other hydrogen bonds to the protein.
On the other hand the cleavage product of the C terminus forms four hydrogen
bonds to the main chain of the enzyme, and its P10 residue binds in the S10 pocket.
The newly formed free amino group initially remains in the vicinity of the zinc ion.
It probably exists next to the zinc ion in an uncharged state. Then the cleavage
product that originated from the N terminus leaves the catalytic site. Presumably it
25.2 Key Step in the Design of Metalloprotease Inhibitors: Binding to the Zinc Ion 569
is displaced by water, which resumes its place at the zinc ion. Finally the C-terminal
product leaves the binding pocket.
Meanwhile, the three-dimensional structures of many zinc proteases have been
solved, among them those of the angiotensin-converting enzyme and many of the
interesting matrix metalloproteases such as collagenases, gelatinases, and
stromelysin (Table 25.1). Four subfamilies are known from the family of carbonic
anhydrases. The therapeutically most significant ones are the a-carbonic
anhydrases that fulfill important tasks in many organs and for which numerous
drugs have been developed.
The zinc ion plays a pivotal role in the catalytic mechanism. The known spatial
structures of metalloprotease–inhibitor complexes show that almost all highly
potent inhibitors contain functional groups that bind directly to the zinc ion. If
these groups are left out, the binding affinity drops significantly. Therefore the first
step in the design of new inhibitors must be to seek functional groups that can bind
to the Zn2+ ion particularly well. Different groups have been described in the
literature for this purpose and are summarized in Fig. 25.3. Phosphonamides –
PO2NH–, phosphonates –PO2O–, and phosphinates –PO2CH2– can all be seen as
transition-state analogues of the enzyme reaction. In fact a few potent
metalloprotease inhibitors are known, for instance, the natural product
phosphoramidon 25.1, that contain such a group. The relative binding strength of
different groups was investigated for carboxypeptidase A (Table 25.3). Likewise,
different zinc-binding groups were tested for endothelin-converting enzyme. The
results of these investigations are presented in Table 25.4.
Remarkable variability has been observed in the binding potency of functional
groups that interact with the Zn2+ ion. Attenuated partial charges on the zinc ion and on
the anchor group are probably responsible for this effect. The zinc ion itself can be
found in very different local environments (i.e., 3 His; 2 His/1 Glu or 1 Glu/1
Cys). Obviously thiol groups, –SH, and hydroxamic acids, –CONHOH, are
particularly well-suited to contribute strong binding of the metalloprotease. The latter
group binds as a bidentate ligand to the zinc ion. Carboxylic acids and ketones bind
more weakly to the zinc ion than the above-mentioned groups. Nonetheless, acids are
of particularly great interest because in form of esters acids are often orally available
prodrugs (▶ Sect. 9.2). In contrast to phosphinates and phosphonic acids,
phosphonamides are chemically not particularly stable and therefore are not the first
choice in the development of a new drug. On the other hand, sulfonamides are
excellent zinc anchors, above all for the carbonic anhydrases.
How might potential active substances for metalloproteases be designed?
A comparison of known crystal structures (e.g., MMP-12, Fig. 25.2) shows that
the binding pockets in these proteins are much better defined on the primed side of
the cleavage site. Therefore, inhibitor design must concentrate on the S10 and its
570 25 Inhibitors of Hydrolyzing Metalloenzymes
COO− NH
25.1 Phosphoramidon Ki = 2.8 ⫻ 10−8 M
Table 25.3 Binding of phenylpropionic acids 25.2 to carboxypeptidase. The strongest binding
was found with the thiol derivative.
OH 25.2
R
O
R Ki (nM)
H 6,200
CH2COOH 450
CH2S(¼NH)2CH3 250
OP(¼O)(OH)2 140
CH2SH 11
neighboring pockets on the primed side. Nevertheless, it has been shown that
occupancy of the S1 and S2 pockets can be very important to afford inhibitors
with adequate selectivity.
The choice of appropriate groups for these pockets is dictated by their chemical
composition. Moreover, the inhibitors must be fitted with an appropriate head
group, as described previously, to coordinate to the zinc ion. It is easily understood
from the above-described mechanism for peptide cleavage why the binding pocket
on the unprimed side is less well-established. After peptide cleavage, a peptide is
formed on this side that has a terminal acid function. Such a function is itself a good
25.3 Thermolysin: Tailored Design of Enzyme Inhibitors 571
Table 25.4 Inhibition of the endothelin-converting enzyme by tryptophan derivatives 25.3. The
hydroxamic acid (R ¼ CONHOH) as well as thiol compounds have much better affinity than the
carboxylic acid derivatives.
CH3
H
N COOH
R 25.3
O
R Ki (mM)
CONHOH 24
CH2SH 12
COOH >100
CH2COOH >100
coordination group for zinc. If the N-terminal end of the cleaved peptide were
bound in a strongly pronounced pocket on the unprimed side, a self-inhibition of the
protease would result. Usually there is no interest in this property. If the cleaved
N terminus has weak affinity to the protease, this type of inhibition can occur at high
product concentration. This can be a desirable regulatory mechanism (feedback
regulation) of Nature.
with thermolysin. The Bartlett group then sought a rigid structural element to form
a scaffold in which the conformation of both leucine side chains remained
unchanged. Chromane 25.5 (Fig. 25.4) was chosen. The additional methyl group
on the ring had to be added for synthetic reasons.
A comparison of the binding constants of compounds 25.5 and 25.7 proves that the
rigidization caused by the chromane group increased the binding affinity by a factor
of 50. This corresponds to an energetic gain of about 10 kJ/mol. The X-ray structure
analysis of the macrocyclic ligand 25.5 shows that it binds as expected. Both leucine
side chains and the main-chain atoms are found in the same position as with
Cbz–GlyP–Leu–Leu (25.4, Fig. 25.5). Certainly the gain in binding energy is not
only a result of the rigidization of the ligand. The direct interaction of the chromane
group with the enzyme additionally contributes to the affinity. The goal of the synthesis
of 25.6 was to differentiate between the two effects of the rigidization and the affinity
gain from the chromane unit. Compound 25.6 binds 20-times more weakly to
thermolysin than 25.5. Nonetheless, the 3D structure shows that the open-chained
inhibitor binds to the enzyme in another conformation. This is a further example that
structures assumed to be very similar do not necessarily bind in the same way!
Phe114
Asn112
Zn2+
Arg203
Fig. 25.5 The 3D structure of a complex of thermolysin and Cbz–GlyP–Leu–Leu 25.4 (gray
carbon atoms). The leucine side chain (right) neighboring a phosphate group occupies the deep S10
pocket that is pointed toward the interior of the protein; the second leucine residue lies in the
shallow S20 pocket that is open to the surface. The Cbz group is oriented in the S1 pocket (left). The
macrocyclic inhibitor 25.5 (green carbon atoms) with the chromane scaffold locks the conforma-
tion of 25.4 and places the leucine side chains in S10 and S20 in an analogous way. Although this
inhibitor leaves the S1 pocket completely unoccupied, it binds to thermolysin more strongly than
the open-chain compound 25.4.
His–Leu (Fig. 24.5, ▶ Sect. 24.2). The release of this octapeptide leads to an
increase in blood pressure. Furthermore, ACE catalyzes the degradation of the
blood-pressure-lowering nonapeptide bradykinin to inactive peptides and in so
doing, also acts indirectly to increase blood pressure. This means that the inhibition
of ACE can simultaneously prevent blood pressure increases by blocking multiple
mechanisms. In 1965 Sergio Henrique Ferreira and John Robert Vane isolated
a peptide mixture from the venom of a snake, Bothrops jararaca (the South
American pit viper), which prolonged the blood-pressure-decreasing effects of
bradykinin in that it inhibited a protease that degraded bradykinin in the body. It
was shown that this peptide (initially called bradykinin potentiating peptide, BPP)
also inhibits the transformation of angiotensin I into angiotensin II. Multiple
structurally related peptides were identified. The most active was teprotide Pyr–
Trp–Pro–Arg–Pro–Gln–Ile–Pro–Pro (Pyr ¼ pyroglutamic acid). This nonapeptide
was synthesized by Miguel Ondetti at the Squibb company. Teprotide is a potent
ACE inhibitor with a binding constant of Ki ¼ 100 nM. In clinical trials it was shown
that the compound has not only a blood-pressure-decreasing effect in animal
574 25 Inhibitors of Hydrolyzing Metalloenzymes
Zn2+
Arg145
Glu270
Zn2+ Zn2+
O O
H
N O− O−
{ N + HO +
H NH2 NH2
R O O
H2N N H2N N
H H
Fig. 25.7 Comparison of the binding mode of the inhibitor benzylsuccinic acid and the peptidic
substrate to carboxypeptidase A. The inhibitor forms the same interaction to the enzyme as the
substrate. The amide group to be cleaved was replaced by a carboxylate group.
O R2 O O R2 O
H H
N N
N OH HO OH
H O R1
R3 O R1
Substrate Inhibitor
Fig. 25.8 The development of ACE inhibitors: A comparison of the substrates with the inhibitors
that were investigated by Ondetti and Cushman. In the initially investigated structure, the amide
bond to be cleaved is replaced by a carboxylate group.
(IC50 ¼ 300 mM). The replacement of the proline unit by another amino acid did
not produce an improvement in the binding: Proline was already the optimal
amino acid. Next, the length of the acid side chain was optimized. Glutaryl-L-
proline (25.9) proved to be the best representative with a moderate improvement
in binding (IC50 ¼ 70 mM). The introduction of a methyl group in the side chain
(25.10 and 25.11) gave a strong increase in the binding affinity by a factor of 15.
Finally, the replacement of the carboxylate with a thiol group (25.12 and 25.13)
afforded the breakthrough with an increase in the binding by an order of
magnitude. The compound, SQ 14225, 25.13 D-2-methyl-3-mercaptopropanoyl-
L-proline binds to ACE with Ki ¼ 1.7 nM, and is orally available. SQ 14225 has
been marketed under the name captopril for many years and has proven itself as
an agent to treat high blood pressure. Because decreasing the blood pressure
significantly reduces stress on the heart, captopril has also been successfully
used for the treatment of congestive heart failure.
The compounds that are shown in Fig. 25.10 prove that both a free SH group
and a free carboxylate group are necessary for the strong binding of captopril to
ACE. Esterification of the carboxyl group to 25.14 or S-methylation to 25.15
leads to a dramatic loss in affinity, just as the exchange of the amide group
576 25 Inhibitors of Hydrolyzing Metalloenzymes
CH3
N
HOOC
O COOH 25.11 IC50 = 4.9 μM
CH3
HS N 25.13 Captopril
O COOH IC50 = 23 nM
Ki = 1.7 nM
HS N
In his seminal publication in 1977 on the design of captopril, David Cushman once
again highlighted the importance of structural models for the work at Squibb:
The studies described above exemplify the great heuristic value of an active-site model in
the design of inhibitors, even when such a model is a hypothetical one. Only when suitable
information on substrate specificity and mechanism of action of an enzyme is available can
one make a reasonable working hypothesis with regard to complementary functionality
needed in an inhibitor.
Could he have dreamed that it would take another 25 years before this structure
became available? In 2003 the group of Edward Sturrock in Cape Town, South Africa
accomplished the structure determination. Could it confirm the previously proposed
model? Not all of the assumed binding modes for the inhibitors were correct, but the
structure delivered critical knowledge that brought life into the ACE-inhibitor research
area again. The human enzyme is extensively glycosylated. It is composed of 1,227
amino acids of an extracellular domain and is anchored in the cell membrane with 28
additional residues. Interestingly, it possesses two catalytic domains, a phenomenon that
is only very rarely seen in enzymes and has its origin in gene duplication. The
N-terminal domain contains 612 residues, and the C-terminal has 650. The two domains
are 60% identical. Both domains are catalytically active, and their catalytic sites differ
by only a few amino acids. Thus, a difference in selectivity for potential ligands is to be
expected. Moreover, the C-domain depends strongly on the local chloride concentration,
whereas the N domain does much less so. Aside from this so-called somatic form
(s-ACE), there is also a testis form (t-ACE), which is 701 amino acids long and is
composed of one domain. Except for the first 36 residues, it is almost identical to the
C domain of the somatic form. The structure determination with lisinopril 25.19
(Fig. 25.12) was accomplished with this form. The inhibitor binds with its central acid
group to the zinc ion. Its phenethyl group lies in the S1 pocket. The lysine-like group is in
the S10 pocket and undergoes an interaction with Glu162. The proline part binds with its
acid group in S20 to Lys511 and Tyr520. It was then interesting to model the differences
in the N and C domains of the s-ACE based on the t-ACE structure and to prepare the
578 25 Inhibitors of Hydrolyzing Metalloenzymes
NH3+
CH3
N
EtOOC N N
H HOOC N
O COOH H
O COOH
25.18 Enalapril
25.19 Lisinopril
S CH3
CH3 H H
S
N
N EtOOC N
EtOOC N H
H O COOH
O COOH
25.20 Spirapril 25.21 Perindopril
CH3 CH3
H H H H
N N
EtOOC N EtOOC N
H H
O COOH O COOH
25.22 Ramipril 25.23 Trandolapril
R
R
CH3 N
N N
EtOOC N EtOOC N
H H
O COOH O
COOH
25.24 R = H Quinapril 25.26 Cilazapril
25.25 R = OCH3 Moexipril
N O O N
EtOOC N P
H O O COOH
O COOH O
25.27 Benazepril 25.28 Fosinopril
Glu162
Ala354
Zn2+
His353
His513
Lys511
Tyr520
Fig. 25.12 Crystal structure of lisinopril 25.19 (Fig. 25.11) with t-ACE. The central carboxylate
group of the inhibitor coordinates to the zinc ion. The NH group on the lysine residue of lisinopril
forms an H-bond to the C═O group of Ala354 in the S10 pocket, and the carbonyl group also forms
hydrogen bonds with His353 and His513. The terminal ammonium group of the lysine residue
forms an H bond to Glu162. The acid group of the proline residue forms an H-bond contact with
Lys511 and Tyr520. The phenethyl side chain is placed in the S1 pocket.
proteins by mutagenesis. Both domains bind lisinopril with very similar affinity
(Table 25.5). The S1 pocket and the S2 pocket, which is not occupied by lisinopril,
exhibit Tyr396, Asn494, and Thr496 in the N domain. A phenylalanine, a serine, and
a valine are in these positions in the C domain. Moreover, an asparagine is found in this
pocket in the N domain that limits the entrance to the S2 pocket through a glycosylation.
Therefore, it is not surprising that keto-ACE 25.29, with its bulky benzamido group,
interacts much better with the C domain. Two other compounds are known, RXP 407
25.30 and RXP A380 25.31, which bind to the two domains with a selectivity differ-
ence of 1,000-fold. They are derived from phosphinic acids. Because the zinc-binding
group lies in the center of the molecule, these inhibitors can occupy all of the pockets
from S2 to S20 well. RXP A380 has a much larger group for the S2 pocket. Furthermore,
this molecule has an indole moiety in the P20 position that can undergo stronger
hydrophobic interactions in S20 . At this site the C domain has an advantage over the
N domain: Instead of a serine, a hydrophobic valine is found at position 379 which
translates into stronger binding of the inhibitor to the C domain.
What advantage does the domain-specific inhibition of ACEs offer? The enzyme
not only transforms angiotensin I into II, it metabolically degrades the blood-
pressure-decreasing bradykinin as well. ACEs are assumed to be also involved in
580 25 Inhibitors of Hydrolyzing Metalloenzymes
COOH
O O O
H H H
Ph N N N P N
N
O O COOH H OH
O O CONH2
H
N
O
H H
Ph O N P N
OH
O O COOH
25.31 RXPA380
the cleavage of other signal peptides. ACE inhibitors are usually well tolerated by
patients. Some adverse effects, however, have been described. For example, many
patients develop an unpleasant dry cough, and occasionally a life-threatening
angioedema (an acute swelling of the mucous membranes) can occur. It is
suspected that this is associated with the blocked degradation of the described
peptides, especially bradykinin. The catalytic activity of the C domain seems to
be responsible for the blood pressure regulation under in vivo conditions, and
angiotensin I is efficiently cleaved there. Bradykinin, on the other hand, is degraded
equally well by both domains. By using compounds that are selective for the
C domain, it might be possible to decrease blood pressure while leaving
a residual degradation of bradykinin intact. The excessive concentration of this
peptide could then be avoided. The structure determination of ACE therefore opens
a new perspective for the development of selective inhibitors that allow for an
efficient regulation of blood pressure according to an established principle. Hope-
fully, they will display fewer adverse effects.
25.6 Inhibitors of Matrix Metalloproteases 581
The family of matrix metalloproteases (MMPS) belongs to the neutral zinc endopep-
tidases. They assume important roles in the construction and degradation of connec-
tive tissue, for example after an injury or during angiogenesis (the proliferation of
blood vessels). In a healthy state, these proteases are kept in balance by tightly
controlled mechanisms. In this way, active proteases are released from inactive
precursors only when they are needed, or our bodies have adequate endogenous
inhibitors that can mediate the balance between matrix synthesis and matrix
degradation. In a disease state, this complex equilibrium is thrown off balance,
and different MMPs are produced in excess. Pathological situations ensue that are
associated with the construction and degradation of the extracellular tissue.
The etiology of rheumatoid arthritis is based on such chronic destructive
processes that lead to loss of bone and cartilage. Cartilage tissue is composed of
a glycoprotein matrix that is cross-linked and reinforced by collagen. The MMPs
cleave such scaffold proteins. In rheumatoid arthritis, the balance between matrix
synthesis and degradation is obviously lost. Excessive activity of the matrix
metalloproteases leads to an overwhelming degradation of the cartilage. Inhibiting
these proteases could therefore be a promising approach to treat rheumatoid
arthritis. The degradation of the extracellular matrix is also critical for the growth
of malignant tumors, the invasiness of tumour cells, metastasis, and angiogenesis.
Therefore, the inhibition of MMPS could also lead to cancer therapy.
In the meantime, almost 30 MMPs are known, which include the collagenases
(MMP-1, -8, -13), gelatinases (MMP-2, -9), stromelysins (MMP-3, -10, -11),
matrilysin (MMP-7), macrophage metalloelastases (MMP-12, -19), and enamelysin
(MMP-20). The collagenases, gelatinases, and stromelysin recognize collagen as
a substrate. Collagen is composed of three intertwined, left-handed, a-helical
chains. Each individual chain is more than a 1,000 amino acids long and contains
the repeating sequence –(Gly–X–Y)n–, in which the position X is usually occupied
by a proline or an alanine, and the position Y is usually occupied by
a hydroxyproline or an alanine. The collagenases cleave collagen in its native,
threefold helical structure, gelatinases cleave collagen in a denatured form, and it is
assumed that stromelysins cleave the proteoglycans.
A series of different collagens are cleaved by collagenases between the glycine
and leucine or isoleucine residues. A substrate comparison between the species
human, cattle, mouse, and chicken showed that three amino acids to the right and
left of the cleavage site are conserved. Therefore, the N- or C-terminal-protected
hexapeptide Ac–Pro–Leu/Gln–Gly–Leu/Ile–Leu/Ala–Gly–OEt, for example, 25.32
(Fig. 25.13) is recognized as a minimal substrate. This established the starting point
for the design of collagenase inhibitors. The peptide bond to be cleaved in the
minimal substrate 25.32 is replaced with a non-cleavable isostere. The replacement
of the amide bond between Gly and Leu with a ketomethylene group –COCH2–,
a hydroxymethylene group CH(OH)CH2– or a hydroxylamine derivative led to
inactive compounds in all cases. These groups are apparently unable to form
582 25 Inhibitors of Hydrolyzing Metalloenzymes
O O O
H H
N N OEt
N N N
H H H
N O O O
Ac
25.32 Minimal substrate
Cleavage site
O O O− O
H H
N P N OEt
N N
H H
N O O O
Ac
25.33 IC50 = 70 nM
O O
H
N OH
N N
H H 25.34 IC50 = 10 mM
N O
Ac
O O
H
HO N OEt
N N
H H 25.35 Ro 31-4724
O O
IC50 = 9 nM
O O
H
HO N CH3
N N
H H 25.36 Ro 31-9790
O
IC50 = 5 nM
O O O O
H H
HO HO N
N N NHMe
N NHMe
H H
O O
OH S
S
25.37 Marimastat 25.38 Batimastat
Fig. 25.13 Collagenase inhibitors made from substrate analogues. Compound 25.32 covers the
substrate sequence from P3 to P30 . Replacement of the amide bond by a –PO2–group 25.33 leads to
a potent inhibitor. Compound 25.35 contains only the three amino acids prior to the cleavage site
as well as the C-terminal hydroxamic acid as a zinc-binding group. Compounds 25.35 and 25.36
contain the three or two amino acid side chains following the cleavage site in their structures, this
time they are augmented with an N-terminal hydroxamic acid group. The two inhibitors
marimastat 25.37 and batimastat 25.38 were in clinical trials for several years as compounds for
tumor therapy.
25.6 Inhibitors of Matrix Metalloproteases 583
Asn80
Glu119
Tyr140
S1′
S3′
Zn2+
S2′
His122
His128
Pro138
Fig. 25.14 Crystal structure of Ro 31-4724 (25.35, IC50 ¼ 9 nM) and collagenase; the binding
mode is shown. The hydroxamic acid binds in a bidentate-like manner to the zinc ion. Both amide
groups form hydrogen bonds to the enzyme. The leucine side chain of the inhibitor in the P10
position fills the S10 pocket, which is oriented toward the protein’s interior. The alanine methyl
group binds in the S30 pocket, whereas the leucine side chain in position P20 protrudes into the
solvent because the S20 pocket is practically non-existent.
a favorable interaction with the zinc ion. The use of a phosphinate group finally gave
a potent collagenase inhibitor 25.33. However, if only the N terminal proline is
eliminated from this hexapeptide, the inhibitory activity is largely lost. The search
for collagenase inhibitors based on the N terminal tripeptide fragment led to modestly
active compounds such as 25.34. The synthesis of potential inhibitors that contain the
C terminal tripeptide sequence Leu–Leu–Gly–O-alkyl was much more successful.
The coupling of these structural elements with the potent head group hydroxamic
acid to bind the zinc ion gave collagenase inhibitors with nanomolar affinity such as
Ro 31-4724, 25.35 and Ro 31-9790, 25.36. The X-ray structure of 25.35 in complex
with human fibroblast collagenase was solved. As expected, the compound binds to
the zinc ion as a bidentate ligand. The leucine side chain in the P10 position fills the S10
pocket, and the alanine methyl group binds in the S30 pocket. The leucine side chain
in position P20 , which should formally occupy the S20 pocket, orients away from the
enzyme. The binding mode is shown in Fig. 25.14.
Interestingly, exchanging the isobutyl side chain at position P20 for a tert-butyl
group in 25.36 led to an increase in affinity, even though the group is not in direct
contact with the enzyme. This result is attributed to a conformational stabilization.
The voluminous tert-butyl group limits the mobility of the inhibitor so that the
584 25 Inhibitors of Hydrolyzing Metalloenzymes
O
O O
R S
S OCF3 O
H N O O O CN
HN
OH
O OH 25.42
25.39 R = Ph, 2-Pyridyl, N-Morpholino-ethyl
Cl
N O O
S
N N
N
SO2 HN O O
O OH OH
25.40 25.43 F
N N
N S
SO2 O O
HN O
O OH
25.41 OH 25.44
Cl
O O
S
N
N
S
O
O O O OH
HN O
25.47 Tanomastat BAY 12-9566
OH
25.45 N N OMe
O
S O
N
N S
O O
HN O
HN O
OH
OH
25.46 25.48 CGS 27023A
Fig. 25.15 Development candidates 25.39–25.48 from different companies as potent MMP
isoenzyme inhibitors. Hydroxymates, inverse hydroxymates, and carboxylates were used as
anchor groups for the zinc ion. Tanomastat 25.47 from Bayer and CGS 27023A 25.48 from
Novartis were clinically developed for several years.
H O
H
N O N
HO HO
O H SO2
N O
H N
O
25.49
O
25.50
Fig. 25.16 Crystal structure of the collagenase MMP-1 with two different inhibitors 25.49 and
25.50. Because of a conformational rearrangement of Arg214, the S10 pockets with their volumi-
nous groups can be accommodated. This adaptive ability of the specificity pockets in MMP makes
the development of selective inhibitors extremely difficult.
bicarbonate, or the back-reaction for the release of CO2. In total, four different
families of these enzymes are known, which are called the a-, b-, g-, and d-CAs.
Sixteen of these isoforms occur in mammals. Some are found in the cytosol and some
are membrane anchored. They are involved in many physiologically important
processes such as respiration, CO2/HCO3 transport between metabolizing tissue
and the lungs, pH homeostasis, electrolyte secretion, biochemical reactions that need
C1 building blocks, bone resorption and calcification, and tumor growth.
The zinc ion is found at the end of a funnel-shaped catalytic site in the a-carbonic
anhydrases. It is held in position by three histidine residues. A water molecule is found
at the fourth coordination site. This water is severely polarized by coordination to the
Zn2+ and is most probably present as an OH ion. Furthermore, a hydrogen-bond-
acceptor group is found in the OH group of Thr199 (Fig. 25.17). The proton of the OH
group of Thr199 forms an H-bond to the carboxylate group of Glu106. The water (or
OH ion), which has strongly enhanced nucleophilicity, attacks a CO2 molecule, which
is positioned in a hydrophobic niche in the vicinity of Val121, Val143, and Leu198 at
the bottom of the binding pocket. One of the oxygen atoms of the CO2 finds a hydrogen-
bonding partner in the NH function of Thr199. The newly formed bicarbonate is
25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential Reaction 587
Fig. 25.17 The catalytic site in a-carbonic anhydrases is found at the end of a funnel-shaped
binding pocket. There an OH ion which is coordinated to the Zn2+ ion nucleophilically attacks
a CO2 molecule. Bicarbonate forms, which is held in place by Thr199 (left). A sulfonamide,
deprotonated at nitrogen, fits at the site of the carbonate in the very narrow binding pocket (right).
Because of the tetravalency of the sulfur, this site can be fitted with another substituent, as is shown
in the case above with a p-fluorophenyl group.
displaced from the temporarily pentacoordinated zinc ion, and a new water molecule
adopts its position at the zinc ion. A new catalytic cycle can begin. CAII is one of the
fastest enzymes known. The acquisition or removal of a proton is the rate-limiting step
in the reaction cycle. For this, carbonic anhydrases have a series of multiple histidine
residues that deliver the protons from the edge of the funnel-shaped binding pocket. At
the same time, this arrangement causes the funnel to appear amphiphilic. One side is
hydrophobic, and the other is hydrophilic. The very narrow area around the catalytic
zinc ion affords only enough space for CO2 and HCO3. Putative inhibitors must be
able to form an equivalent interaction to the bicarbonate on the one hand, and on the
other hand they must occupy the funnel opening. In addition to ions such as cyanide,
thiocyadnate, or isocyanate, it is above all the sulfonamides, sulfamates, and
sulfamides that have the appropriate head group for coordination in the catalytic site.
The amino group on these sulfur derivatives is acidic enough to easily release a proton
and coordinate to the zinc ion in a charged state, analogously to the OH ion. The
remaining proton undergoes an interaction with the threonine OH group. An oxygen
atom of the SO2 function satisfies the NH function of the latter amino acid. The second
S═O group expands the coordination number at zinc to five. An aromatic carbon that is
part of a heterocyclic ring system is usually found at the fourth bond of the central sulfur
atom of most known inhibitors. In further examples there is another oxygen or nitrogen
atom as a linker to this heterocycle.
In the case of carbonic anhydrases, the coordination of the ligands to the zinc ion
in the catalytic site is essential for good binding. In this way, small ligands such as
phenyl sulfonamide 25.51 or its isostere thiophene-2-sulfonamide 25.52 achieve
submicromolar inhibition of carbonic anhydrase II (Fig. 25.18). More than 50 years
ago, the replacement of these aromatic rings with other heterocycles led to the first
marketed products, which were introduced to therapy as sulfonamides under the
588 25 Inhibitors of Hydrolyzing Metalloenzymes
NH2 25.51 N
S N NH
O O Ki = 300 nM N
O
NH2 S N S NH2
S 25.52 H
S O
O O Cl
O N N 25.59 Azosemide
NH2 25.53 Acetazolamide
H3C N S S HOOC
H O
O O
H3C O N S NH2
H O
O N N
Cl
NH2 25.54 Methazolamide
H3C N S S 25.60 Furosemide
O O
H3C
NH
O N N
NH2 25.55 O
F3C N S S
S NH2
O O O N
CH3 S S
O
O O
HN CH3 25.61 Brinzolamide
CH3
NH2 25.56 MK 927
S S S
Ki = 0.7 nM
O O O O
CH3 O
N S NH2
HN N O
F3C
Fig. 25.18 The small aromatic sulfonamides 25.51 and 25.52 bind to carbonic anhydrase II with
submicromolar affinity. By exchanging a heterocycle, acetazolamide 25.53 and methazolamide
25.54 are obtained. Both drugs were used for a long time as systemic carbonic anhydrase inhibitors
for diuresis and for the treatment of glaucoma. Compound 25.55 was the first topically active, that
is, useable as eye drops, CA inhibitor. The structure-based design of new inhibitors led to the
marketed product dorzolamide 25.57 by way of 25.56. Compounds 25.58–25.61 are further drugs
that inhibit carbonic anhydrases and are used for the treatment of glaucoma or as a diuretic. Even
celecoxib 25.62, topiramate 25.63, and the artificial sweetener saccharin 25.64 inhibit carbonic
anhydrases and this explains some of their observed side effects.
25.7 Carbonic Anhydrases: Catalysts of a Simple but Essential Reaction 589
N P
O HO O N
N NH2 PDEs HO O
O N NH2
O P O N N
OH N N
HO OH
HO
25.65 cAMP 25.66 AMP
O
N P
O HO O N
N O HO O
O PDEs N O
O P O N NH
OH N NH
HO OH
HO NH2
NH2
25.67 cGMP 25.68 GMP
Fig. 25.19 cAMP 25.65 and cGMP 25.67 are hydrolyzed into their open-chain analogues AMP
25.66 and GMP 25.68, respectively, by phosphodiesterases.
Gln817 Phe820
Zn2+
His653
Phe786
Mg2+
Fig. 25.20 Crystal structure of sildenafil 25.69 (Fig. 25.21) in PDE 5. The pyrazolopyrimidinone
moiety of the inhibitor is recognized by Gln817 through two parallel hydrogen bonds and binds to
the catalytic zinc ion (blue-gray) through a water molecule. It is found in the vicinity of
a magnesium ion (light green), which is coordinated by five water molecules and Asp654.
A bridging water molecule is shared by Mg2+ and Zn2+.
natural substrate cGMP. The relationship with cGMP is even more apparent when it
is considered that the 2-phenyl-substituted purines such as 25.72 served as lead
structures (Fig. 25.21). The pyrazolopyrimidine 25.73 or imidazotriazenone 25.75
that are contained in sildenafil and vardenafil, respectively, were developed from it.
The chemically closely related vardenafil adopts a very similar binding mode as
sildenafil. On the other hand, the structurally deviating tadalafil adopts a distinctly
different orientation.
The discovery of the effects of sildenafil was once again accomplished by
serendipity. The compound was in clinical trials at Pfizer for the treatment of
angina pectoris. It proved, however to be no better than the classic nitro
compounds (i.e., nitroglycerin or isosorbide dinitrate). These nitro derivatives
release NO under reductive conditions, which stimulates guanylate cyclase.
cGMP is then formed, which in turn exerts an influence on vascular constriction.
A phosphodiesterase inhibitor also increases the cGMP level because it blocks the
degradation of this second messenger. In the clinical trials, however, a side effect
proved to be remarkable in the male probands: It stimulated penile erections. NO is
released into the cavernous body of the penis, and increased cGMP is produced by
activation of guanylyl cyclase. This causes increased blood flow to the cavernous
body and stimulates penile erection. Sildenafil amplifies the effect by inhibiting the
degradation of cGMP. In 1998 sildenafil was approved for the treatment of erectile
dysfunction. The market accepted Viagra euphorically. Until 2005, more than
25.8 Zinc and Magnesium in the Catalytic Centers of Phosphodiesterases 593
O CH3
O CH3
N O O HN
O O HN N
N S N
S N N
N N
H3C N
N O
H3C O CH3
CH3
CH3
CH3
25.70 Vardenafil
25.69 Sildenafil
O
CH3
N
N N
H
O
O
O
25.71 Tadalafil
O O O
H CH3 CH3
HN N N
HN HN
N N
N N N
N N
O O O
CH3 CH3 CH3
CH3 CH3
Fig. 25.21 Sildenafil 25.69, vardenafil 25.70, and tadalafil 25.71 represent potent PDE 5 inhib-
itors. The first two compounds were developed from phenyl-substituted purines such as 25.72, and
modified to pyrazolopyrimidines such as 25.73 or imidazotriazenones such as 25.74.
177 million prescriptions in 120 countries around the world have been registered.
In addition to PDE 5, PDE 6 is also inhibited by sildenafil, vardenafil, and tadalafil.
This isoform is involved in visual processes, which provides an explanation why the
use of these drugs is accompanied by visual disturbance. Tadalafil has better
selectivity against PDE 6, but inhibits PDE 11 in addition to PDE 5. Another
clinical application has been approved for sildenafil and tadalafil. They are used
on intensive care units to prevent and treat pulmonary hypertension in mechanically
ventilated patients.
Do PDE 5 inhibitors have another career? What helps men apparently also gives
cut flowers more stamina. According to experiments by Heribert Warzecha at the TU
Darmstadt, cut daisies stay fresh longer when Viagra® is added to the water in the
vase! Aspirin®, however, is less expensive and it also supposedly keeps cut flowers
fresh longer. In another study it was found that hamsters could reset their circadian
594 25 Inhibitors of Hydrolyzing Metalloenzymes
rhythm faster when they had Viagra® in their blood. A higher cGMP level apparently
helps the internal clock to more easily adjust to changes in external conditions.
Whether Viagra also helps to overcome jetlag after long-distance travel must be
demonstrated. The examples show that no drug is without side effects. Often these are
only discovered after some time in clinical trials or after practical use.
What makes zinc so special that it preferentially occurs in the catalytic site of so
many enzymes? Zinc is an ion that often occurs in biological systems. This is also
valid for an element like iron. Zinc exists as a doubly positively charged ion. This,
however, is also achieved by other ions such as Fe2+, Co2+, Ni2+, or Cu2+. In
contrast to the latter-named elements, the zinc ion is not redox sensitive because
of its filled d-orbitals. If the reaction mechanism of an ester or amide cleavage is
considered, aside from the coordination properties, only the charge of the metal ion
is critical. It serves to polarize a water molecule that initiates the nucleophilic attack
on the carbonyl carbon atom of the ester or amide to be cleaved. This task can also
be assumed by other metal ions. In fact, under reductive conditions, hydrolyzing
enzymes can be found that have an iron instead of a zinc ion in the catalytic site.
New polypeptide chains that are synthesized in prokaryotes, mitochondria, or
plastids initially carry a methionine at the first position of the N terminus that is
substituted with a formyl group. In other compartments of more complex organisms
the same proteins are formed without these formyl groups. The methionine is
cleaved by a methionine aminopeptidase in about a third of all mature proteins.
The formyl group must be removed so that the formylated chains can also undergo
this process. This is achieved by peptide deformylases (PDFs). They carry an Fe2+
ion in their catalytic site and are therefore exceedingly sensitive to oxidation. An
exchange for Ni2+ or Co2+ is achieved only with a drastic concomitant loss in
catalytic activity. On the other hand, the exchange of iron for a Zn2+ ion leads to
a complete loss in enzymatic function in almost all PDFs. Peptide deformylases
occur in bacteria as well as plastids from plants and some parasites. Initially it was
thought that these enzymes do not occur in humans, so that they would seem to be
an ideal target structure for an antibacterial or antiparasitic therapy. In the
meantime, PDFs have also been discovered in the mitochondria of animals and in
humans. This must be considered when developing antibiotics based on PDF
inhibitors. The potent inhibitor actinonin 25.75 (Fig. 25.22) has not only
antibacterial effects but also inhibits proliferation in human cells. This can lead to
cytotoxic side effects, but can also be exploited for antineoplastic effects. More-
over, these inhibitors have importance as herbicides.
The iron ion is tetrahedrally coordinated by two histidines and one cysteine
in PDFs (Fig. 25.22). A water molecule occupies the fourth position. The pKa
value of this water molecule is drastically shifted by the direct coordination to the
metal ion and gains nucleophilicity because of its ease of deprotonation.
It presumably attacks as a hydroxide ion the formyl peptide group being cleaved.
25.8 Zinc and Magnesium in the Catalytic Centers of Phosphodiesterases 595
Arg97
His132
Fe2+
Ile44
The mechanism is very similar to that of the proteases. The carbonyl carbon
atom of the formyl group being cleaved adopts a tetrahedral transition state.
For this, the charge that forms on the oxygen is stabilized by an NH of the main
chain, a terminal carboxamide group of a glutamine, and coordination to the
iron. The amino group of the bond to be cleaved is bound to a glutamate by an
H-bond. The polypeptide chain is cleaved with concomitant release of the
N terminus. The remaining formiate group leaves the coordination site at the
metal ion and dissociates from the enzyme. Two water molecules take its place at
the catalytic site. Inhibitors of this enzyme have hydroxamate groups to anchor
them to the iron ion. Because the natural peptide substrate has a methionine in the
P10 position, n-alkyl chains with four or five carbon atoms on inhibitors are ideal
in the same position. The S10 pocket is well-formed in the PDFs, but the sur-
rounding pockets are not well-characterized. This is because of the function of the
proteins. A broad palette of formylated substrates can be processed, that is, after
the formyl methionine the amino acid sequence is arbitrarily composed. Interest-
ingly, thiorphan also inhibits PDFs. This underscores that the thiol group can
coordinate to the iron atom. The benzyl group of the inhibitor fills the S10 pocket
of the enzyme.
596 25 Inhibitors of Hydrolyzing Metalloenzymes
25.9 Synopsis
Bibliography
General Literature
Becket RP, Davidson AH, Drummond AH, Huxley P, Whittaker M (1996) Recent advances in
matrix metalloproteinase inhibitor research. Drug Discov Today 1:16–26
Fersht A (1985) Enzyme structure and mechanism. W. H. Freeman, New York, p 416
Rich DH (1990) Peptidase inhibitors. In: Hansch C, Sammes PG, Taylor JB (eds) Comprehensive
medicinal chemistry, vol 2, Enzymes & other molecular targets. Pergamon Press, Oxford,
pp 391–441
Türk B (2006) Targeting proteases: successes, failures and future prospects. Nat Rev Drug Discov
5:785–799
Special Literature
Acharya KR, Sturrock ED, Riordan JF, Ehlers MRW (2003) ACE revisited: a new target for
structure-based drug design. Nat Rev Drug Discov 2:891–902
Baldwin JJ, Ponticello GS, Anderson PS et al (1989) Thienothiopyran-2-sulfonamides: novel topically
active carbonic anhydrase inhibitors for the treatment of glaucoma. J Med Chem 32:2510–2513
Bertenshaw SR et al (1993) Thiol and hydroxamic acid containing inhibitors of endothelin
converting enzyme. Bioorg Med Chem Lett 3:1953–1958
Bertini I, Calderone V, Fragai M, Luchinat C, Maletta M, Yeo KJ (2006) Snapshots of the reaction
mechanism of matrix metalloproteinases. Angew Chem Int Ed 45:7952–7955
Borkakoti N, Winkler FK, Williams DH, D’Arcy A, Broadhurst MJ, Brown PA, Johnson WH,
Murray EJ (1994) Structure of the catalytic domain of human fibroblast collagenase complexed
with an inhibitor. Nat Struct Biol 1:106–110
598 25 Inhibitors of Hydrolyzing Metalloenzymes
Cushman DW, Cheung HS, Sabo EF, Ondetti MA (1977) Design of potent competitive inhibitors
of angiotensin-converting enzyme. Carboxyalkanoyl and mercaptoalkanoyl amino acids.
Biochemistry 16:5484–5491
Hu J, van den Steen PE, Sang Q-XA, Opdenakker G (2007) Matrix metalloproteinase inhibitors as
therapy for inflammatory and vascular diseases. Nat Rev Drug Discov 6:480–498
Jain R, Chen D, White RJ, Patel DV, Yuan Z (2005) Bacterial peptide deformylase inhibitors:
a new class of antibacterial agents. Curr Med Chem 12:1607–1621
Matter H, Schudok M (2004) Recent advances in the design of matrix metalloprotease inhibitors.
Curr Opin Drug Discov Devel 7:513–535
Matthews BW (1988) Structural basis of the action of thermolysin and related zinc peptidases. Acc
Chem Res 21:333–340
Morgan BP, Holland DR, Matthews BW, Bartlett PA (1994) Structure-based design of an inhibitor
of the zinc peptidase thermolysin. J Am Chem Soc 116:3251–3260
Porter JR, Beeley NR, Boyce BA et al (1994) Potent and selective inhibitors of gelatinase-A,
1. Hydroxamic acid derivatives. Bioorg Med Chem Lett 4:2741–2746
Rotella DP (2002) Phosphodiesterase 5 inhibitors: current status and potential applications.
Nat Rev Drug Discov 1:674–682
Supuran CT, Scozzafava A (2000) Carbonic anhydrase inhibitors and their therapeutic potential.
Expert Opin Ther Pat 10:575–600
Supuran CT, Mastrolorenzo A, Barbaro G, Scozzafava A (2006) Phosphodiesterase 5 inhibitors –
drug design and differentiation based on selectivity, pharmacokinetic and efficacy profiles.
Curr Pharm Des 12:3459–3465
Transferase Inhibitors
26
At the end of the 1970s, the evidence became corroborated that proteins are not only
translated and synthesized in the ribosomes, but can also undergo subsequent
changes after their synthesis. In addition to glycosylation, the attachment of phos-
phate groups to alcohol functions on serine, threonine, and tyrosine residues occurs.
Later it was recognized that even histidine can be phosphorylated. Moreover it was
shown that the degree of phosphorylation of a protein can dramatically change with
time in the cell. Cellular reproduction proved to be strongly dependent on these
changes. It therefore became obvious to correlate phosphorylation with intracellular
signaling processes. ATP was established as the source of the transferred phosphate
groups. However, the bonds between the phosphate groups of ATP cannot be so
easily transferred to an amino acid. This reaction is kinetically too slow in
aqueous solution. Therefore, Nature developed efficient catalysts for this task: the
protein kinases. On the other hand, the cleavage of a phosphate group from
a phosphorylated amino acid is also a very slow process. This process therefore
requires efficient enzymes, for which the phosphatases are available. In this way,
protein phosphorylation is a reversible process that can be “switched” in both
directions by the above-named enzyme classes (Fig. 26.1). Although these enzymes
catalyze very general reactions, their substrate recognition is highly specific. It is
only in this way that the signal transduction processes are precisely controlled and
the protein function is switched on and off.
The palette of posttranslational modifications is still not exhausted with these
examples. Each newly synthesized protein carries an N-formylmethionine at its
N terminus. Initially this formyl group is cleaved by a deformylase (▶ Sect. 25.9)
before a methionine aminopeptidase removes the methionine residue from the
peptide chain of many proteins. The attachment of sugar residues (glycosylation)
not only improves the solubility and proteolytic stability of the protein, it also
especially serves to label proteins with crucial recognition characteristics for
signaling and intracellular transport processes. Above all, sugar residues are of
crucial importance for cell–cell recognition and interactions with the extracellular
matrix (▶ Sect. 31.3). The transglutaminases, which posttranslationally crosslink
proteins by forming isopeptide bonds through glutamate and lysine side chains,
Signal-
Input
Protein-
Protein-
substrate
substrate
Adenosine P P P
P
Kinase Phos-
phatase
Adenosine P P
P P
Protein-
Protein-
substrate
substrate
Signal
Switch on Switch off
Transmission
Fig. 26.1 The posttranslational phosphorylation of proteins is critical for regulating intracellular
signal processes, for example, cellular reproduction is strongly dependent on these processes.
A phosphate group P is transferred from ATP (green) to the alcohol function of a serine, threonine,
or tyrosine. This task of turning-on protein function is performed by the kinases. Conversely
phosphate groups can be cleaved from a phosphorylated amino acid again. The protein function is
turned off by this step.
were discussed in ▶ Sect. 23.9. Transferases can also transfer alkyl groups. For one
family of transferases, methyl groups that are used to modify residues. For others, it
is a prenyl group that is transferred, the terpene anchor of which can be used to
immobilize proteins at the membrane (Sect. 26.10). Finally, ubiquitin and SUMO
should be mentioned. Ubiquitin is a polypeptide chain that labels proteins for
proteolytic degradation in the proteasome (▶ Sect. 23.8). SUMO is also a small
protein that can be attached to proteins and exerts an influence on, for example,
processes in the cell nucleus.
In case of a disease, it sounds initially very attractive to regulate with drug therapies
enzymes that act as switches in signaling cascades. In ▶ Sect. 12.4 kinases were
identified as enzymes that are often involved in disease processes. In eukaryotes,
26.2 Structure of Protein Kinases: More than 500 Variations with Similar Geometry 601
about 30% of all proteins are reversibly phosphorylated. The electrostatic proper-
ties of the protein are changed by attaching a phosphate group, conformational
rearrangements are induced, and new binding sites can be formed. The design of
kinase inhibitors was initially focused nearly exclusively on a competitive displace-
ment of ATP from its binding site. But not only kinases use ATP as a substrate. This
molecule is the most important energy-transfer system in cellular metabolism.
Many cofactors use ATP as a building block to fulfill their cellular tasks. There
are about 2,000 proteins in the human genome that use ATP as a substrate in
a variety of ways. At 0.01 M, the intracellular concentration is very high. Overall,
the physiological turnover of ATP is 75 kg per day in an adult! In light of this
situation, the question as to how specifically and selectively a binding site of
a particular kinase can be blocked by an inhibitor is justified: The same substrate
ATP is transformed by each of these enzymes, and its cellular concentration is very
high. The problem is further complicated in that Nature has established redundancy
in many of these processes as a failsafe. If one signal transduction pathway is
removed, a similar pathway can serve as a replacement in that it produces more of
its own phosphorylated proteins. In this way, they contribute to the correction of the
deficit caused by the blocked function. Is this true especially for signaling cascades
for which many different structurally similar kinases and phosphatases are used to
transmit information? Up until the early 1990s, all of these problems were consid-
ered to be so complex and unsolvable that anyone who tried to develop selective
kinase inhibitors as drugs was considered crazy. In the meantime, the tables have
been completely turned. Today, a pharmaceutical company that does not work on
multiple kinase projects is considered to be backward and not innovative! Until
now, no other protein family has been investigated with so much fervor. What
brought about this change of heart that led to a pharmaceutical “kinase gold rush”?
Protein kinases represent one of the largest target families in the human genome.
More than 530 kinases switch the most different signaling pathways in our bodies
on and off and transform proteins from inactive to active states. They are related to
one another in varying degrees by their sequence and structure, and are divided into
subfamilies based on a family tree (Sect. 26.3). Kinases can also be regulated by
further binding partners. Allosteric binding sites and second messengers that
intervene in the regulation of kinase function are known. Inhibitory or activating
proteins (e.g., cyclines) control kinase activation via complexation to the kinase
domains. The autophosphorylation of kinases exerts an important influence on
their conformation and the correct positioning of the catalytic residues for the
transfer of the g-phosphate group of ATP to the amino acids serine, threonine,
tyrosine, or histidine (Fig. 26.2). The conserved architecture of protein kinases is
shown in Fig. 26.3. The N-terminal domain is constructed from five b-pleated
sheets. The C-terminal domain is overwhelmingly a-helical and contains the
602 26 Transferase Inhibitors
O
PO3H2
O O
PO3H2 PO3H2
N N N
H H H
O O O
Fig. 26.2 Kinases transfer phosphate groups (red) to the alcohol function of serine, threonine, or
tyrosine (black, peptide strand is blue).
N-Terminal Domain
substrate-binding site. The two domains are connected by the so-called hinge region.
This contains the recognition motif for the adenosine moiety of ATP. The ribose
building block and the triphosphate group are bound in a crevice between the two
domains and are coordinated by a magnesium ion, which is essential for the transfer
mechanism. The activating loop with the DFG (Asp–Phe–Gly) and APE (Ala–Pro–
Glu) motifs that are next to the catalytic site are also important for the mechanism.
26.3 Isosteric with ATP, and Selective Nonetheless? 603
A detailed analysis of the binding sites for ATP in a large number of kinases
afforded a surprising and promising picture: There are indeed unoccupied regions
in the vicinity of the ATP-recognition site that are different for individual kinases!
Two hydrophobic regions open up, one deep toward the interior of the kinase, and
a second on the opposite side toward the surface (Fig. 26.5). The aminopyrimidine
ring of adenine forms two adjacent hydrogen bonds to the peptide main chain in the
hinge region of the kinase. A third interaction site on the polymer chain remains
unused by ATP 26.1, but can principally be involved in an interaction with
a ligand’s donor function. The design of ATP-competitive kinase inhibitors has
uncovered many interaction motifs that address the hinge region. They have been
incorporated into many clinically tested kinase inhibitors (26.2–26.21, Fig. 26.6).
604 26 Transferase Inhibitors
DFG Loop
O
Asp 184 −
O
Mg2+
O Asp 166
O O
P O O
O O P
Adenine O H −
P O O
O O
O
O
Mg2+ Ser
HO OH
Substrate Chain
Hinge-
Region
Pa
Pb Phe54
Asp184 Mg2+
Mg2+
Pg
Phe187
Asp166 Ser21
Agr18 Lys168
Fig. 26.4 Based on the crystal structure of a cAMP-dependent kinase with a bound ADP and
aluminum trifluoride as a transition-state mimetic, the reaction steps of the phosphate group
transfer from ATP (red) to the serine residue of the substrate (blue) can be modeled. Asp184
from the DFG loop is coordinated to the b- and g-phosphate groups of ATP via a magnesium ion.
An additional Mg2+ helps to position the three phosphate groups correctly. Serine 21, which is to
be phosphorylated, nucleophilically attacks the terminal g-phosphate group, and a phosphorus
atom is transferred with formation of a trigonal bipyramidal intermediate. The neighboring
Asp166 takes the proton from Ser21 OH during this reaction step. At the same time, the positively
charged residues Lys168 of the kinase and Arg18 of the substrate stabilize this intermediate.
The ubiquitous H-bonding pattern of the hinge region, which occurs in all kinases,
makes it difficult to confer ligands with selectivity. Nonetheless, certain MAP
kinases (mitogen-activated protein kinases, signal transduction pathways in cell
differentiation, cell growth, and cell death) provide the chance to design inhibitors
with interesting selectivity, which is associated with a conformational change in the
26.3 Isosteric with ATP, and Selective Nonetheless? 605
Fig. 26.5 Schematic overview of the recognition site of ATP 26.1 in kinases (so-called Traxler
model). The adenine moiety is recognized in the hinge region by two parallel hydrogen bonds from
the peptide strand. A third carbonyl group is available for interactions, but is not involved in the
ATP binding. Kinases with a glycine residue at this position can switch an exposed acceptor
function for a donor function at this third position by folding over an amide bond (left). Two
differently composed pockets open in the kinases next to the ATP-binding site, the so-called front
and back pocket. The latter-named pocket is bordered by the gatekeeper residue. The residues in
this pocket are not involved in ATP binding. Adjacent the phosphate-binding site is found.
hinge region. The orientation of the amide bond in this region is spatially
exchanged so that a donor function instead of an acceptor function is oriented
toward the bound ligands (Fig. 26.5). The flip of the amide bond is possible in these
kinases because a glycine is present in the neighboring position. Because glycine
lacks a side chain on its Ca atom, this residue can access a much larger conformational
space. Inhibitors with a dihydroquinazolinone scaffold as a mimic for the adenine
motif of ATP can induce this conformational flip. In the altered protein conformation,
they can bind selectively to kinases that carry a glycine in this position of their
sequences. If an amino acid with a side chain is found at this position, as it is in other
kinases, the rearrangement cannot be induced. Inhibitors that require this conforma-
tional flip to produce the specific H-bond pattern with the hinge region will therefore
only bind with reduced affinity to the latter kinases. The required conformational
rearrangement of the main chain is not possible in those cases for steric reasons.
The occupancy of the hydrophobic pockets on both sides of adenine’s binding
site (Fig. 26.5) is a generally applicable concept to render kinase inhibitors with
selectivity. The pocket that is found deep in the protein (the so-called back pocket)
has amino acids in its front part that can have very different properties in different
kinases. These are called gatekeeper residues. For example, a threonine is found in
the gatekeeper position in the p38a and p38b kinases. A much larger methionine
residue is found in the same position in the structurally similar p38g and p38d
kinases (Fig. 26.7). The compound SB 203580 26.3 has a p-fluorophenyl group in
the 5-position of its indole ring. The steric demand of this group is just enough that
there is sufficient space for its uptake into the binding pocket next to the threonine.
606
26.2 SB202190 26.3 SB203580 26.4 SP600125 26. 5 Imatinib 26.6 VX-745
26.7 BIRB-796 26.8 BAY43-9006 Sorafenib 26.9 GW-2016 Lapatinib 26.10 Gefitinib 26.11 Erlotinib
26.12 CI-1033 26.13 EKB569 26.14 ZD-6474 Vandetanib 26.15 Vatalanib 26.16 SU11248 Sunitinib
26
Fig. 26.6 Marketed products and development candidates of ATP-competitive kinase inhibitors 26.2–26.20; staurosporine 26.21 is a natural product.
All substances bind through hydrogen bonds to the peptide bonds in the hinge region of the kinases.
Transferase Inhibitors
26.3 Isosteric with ATP, and Selective Nonetheless? 607
Leu108
Tyr35
Thr106
Met109
Asp168 F
p38a,b Thr
N
5
p38γ,d Met S
N O
Erk1,2 Gln H
N
JNK1,2,3 Met 26.3 SB203580
Fig. 26.7 The kinases p38a and p38b have threonine (Thr106, violet) as gatekeeper residues;
a sterically more demanding methionine is in this position in the structurally related p38g and p38d
kinases. SB203580 26.3 binds with its p-fluorophenyl group at the central imidazole ring in a small
niche next to the threonine (green surface, interior is blue). The activity is significantly reduced on
other kinases with more voluminous amino acids in this position (Met, Glyn) because of steric
conflicts.
On the other hand, a methionine in this position requires so much space that there
is insufficient room for the p-fluorophenyl group, and the affinity of 26.3 markedly
decreases.
Analogously, 26.22 benefits from a binding advantage on the p90 ribosomal S6
kinase (RSK) because its p-tolyl group has enough space in a large pocket that is
gated by threonine as well as a neighboring cysteine (Fig. 26.8). The combination of
a Thr and Cys residue in these two positions has only been discovered in three kinases
in our genome. If a reactive fluoromethylene group is introduced as in 26.22, this group
can react with the neighboring cysteine to form a stable, covalent bond to the protein.
Another concept for the development of selective inhibitors exploits the confor-
mational adaptation of kinases. During the course of their activation, kinases go
through multiple steps on the way from the inactive to the active conformation.
Interestingly, kinases have a high degree of structural homology in the active state
in which ATP is bound. Inhibitors that exhibit high affinity to the active conforma-
tion are less selective as inhibitors than those that stabilize the inactive conforma-
tion. This is because the differences in the inactive conformations are significantly
greater. Therefore, the goal is to especially develop inhibitors that bind to an
inactive state of the kinase (Sect. 26.4).
608 26 Transferase Inhibitors
Back Pocket
Hinge-Region Phosphate Pocket
HN O
OH
O H H
R N HS
R N F
H N
N N O
HN O
26.22
R HO
Front Pocket
Ribose Pocket
Fig. 26.8 With its p-tolyl group, 26.22 achieves selective binding to the p90-ribosomal S6 kinase
because it finds a sufficiently large niche next to the threonine gatekeeper residue. Because of this,
the neighboring fluoromethylene group is placed in the vicinity of a cysteine residue with which
the inhibitor can subsequently react. In this way, a strong covalent bond is formed with the kinase.
The necessary arrangement of the Thr and Cys residues has been discovered in three kinases in our
genome so that 26.22 achieves high selectivity for kinases with this amino acid composition in the
back pocket.
Well into the 1980s, drug development for cancer therapy was almost exclusively
concentrated on processes that intervene in DNA synthesis or cell division. This led
to the development of antimetabolites, alkylating compounds, microtubule
disruptors, and inhibitors for DNA synthesis. These strategies attempt to attack
target cells with very high rate of division such as cancer cells. The disadvantage of
such a chemotherapy is the massive adverse effects that severely limit the treated
patients’ quality of life. In 1960, Peter Nowell and David Hungerford were the first
to recognize that chronic myeloid leukemia comes from a specific genetic modifi-
cation. This defect causes about 15% of all leukemia cases. Chronic myeloid
leukemia represents the second most common form of chronic leukemia and is
caused by a severe proliferation of white blood cells, in particular the granulocytes.
A reciprocal translocation between chromosomes 9 and 22 causes chromosome 22
to be shortened. This is termed the Philadelphia Chromosome. The exchange has
the result that the so-called BCR-ABL fusion gene is generated, which codes for
a protein with constitutionally activated tyrosine kinase activity. This protein
belongs to the group of receptor tyrosine kinases (▶ Sect. 29.8) and plays an
important role in the regulation of cell growth. Uncontrolled proliferation is the
result of unregulated activation, and the cell becomes a tumor cell. It has been
610 26 Transferase Inhibitors
26.15
26.12 26.13 26.14 26.16
26.19 26.21
26.20
26.17 26.18
Fig. 26.10 Inhibition profile of the inhibitors 26.2–26.21 that were shown in Fig. 26.6 for 113
different kinases. The size of the red circle quantifies the strength of the inhibition. The data are shown
on the kinase family tree. In this diagram, the branching and the length of the individual branches
denotes the degree of relatedness between the members of the kinase families. The longer the distance
in the dendrogram is, the smaller the degree of relatedness. The natural product staurosporine 26.21 is
a largely unselective inhibitor, whereas 26.9 and 26.15 inhibit a few kinases very selectively.
Abbreviations: TK non-receptor tyrosine kinase, RTK receptor tyrosine kinase, TKL tyrosine
kinase-like kinase, CK casein kinase family, PKA protein-kinase-like family, CAMK calcium/cal-
modulin-like kinase, CDK cyclin-dependent kinase, MAPK mitogen-activated kinase, CLK CDK-like
kinase (from M.A. Fabian et al. 2005, with kind permission from the author and publisher).
shown based on further leukemia models that this gene is responsible for causing
this type of cancer. Therefore, it seemed that the increased kinase activity as a result
of the misregulated gene is responsible for the disease. It should be possible to
intervene in this overmodulation with a pharmaceutical therapy. As a result,
a program for the development of selective inhibitors of ABL-tyrosine kinases
was undertaken at Sandoz.
26.4 Gleevec ®: Success Stories Breed Copycats! 611
N
N H
H H H
N N N R1
N N N N
N O
N 6
N
N N
H H N
H H
N N N R1 N N N N
N O N O
H3C H3C
26.26 26.5 Imatinib
Fig. 26.11 By starting with the PKC kinase inhibition screening hit 26.23, multiple development
steps afforded imatinib 26.5.
In the 1980s, several companies had already initiated the search for protein kinase
C inhibitors. Phenylaminopyrimidine (26.23, Fig. 26.11) was identified by a screening
campaign as a well-suited lead structure. The compound was derivatized (i.e., 26.24)
and initially optimized as a PKC inhibitor. It was noticed that the introduction of a
methyl group in position 6 (i.e., 26.26) completely reversed the kinase inhibition. This
“magic” methyl group influences the conformation between the central aromatic ring
systems, which are coupled through an amino group. In the binding mode observed
with ABL-tyrosine kinase, the inhibitor adopts an extended conformation, and the
methyl group contributes to a twisted arrangement between the two ring systems.
Compound 26.26 proved to be ideal for the inhibition of members of this family
of tyrosine kinases. Initially, this derivative had inadequate oral bioavailability and
water solubility. Therefore an attempt was made to improve these properties by
introducing polar groups such as an N-methylpiperazine group. Compound 26.5
proved to be optimal; it passed all phases of clinical trials and was introduced into
therapy in 2001 as imatinib (Gleevec ®). The compound selectively blocks the
BCR-ABL receptor tyrosine kinase and prevents phosphorylation of the substrate
proteins of this kinase. Later it was determined that still other kinases, for example,
the related c-Kit and PDGF receptor kinase are also inhibited.
Why did imatinib develop into such a success story? First of all the development
of this inhibitor represented an entirely new approach to cancer therapy. Ultimately
a cancer variant was being treated by a selective therapy. The drug showed very few
side effects. Therapy with this compound, however, is not cheap. In short order, it
evolved into a blockbuster for Novartis, and achieved more than a billion Euros in
sales per year. In light of the therapy as well as the sales, such a success story has
a maximally stimulating effect on the area of kinase research. Success stories breed
copycats! The original pessimism about the selectivity problems and redundancy in
the kinases seemed to have blown over. But experience has shown how difficult it is
to write a similar success story. In the meantime over ten kinase inhibitors have
been introduced to the market for different indications (mostly cancer therapy).
612 26 Transferase Inhibitors
Phe317
26.5
Thr315
Thr315
Phe382
Asp381
DFG-
Loop
DFG-
Loop
Fig. 26.12 Two views of the superimposed crystal structure of imatinib 26.5 and tetrahydro-
staurosporine 26.27 (Fig. 26.13) with the active (green) and inactive (red) states of the BCR-ABL
receptor tyrosine kinase. Whereas 26.5 binds to the inactive form of the kinase, the unselective
inhibitor 26.27 blocks the active conformation. With the so-called magic methyl group, imatinib
orients in the direction of the gatekeeper residue Thr325. The amino group that is found between
both rings forms a hydrogen bond to its OH group.
There are also follow-up compounds for imatinib, but no other compound has been
able to achieve a similar economic or therapeutic success.
The binding of imatinib to the kinase stabilizes an inactive enzyme conforma-
tion. The DFG loop, which is critical for the catalytic mechanism, remains in
a conformation that is oriented outward (Figs. 26.9, 26.12). The inhibitor’s
N-methylpiperazine group, which was initially introduced to improve the solubility,
adopts a position that would be occupied by this loop in the active state. Conse-
quently, this group is decisive for the binding mode adopted by 26.5. A structural
comparison of the kinase in complex with imatinib 26.5 and tetrahydrostaur-
osporine 26.27 (Fig. 26.13) is shown in Fig. 26.12. The latter inhibitor stabilizes
the enzyme in its active conformation. The DFG loop takes on a completely
different course; in consequence the DFG sequence motif is oriented toward the
interior. The magic methyl group in the 6-position of the central phenyl ring of 26.5
forces a perpendicular arrangement of this ring relative to the neighboring pyrim-
idine ring. This geometry enables favorable hydrophobic contacts to the gatekeeper
residues Thr315 and a hydrogen bond is formed between the NH group that
26.4 Gleevec ®: Success Stories Breed Copycats! 613
N
N
H H
N N N N
CH3
H
N O N
H3C O
O
Cl 26.29 Dasatinib
Fig. 26.13 Nilotinib 26.28, which has a resistance-breaking profile, was developed as a follow-up
compound for imatinib 26.5. This compound binds with almost the same binding mode, but with
stronger affinity to the BCR-ABL kinase. Dasatinib 26.29, which was developed at Bristol-Myers
Squibb, also binds to this kinase, but adopts an entirely different binding mode.
connects the two rings and the hydroxyl group of this threonine. The combination of
an optimal interaction with Thr315 and a potent binding to the inactive conforma-
tion of the protein provides imatinib’s selectivity advantage. c-Kit is the only other
kinase to which imatinib has a pronounced affinity. This is explained by the high
sequence homology of this kinase with the BCR-ABL kinase in the DFG loop and
in the ATP-binding region. In both cases, the gatekeeper residue is threonine.
Meanwhile, cases of resistance to imatinib have developed. The observed
mutations desensitize the kinase to imatinib inhibition. To date about 30 mutations
have been described. They are a consequence of single base pair exchanges in the
genetic code and have developed from multiple cell populations in which the
exchanges happened purely by chance or have been influenced by oxidative
damage to the DNA. These variants have established themselves under the selection
pressure of imatinib blockage. The most commonly observed resistance mutation is
caused by an exchange of the gatekeeper residue Thr315 for isoleucine. Because of
the larger size of the exchanged amino acid, the inhibitory effects of imatinib fail.
Moreover, the hydrogen bonds can no longer form. The affinity drops from
Ki ¼ 85 nM to 10 mM. In the hinge region, Phe317 forms aromatic contacts with
the pyridine ring of the inhibitor. Mutation of this residue to a leucine causes a loss
in the aromatic interactions and reduces the binding affinity by a factor of 3. Most of
614 26 Transferase Inhibitors
the other observed mutations are rationalized in that the conformation of the kinase
is shifted more in the direction of the active conformation. Consequently, the
selection advantage of imatinib, which is caused by its potent binding to the
inactive conformation, becomes a disadvantage in terms of susceptibility to resis-
tance mutations. Novartis has introduced a follow-up drug for imatinib, the struc-
turally similar nilotinib (Tasigna ®) 26.28 (Fig. 26.13), which shows an improved
resistance profile. With the exception of the mutant Thr314 ! Ile, it shows good
affinity to all of the described resistance-imparting exchanges and stabilizes the
inactive conformation of the kinase. With its altered side chain exhibiting a
trifluoromethyl-substituted aromatic ring and an imidazole motif, nilotinib fits
into the preformed binding pocket better and achieves a higher binding affinity.
The affinity advantage is presumably the reason for its diminished susceptibility to
resistance because small shifts from the inactive to the active conformation are
better tolerated. Another compound, dasatinib (Sprycel ®), is available from Bristol-
Myers Squibb that can circumvent the observed resistance to imatinib. It adopts an
entirely different binding mode with the BCR-ABL kinase. Therefore it is also
reasonable to assume that it has a different selectivity profile than imatinib and
nilotinib; for example, it also binds to kinases of the Scr family.
NH2
N N
N N
O
HO O− O− O−
O O O O−
P P P
HO O O
O
HN
N N
N N
O
HO O− O− O−
O O O O−
P P P
HO O O
O
26.30 Enlarged Adenosine Triphosphate ATP
Fig. 26.14 In the context of the bump and hole method, the back pocket of a kinase is enlarged by
exchanging the gatekeeper residue (yellow) for smaller amino acids (e.g., Thr ! Gly). The altered
kinase can then recognize a chemically modified ATP 26.30 with an enlarged side chain, which
can subsequently be used as a phosphorylating reagent for the protein substrate.
a Protein- b Protein-
substrate substrate
Adenosine P P P Adenosine P P P
Wild- Wild-
type type Inhibitor
Adenosine P P
X
P
Protein- Protein-
substrate substrate
c Protein- Protein-
substrate d substrate
Adenosine P P P Adenosine P P P
Mu- Wild-
X Inhibitor
tant type
Adenosine P P Adenosine P P
P P
Protein- Protein-
substrate substrate
e Protein-
substrate
Adenosine P P P
NH2 NH2
Mutant N N
Inhibitor N N
N N N
N
X
26.31 26.32
Protein- IC50 (Wild type): 28000 nM IC50 (Wild type): 1000 nM
substrate IC50 (Mutant): 4.2 nM IC50 (Mutant): 1.5 nM
Fig. 26.15 (a) The wild type of a kinase activates a protein substrate by transferring a phosphate
group. (b) If a potent inhibitor is added, the phosphorylation is inhibited. (c) Exchanging
a gatekeeper residue for a smaller amino acid such as glycine does not change the catalytic
activity of this kinase. (d) If an inhibitor that has an enlarged substituent to fill out the pocket
next to the gatekeeper residue is added to a wild-type kinase it can barely bind to the wild type
because of steric conflicts. (e) It could, however, block the kinase with the enlarged pocket. The
two inhibitors 26.31 and 26.32 hardly block the wild type at all, but are able to efficiently inhibit
the kinase with the enlarged binding pocket due to their modified gatekeeper residues.
26.6 Metals Teach Kinase Inhibitors Selectivity 617
a model organism. The yeast genome codes for 120 kinases, of which many are
related to kinase families in mammals. One such case is the Cdc28 protein kinase
(cyclin-dependent kinase) in yeast. It plays an important role in yeast reproduction
and drives special phases of the cell cycle. It exhibits 62% sequence identity with
a comparable enzyme, CDK2, in humans. To demonstrate the high specificity of the
inhibitors 26.31 and 26.32 for the mutated kinase, the altered protein had to be
incorporated in the genome of the yeast. This was accomplished with established
retroviral methods in molecular genetics (▶ Sect. 12.14). Finally, it had to be shown
that the cells of the genetically modified yeast exhibited normal growth. Only
a 20% longer reduplication time was observed. Next, the inhibitor 26.32 was
added to the cells of the wild-type yeast and the genetically modified yeast. The
cell growth of the wild-type yeast remained unaffected, except at an inhibitor
concentration above 50 mM, at which a longer replication time was observed. On
the other hand, the yeast with the modified cdc28 gene showed a strong dependence
on 26.32 under in vivo conditions. The growth was reduced by 50% at concentra-
tions as low as 50–100 nM; at 500 nM the growth was completely arrested. Obvi-
ously the inhibitor blocks the cells at the step before mitosis (cell nucleus division
during cell replication) because the phenotype of these inhibited cells seemed to be
very similar to those in which the mitotic cyclins (proteins with a key function in the
control of the cell cycle) were turned off. Individual processes in the cell cycle can
be investigated by using this method, above all, the phase in which a specific
inhibitor intervenes, can be determined. This information is of critical importance
for the development of a therapeutically valid drug. Usually at the beginning of
a project though, adequately selective inhibitors are not yet available that would
allow this type of specific study. This problem is particularly pronounced when
many proteins with high homology are found in the cell. The bump and hole
method, a combined chemical–genetic technique, allows a specific therapeutic
validation of the biological relevance of the target protein as well as the optimiza-
tion of the inhibitor class that is intended for development of a model organism in
an early phase of the project.
Metals and metal ions play an important role in biological systems, especially as
catalytic centers. Zinc and calcium ions can contribute to the crosslinking and
stabilization of proteins by acting as multidentate ligands (cf. zinc finger proteins,
▶ Sect. 28.2). Magnesium ions often serve as a kind of charge buffer to counteract
the electrostatic contribution of the strongly negatively charged phosphate groups.
As described in Sect. 26.2, they are involved in the phosphate-transfer mechanism
from ATP to the hydroxyl groups of Ser, Thr, or Tyr. In rare cases, metals serve as
a component of a ligand that binds to the biomolecule. An example of this are
magnesium ions that are so tightly coordinated to the b-hydroxyketo group of
tetracycline 26.33 that they remain bound during complex formation on the ribo-
some or on the tet-repressor. Another example is cisplatin 26.34, which induces
618 26 Transferase Inhibitors
N(CH3)2 H H3C OH
Cl NH3
HO
Pt
H2N Cl NH3
OH
O O OH O OH 26.34 cis-Platin
Mg2+
CH3 CH3
HN
A
O
O B
O N N
HN Ru
HN N
N O C
O
D
CH3
H H CH3
O N O N
O O N
O O
N N N N N
H N
Ru
C Ru
O C
O
26.36 IC50 = 3nM 26.37 IC50 = 50μM 26.38 IC50 > 300μM
Fig. 26.16 Examples for protein ligands that bind to the metal center of the protein. Tetracycline
26.33 chelates magnesium ions so strongly that protein binding of this ligand is achieved together
with the Mg2+ ion. Cisplatin 26.34 binds through substitution of the chlorine atoms by the basic
nitrogen atoms of the nucleotide bases of DNA. Replacement of the sugar moiety in staurosporine
26.21 led to the chelating ruthenium complex 26.35. They proved to be potent kinase inhibitors
(e.g., 26.36).
Hinge
Leu120
C≡O
Ru
Asp186
VaI126
Fig. 26.17 Superposition of the crystal structures of the complex of PIM-1 kinase with the
unselective inhibitor staurosporine 26.21 (light blue) and the selective ruthenium carbonyl com-
plex 26.36 (olive green). The binding geometry is almost identical in both cases. In 26.36, the
carbonyl group is opposite to the b strand that runs above the binding pocket.
frog and zebra fish embryos. Time will tell whether such metal complexes do, in fact,
open a new perspective for drug development or whether they serve as interesting
probe molecules for basic research on signaling pathways. Certainly they will have an
answer in store for the specific question of the development of selective kinase
inhibitors, but it still must be provided.
Fig. 26.18 Two catalytic mechanisms have been described for the cleavage of phosphate groups
from serine, threonine, and tyrosine in peptide substrates. The first group (a) uses two metal ions
(presumably Zn2+ and Mn2+ or Mg2+), which are coordinated by a histidine or aspartic acid. A
water molecule (presumably in the form of an OH group) nucleophilically attacks the phosphate
group of the substrate and initiates the cleavage. The second class of phosphatases begins the
cleavage reaction with a nucleophilic attack by the thiolate group of a cysteine (b, above). The pKa
value of this cysteine is severely shifted by the dipole moment of a helix that is pointing toward the
site that accommodates the thiol group and the reaction starts from a deprotonated cysteine.
Finally, a water molecule initiates the cleavage of the phosphate group from cysteine (b, lower).
metal ions with two of its oxygen atoms. The intermediate collapses with transient
formation of a pentacoordinated phosphorus atom. The bond between the hydroxyl
oxygen atom of the Ser or Thr residue and the phosphate group is cleaved.
A neighboring histidine assists the cleavage by providing the required proton.
The reaction is reminiscent of the reaction mechanism of phosphodiesterases
(▶ Sect. 25.8). The second group of phosphatases does not use a metal ion for the
cleavage reaction, but rather a covalent intermediate is formed during the course of
the reaction (Fig. 26.18b). These phosphatases cleave phosphate groups from
tyrosine residues. The formation of a very deep, 9-Å-long binding pocket is
characteristic for the latter phosphatases. It is completely established only after
the substrate is bound. A loop that contains a tryptophan, proline, and aspartic acid
(WPD loop) lies over the catalytic site and closes it to the outside. It contributes the
catalytically important aspartic acid and is critical for substrate recognition
(Fig. 26.18). In a closed, substrate-bound state the aspartic acid forms an H-bond
with the phenolic oxygen atom of the phosphotyrosine residue. The phosphate group
is polarized by this interaction and is prepared for nucleophilic attack. This is
622 26 Transferase Inhibitors
Table 26.1 Examples for phosphatases that have been recognized as target structures for drug
therapy
Family Description Disease, therapeutic approach
pSer, pThr PP1, PP2A Tumor suppression
PP2B, PP2C Cystic fibrosis
(Calcineurin) Immune suppression
Asthma
Cardiovascular disease
pTyr PTP1B Diabetes, obesity
CD45 Alzheimer’s disease
SHP Neuroprotection
Dual-specific VHR Regulation of MAP phophatases; stimulation of the cell
phosphatases Cdc25 kinases cycle; cancer therapy
Adult-onset type-II diabetes and obesity are diseases that have increased alarm-
ingly in our society in the last years. They must be considered to be a typical
civilization disease. Adult-onset diabetes is based on increasing insulin resistance,
which is observed as a reduced ability of the cells in the target organ to respond to
insulin. As a consequence, high blood insulin levels occur even at normal blood
sugar concentration. Because of the resistance, the cells no longer respond as
required to the signal that insulin would illicit in a healthy person. Insulin causes
glucose uptake from food into the liver cells, where glucose is stored in the form of
glycogen. If increasing resistance comes to happen, pathophysiological changes
occur based on inadequate insulin control. The uptake of blood sugar into tissues
and the release of sugar from the liver runs askew. As a result, the blood sugar level
increases even more, and this can manifest itself in the form of complications such
as coronary heart disease, retinopathy, cataracts, and vascular disease.
The other civilization disease is much more obviously seen. There are more and
more obese people. The signs are a disproportionate excess of body mass. The fact
that obesity is in no way limited to age is even more alarming. Even in young years,
the number of cases of obesity is increasing dramatically. There are estimates that
by the year 2015, 75% of all adults in industrialized countries such as the USA will
be overweight, and 40% defined as obese. The percentage is also distinctly increas-
ing in developing countries. Of course, this has something to do with our altered
lifestyles. An overabundance of food, often without any dietary fiber, coupled with
a lifestyle that demands ever-decreasing amounts of physical labor has caused this
development. Furthermore, a genetic predisposition contributes to the development
of obesity. Interestingly, the development of type-II diabetes and obesity occur very
commonly together, and in so doing increase the health risks for the patient. The
generated disease symtoms are called a metabolic syndrome. For this diagnosis,
the following other criteria apply: an abdominal girth of more than 80 cm in
a woman or 90 cm in a man, and two of the following other factors such as an
elevated triglyceride level (>150 mg/dL), an elevated fasting blood sugar level
(>100 mg/dL), arterial hypertension (>130/85 mmHg), and/or a reduced HDL
cholesterol level (<40–50 mg/dL; ▶ Sect. 27.3). The costs to society that come as
a consequence of this increased health risk are barely estimable today. They are
most likely dramatic. Therefore, great effort has been made in the search for drug
therapies that can counteract metabolic syndrome and its consequences.
The correlation between insulin resistance and obesity is not fully understood at
the molecular level. Insulin is indeed a hormone that is related to fat metabolism
and that exerts an influence on the fat inventory. For example, it influences the
storage of fat but an insulin deficiency leads to weight loss. Insulin is bound to the
insulin receptor, which undergoes autophosphorylation by its tyrosine kinase
domain as a response to this signal (▶ Sect. 29.8). This initiates a cascade of
multiple kinases that ends in the synthesis of the sugar-storing glycogen. The
synthesis of fatty acids and proteins is also induced. Dephosphorylation of the
insulin receptor attenuates its function. The cleavage of phosphate groups from two
624 26 Transferase Inhibitors
Arg24
Phe182
Arg254
Asp181
Arg221
Asp48 Cys215
Tyr46
Arg47
Catalytic Site
entrance, so that the top of the catalytic site is opened (Fig. 26.20b). Abbott
additionally applied their SAR-by-NMR technique (▶ Sect. 7.8) to discover poten-
tial binders for the second binding site. Small aromatic acids such as 26.46–26.48
were discovered. By coupling such moieties (e.g., naphthylcarboxylic acids) and
the already-known mimetic 26.45 to bind to the catalytic center produced the
nanomolar inhibitor 26.49 (Ki ¼ 22 nM, Figs. 26.20c, 26.21). This second binding
site was determinant for the lead structure optimization. At Novo Nordisk the initial
oxalic acid derivatives on the thiophene ring were expanded by using Asp48 as an
additional anchor point to arrive at more potent and selective inhibitors based on
scaffold 26.50 (Fig. 26.21).
The development of highly potent, PTB-1B-selective, and orally available
inhibitors was overshadowed by another observation. It had been suspected on
the basis of sequence comparisons that there is another phosphatase, the T-cell
protein tyrosine phosphatase TCPTP which exhibits high similarity to PTP-1B.
626 26 Transferase Inhibitors
a Arg24
b Phe182
Gln262 Phe182
Arg24
Arg254
Arg254
Lys41
c d Phe182
Phe182 Arg24
Arg24 Arg254
Arg254
Asp48 Tyr46
Asp48
Tyr46
Lys41
Fig. 26.20 (a) Binding mode of the substrate-analogous phosphotyrosine (26.38, Fig. 26.21) in
human PTP-1B. The phosphate group binds deeply in the catalytic site (green). The two hydro-
phobic amino acids Phe182 and Tyr46 form a narrow entry portal to the catalytic center. A second
phosphotyrosine (pink) is found in the crystal structure that binds to Arg24 and Arg254. (b) Crystal
structure of an aromatic oxalic acid derivative (26.54) that was developed at Abbott to occupy the
catalytic site (green). The compound induces a rearrangement of the Phe182 side chain and opens
the catalytic site to the top. (c) By chemically coupling an aromatic carboxylic acid that was
discovered with the SAR-by-NMR method as a binder for the second binding site (pink) and
a mimetic to occupy the catalytic site, a nanomolar inhibitor 26.49 (Fig. 26.21) was obtained.
(d) To achieve selective binding to PTP-1B compared to the structurally very similar TCPTP,
structural differences at position 41 were exploited (light blue outlining). There PBP-1B has
a lysine, and the related family member TCPTP has an Arg in this position. The nanomolar
inhibitor 26.51 achieves a significant selectivity advantage.
F F O
R O O R O OH
P P N
OH OH H O
26.39 OH OH
26.40 26.41
R=H2N-CH-COOH
O O
O OH O
OH O
OH O
OH OH
N S OH N
H O N
H O
R O
26.42
26.43
O
OH O
O 26.44
OH
N COOH
O
26.46
S N COOH
COOH
26.48
26.45 PTP-1B Ki = 39 μM
TCPTC Ki = 44 μM
26.47
O O
OH
OH O
H
OH N OH
N N
S
O
O O
26.50
OH R
H
N
O
O
O N O
H F O
F
26.49 PTP-1B Ki = 22 nM HN O O–
TCPTC Ki = 44 nM O P
H
O N O–
– N
O H
P O
COOH
O–
F F
26.51 PTB-1B Ki = 4.9 nM
TCPTC Ki = 20 nM
Fig. 26.21 By starting with a substrate with a terminal phosphotyrosine 26.36, a hydrolytically
stable compound 26.40 was developed. A fragment screening drew attention to the two mimetics
26.41 and 26.42. Thiophene derivatives such as 26.43 were designed from the latter compound.
Aromatic carboxylic acids such as 26.46–26.48 were discovered by screening with the SAR-by-
NMR method as ligands for the second binding site. By chemically linking such aromatic
carboxylic acids as binders for the second binding site and a mimetic for the phosphotyrosine in
the catalytic site, 26.49 was obtained as a nanomolar inhibitor. Lead structures were also fitted with
side chains for the second binding site (26.50) at Novo Nordisk. Compound 26.51 embodies
a fourfold selective PTP-1B inhibitor compared to TCPTP.
628 26 Transferase Inhibitors
The need was great. Where do differences between the structures of the two
phosphatases occur that could be exploited to develop sufficiently selective com-
pounds? All of the developed inhibitors at that time showed almost equipotent
affinity for both proteins. Bidentate inhibitors such as 26.51 (Fig. 26.21), which
were reported in 2003, proved to be very interesting because they occupy the
catalytic site and neglect the second binding site (Fig. 26.20d). Even the sequence
of this region proved to be identical to TCPTP. With their somewhat altered
orientation, the new inhibitors address a lysine residue (Lys41) that is an arginine
in TCPTP. At the least, the nanomolar inhibitor 26.51 has a distinct selectivity
advantage for PTP-1B compared to TCPTP. In 2004, the Sunesis company reported
the discovery of an allosteric binding site 20 Å away on the back side of the
catalytic site in PTP-1B. An inhibitor was developed for this site that binds with
micromolar affinity to the enzyme. It blocks its function by preventing the closure
of the WPD loop. In this way, the loop cannot fold upon the substrate-binding site.
The essential residues such as the catalytically active aspartic acid are not brought
in the vicinity of the substrate. The most potent ligand from this series, 26.52
(IC50 ¼ 8 mM), wraps itself around a phenylalanine that is found there, as proven by
the crystal structure (Fig. 26.22). In the structurally analogous TCPTP, a cysteine is
found at this position and forms entirely different interactions with the aromatic
groups of this ligand. Due to the deviating interaction pattern, this compound
achieves TCPTP inhibition at only 280 mM. Perhaps blocking this allosteric binding
site will open a new perspective for the selective inhibition of PTP-1B. The future
must show whether the severe selectivity problem can be resolved in an appropriate
way. Therefore, all hopes to block this, at first glance, ideal target are currently
focused on the antisense nucleotide ISI 113715 from Isis Pharmaceuticals that is
undergoing clinical trials (▶ Sect. 32.4).
A large family of transferring enzymes are the methyl transferases, which shift
methyl groups to other biomolecules. The DNA methyl transferases represent an
important group in this family. Their task is to chemically change the nucleobases
of DNA at particular positions by transferring methyl groups (▶ Sect. 12.13).
This methylation does not lead to changes in the genetic code, that is, the same
amino acids are translated as before. It acts, however, as a kind of DNA-strand
labeling that can, for instance, allow the recognition of foreign DNA or the
differentiation of the original from newly synthesized strands. Another group of
methyl transferases shuffle methyl groups onto oxygen, nitrogen, or sulfur atoms in
small biomolecules. Methyltransferases use S-adenosyl-L-methionine (SAM 26.53)
as a cofactor (Fig. 26.23). A highly reactive methyl group on the sulfonium group is
transferred during the transmethylation reaction.
Inhibitors of catechol-O-methyltransferase (COMT) have gained importance in
pharmaceutical therapy. This enzyme deactivates the endogenous function of
catecholamines such as dopamine, adrenaline, or noradrenaline in that it transfers
26.9 Inhibitors of Catechol-O-Methyltransferase 629
O O
O
S S
Br O
N
O H HN S
O
HO N
Br 26.52
O2N
NH2
NO2
26.54 N
N
HO O– N N
Mg2+
H3C O
OH
S+
–OOC 26.53 S-Adenosyl-L-Methionine (SAM)
OH
NH3+
Asp169
Asn170
Lys144
Mg2+
Glu199
Asp141
SCH3+
SAM
Fig. 26.23 The crystals structure of COMT with the cofactor S-adenosyl-L-methionine 26.53 and
the catecholamine-analogous nitro-substituted inhibitor 26.54. The methyl group that is to be
transferred to the phenolic oxygen atom (red) is within a short distance (2.63 Å). The phenolic
oxygen, which is the nucleophile in the transfer reaction, is presumably deprotonated because of
the electron-withdrawing effect of the nitro groups and the narrow proximity to the magnesium
ion, the sulfonium group, and the ammonium group of Lys144. The accumulated positive charges
shift the pKa value of this hydroxyl group additionally into the acidic range. The second phenolic
OH group is probably uncharged and forms an H-bond to Glu199.
COMT is, as such, inactivated by the transfer of a methyl group onto the phenolic
hydroxyl group of its substrate. Inhibiting COMT therefore allows the bioavailability
of L-DOPA to be further increased, and a higher concentration of dopamine in the
brain can be achieved. The crystal structure of the enzyme was elucidated in 1994 in
the group of Anders Liljas (Fig. 26.23). A deeply buried magnesium ion that takes on
26.9 Inhibitors of Catechol-O-Methyltransferase 631
O
HO HO
OH
HO HO HO
OH OH O
O O
HO HO
N CH3
CN
HO CH3 HO CH3
NO2 NO2
HO O CH3
HO
NO2
NO2
26.60 Nitecapone IC50 = 1 nM 26.61 Nebicapone
NH2
O2N
N
O N
N
N
N
H
HO O– O
OH
Mg2+
26.62 OH
Fig. 26.24 Pyrogallol 26.55, gallic acid 26.56, or tropolone 26.57 bind to COMT with micro-
molar affinity. Tolcapone 26.58, entacapone 26.59, nitecapone 26.60, or nebicapone 26.61 have
strongly electron-withdrawing groups directly on or conjugated to the aromatic ring. They are
nanomolar, competitive inhibitors of catecholamine. The linking of two moieties, each analogous
to catecholamine or adenosine with a rigid five-membered tether (amide bond and double bond,
red) affords the nanomolar bisubstrate-analogue inhibitor 26.62.
Trp143
SCH3+
Glu199 Met91
SAM
Fig. 26.25 Superposition of the crystal structures of COMT with SAM 26.53 and the catechol-
amine-like inhibitor 26.54 (gray carbon atoms) with the bisubstrate inhibitor 26.62 (green carbon
atoms).
Kinases are not the only proteins that undertake posttranslational modification in
the course of signal transduction. The spatial location of proteins is often essential
for their correct function in the cell. Some proteins must be anchored in the
membrane. In addition to examples in which a section of the polymer chain sub-
merges in the membrane, proteins are known that are anchored there by an added
farnesyl 26.63 or geranylgeranyl anchor 26.64. These hydrophobic anchors are
made up of isoprenoid units (Fig. 26.26). The attachment to the proteins is accom-
plished via cysteine residues that are found in the vicinity of the C terminus. Three
classes of so-called prenylating enzymes are known: the farnesyl transferases
(FTases) and the geranylgeranyl transferases I and II (GGTase I and II). The
substrates of these catalysts are, among other, the GTPases of the Ras, Rab, and Rho
families, lamins, and the g subunit of G protein heterotrimers. To become fitted with
a prenyl anchor by FTases and GGTases, substrate proteins must carry a CAAX
sequence 26.65 on their C terminus (Fig. 26.26). Here, C stands for the cysteine
upon which the prenyl group will be transferred, and A is usually an aliphatic amino
acid. If X is a serine, methionine, glutamine, or alanine, the protein is prenylated by
an FTase. A leucine in this position prefers a GGTase as catalyst.
By now over 250 proteins have been discovered that require the posttranslational
attachment of a prenyl tail for its function. The interest in these prenylating
634 26 Transferase Inhibitors
O– O–
O O
O– P
O – P Zn2+ Zn2+
O O
O P O– O P O–H H Peptide–NH3+ Peptide–NH3+
H H
O O S NH S NH
O O
HN HN
A1 A1
O O
NH NH
A2 A2
O O
HN HN
X X
O O
26.63 OH OH
26.65
26.66
26.64 (red)
Fig. 26.26 Farnesyldiphosphate 26.63 binds to FTase and occupies a part of the large catalytic
site. The crystal structure of the enzyme with this substrate was determined (above, farnesyldi-
phosphate is green). Geranylgeranyl groups 26.64 that have an elongated isoprenyl chain
(isoprenyl chain indicated in red instead of a black chain in 26.63) are transferred by GGTase.
Trp102b and Tyr365b border the binding pocket in FTase and provide for substrate selectivity.
After binding the farnesyl substrate, the peptide substrate 26.65 (gray) with its CAAX terminus
diffuses into the binding pocket. The farnesyl group transfer onto the thiol group of the cysteine is
accomplished with the help of a neighboring catalytic zinc ion, which coordinates the cysteine
residue of the substrate. The diphosphate group is displaced by nucleophilic attack. A crystal
structure could also be determined with the product 26.66 (gray-green). It is shown above
superimposed with the binary complex. The farnesyl group moves into the pocket (arrow) and
the newly formed product coordinates to the zinc ion. The enzyme recognizes the tetrapeptide unit
by its two aliphatic residues A1 and A2 of the CAAX motif. The terminal methionine (X) forms
a hydrogen bond to Glu167a with its carboxylate group.
26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors 635
enzymes, and especially the FTases, began in the early 1990s. It was observed that
RAS proteins, which mediate a permanent growth signal in a mutated form in
cancer, must be farnesylated. It is only then that they are active. If the farnesylation
is omitted, the RAS activity is suppressed. After transferring the prenyl group in the
cytoplasm to the cysteine three amino acids away from the C terminus, the protein
migrates to the endoplasmic reticulum. There the AAX tripeptide tail is proteolyt-
ically cleaved and a methyl group is transferred to the C terminus by
a carboxymethylation step. Finally, the prenylated protein is anchored in the
membrane. The FTases and the GGTases contain a zinc ion in their catalytic site
that is coordinated by cysteine, aspartate, and histidine. First the diphosphate
farnesyl or geranylgeranyl anchor diffuses into the large, funnel-shaped binding
pocket of the enzyme. FTases and GGTases form a heterodimer with a barrel-like
architecture, to which almost exclusively helical structural elements contribute.
FTase recognizes the shorter substrate farnesyldiphosphate 26.63 specifically
because the floor of its binding pocket is defined by Trp102b and Tyr365b. After
successful binding of the prenyl substrate, the peptide chain with the tetrapeptidic
C-terminal CAAX diffuses into the catalytic site. The prenyl substrate provides
a large interaction surface for the incoming peptide substrate.
The farnesyl chain must move to the peptide substrate for the actual reaction.
The CAAX substrate occupies the fourth coordination site on the zinc ion with the
thiol group of its cysteine. It binds with its hydrophobic aliphatic side chain A2 into
the preformed binding pocket of the enzyme. The A1 side chain protrudes into the
surrounding solvent. In the structure shown in Fig. 26.26, a methionine occupies the
X position and the C-terminal carboxylate group forms a hydrogen bond to
Gln167a. The prenyl group is transferred to the peptide chain by a nucleophilic
attack of the cysteine in the substrate onto the carbon atom next to the diphosphate
group. The prenylated substrate 26.66 diffuses out of the catalytic center. Interest-
ingly, this is the rate-determining step. There are indications that a new substrate
molecule is necessary to displace the product from the enzyme. For this, the product
molecule takes on a new position and binds in an area of the binding pocket,
through which it leaves the reaction site.
According to the reaction mechanism, different concepts for the development
of inhibitors for this enzyme have been pursued. The first attempts aimed at competing
with the isoprenoid diphosphate binding. For example, the isoprenoid analogue
a-hydroxyfarnesylphophonic acid 26.67, occupies the binding pocket comparably to
farnesyldiphosphate and forms extensive interactions with the enzyme as well as with
the CAAX peptide substrate. The second and most often-used strategy is the displace-
ment of the peptide substrate from the binding site. This goal can be achieved by the
development of peptidomimetics. An example is L-739750 26.68, an ester prodrug that
caused the regression of tumors in rats without systemic toxicity (Fig. 26.27).
It was also possible to completely depart from peptide lead structures. Examples
are R115777 (tipifarnib) 26.69 from Janssen Pharma or BMS-214662 26.70 from
Bristol-Myers Squibb. Both use their imidazole groups to coordinate to the zinc ion.
A superposition of BMS-214662 26.70 with the peptide substrate 26.66 is shown
in Fig. 26.28. Compound 26.70 replaces the isopropyl group of the peptide in the
636 26 Transferase Inhibitors
O Cl
H3C
S Cl
O OR
O
NH2
O SH
H N
N N
O NH2 O N H3C
CH3
26.69 R115777 Tipifarnib
CN
CH3
26.70 BMS-214662
26.71 ABT-839
N
Br Cl
N
N
N
Br O
CN
O N
N N NH2
26.72 Lonafarnib Cl
26.73 L-778123
Fig. 26.27 Development of compounds for the inhibition of FTase. Compound 26.67 represents
a competitive inhibitor for farnesyldiphosphate 26.63. Compounds 26.68–26.73 are inhibitors that
bind competitively to the tetrapeptide substrate, CAAX. Only some of them (26.68–26.70, 26.73)
use their functional groups (e.g., imidazole rings) to block the zinc ion in the catalytic site.
Compounds 26.71 and 26.72 inhibit FTase without direct coordination to the Zn2+. Compound
26.73 blocks FTase and GGTase equipotently.
26.10 Blocking the Transfer of Farnesyl and Geranyl Anchors 637
H
N
N
S
O S N N
O
CN
26.70 BMS-214662
Fig. 26.28 Crystal structure of 26.70 (violet); 26.70, which has a completely non-peptidic
structure, mimics the binding mode of the peptide substrate 26.66. It coordinates to the catalytic
zinc ion with its imidazole group. The hydrophobic benzyl group and the thiophene group replace
the A1 and A2 side chains in the natural substrate. The binding area of the terminal amino acid
X (here methionine) remains unoccupied by 26.70.
A1 position with its thiophene ring. The inhibitor uses its benzyl group for the A2
position to emulate the side chain of the isoleucine. With ABT-839, Abbott has
found a compound that undergoes no coordination to the zinc ion at all. It carries
a methionine group at the end that is very similar to the peptide tail in position X of
the natural substrate. Lonafarnib 26.72, a tricyclic derivative, was developed at
Schering-Plough; its urea group orients into the binding area over which the
processed substrate leaves the binding pocket. This inhibitor also blocks the
enzyme without coordinating to the zinc ion. The compounds 26.68–26.72 all
show a selectivity advantage for FTase. Merck has developed the non-peptide
structure 26.73 that strongly inhibits both FTase and GGTase I. Of course
a strategy can be followed here, as with COMT (Sect. 26.9), that pursues the
simultaneous displacement of both substrates from the binding pocket. The
bisubstrate-analogue inhibitors have to contend with the problem that they must
be very large to successfully compete with the two large substrates.
Clinical studies on the non-peptidic farnesyltransferase inhibitors 26.68–26.73
are not advanced enough to be judged. Monotherapy with these inhibitors delivered
a rather disappointing picture, although very promising results have been seen
for tipifarnib 26.69 for the treatment of breast cancer. We must wait and see
whether FTase inhibitors find application in tumor therapy as a monotherapy or
638 26 Transferase Inhibitors
whether they are more efficiently used with other cytostatic and hormone drugs.
Most recently, however, a new field has been opened for FTase inhibitors in drug
development. It seems that they are potential lead structures for the treatment of
infectious diseases that are caused by pathogenic microorganisms such as Plas-
modium (malaria), Trypanosoma (African sleeping sickness and Chagas disease),
and Leishmania (leishmaniasis, kala-azar). The causative agent of fungal diseases
such as Candida albicans can also be fought in this way. Obviously the post-
translational prenylation of their proteins is an essential step in the lifecycles of
these organisms. We can hope that the sequence differences in the transferases
are adequately large compared to the human enzymes to develop selective
compounds.
26.11 Synopsis
(e.g., a kinase at its gatekeeper residue) and implemented into a model organism.
Selective inhibition of this protein under in vivo conditions is achieved via
inhibitors that are adapted to the modified binding site of the engineered protein.
• Phosphatases remove phosphate groups from Ser, Thr, Tyr, and His residues thus
switching off the function of the substrate protein. Two catalytically different enzyme
classes are known, either operating through nucleophilic attack of a water molecule,
which is highly polarized by two adjacent metal ions, or through the nucleophilic
attack of a cysteine residue via a pathway similar to that in cysteine proteases. In both
cases the tetrahedral phosphorous atom in the phosphate group is attacked.
• PTP-1B initially appeared to be an ideal target to treat the metabolic syndrome
because it involves dephosphorylation of the insulin receptor kinase. Potent
inhibitors of this target with challenging druggability could be developed;
however, sufficient selectivity with respect to another phosphatase, TCPTP,
failed. Knock-out mice were unable to survive if the genes of both phosphatases
are simultaneously turned off. A similar life-threatening situation can be anti-
cipated with insufficiently selective inhibitors.
• Catechol-O-methyl transferase is representative for the family of methyl trans-
ferases using S-adenosyl-L-methionine as cofactor for methyl transfer via its
sulfonium group. It transfers methyl groups to catecholamines such as dopa-
mine, adrenaline, or noradrenaline.
• Inhibition of the methyl transferase reaction is achieved by introduction of
strong electron-withdrawing groups, such as nitro groups, at the aromatic ring
of the natural substrates, producing substrate-like inhibitors.
• Farnesyl and geranylgeranyl transferases transfer prenyl anchor groups onto
protein substrates exhibiting a CAAX sequence on their C terminus. The phos-
phorylated prenyl anchor is attacked by the nucleophilic cysteine thiol group,
which is further polarized through the coordination to a neighboring zinc ion in
the catalytic center.
• Inhibitors of farnesyl and geranylgeranyl transferases bind competitively either
to the CAAX peptide substrate or the prenyldiphosphate substrate binding site.
Some of them show strong peptidomimetic character and involve coordination
of the zinc ion. However, also completely non-peptidic inhibitors have been
developed some of which bind without zinc coordination.
Bibliography
General Literature
Alaimo PJ, Shogren-Knaak MA, Shokat KM (2001) Chemical genetic approaches for the eluci-
dation of signalling pathways. Curr Opin Chem Biol 5:360–367
Bialy L, Waldmann H (2005) Inhibitors of protein tyrosine phosphatases: next-generation drugs?
Angew Chem Int Ed 44:3814–3839
Bonifacio MJ, Palma PN, Almeida L, Soares-da-Silva P (2007) Catechol-O-methyltransferase and
its inhibitors in Parkinson’s disease. CNS Drug Rev 13:352–379
Bridges AJ (2001) Chemical inhibitors of protein kinases. Chem Rev 101:2541–2571
640 26 Transferase Inhibitors
Special Literature
Bishop AC, Ubersax JA et al (2000) A chemical switch for inhibitor sensitive alleles of any protein
kinase. Nature 407:395–401
Cowan-Jacob SW, Fendrich G et al (2007) Structural biology contributions to the discovery of
drugs to treat chronic myelogenous leukaemia. Acta Crystallogr D63:80–93
Lerner C, Masjost B et al (2003) Bisubstrate inhibitors for the enzyme catechol-O-
methyltransferase (COMT): influence of inhibitor preorganization and linker length between
the two substrate moieties on binding affinity. Org Biomol Chem 1:42–49
Madhusudan, Akamine P, Xuong N-H, Taylor SS (2002) Crystal structure of a transition state
mimic of the catalytic subunit of cAMP-dependent protein kinase. Nat Struct Biol 9:273–277
Meggers E, Atilla-Gokcumen GE et al (2007) Exploring chemical space with organometallics:
ruthenium complexes as protein kinase inhibitors. Synlett 8:1177–1189
Puius YA et al (1997) Identification of a second aryl phosphate-binding site in protein-tyrosine
phosphatase 1B: a paradigm for inhibitor design. Proc Natl Acad Sci U S A 94:13420–13425
Szczepankiewicz BG et al (2003) Discovery of a potent, selective protein tyrosine phosphatase 1B
inhibitor using a linked-fragment strategy. J Am Chem Soc 125:4087–4096
Vidgren J, Svensson LA, Liljas A (1994) Crystal structure of catechol-O-methyltransferase.
Nature 368:354–358
Oxidoreductase Inhibitors
27
Chemical reactions that occur via the exchange of electrons are termed redox
reactions. Normally the carbon atom changes its oxidation state in biochemical
redox processes. As a general rule, derivatives with a significant number of directly
bound hydrogen atoms are transformed into derivatives with a larger number of
contacts to nitrogen, oxygen, and sulfur in oxidations. Because these bonds to the
above-mentioned electronegative elements are usually associated with the intro-
duction of polar functional groups, redox reactions exert a decisive influence on the
physicochemical properties of the oxidized substances. For example, the water
solubility is increased. This is of great importance for the elimination of xenobi-
otics. Cytochrome P450 enzymes, a large group of oxidizing enzymes, are involved
in the corresponding metabolic transformations. On the other hand, reductions
are of crucial importance for the organism too. In these reaction steps, reactive
aldehydes or ketones are transformed into alcohols, which subsequently are more
easily conjugated and eliminated (▶ Sect. 8.1). Transition metals, which can adopt
a variety of oxidation states, are predestined to serve as electron donors and
acceptors in redox reactions. In biological systems, one transition metal, iron, is
often used for this task. Once incorporated in a protoporphyrin ring scaffold, it
exists in penta- or hexavalent coordination state and can take on oxidation states
between +2 and +4. Moreover, it participates in complexes with sulfur. There it
forms interesting multinuclear structures: the so-called iron–sulfur clusters. In addi-
tion to iron, copper also plays a role as a mediator of biochemical redox processes.
Nature uses so-called cofactors for enzyme-catalyzed redox reactions. They are
embedded in the specific environment of a protein, and, shielded from the sur-
rounding solvent, they accomplish the electron or hydride ion transfer from the
group being oxidized to the group being reduced. Cofactors can be tightly coupled
to the protein. In these cases, they are referred to as prosthetic groups and do not
leave the enzyme during the reaction. Other loosely bound cofactors can be taken
up by the protein, just as the substrate is, chemically altered, and finally released
again. These cofactors must be regenerated for the next redox reaction cycle in
another independent reaction.
The oxidoreductase enzyme class shall be addressed in this chapter. They are
involved in numerous electron-transfer reactions and need electrons or hydrogen in
the form of hydride ions. These particles are transferred from cofactors such as
NAD(P)+ (nicotinamide adenine dinucleotide (phosphate)) or flavinucleotides
FMN (flavinmononucleotide) and FAD (flavin adenine dinucleotide) and the
already mentioned iron atom in the heme group. Because many of these enzymes
are often involved in processes that are related to the development of pathophys-
iology, multiple drugs act by inhibiting these enzyme systems.
H O H O
H
NH2 NH2
N+ O N
O
O O
NH2 P
NH2
O
P
O−HO O O−HO OH
OH O
O N
N N − N
P O− +H P O−
O O
− N N O O
N N O O −H
OH XO OH
XO
27.2 NADH X = H
27.1 NAD+ X = H
NADP+ X = PO32− NADPH X = PO32−
Fig. 27.1 Many enzymatic redox reactions use NAD+/NADP+ 27.1 (nicotinamide adenine
dinucleotide) and NADH/NADPH 27.2 as a cofactor for the transfer of electrons and/or hydride
ions. It is made up of three components: the nicotinamide, which bears an attached ribose sugar,
the central diphosphate unit, and the adenosine moiety. Compounds 27.1 and 27.2 differ in the
phosphate group (blue) at the 2’-OH group of the ribose ring. Upon oxidation, the positively
charged nicotinamide moiety takes on a hydride ion (red) at the 4-position; upon reduction the H
ion is released from this position.
O OH Malate O O
OH Dehydrogenase OH
HO + NAD+ HO + NADH + H+
O O
Fig. 27.2 Examples for an oxidation reaction with malate dehydrogenase (top) and for
a reduction with homoserine dehydrogenase (bottom). The transformation of a hydroxyl group
into a ketone function or vice versa (red) is carried out in both reactions.
middle of the pleated sheet (Fig. 27.4). The binding of the charged diphosphate
group to the conserved nucleotide-binding moiety occurs in an extension of this
position. The folding motif is termed “Rossmann fold” to honor its discoverer,
Michael Rossmann. We will return to this nucleotide-binding domain with the
enzymes dihydrofolate reductase, HMG-CoA reductase, and 11b-hydroxysteroid
dehydrogenase in the next sections. Other folding motifs can also make a binding
site for the NADPH cofactor available. A TIM barrel (▶ Sect. 14.3) is used for the
binding of this cofactor in aldose reductase (Sect. 27.4). Two protein superfamilies
644 27 Oxidoreductase Inhibitors
N N+
NADPH NADP+
NH2 NH2
H H O H O
H+ H R H R
O N O N
H H
N N
HN HN
H2N N N H2N N N
H H
NADPH
DHF
Fig. 27.3 The stereochemically unambiguous transfer of a hydride ion from the NADPH cofactor
to the double bond of the substrate being reduced is accomplished deep in the protein’s binding
pocket. Crystal structure determination of the enzyme dihydrofolate reductase with bound
dihydrofolic acid (DHF) and cofactor (NADPH) provided detailed information about the course
of the reduction step. The two reaction sites come spatially very close to one another in the
structure. A hydride ion is transferred from the 4-position of the reduced nicotinamide ring onto
the neighboring double bond of the DHF substrate (violet line).
are known that can reduce or oxidize carbonyl compounds in biological systems.
The first group encompasses the aldo–keto reductases, to which the aldose reduc-
tase belongs as a representative. The second superfamily contains short-chained
dehydrogenase/reductases, to which the 11b-hydroxysteroid dehydrogenase (Sect.
27.5) belongs.
Flavoproteins use FMN 27.3 and FAD 27.4 as cofactors (Fig. 27.5). They are
derived from vitamin B2, riboflavin. FAD is composed of an adenosine that is
27.1 Redox Reactions in Biological Systems Use Cofactors 645
O
H3C N
NH
H3C N N O
OH
OH
HO
O− O− N
O O O
P P NH2
O N
O O
N
27.3 FMN N
27.4 FAD (blue) HO OH
O H O
H3C N H3C N
NH 2H+, 2e− NH
H2C N N O H2C N N O
S-Enzyme R S-Enzyme R H
FAD (oxidized form) FADH2 (reduced form)
Fig. 27.5 Flavoproteins use FMN 27.3 and FAD 27.4 (extended by the blue part) as a cofactor.
FAD is composed of an adenosine moiety, a diphosphate bridge with the carbohydrate alcohol
ribitol, and the tricyclic isoalloxazine ring. This tricyclic heterocycle represents the redox-active
part of the molecule and can accept or donate one or two electrons to the substrate. The cofactor is
very tightly but reversibly in some cases via a covalent bond anchored to the enzyme.
646 27 Oxidoreductase Inhibitors
COOH
H3C
COOH OH
N
N N
N Fe N N N
CH3 N F N
H3C N
CH3 F
N
O H3C CH3
Cl O
N N N
O O CH3
O
N
Cl
OH
HO O
OH O
27.9 Naringenin
Fig. 27.6 The heme group 27.5 occurs as a cofactor in proteins that use oxygen as an oxidant.
An iron atom is embedded in a protoporphyrin system in a quadratic–pyramidal or octahedral
geometry. The four pyrrole rings form a plane. The fifth apical position is occupied by a
histidine or a cysteine, and the sixth position is coordinated by a reactive-oxygen species.
This binding site can be blocked by a nitrogen-containing heterocycle such as a triazole or
imidazole ring in fluconazole 27.6, ketoconazole 27.7, or by pyridine rings as in metyrapone
27.8. Even natural products such as the flavonoid naringenin 27.9 represent examples of cyto-
chrome inhibitors.
the cofactor leaves the enzyme as dihydrofolate 27.11 and must be reduced to
tetrahydrofolate 27.12. Dihydrofolate reductase (DHFR) accomplishes this task.
As the carrier of genetic information, DNA is produced in increased quantities
when a high level of cell division is necessary. Cancer represents one example of
increased cell proliferation. Bacteria cells also reproduce at an increased replication
648 27 Oxidoreductase Inhibitors
Fe
Heme
dUMP dTMP
O O
CH3
HN HN
Thymidylate
Synthase
O N O N
Desoxyribose-OPO3H Desoxyribose-OPO3H
R R
O H2C N O HN
N N
HN HN
H2N N N H2N N N
H H 27.11
27.10
Serine R NADPH + H+
Dihydrofolate
Glycine Transhydroxymethylase O HN Reductase
H
N
HN
O COOH O
COOH
N N
H
R R
NH2 X COOH NH2 N
N N COOH
N N
H2N N N H2N N N
Fig. 27.9 Inhibitors 27.13–27.17 of human DHFR that are used as chemotherapeutics in
cancer therapy.
rate during infections. Therefore the inhibition of this enzyme in the synthesis cycle
represents a point of attack for the chemotherapy of tumor disease. If the target is
an enzyme of a bacterial organism, a compound with a bacteriostatic effect is
obtained. Dihydrofolate reductases from different species are rather small enzymes.
Depending on their origin, they are composed of between 150 and 260 amino acids.
The substrate dihydrofolate 27.11 is composed of a pteridine ring, a central para-
aminobenzoic acid, and a terminal L-glutamate moiety. The hydrogenation of the
5,6-double bond in the pteridine ring occurs stereospecifically by the attack of
a hydride ion with subsequent addition of a proton onto the neighboring nitrogen
atom. The mechanism is shown in Fig. 27.3 in detail.
Early on, before the first crystal structure of this enzyme was determined in the
group of Joseph Kraut in San Diego in 1982, methotrexate 27.13 was known to be
a potent dihydrofolate reductase inhibitor (Fig. 27.9). Aminopterine 27.14 and
edatrexate 27.15 were described as analogues. Chemically, they appear to be very
similar to the natural substrate dihydrofolate 27.11. Nonetheless, a decisive
exchange of a hydrogen bond acceptor group for a donor group on the heterocycle
is apparent. As explained in detail in ▶ Sect. 17.6, this causes a 90 twist in the
orientation of this moiety in the binding pocket of the reductase. Therefore, intimate
contact with the reduced nicotinamide group of the NADPH cofactor and the
double bond of the bound ligand cannot take place. A transformation is impossible;
the enzyme is blocked.
Methotrexate is a potent chemotherapeutic that is used in cancer therapy to treat
breast tumors, sarcomas, acute lymphatic leukemia, and non-Hodgkin lymphomas.
Both the natural substrate and methotrexate are very polar compounds and must be
transferred into the cell through the reduced folate carrier (RFC). Then the ligands
are augmented with additional glutamic acid residues. A prerequisite for good and
efficient inhibition of DHFR in cancer therapy is therefore not only a strong binding
to the reductase but also a highly specific uptake through the transporter. For
example, the derivatives 27.16–27.17, which were obtained by replacing the central
phenyl ring with the attached amide bond of methotrexate with a benzolactam
group (Fig. 27.9), indeed have somewhat poorer binding constants to DHFR.
650 27 Oxidoreductase Inhibitors
OMe
CH3
OMe NH2
NH2 CH3 O
N Cl
NH2
N N OMe
H H2N N N N N
H2N N O CH3
H2N N CH3
27.21 Trimetrexate CH3
27.22 Epiroprim 27.23 Cycloguanil
This is, however, compensated for by an improved affinity to the RFC transporter so
that the tumor growth can be suppressed by these compounds equipotently. The
RFC transporter and the highly potent binding of folic acid analogues to this
receptor offers yet another perspective for tumor therapy. This transporter is
expressed on malignant cells to ensure that their increased need for folic acid is
met. Because of the tight binding of folic acid derivatives to this receptor and the
subsequent internalization of its substrates, there is a possibility that folic acid
derivatives could carry an additional molecular freight, which would be
piggybacked into the cell. Upon arrival, this freight could be chemically unloaded
and could, if it were a potent cancer therapeutic, unleash its destructive effects in
the interior of the tumor cell.
In addition to chemotherapeutics for tumor therapy, bacteriostatic inhibitors
such as trimethoprim 27.19 are directed against the corresponding bacterial
enzymes. A few of these non-classical antifolate inhibitors (27.18–27.23) are listed
in Fig. 27.10. Structurally, a relationship to the natural substrate is apparent. The
first heterocycle is the same as in methotrexate so that an identical binding mode for
this moiety is observed. For all DHFRs in different species an aspartate or
a glutamate is conserved that uses an interaction with the positively charged
nitrogen atom in the ring and the exocyclic 3-amino group. The amino group in
the 1-position finds interaction partners in two carbonyl groups in the protein
backbone (▶ Figs. 17.7 and ▶ 17.12). In contrast to methotrexate, trimethoprim-
like antibiotics display a more strongly hydrophobic group as the second ring
moiety. This grouping is decisive for the selective inhibition of DHFRs in bacteria.
At therapeutic doses, trimethoprim inhibits bacterial but not human dihydrofolate
reductase. For bacteria, the inhibitory concentrations range from a factor of
60 (Neisseria gonorrhoeae, the causative agent of gonorrhea) to 50,000
(the intestinal bacterium Escherichia coli) times lower than for human DHFR.
27.2 Chemotherapeutics for Cancer and Bacteria: Dihydrofolate Reductase Inhibitors 651
Table 27.1 Binding constants of a few dihydrofolate reductase (DHFR) inhibitors for the
humane enzyme and the RFC transporter and cell-growth inhibition in tumor tissue
Compound DHFR Ki (pM) RFC Ki (mM) Cell growth IC50 (nM, 72 h)
27.13 4.8 0.45 4.7 1.3 14 2.6
27.14 3.7 0.35 5.4 0.09 4.4 0.10
27.16 34 3.0 0.28 0.10 5.1 0.25
27.17 2100 200 1.1 0.11 140 5.0
Table 27.2 Dissociation constants Kd of trimethoprim 27.19 for dihydrofolate reductase from
different species
Species Kd (nM)
Escherichia coli 0.02
Escherichia coli, Gln118 mutant 0.09
Escherichia coli Arg28/Gln118 double mutant 3.8
Lactobacillus casei 0.4
Neisseria gonorrhoeae 15
Chicken 3,500
Mouse 3,500
Cattle 330
Human 1,000
NH2 Pc-DHFR
OMe Murine-DHFR
N
H2N N OMe
Asn64
27.24
COOH Phe69
27.24
Arg75
Fig. 27.11 The exchange of the 3-methoxy group in trimethoprim 27.19 for an unsaturated
aliphatic side chain in 27.24 gives an affinity that is improved by a factor of 5,000 and selectivity
for the bacterial enzyme from Pneumocystis jirovecii compared to the mouse enzyme. An Asn64
residue is found in the crystal structure of the vertebrate enzyme in the place where Phe69 is found
in the bacterial enzyme. This exchange to a more hydrophobic and less-charged environment in the
bacterial enzyme allows the selectivity advantage for 27.24.
charged molecule with respect to seven charge units. Correspondingly, it is not only
the direct contacts that are responsible for the strength of the protein–ligand
interaction, but rather the electrostatic interactions in the remote environment.
The 3-methoxy group in trimethoprim was exchanged for an unsaturated, acidic
side chain in another model study (Figs. 27.11, 27.24). The modified derivative
shows a significantly improved affinity (factor of 5000) and therefore selectivity
with regard to the bacterial enzyme from Pneumocystis jirovecii compared to the
enzyme from rodents. Crystal structures of the inhibitor 27.24 were obtained with
the enzymes from bacteria and vertebrates. Asn64 in the vertebrate enzyme is
exchanged for Phe369 in the bacterial enzyme. This change leads to a more strongly
hydrophobic and less-charged environment in the bacterial reductase, and thus
causes an increased selectivity for the modified trimethoprim derivative.
In the bacterial enzyme, a tighter and spatially more favorable contact exists
between the unsaturated triple bond of the ligands and the aromatic ring of the
phenylalanine. A comparable contact to Asn64 in the rodent enzyme cannot
achieve this contribution.
In the pioneer era of structure-based drug design at the beginning of the 1980s,
DHFR was the model protein par excellence. Therefore much of the expertise that
shapes our current understanding of selectivity phenomena was collected on this
enzyme.
27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of Drug Development 653
Coronary heart disease (CHD), atherosclerosis, and the concomitant heart attacks
and strokes belong to the most common causes of death in the majority of European
countries and in the USA. CHD has multifactoral genetic causes and is also
a typical disease in the developed world. Risk factors are obesity, smoking, high
blood pressure, and elevated fibrinogen and cholesterol levels. High levels of
cholesterol are found in the plaques that constrict and occlude the blood vessels.
The overwhelming academic opinion considers a reduction in the cholesterol
level to be a reasonable treatment strategy so that medications that act in this way
are often prescribed. Cholesterol fulfills different functions in the construction of
the cell membrane (▶ Sect. 4.2) and serves as a starting material for the synthesis of
steroid hormones and bile acids (▶ Sect. 28.3). The brain, the adrenal glands,
skeletal muscle, skin, blood, and the liver have an increased need for cholesterol.
Between 0.9 and 2 g of this substance is required daily. About a third is obtained
from the diet, and the rest is synthesized in the liver.
A group of drugs that inhibit cholesterol biosynthesis are the statins. There is
hardly another class of compounds that illustrates the success and failure of drug
development in pharmaceutical research equally well. There is a thin line between
astronomical financial success from gigantic sales figures and catastrophic crashes
that can bring a company to the edge of financial ruin. The development of the
statins began as far back as the 1950s. The American company Merck & Co. began
intense work on the biochemistry of lipid metabolism. In 1956 Karl Folkers and
Carl Hoffman, both from Merck, discovered mevalonic acid 27.25, an intermediate
in the biosynthesis of cholesterol 27.26 (Fig. 27.12). Nonetheless, the importance of
the substance and the enzyme 3-hydroxy-3-methylglutaryl-coenzyme A reductase
(HMG-CoA reductase), which transforms HMG-coenzyme A into mevalonic
acid, was not recognized at that time. The enzyme reduces the substrate, which is
composed of two acetate units, by using two equivalents of NADPH in the rate-
determining step in the biosynthetic pathway.
As a therapeutic approach to decreasing the cholesterol level, Merck initially
pursued a basic ion-exchange resin (cholestyramine), which has a high affinity to
bile acids. Because bile acids are synthesized from cholesterol, the removal of bile
acids from the intestines causes more cholesterol from the diet to be used for the
replacement of these substrates. Altogether the cholesterol level in blood decreases.
The sucess of clofibrate 27.27 (Fig. 27.12, ▶ Sect. 28.6) began in the 1960s. This
substance decreases elevated triglyceride levels and, to a lesser extent, the choles-
terol level too. Long-term observations showed, however, that the number of
fatalities in the patient group that was treated with clofibrate was higher than in
the control group. Moreover, cases of liver cancer were observed in animal
experiments.
In 1973 Merck & Co. and other companies began to investigate the influence of
hydroxylated steroids on the biosynthesis of cholesterol. Although these substances
are active in vitro, they were inactive in animal experiments. In the same year the
654 27 Oxidoreductase Inhibitors
Coenzyme A
Coenzyme A S O
S O -
H+ + HO HO O
-
O HO O H O
H H O O
NH2 NH2 -
O HO O
N NADPH NADP + N+
NADPH O
Ribose Ribose
NADP+ -
HO HO O
Adenosine Adenosine
27.25 Mevalonic Acid
O
O
OEt
H3C CH3
HO Cl
27.26 Cholesterol 27.27 Clofibrat
Beginning in 1974 Merck developed in vitro cell tests for the evaluation of
cholesterol biosynthesis inhibitors, especially for HMG-CoA. At the same time,
Akiro Endo and colleagues at Sankyo Japan began investigating extracts from
8,000 microorganisms. The most active compound, which was also isolated at
Beecham in England, was compactin 27.28 (mevastatin, Fig. 27.13). Early in
1979 Endo registered a Japanese patent for another microbial HMG-CoA reductase
inhibitor, monacolin K, without knowing its structure. In the fall of 1978 microbial
extracts were being investigated at Merck too. In the second week of the experi-
ment, they found what they were looking for. In February 1979 the compound was
isolated and a patent for lovastatin 27.29 (Fig. 27.13), complete with structural
details, was registered in June of 1979. The substance was identical to monacolin K.
The Merck patent was awarded at the end of 1980 in the USA and later in other
countries as well. In a few countries the patent was awarded to Sankyo instead. The
reason for this varying credit was the different interpretation of the time priorities.
Sankyo registered the patent (first to file) 4 months earlier. Merck was awarded the
patent in the USA and in many other countries because they could demonstrate
a 3-month-earlier date of invention (first to invent).
Merck began clinical studies with lovastatin in April 1980, but these were
discontinued again in September 1980. The reason was rumors that compactin
was causing tumors in dogs. Toxicity studies with lovastatin showed no indication
of this, and the rumors could not be confirmed. Nonetheless, the project was
initially halted. In July 1982, Merck negotiated an agreement with the American
FDA that lovastatin could be clinically used by selected investigators. The use
would be limited to therapy-resistant cases with severely elevated cholesterol levels
because the risk of heart attack and stroke was particularly high in these patients.
The therapeutic effects on the LDL cholesterol level as well as the total cholesterol
level in blood were convincing, and the side effects were minimal. The chronic
toxicology and clinical studies were reinitiated. In November 1986 a licensing
application was made. Altogether, 160 volumes of preclinical and clinical data
were submitted to the FDA. Just 9 months later the drug was approved, and the
compound developed into a blockbuster with billions in sales.
Years later, the crystal structure of the target enzyme, HMG-CoA reductase, was
determined. The enzyme is a tetramer in its active form. Each monomer is com-
posed of three subunits. The N-terminal domain has an anchor that fixes the enzyme
to the membrane of the endoplasmatic reticulum. The smaller S domain that
contains the binding site of the reduced NADP(H) is nested in the larger
L domain. The S domain adopts the geometry of a Rossmann fold. The extended
HMG-CoA molecule binds to the L domain. It protrudes deeply into the interior of
the protein with its pantothenic acid moiety, whereas the ADP part is found in
a pocket with positively charged residues at the protein’s surface. The actual
binding site for hydroxymethylglutaric acid (HMG) is found between the L and S
domains. The product of the first reduction step, mevaloyl-CoA, has a negatively
charged oxygen atom that is stabilized by a neighboring Lys691 in the enzyme
(Fig. 27.14). The thiolate that is temporarily released from the CoA group is
stabilized by His752, which is presumably protonated. The activity of HMG-CoA
656 27 Oxidoreductase Inhibitors
HO O HO
COO− Na+
O O OH
O
O O
H3C R2 H H
CH3 H3C H CH3
R1 HO
H3C CH3
HO
COOH HO
COOH
OH H3C
OH COOH
H3C
F
CH3 F
CH3 O
N N
N H3C
N
H3C SO2CH3
CH3
27.35 Rosuvastatin 27.36 Pitavastatin 27.37 Gemfibrozil
Fig. 27.13 The natural products mevastatin (compactin) 27.28 and lovastatin 27.29
inhibit cholesterol biosynthesis at the HMG-CoA reductase step. Simvastatin 27.30 and prava-
statin 27.31 are partial-synthetic analogues that were developed later. The ring-opened form
27.31 is significantly less lipophilic than lovastatin and therefore has fewer CNS side effects.
The opened lactone ring is the actual active form of lovastatin and its analogues (▶ Sect. 9.2).
Fluvastatin 27.32, cerivastatin 27.33, atorvastatin 27.34, rosuvastatin 27.35, and pitavastatin 27.36
were introduced to the market later as synthetically prepared inhibitors. A fivefold-higher plasma
concentration of cerivastatin 27.33 as a consequence of a blockade of its metabolism by cyto-
chrome CYP 3A4 by gemfibrozil 27.37 was obtained when the two medications were
coadministered.
27.3 HMG-CoA Reductase Inhibitors: The Changing Fate of Drug Development 657
His752
Lys735
Glu559 HMGCoA
Lys691
Ser684
NADPH
Arg590
Fig. 27.14 The crystal structure determination of HMG-CoA reductase was accomplished with
the bound NADPH cofactor (green) and HMG-coenzyme-A (pink). The nicotinamide ring of the
cofactor lies underneath the thioester bond of HMG-coenzyme A. The hydride ion is transferred
from there in the first reduction step (cf. Fig. 27.12).
Atorvastatin
A
Atto
Ato
torva
rrv
vva
asta
sstta
tatiin
n
Lys735
Simvastatin
Ser684
Glu559
Arg590
Fig. 27.15 Superposition of the structures of HMG-CoA reductase in complex with simvastatin
27.30 (gray) and atorvastatin 27.34 (green). Both inhibitors bind Lys745, Ser684, and Arg590 with
their mevalonic acid analogue moiety, just as the natural substrate does. The remaining molecular
portion, which is very different in the natural-product-like simvastatin as it is in the fully synthetic
atorvastatin, binds in the region that is occupied by the CoA residue in the substrate complex. The
NADPH pocket remains unoccupied in the structures.
The high degree of structural variation of this group in the newer fully synthetic
statins underscores the fact that this molecular portion indeed contributes to the
affinity of the inhibitors but no specific interactions are formed in the binding
pocket that is open to the surface.
The history of the statins would be incomplete without a discussion about
cerivastatin 27.33 and atorvastatin 27.34 (Fig. 27.13). Both are statins from the
more recent research and were prepared fully synthetically. Atorvastatin was
developed by Warner–Lambert in the USA. In 1997 it was introduced to the market
and changed hands from Warner–Lambert to Pfizer in a corporate acquisition.
There, it was developed to a success story par excellence. The compound became
the best-selling medication ever (Sortis ® and Lipitor®). In 2004 it made up half of
the market share of the statins. Pfizer was able to earn $US14 billion in 2006 as
well as in 2007 with this compound. In Germany, the sales figures were low
because of a healthcare reform that required a fixed copayment for statins.
Cerivastatin 27.33 (Lipobay ®) represented a similar cash cow for the Bayer
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 659
Corporation. This compound was also introduced in Germany in 1997 and in other
European countries and in the USA. At the end of 1998 the Bundesinstitut f€ur
Arzneimittel und Medizinprodukte (BfArM, the German authority that monitors
adverse drug events) reported fatalities under cerivastatin therapy. After more
fatalities were reported in the USA and in Germany, Bayer withdrew the drug
from the market in mid-2001. What happened? The fatalities were a result of
rhabdomyolysis, an acute disintegration of skeletal muscle, and concomitant
kidney failure from the toxic muscle metabolites. This adverse event occurred
especially upon overdosing, and in particular with the combination of cerivastatin
and gemfibrozil 27.37, a compound that belongs to the fibrate group. Gemfibrozil
increases the plasma level of cerivastatin by a factor of five and can cause
myopathy by itself. The cause of death was assumed to be an overdose of
cerivastatin by simultaneous mutual inhibition of the degradation mechanism of
both compounds through the metabolizing cytochrome CYP 3A4 (Sect. 27.6).
Patients were informed of the risk by the enclosed leaflet describing the correct
use of medication, and pharmacists distributing the drug in the USA were also
informed. Cerivastatin was considered to be a growth product for Bayer’s phar-
maceutical business. Within a short time after its approval, it achieved sales of
2.5 billion Euros. Worldwide, about six million people took the medication. Its
withdrawal had broad consequences for the Bayer Corporation with which it had
to struggle for several years. The recall itself caused resentment because the press
and stockholders were informed before physicians and pharmacists. This approach
was certainly suboptimal because it readily contributed to a diminishment in trust
in the public eye regarding the pharmaceutical industry and suggested purely
commercial intentions. The two medications Sortis ® and Lipobay® show, how-
ever, how thin the line is between success and failure in the pharmaceutical
business, and with the risks associated with introducing a new medication to the
market, despite a known, established principle.
The alarming increase in cases of type-II diabetes mellitus was already men-
tioned in ▶ Sect. 26.8. One hundred and fifty million people already suffer from the
consequences of glucose metabolism disorders. In the next 15 years this number is
expected to double. The treatment of diabetes and its consequences devours billions
and represents a massive economic and healthcare-system burden. Acquired
diabetes, which manifests itself as an increasing resistance of cells to insulin,
leads to grave complications if left untreated. These manifest themselves in
secondary complications, for example, increased atherosclerosis (Sect. 27.3) with
increasing risk of infarct and stroke. The long-term consequences of a poorly
controlled blood glucose level preferentially affect cells in tissues that do not
control their glucose uptake by insulin. This occurs especially in cells in the
vascular system, in nerves, in the eyes, and the kidneys. Exogenous insulin admin-
istration does not directly help these cells because they are not able to downregulate
660 27 Oxidoreductase Inhibitors
O OH OH
H
OH OH O
Aldose Reductase Sorbitol Dehydrogenase HO
HO HO
OH OH OH
HO HO HO
Fig. 27.16 D-Glucose 27.38 is transformed by aldose reductase to D-sorbitol 28.39 and further by
sorbitol dehydrogenase to D-fructose 27.40 along the polyol pathway.
their glucose uptake. Early blindness, kidney damage, and peripheral vascular
disease can all be a consequence, the treatment of which could require the ampu-
tation of limbs.
One way to intervene in blood glucose regulation is the exogenous administra-
tion of insulin (▶ Sect. 32.4) in addition to significant changes in diet and lifestyle.
However, even a rigorously followed insulin-replacement therapy cannot match the
efficiency of endogenous insulin. Repeated episodes of injurious hyperglycemia
can occur that especially affect the insulin-independent cells. Despite therapy,
diabetics must count on long-term complications. These consequences particularly
affect the quality of life of older patients. All the more, therapeutic approaches are
sought that can reduce long-term complications.
One approach is intervention in the so-called polyol pathway. Glucose 27.38 is
reduced to sorbitol 27.39 and then oxidized to fructose 27.40 (Fig. 27.16) along this
pathway. The first step is catalyzed by aldose reductase, and the second by sorbitol
dehydrogenase. The transformation by aldose reductase is the rate-determining
step. It proceeds with the consumption of NADPH, which is oxidized to NADP+.
NADH/NAD+ is needed as a cofactor in the next step of the dehydrogenase. For
a long time, it was discussed whether overloading the polyol pathway by severe
glucose enrichment would lead to an increased concentration in polar reaction
products in the cell. As a result, the cell would experience elevated osmotic
pressure, which would be alleviated by increased water uptake. This would, how-
ever, lead to cell swelling and increased osmotic stress in the membrane. The
oxidative stress that the cell experiences due to the overloaded polyol pathway
seems to be more serious. The elevated glucose flow along this degradation
pathway requires increasing amounts of NADPH and NAD+ and therefore severely
stresses the homeostasis of these redox-active substances. The body must protect
itself against reactive-oxygen species that are formed as a side product of the ca.
400–800 L of oxygen that we take in daily. These species possess cell-damaging
potential. If the production of these aggressive oxygen derivatives exceeds the
detoxification capacity of the endogenous antioxidant systems of the cell, then
this is referred to as oxidative stress. The main defensive system is glutathione,
which is oxidized to glutathione disulfide species under oxidative conditions.
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 661
Trp79
Val47
HN
Phe122
His110
Tyr48
N
Phe115
NH N HO
H
Trp111 H2N N+
NADP+
Thr113 OH O
Anion-Binding
Pocket
HN
Specificity
Cys303 SH
Pocket
Trp20
O
O
Tyr309 OH HS N
N N H Trp219
N
O
O N
Val297 - Leu300 N
Fig. 27.17 The binding pocket of aldose reductase is divided into a catalytic (blue) and
a specificity (orange) pocket. The cofactor NADPH/NADP+ binds below the catalytic pocket.
Phe122, Trp219, and Leu300 are most responsible for the structural adaptability of the specificity
pocket. Trp20 can also undergo conformational changes in the catalytic pocket. The segment
Val297–Leu300 (red) belongs to a loop that exhibits particularly high adaptability.
Another interesting property of the aldose reductase raises the question as to how
an enzyme can adapt itself to so many different substrates. Protein–ligand com-
plexes of four different inhibitors are shown in Fig. 27.19 (Fig. 27.18, 27.50, 27.44,
27.48, and 27.45); each of these binds to a different protein conformer. In the case
of aldose reductase, it is assumed that there are many protein conformers that
coexist in a dynamic equilibrium with one another that cause an opening and
closing of different sub-pockets of the specificity pocket. A substrate molecule,
but also an inhibitor binds to one of these conformers in the equilibrium and
stabilizes it upon complex formation. Only by this it is understandable that it is
virtually energy cost-neutral to gain access to and strongly block different con-
formers of the enzyme without a concomitant loss in binding affinity.
MD simulations can be carried out to gain access to the conformational diversity
of the possible geometries of the enzyme. How such a study can be carried out has
been described in ▶ Sect. 15.8 by using aldose reductase as an example. In this
study it was shown that it is largely the side chains of only a few amino acids that
27.4 Hitting a Moving Target: Aldose Reductase Inhibitors 663
COOH COOH
COOH CF3
O N O Br
N
N N
N
N
S
O F
O
27.41 Alrestatin 27.42 Zopolrestat 27.43 Ponalrestat
COOH
COOH
O
S N COOH
O N
CH3 N
SO2
S
S
OH
CH3O
CF3 O
27.44 Tolrestat 27.45 27.46 Epalrestat
COOH COOH
Cl N O Br F O Br
H
N N
O F S F
27.47 Zenarestat 27.48 IDD594
O H O H
H N N
O
N HN HN O
O F
O O F
S
NH2
O O
O
O
27.49 Risarestat 27.50 Sorbinil 27.51 Fidarestat
O
O H O H
N N NH
O O
F O Br O Br N
N O
N N S O
Cl O
O F O F CH3
Fig. 27.18 Synthetic inhibitors 27.41–27.54 of aldose reductase. Epalrestat 27.46 was the only
one to reach the market.
664 27 Oxidoreductase Inhibitors
a b
Phe122
Phe122
Trp20
Trp20
Leu300
Leu300
c d
Phe122 Phe122
Trp20
Trp20
Leu300
Leu300
Fig. 27.19 Crystal structures of aldose reductase with (a) sorbinil 27.50, (b) tolrestat 27.44,
(c) IDD594 27.48, and (d) 27.45 (Fig. 27.18). All inhibitors bind to a different conformer of the
protein. Above all, the residues Trp20, Phe122, and Leu300 undergo significant spatial
rearrangements and open up structurally altered sub-pockets in the enzyme.
allow this conformational adaptability and that induce the opening and closing of
entire areas of the binding pocket. The binding geometry of sorbinil 27.50
(Fig. 27.18) is shown in Fig. 27.20. The inhibitor blocks the catalytic site with its
hydantoin group and sits above the nicotinamide ring of the cofactor. Interestingly,
it leaves the specificity pocket closed. Phe122 and Leu300 orient toward one
another like wings of a swinging door and close the parts of the specificity pocket
that lie behind it.
The development of potent aldose reductase inhibitors has been worked on for
many years. Numerous candidates have successfully made their way into clinical
trials (Fig. 27.18). Unfortunately, the development of most of these was terminated
at this phase. Often, it was adverse effects or inadequate efficacy that led to these
decisions. In 1992 ONO Pharmaceutical Co. in Japan managed to introduce
epalrestat (Kinedak ®) 27.46 to the market for the treatment of diabetic neuropathy.
Many other derivatives such as fidarestat 27.51, ranirestat 27.53, ponalrestat 27.42,
27.5 11b-Hydroxysteroid Dehydrogenase 665
Trp79 Tyr48
Phe122
NADPH
Leu300
Fig. 27.20 Crystallographically determined binding geometry of sorbinil 27.50 to aldose reduc-
tase. The inhibitor’s hydantoin group binds above the nicotinamide ring of the cofactor to Tyr48,
Trp79, and His110. The specificity pocket remains closed during this binding. This pocket can be
opened by twisting the side chain of Phe122 and Leu300 out of space.
CH2OH
CH2OH
O O
HO R
R 11b -HSD1/NADPH
O
H3C H
H3C H
11b -HSD2/NAD+
H H
H H
O
O
27.55 Cortisone R = OH
27.56 Cortisol, R = OH
27.57 11-Dehydrocorticosterone R = H 27.58 Corticosterone, R = H
Fig. 27.21 The two isoforms of HSD1 and HSD2 of the 11b-hydroxysteroid dehydrogenase
transform inactive cortisone 27.55 into active cortisol 27.56 and vice versa. In rodents the same
enzyme pair transforms 11-dehydrocorticosterone 27.57 into corticosterone 27.58.
into the biologically inactive 11-keto form, cortisone 27.55 (Fig. 27.21). Two
isoenzymes 11b-HSD1 and 11b-HSD2 were found that belong to the superfamily
of short-chain dehydrogenases/reductases. There is a sequence identity between the
two of only 15%. Chemically they are opponents. 11b-HSD1 is broadly distributed,
but with increased expression in the liver and in adipose tissue. The enzyme acts as
a reductase with consumption of NADPH and forms active cortisol from inactive
cortisone, the latter binds to the glucocorticoid receptor (▶ Sect. 28.5) and activates
it. On the other hand, as a dehydrogenase 11b-HSD2 oxidizes cortisol to inactive
cortisone with consumption of NAD+. In doing so, it protects the mineral corticoid
receptor from overexposure to this active hormone. This is especially important in
the colon and in the kidney. An overactivation of this receptor by cortisol, in
addition to aldosterone, leads to an increased renal resorption of sodium and
chloride ions. Water retention and an increase in blood pressure is the consequence.
A congenital gene defect that causes mutations in 11b-HSD2 can lead to
a hereditary form of hypertension. The mutated enzyme works less efficiently. An
excess of cortisol is the result. The receptor becomes overloaded and causes an
elevated blood pressure. Interestingly, glycyrrhizin 27.59 (Fig. 27.22), one of the
ingredients in licorice, is a potent 11b-HSD2 inhibitor. Excessive consumption of
this confectionary, which is made from the root of Glycyrrhiza glabra, can, in the
worst cases, lead to temporary symptoms that are comparable to those of the
congenital gene defect.
The short-chained dehydrogenases/reductases take on a Rossmann folding pat-
tern. The occurrence of a Tyr-Lys-Ser triad, which occurs in almost all members
of this family, is critical for the catalytic mechanism. The sequence of the reduction
reaction is outlined in Fig. 27.23. A hydride ion is transferred from the nicotinamide
ring of NADPH to the carbonyl group being reduced. The carbonyl function is
involved in a network of hydrogen bonds, which is responsible for its polarization
for the nucleophilic H ion attack. The hydroxyl group of a tyrosine residue serves
as a proton donor. Moreover, the ammonium group of a neighboring lysine
27.5 11b-Hydroxysteroid Dehydrogenase 667
S O O
H3C COOH
S
N N
N H
H H3C N
O O
COOH F
CH3 CH3 CH3
HO Cl
O 27.61 BVT-2733
H H
HO O H
HOOC O O H3C CH3
S
HO OH 27.59 Glycyrrhizin O
N N
OH H
H3C COOH F
27.62
H
O
CH3 CH3 CH3 Cl
O H
H H CH3SO2 N O F
HOOC O H
CH3
H3C CH3 O CH3
Fig. 27.22 The contents of licorice, glycyrrhizin 27.59, represents a potent inhibitor of both
11b-HSD isoforms. Its derivative, carbenoxolone 27.60, is also able to block both isoforms of
11b-HSD. Arylsulfonamidothiazole BVT-2733 27.61 was developed as an inhibitor of 11b-HSD1.
The crystal structure shown in Fig. 27.25 was obtained with an analogous compound, 27.62.
The development of adamantylsulfone 27.63 with a central amide bond was accomplished at
Abbott.
+ CH2R
Lys HO
NH3
HO O
Carbenoxolone 27.60
Ile121
Tyr183 Ser170
Lys187
Fig. 27.24 Crystal structure of human 11b-HSD1 with carbenoxolone 27.60 (gray). The inhibitor
binds competitively to cortisol, the natural ligand. The binding geometry of corticosterone 27.58
(green) was extracted from the crystal structure with the murine enzyme and superimposed on
human 11b-HSD1. This shows the binding geometry of the natural substrate.
immobilizes the OH groups of the sugar moiety and facilitates the proton transfer
by lowering the pKa value of the tyrosine residue (Figs. 27.23 and 27.24). The
different isoforms of 11b-HSD are suitable for both oxidation steps (dehydroge-
nases) as well as reduction steps (reductases), which are catalyzed according to very
similar mechanisms.
Endocrinologists have since long noticed a phenotypical similarity between the
relatively seldom Cushing syndrome and the metabolic syndrome, which com-
monly occurs in industrialized countries. Cushing syndrome occurs as
a consequence of excessive cortisol production and leads to a “full-moon face”
and adrenocortical obesity (central fat distribution). The alarming increase in
obesity in the industrialized world and the simultaneous increase in type-II diabetes
was already discussed in the two previous sections and in ▶ 26.8. Interestingly, an
elevated cortisol level could be evidenced in the adipose tissue of obese people
compared to the tissues of lean people.
Obviously there is a tendency for an increase in 11b-HSD1 activity in the
adipose tissue of people who tend toward obesity. A resistance to nutritionally
caused obesity could be observed in genetically altered mice with no 11b-HSD1
activity. The mice showed better lipid and lipoprotein levels and an increase in
insulin sensitivity in the liver. On the other hand, transgenic mice with induced
overexpression of 11b-HSD1 in adipose tissue showed an increasing insulin resis-
tance. These results suggest that a reduction in 11b-HSD1 activity might represent
a promising therapeutic principle for the treatment of metabolic syndrome.
27.5 11b-Hydroxysteroid Dehydrogenase 669
NADPH
Corticosterone 27.58
27.63
Ile121
27.62
Fig. 27.25 Crystallographically determined binding geometries of the inhibitors 27.62 (beige)
and 27.63 (gray) together with the binding mode of corticosterone 27.58 (green) taken from the
crystal structure of the murine enzyme. Despite entirely different molecular scaffolds, the inhib-
itors largely occupy the steroid’s position. They bind with their amide (27.63) or amide-like
(27.62) groups to Ser170 and Tyr183 of the catalytic triad. Lys187 holds the ribose moiety in
position and polarizes the oxygen functionality of the neighboring Tyr183.
group. Also the example of this enzyme underscores the point that structurally very
different molecular scaffolds can mimic the geometry and properties of a steroid to
successfully block the binding pocket and therefore the catalytic mechanism.
The family of cytochrome P450 enzymes plays a central role in drug metabolism.
The fundamentals of distribution, transport, and degradation of drugs were already
discussed in ▶ Sect. 9.2. Here the architecture and mode of action of these
monooxygenases shall be introduced, above all their interaction with low-
molecular-weight active substances. The cytochrome P450s (CYPs) are a super-
family of heme proteins that carry out biochemical transformations as
monooxygenases, usually by the introduction of oxygen onto the substrate being
oxidized. They have an iron-containing protoporphyrin system as a prosthetic group
in their center. The fifth, apical position of iron is coordinated by a cysteine residue.
An oxygen is intermediately bound at the sixth coordination position and is intro-
duced to the substrate from there. The name comes from a typical absorption band at
450 nm that is observed when the complex is blocked with carbon monoxide.
The proteins are constructed from about 500 amino acids. Until now, more than
6,000 genes have been described for CYPs in Nature. In humans, 17 families have
been characterized, which are subcategorized into 57 isoenzymes. A combination of
numbers and letters are used to name the proteins in which the first number indicates
the family, the letter the subfamily, and the second number describes the isoform. In
the body, these are found predominantly in the liver, lung, and the gastrointestinal
tract. This provides clues about their function: above all, to intervene in the metab-
olism of xenobiotics. Some CYPs carry out important transformations on endogenous
substrates, such as CYP 2R1 in vitamin D metabolism, CYP 19A1 (aromatase) in
steroid metabolism, or CYP 2J2 and CYP 5A1 (thromboxane synthase) in eicosanoid
metabolism. Xenobiotic compounds are transformed in so-called phase-I reactions to
better water-soluble and hence more easily excretable substances. Usually these
transformations serve to detoxify compounds, but in a few cases a toxification of
the substrate can also occur (▶ Sect. 9.1). A few typical reactions that are catalyzed
by CYPs are listed in Fig. 27.26.
The catalytic cycle of P450 enzymes is NADPH dependent. Initially the iron ion
in the heme center is in the +3 oxidation state. The substrate diffuses into
a reaction cavity that is practically fully shielded from the outside (Fig. 27.27).
A helical sequence segment allows entrance into the catalytic site and also acts as
a lid over the site. An NADPH reductase delivers the first electron to the cyto-
chrome and reduces the iron atom there. Then molecular oxygen coordinates to the
iron. In the next step, NADPH reductase delivers a second electron. Then a proton is
taken up, and a Fe2+OOH species forms, which homolytically cleaves the sub-
strate’s C—H bond being oxidized with concomitant water release. An OH group is
stereoselectively transferred from the iron to the carbon being oxidized. The iron
returns to its original +3 state, and the oxidized product can leave the binding
27.6 The Cytochrome P450 Enzyme Family 671
R2 R2 R1 R2 R1 O
R1 H R1 OH R2
R3 R3 H R3 R3 H
OH OH
RX RX+ O−
X HO
RX ROH R2
R2
R1 R1
X R3 O + HXR3 N
N
H
R2 R2
R1
O O
O
Fig. 27.26 Examples for typical oxidation reactions as they are catalyzed by cytochrome P450
enzymes; X stands for heteroatoms such as nitrogen or sulfur.
pocket. Even today, the reaction is not yet fully understood in detail. It has been
shown, however, that P450 enzymes are able to adapt to their substrates, sometimes
to an extreme extent. Even the uptake of two different molecules, as opposed to just
one substrate molecule, into the binding pocket is possible. As we shall see, this has
broad consequences for drug metabolism. The majority of CYPs are in the liver. In
mammals, they are embedded in the endoplasmatic reticulum membrane by an
anchor. The distribution of CYPs into different families is shown in Fig. 27.28. If
their role in drug metabolism is considered, CYP 3A4, CYP 2D6, and CYP 2C9
take on the lion’s share of this task (Table 27.3). CYP 3A4 in particular has
demonstrated a pronounced adaptive structure. Its binding pocket broadens
from 900 Å3 in the uncomplexed state to 2,000 Å3 in the complexed state upon
binding erythromycin (Fig. 27.27). Moreover, erythromycin must completely
rearrange in the binding pocket, because in the experimentally determined crystal
structure, the group being oxidized is still 17 Å away from the heme center.
P450 enzymes can be blocked by diverse compounds. Compounds containing
heteroaromatic rings such as imidazole or triazole tend to inhibit them. Fluconazole
27.6 and ketoconazole 27.7 represent potent CYP 3A4 inhibitors (Fig. 27.6). Other
examples are the flavonoids such as naringenin 27.9, which is contained in grape-
fruit juice. They are metabolized by CYPs to active inhibitors and finally bind
irreversibly to diverse CYPs, above all to CYP 3A4. Additional examples are listed
in Table 27.3. These inhibitory properties must be considered when a CYP inhibitor
672 27 Oxidoreductase Inhibitors
a b
c d
Fig. 27.27 Crystal structures of human CYP 3A4 in an uncomplexed state (a), with bound
metyrapone 27.8 (b), with erythromycin 32.29 (c), and with ketoconazole 27.6 (d). The protein
is shown with a white surface that is red colored on the interior. The ligands are shown with their
own surfaces (outside green, inside blue). In the case of ketoconazole, two ligands bind to the
protein (the second molecule is shown with a violet surface and a cyan interior). CYP 3A4’s
binding pocket, which is nearly fully closed to the exterior (cf. (a) and (b)), had proven itself to be
extremely adaptive. It is only because of this that the enzyme can take on ligands with entirely
different sizes and shapes.
a b c
Fig. 27.28 Percentage of CYP P450 enzymes involved in drug metabolism and their relative
distribution. (a) A study from 2002 compiled the data for the relative portion of the different CYP
enzymes that take part in the metabolism of the 200 best-selling drugs. (b) Proportion of the
different CYP enzymes in the small intestines. (c) Relative distribution of CYP enzymes over the
different P450 families in humans.
27.6 The Cytochrome P450 Enzyme Family 673
Table 27.3 Examples of drugs that act as substrates, inhibitors, or inducers of CYP 3A4, CYP
1A2, and CYP 2D6
Substrate Inhibitor Inducer
CYP 3A4 Amitryptiline Ketoconazole Barbiturate
Clarithromycin Cimetidine Carbamazepine
Ciclosporin Ciprofloxacin Glucocorticoids
Dexamethasone Erythromycin Phenobarbital
Carbamazepine Fluconazole Rifampicin
Terfenadine Ritonavir St. John’s wort
Ethinylestradiol Grapefruit juice
CYP 1A2 Caffeine Cimetidine Insulin
Amitryptiline Ciprofloxacin Omeprazole
Paracetamol Grapefruit juice Aromatic hydrocarbons
Theophyllin Smoking
Verapamil
CYP 2D6 Amitryptiline Cimetidine Dexamethasone
Captopril Haloperidol
Chlorpromazine Clotrimazole
Codeine Quinidine
Imipramine Ritonavir
Metoprolol
Propafenone
Debrisoquine
O
O O
HN CH3
HO N CH3
N CH3
CYP 2E1 Conjugation with
Macromolecules
OH
OH O Glutathione, if available
27.64 Paracetamol in adequate amounts
Intermediate 27.65 Toxic!
O
HN CH3
Sulfation
Glutathione
Gluconidation S
OH
Fig. 27.29 In addition to alcohol dehydrogenase, alcohol is metabolized by CYP 2E1. This
enzyme is overexpressed because of induction in chronic alcoholics. The analgesic paracetamol
27.64 is partly metabolized by CYP 2E1. Compound 27.65 is formed in the process as a toxic
intermediate. At low concentrations, it can be detoxified by glutathione. If, however, elevated
levels of paracetamol end up in this pathway because of CYP 2E1 upregulation, the supply of
glutathione will be insufficient, and a poisoning can occur.
drinkers. If, however, these drinkers wish to treat their hangovers with paracetamol
(acetaminophen) the next morning, problems can occur. Paracetamol 27.64 is
partially metabolized by CYP 2E1, and it is through this enzyme that the toxic
intermediate is formed (Fig. 27.29). If the intermediate is present in low concen-
trations, it can be transformed with the available glutathione and detoxified. If
paracetamol is metabolized extensively via this pathway, the available amount of
glutathione is inadequate, and toxicity symptoms can occur. This danger is the most
severe with heavy drinkers, in whom the CYP 2E1 concentration is permanently
elevated by continuous induction, and in whom paracetamol is predominantly
metabolized through this pathway.
Saturation and upregulation of cytochromes by induction or inhibition of cyto-
chromes by drug–drug interactions represent a serious potential danger in drug
metabolism. Therefore efforts are made in drug design to estimate the metabolic
profile of a development candidate. It would be nice to know at what position
a compound is metabolized and whether cytochrome inhibition is expected, espe-
cially of the most important enzymes. The crystal structure determination of the
essential human CYPs was pursued aggressively. The information obtained was
rather disillusioning. The proteins have such extremely adaptive properties that it
seems practically impossible to predict plausible binding modes to estimate inhi-
bition data. Even a prediction about what parts of a molecular scaffold are prefer-
ably metabolized and which metabolites would be expected has not gotten any
easier, despite the many crystal structures. Currently, routine structural determina-
tion of each development candidate with these proteins still seems rather utopian.
Moreover it has been shown that not only binary but also ternary complexes with
27.7 What Makes Slow and Fast Metabolizers Different? 675
one or two different ligands can be formed. Only time will tell how the methodology
in this area develops. The current state of the art, however, allows the estimation
of metabolic properties with empirical QSAR models and 3D comparisons
(▶ Chaps. 17, “Pharmacophore Hypotheses and Molecular Comparisons” and
▶ 18, “Quantitative Structure–Activity Relationships”). The program MetaSite
from Gabriele Criciani at the University of Perugia in Italy attempts to find the
best-fitting patterns by considering possible complementary interaction patterns on
the surface of the ligand and in the binding pocket. Multiple ligand conformations
are considered for this. Next, concepts about possible binding modes in the binding
pocket of the metabolizing cytochromes are developed that estimate the spatial
accessibility for oxidative attack by the iron atom on the different sites in the ligand.
Furthermore, the technique accesses a system of rules to judge the reactivity of organic
molecules that are similar to those that were introduced to construct the Hammett
equation (▶ Sect. 18.2). Both concepts rank the individual centers in a molecule with
regard to the probability for a metabolic transformation. The combination allows the
metabolic properties of drugs to be estimated surprisingly well.
Metabolizer
Number of Probands
HO H
80
CYP 2D6
N NH2 N NH2
NH NH
27.66 Debrisoquine 27.67 4-Hydroxydebrisoquine
40
0
0.01 0.1 1 10 100
Metabolic Ratio of Debrisoquine/4-Hydroxydebrisoquine
Fig. 27.30 Correlation between genetic variability and the metabolism of the antihypertensive
debrisoquine 27.66 to hydroxydebrisoquine 27.67. The Caucasian population metabolizes this
drug with CYP 2D6 and is divided into slow, extensive, and ultrafast metabolizers. If a standard
dose of the drug is prescribed, the extensive metabolizers would respond well. On the other hand,
the same dose will lead to a plasma level that is too high for the slow metabolizers, which can lead
to side effects. The ultrafast metabolizers will barely reach a plasma level that is adequate for
therapy, and the desired effect of the drug will not be achieved.
CYP, the patient might benefit from a switch to a different drug. There are already
chips on the market on which a patient’s genetic CYP profile can be recorded. It must
be said, however, that the information on the genome for the coding of a particular
enzyme is not adequate to assign an individual to a metabolic group. The genotype is
not important for the metabolic efficiency, that is, the individual genetic complement
of coding proteins, rather the actual expressed quantity of a protein. This determines
the appearance, that is, the phenotype. Additionally, the phenotype can vary
according to the lifestyle and state of health of a person. One only has to think
about the induction of CYP 2E1 in heavy drinkers. This patient profile becomes
important if the therapeutic window for the use of a drug is very narrow. This means
the difference between the desired effect and a toxic dose (▶ Sect. 19.7).
It should also be mentioned that genetic differences in the cytochrome comple-
ment are not the only factors that lead to variable metabolic behavior. Transferases
(▶ Chap. 26, “Transferase Inhibitors”) that transfer, for instance, acetyl groups,
sugar moieties, or methyl groups, play an important role too. These enzymes are
also differently expressed in the general population, which is divided into, for
example, fast and slow acetylators. More attention must be paid to the metabolism
and the genetic and phenotypical variability. How the proband groups are distrib-
uted in terms of their metabolic characteristics must also be better investigated in
clinical trials. It is only then that reliable data about the therapeutic breadth of
a drug can be obtained before the drug gains widespread use in therapy.
27.8 Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors 677
CH3
NH2 HN NH2
NH2
HO
HO
N HO HO
H
OH OH OH
27.68 Serotonin 27.69 Dopamine 27.70 Adrenaline 27.71 Tyramine
CH3
NH2
NH2 HN CH3 HN NH2
O NH O NH
N N
27.72 Isoniazid 27.73 Iproniazid 27.74 Phenelzine 27.75 Tranylcycpromine
H3C
N
H3C
N
N
CH3 CH3 O
Cl
Cl
27.76 Pargyline 27.77 Deprenyl 27.78 Clorgyline
Selegilin
Fig. 27.31 Serotonin 27.68, dopamine 27.69, adrenaline 27.70, and tyramine 27.21 are metab-
olized by MAOs. At first, the hydrazide derivatives such as isoniazid 27.72, iproniazid 27.73, or
the hydrazine phenelzine 27.74 were discovered as inhibitors. Follow-up drugs such as
tranylcypromine 27.75 react with the FAD system of the flavoenzyme by a ring-opening reaction,
or others such as pargyline 27.76, L-deprenyl (or selegilin) 27.77, and clorgyline 27.78 react
through their propargyl groups.
NH2 NH2 O
HO HO HO +
+NH3
H
O O O
Redox H Hydrolysis
H3C N H
H3C N H3C N
NH NH NH
O2 H2O2
H2C N N O H2C N N− O H2C N N O
S-Cys R H
S-Cys R S-Cys R
b +
NH2 +
NH2 NH2 O
+ NH3
27.75 H
O O O O
H + H Hydrolysis H
+H
H3C N H3C N H3C N H3C N
NH NH NH −H+ NH
c
CH3 CH3
N + CH3
N N
H3C H
H H3C H H3C
27.77 H H H O O
O +
−H
H3C N H 3C N H3C N
NH NH NH
H2C N N O H 2C N N− O H2C N N− O
S-Cys R S-Cys R S-Cys R
Fig. 27.32 Possible mechanism for the deamination and inhibition of MAO enzymes. (a) Biogenic amines are transformed into iminium compounds in a redox
reaction by a hydrogen abstraction next to the amino group. Formally, a hydride ion is transferred to the oxidized form of the FAD system. After hydrolysis of the
iminium ion, ammonia and aldehyde are obtained. The prosthetic group is reoxidized with molecular oxygen, whereby H2O2 is formed. (b) Tranylcypromine
Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors
27.75 reacts with the oxidized form of the FAD system upon ring opening and forms a covalent bond to C4a on the ring. (c) Derivatives such as L-deprenyl 27.77
transfer one of their propargylic hydrogens onto the oxidized form of the FAD scaffold. A covalent bond is formed with the N5 nitrogen atom. A delocalized
electron system between the FAD molecule and the inhibitor is formed.
679
680 27 Oxidoreductase Inhibitors
a
Tyr60
Trp388 Arg42
b
Tyr60
Arg42
Phe343
Trp388
Fig. 27.33 (a) Crystal structure of MAOB in complex with covalently bound tranylcypromine
27.75. The inhibitor is attached to the FAD system through the C4a carbon atom. (b) Crystal
structure of MAOB in complex with covalently bound L-deprenyl 27.77, which is coupled to the
cofactor through the N5 nitrogen atom. A delocalized electron system between the FAD molecule
and the inhibitor is formed.
scaffold. This moiety reacts with the nitrogen of the central FAD ring by building
a covalent bond. A delocalized electron system incorporating multiple bonds is
formed (Figs. 27.32 and 27.33).
L-Deprenyl 27.77 selectively blocks MAOB whereas clorgyline 27.78 selectively
inhibits the MAOA isoform. Both isoforms are very similar in the vicinity of the
FAD-binding site. Deviations occur only in the region of the pocket where the
biogenic amine binds as a substrate. Of the 20 amino acids that make up this area,
seven are structurally different. Above all, the two residues Ile199 and Tyr326 in
27.8 Blocking the Degradation of Neurotransmitters: Monoamine Oxidase Inhibitors 681
MAOB are exchanged for Phe208 and Ile335 in MAOA. They give the binding
pocket a different shape. The pocket is shorter, but broader and shallower in
MAOA. It is bordered by Phe208 from below, and can easily accommodate the
2,4-dichlorophenoxy group of clorgyline. The phenoxy group must adopt
a conformation that results in a parallel orientation extending the aliphatic chain
of the inhibitor. This is achieved due to the conformational properties of the
phenoxymethyl group, which, despite the ortho substituent, prefers to adopt
a planar arrangement with the attached chain (Fig. 27.34a). In MAOB, the pocket
takes on a deeper crevice shape in which a phenyl ring fits alongside its edge. The
volume of the pocket is restricted at the rim by the larger Tyr326 residue. Instead of
a phenyl ring on the bottom (cf. Phe208 in MAOA), the wall of the pocket is made
up by Ile299. This residue is considered to have the properties of a flexible entrance
gate. The inhibitor’s phenethyl group, upon which both ortho positions must remain
unsubstituted, enables the ligand, deprenyl, to take on the needed conformation
with a perpendicular orientation of its terminal aromatic ring relative to the
aliphatic chain (Fig. 27.34b).
In addition to the irreversible, covalently binding inhibitors, reversibly binding
inhibitors such as moclobemide 27.79, befloxatone 27.80, or toloxatone 27.81
(Fig. 27.35) are also known. They also occupy the part of the binding pocket that
takes up the biogenic amine substrate. They do not, however, form a covalent bond
with the FAD scaffold.
MAO inhibitors are especially used as antidepressants and for the treatment of
Parkinson’s disease. The antidepressant effect is primarily achieved by a specific
inhibition of MAOA in the central nervous system. In the brain, the levels of
dopamine, noradrenaline, and serotonin rise. The Parkinson’s disease therapy,
which is usually coupled with an L-DOPA strategy (▶ Sect. 26.9), is focused on
the inhibition of MAOB because this isoform is overexpressed in the brains of
Parkinson’s patients. Because both isoforms metabolize dopamine equally well, an
attempt is made to intervene in the Parkinson’s etiology with selective MAOB
inhibitors.
In addition to the above-mentioned liver toxicity that was observed with the first-
generation antidepressive hydrazide-type MAO inhibitors, hypertensive crises as
a result of an acute dysregulation of the blood pressure were also observed. This led
to these substances’ withdrawal from the market. The liver toxicity could be largely
avoided with compounds such as tranylcypromine 27.75 or pargyline 27.78
(Fig 27.31) but the hypertensive crises continued to occur. These could be provoked
by increased concentration of tyramine in the body, above all when certain foods
containing high levels of tyramine (e.g., cheese, causing the so-called cheese
effect, or wine) were ingested, and the metabolic degrading enzymes were irre-
versibly blocked with a MAO inhibitor. An elevated concentration of noradrenaline
is the consequence, which activates the vascular system and can lead to arrhythmias
or heart attacks. Reversible MAOA inhibitors can avoid this problem to a certain
extent. They adequately block the enzyme in the central nervous system to achieve
the desired antidepressant effect. In the periphery, tyramine displaces the reversible
inhibitor and this allows tyramine degradation.
682 27 Oxidoreductase Inhibitors
a Ile335
H3C
Cl CH2
S-Met
Cl
N
O N N
CH3 O R
Phe208
HN N−
Cl
O
CH2R
b Try326
Tyr326 MAO-B
CH3
HO
CH2
CH3 S-Met
N N
CH3 O N
Ile199 R
HN N−
O
CH2R
Ile199
FAD/
Deprenyl
Fig. 27.34 MAOA (a) and MAOB (b) differ in the binding region of the biogenic amine. The
shape of the binding pocket is mainly determined by the exchange of Phe308 ! Ile199 and Ile335
! Tyr326. The 2,4-dichlorophenoxymethyl moiety of the selective inhibitor clorgyline 27.79
(violet) binds in the broad and shallow binding pocket, which is bordered by Phe208, in the
complex with MAOA (residues are violet). The inhibitor’s aliphatic chain lies in the same plane as
the aromatic ring. This chain conformation relative to the ring is the preferred geometry for this
group. A statistical evaluation of the geometry of the ortho-chlorophenoxymethyl group in small-
molecule crystal structures indicates torsion angles (red) of preferentially 180 . The binding
pocket in the complex of MAOB (orange residues) with the selective inhibitor L-deprenyl 27.77 is
severely limited by Tyr326 and opens only a narrow crevice. The phenyl group of the inhibitor
(gray) submerges into this crevice. For this, the aromatic ring must adopt an orientation that is
90 perpendicular to the attached side of the chain. A geometry similar to the one of the
dichlorophenoxy group in clorgyline is not possible for steric reasons (cf. superimposed geometry
of 27.78). On the other hand, the deprenyl’s phenyl ring cannot bind to MAOA with the same
“submerged” edge-on geometry because a steric conflict with Phe208 would occur. Also here,
a statistical analysis of the torsion angles (blue) shows a clear preference for values 90 , which
corresponds exactly to the desired perpendicular orientation of the plane of the phenyl ring to
the chain.
H3C
O
O O
HN
N O H
O O
O N
O N
O NH O
O N
O F
N
Cl OH CH3
CF3 O
H3 C
H3C NH
H3C
N F
H3 C O N CH3
O
S
N
O
N
N H
Cl
Cl
27.83 Citalopram 27.84 Sertralin 27.85 Almotriptan
Fig. 27.35 Examples of reversible MAO enzyme inhibitors 27.79–27.81. The antibiotic linezolid
27.82 bears structural similarity to the oxazolidinones 27.80 and 27.81. It also blocks MAOA.
MAO enzymes also play a role in drug metabolism. Citalopram 27.83, sertraline 27.84, and
triptanes such as 27.85 are metabolized by these enzymes.
The organism synthesizes a great many important signal molecules from compo-
nents of the lipid membrane. Phospholipids are the starting materials from which
arachidonic acid 27.86 is formed (Fig. 27.36). This chain-type molecule of
684 27 Oxidoreductase Inhibitors
COOH
Cyclooxygenase
COOH
O COOH
O COOH
O O
OOH O
27.87 PGG2
OH
27.93 TXA2 Thromboxane
HO Peroxigenase
OH
27.89 PGI2 Prostacyclin
O COOH
HO HO
O
COOH COOH
OH
27.88 PGH2
HO OH O OH
27.91 PGF2 27.92 PGD2
O
COOH
HO OH
27.90 PGE2
Fig. 27.36 Arachidonic acid 27.86 is transformed into the prostaglandin PGH2 27.88 by the
bifunctional enzyme cyclooxygenase by using a cyclooxidation and a peroxidase step. PGH2 is the
starting material for the synthesis of a variety of prostaglandins 27.89–27.93, which are formed by
specific synthases.
20 carbon atoms has a carboxyl group as its only polar function. It is characterized
by four isolated cis double bonds. To be able to generate paracrine hormones such
as the prostaglandins 28.87–27.93 with sufficient water solubility, arachidonic
acid must be oxidized. Oxygen-containing functional groups must be transferred.
This role is taken on by the cyclooxygenases (COX). These are bifunctional
enzymes that catalyze the transformation to prostaglandins in a second step.
Initially a cyclooxidation takes place, then a peroxidase reaction (Fig. 27.36).
Because of the poor water solubility, arachidonic acid diffuses directly from the
membrane into the reaction site of the cyclooxygenase. The enzyme submerges into
the membrane. Three helices are used to allow it to practically swim in the
membrane. These helices attach the protein into the membrane, but they do not
traverse it as is often observed in membrane-anchored proteins. There are two
isoforms, COX-1 and COX-2, the amino acid sequences of which are 65% identical
(Fig. 27.37). Their catalytic sites are almost identically constructed. They are active
as dimers. Access to the catalytic site is obtained through a long channel that opens
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 685
Fig. 27.37 Two isoenzymes COX-1 (green) and COX-2 (blue), which have 65% sequence
homology, are known. They are catalytically active as dimers and submerge into the membrane
with a ring of the hydrophobic helices (coming out of the page, toward the reader). This ring
represents an opening to the channel through which arachidonic acid 27.86 (dark blue) can diffuse
from the membrane into the catalytic site. The superposition of the crystal structures of both
isoforms is shown from the direction of the membrane.
directly to the membrane environment. The natural substrate arachidonic acid 27.86
is taken up in this way. The channel is somewhat narrower in COX-1 than in COX-2
because an isoleucine in a central position is exchanged for a valine. The
arachidonic acid that has diffused into the channel is transformed into endoperoxide
PGG2 by addition of oxygen at C11 and C15 (Fig. 27.36).
The heme cofactor in the vicinity of the reaction channel is essential for the
transformation. Its fifth coordination site is occupied by histidine. The oxidative
oxygen species is bound at the sixth position. The dioxygen species is transferred as
a hydroperoxide in a two-electron reaction. Tyr385 acts as an intermediate tyrosyl
radical for the electron transfer and abstracts a hydrogen atom from C13
(Fig. 27.38). The temporarily present, unsaturated radical adds the peroxide group
to the allyl position at C11. Subsequently, a cyclic peroxide is closed with C9; C8
reacts with C12, which is spatially nearby, and forms a 5-membered carbocyclic
ring. Another hydrogen atom extraction on C13 initiates a peroxide transfer onto
C15, which is also in an allylic position. This is transformed into a hydroxyl
function in the subsequent reduction steps catalyzed by the peroxidase activity. It
is presumed that the peroxidase reaction site of the enzyme located on the opposite
side is accessed from the outside of the protein in the vicinity of the endoplasmatic
reticulum for the reaction step. For this, the oxidized substrate must diffuse out of
the arachidonic acid channel to the position of the peroxidase reaction. Tyr385,
which is found deep in the protein in the vicinity of the heme center, is critical for
686 27 Oxidoreductase Inhibitors
Tyr385 Tyr385
Fe-Heme Fe-Heme
OH
O O O OH
12 H 12
11 O 11
9 15 9 15
8 8
5 5
O O− O O−
O OH OH OH
12 H
O 11 O O O
O OH
9 15 O 12 15 O 12 15
O OH
8C
8 8
5 5 5
O O− O O− O O−
27.87 PGG2 27.88 PGH2
Fig. 27.38 The chemical transformation of arachidonic acid 27.86 to PGG2 27.87 and PGH2
27.88 occurs by an attack of the tyrosyl radical 385 on the C13 carbon atom, from which
a hydrogen atom is abstracted. The intermediately formed, unsaturated radical adds a peroxide
group to C11. A cyclic peroxide is formed with C9 by a ring-closing reaction. The tyrosyl radical
abstracts another hydrogen atom from C13, and C6 closes with C12 a carbocycle, to form PGG2
27.87. The product leaves the binding pocket and is further chemically transformed to PGH2 27.88
in a peroxidase reaction.
the overall reaction. It catalyzes the oxidation and reduction steps according to the
changing oxidation states of the iron. The oxygen species that are to be transferred
are simultaneously supplied from this center. Two enzymatic processes that are
tightly interwoven take place in the COX enzymes. A dioxygen species is needed as
a reagent for cyclooxygenase activity. Tyrosine has a special task because it is
coupled as an intermediate radical to both activities. The radical state of this residue
is formed during the peroxidase reaction and it initiates the cyclooxygenase reac-
tion via homolytic hydrogen atom abstraction. The crystal structures of COX-1 with
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 687
Ile523 Fe
11
8
15 Tyr385
13
Arg120
Ser530
Fig. 27.39 Superposition of arachidonic acid 27.86 (violet) and PGH2 27.88 (gray) in the reaction
channel of COX. The heme center to which oxygen is bound is at the top of the right side. Tyr385
(yellow) is responsible for the hydrogen abstraction from C13 of arachidonic acid. The atoms of
the protein are largely removed, and the reaction channel is indicated with a transparent surface.
The displayed geometry is based on the crystallographically determined complexes of COX with
arachidonic acid and PGH2.
the superimposed arachidonic acid substrate 27.86 (violet) and the product PGH2
27.88 (gray) are shown in Fig. 27.39.
PGH2 27.88 is the central starting material for the synthesis of a series of
arachidonic acid derived products (Fig. 27.36). A variety of synthases are involved
in the transformations that afford the different prostaglandins. COX catalyzes the
rate-determining step, and this explains its central role in the regulation of inflam-
matory processes. Prostaglandins are referred to as inflammatory mediators.
Prostacyclin PGI2 27.89 and PGE2 27.90 increase the vascular permeability. This
leads to tissue swelling, and rubor (redness) occurs as a result of the increased
perfusion. Nociceptive nerve endings are sensitized, and pain perception is
increased. In the stomach, PGI2 and PGE2 are involved in the regulation of the
mucous membranes and the stomach acid production. PGE2 is also associated with
the occurrence of fever in inflammatory processes. The prostaglandin PGF2 27.91 is
associated with reproductive processes. At the beginning of labor, COX-2 is
expressed in the placenta at elevated levels. The PGE2 that is produced, is involved
in stimulating the uterus to contract. PGD2 27.92 takes on the task of regulating
contractions in the bronchial airway. PGH2 is a starting material for the synthesis of
688 27 Oxidoreductase Inhibitors
O O
H3C H3C
OH OH
O OH
O CH3
O H3C
O
CH3
O
O O
F OH
H3C
OH CH3O OH
CH3
CH3
N
F O
H3C S
Cl O
27.97 Flurbiprofen 27.98 Indometacin 27.99 Sulindac
SO2NH2
SO2NH2
O
Cl OH
NH CH3
N
N H3C
Cl
F3C O N
N Cl OH
CH3
NH
O Cl N
F
O
27.103 Rofecoxib 27.104 Etoricoxib 27.105 Lumiracoxib
Fig. 27.40 Inhibitors of COX isoenzymes. Acetylsalicylic acid 27.94 and the arylacetic acids or
propionic acids 27.95–27.100 are unspecific inhibitors of both isoforms. After the discovery of the
induced COX-2, the coxibs 27.101–27.104 were developed as selective inhibitors of this isoform.
Rofecoxib was withdrawn from the market due to an increased risk of cardiovascular diseases.
Lumiracoxib 27.105, which is structurally identical to diclofenac with the exception of a Cl/F
exchange and an additional methyl group, was introduced to the market as a COX-2-selective
inhibitor.
690 27 Oxidoreductase Inhibitors
Fig. 27.41 The most probable binding mode of acetylsalicylic acid (ASA) 27.94 with COX-1.
ASA binds in the middle of the reaction channel (gray surface) that is normally occupied by the
natural substrate, arachidonic acid 27.86. The channel spans through the protein with a bent shape
from the lower left. It forms a salt bridge with Arg120 and reacts with the OH group of Ser530 by
transferring its acetyl group. This blocks the channel irreversibly. The additional volume that the
acetyl group blocks is indicated with a violet surface (interior is yellow). The displayed geometry is
based on a crystal structure that was determined with a bromine derivative of ASA.
before any surgery whether they have taken Aspirin ® in the last week. Salicylic
acid, which lacks the acetyl group, is a weak but reversible inhibitor of COX that is
competitive to arachidonic acid. If Ser530 is mutated to Ala, the enzyme is
catalytically fully active. The mutant, however, is only weakly inhibited by ASA.
In addition to ASA, the arylacetic and propionic acids are another group of
slightly selective and reversible COX inhibitors that deserve mention. Among
others, ibuprofen 27.95, ketoprofen 27.96, flurbiprofen 27.97, indometacin 27.98,
sulindac 27.99, or diclofenac 27.100 (Fig. 27.40) belong to this class. Ibuprofen
also binds in the arachidonic acid channel and forms a salt bridge with its terminal
carboxylic acid function to Arg120. Moreover, oxicams, anthranilic acids, and
pyrazole derivatives are important COX inhibitors. They are termed NSAIDs
(non-steroidal anti-inflammatory drugs). The mode of action of paracetamol (acet-
aminophen) 27.64, a very old and widely used analgesic, was associated with COX
enzymes for a long time. Now, however, it seems that this drug might act by
27.9 Cyclooxygenase: A Key Enzyme in Pain Sensation 691
being conjugated with arachidonic acid through amidation with its metabolite,
p-aminophenol, and in this way, it intervenes in the pain cascade. The newly formed
N-arachidonoyl-p-aminophenol is a nanomolar vanilloid and CB1 receptor antag-
onist, both of which are examples of GPCRs, and the cellular uptake of the
analgetically active anandamide (arachidonoylethanolamide) is inhibited.
Because COX-1 is constitutively expressed in all tissues, unselective COX
inhibitors also act in places where the prostaglandins are needed for other tasks
that have nothing to do with pain. An example is the production of prostacy-
clin 27.89, which is responsible for the regulation of the production of mucous in
the stomach. COX inhibitors block its synthesis, and the protective effect
on the stomach epithelial cells against the severely acidic milieu is lost as an
undesirable side effect. Gastric irritation is the result and can lead to severe
complications.
When it was discovered early in the 1990s that the expression of COX-2 is
upregulated at the site of pain, hopes were high that a side-effect-free pain therapy
could be achieved by selectively inhibiting this enzyme. A careful analysis of both
enzymes showed that there are small but significant differences: in position 523,
COX-1 has an Ile residue, whereas COX-2 has a Val residue. Further, though of less
importance, is an exchange of a Phe residue in COX-1 for a Leu residue in COX-2
at position 503. What can be expected in terms of selectivity from such a small
difference as a methyl group exchange? At the very least, the binding pocket of
COX-2 is 17% larger, and there is a new sub-pocket in the arachidonic acid channel
(Fig. 27.42). It stood to reason that structurally larger inhibitors could be developed
that take advantage of the additional sub-pocket. Such inhibitors can no longer
inhibit COX-1 because of the steric conflict that is caused by the isoleucine residue
at position 523. The first generation of successfully developed COX-2 inhibitors
27.101–27.104 all have a similar structure (Fig. 27.40). In the center is either a five-
or six-membered ring that is usually functionalized with aromatic substituents. This
causes a branched structure that mirrors the larger binding pocket of COX-2 better
than COX-1. In practice, it was demonstrated that the selective COX-2 inhibitors
left COX-1 uninhibited, and the side effects, such as bleeding of the gastric
mucous membranes or a decrease in kidney function, were almost fully eliminated.
The first compounds to come to the market were celecoxib 27.101, valdecoxib
27.102, and rofecoxib 27.103 (Fig. 27.40). Their indications for use ranged from
rheumatism, to osteoarthritis, to chronic polyarthritis, and ankylosing spondylitis
(Bechterew’s disease). All of these diseases are associated with severe pain.
Rofecoxib 27.103 (Vioxx ®) quickly achieved sales in the billions. In 2004, how-
ever, the drug was withdrawn from the market because significant side effects were
observed in patients undergoing long-term therapy. Specifically, an increased risk
of cardiovascular disease was observed, and especially the risk of heart attack,
unstable angina pectoris, and stroke increased. As a result, Merck & Co. experi-
enced a drop in profits in 2004 of 29%. As of March 2006, 10,000 claims for
damages had already accumulated. However, shortly after the withdrawal of
rofecoxib 27.103, Merck introduced a new COX-2 inhibitor, etoricoxib 27.104, to
the market.
692 27 Oxidoreductase Inhibitors
Ile359
Ser530
Ile523(COX-1)
Val523 (COX-2)
Fig. 27.42 Structure of celecoxib 27.101 with COX-2. The inhibitor is shown with a green
surface (interior is blue). Position 523 is a valine in COX-2, but it is an isoleucine in COX-1. If the
Ile residue from COX-1 is superimposed on the valine from the COX-2 structure, the increased
spatial demand of the additional methyl group of the Ile is apparent (surface indicated by the light-
blue net). Ile demands a larger volume in the binding pocket and prevents the binding of the
branched-substituted, five-membered-ring inhibitors. The displayed structure is based on a crystal
structure that was determined with a bromine derivative of celecoxib.
Altogether this raises the question of whether the side effects that were seen with
rofecoxib 27.103 are typical of all COX-2 inhibitors. The cardiovascular risk must
be weighed against the risk of the gastric bleeding that can occur with
acetylsalicylic acid, diclofenac, ibuprofen, or indometacin. Rofecoxib belongs to
the first generation of COX-2 inhibitors, which all have a five-membered ring in the
center. In 2006 lumiracoxib 27.105 (Prexige ®), a COX-2 inhibitor, was introduced
to the market. Structurally it is similar to diclofenac, which is less selective.
It remains to be seen if it shows a different side-effect profile. The example of the
coxibs impressively demonstrates how careful design can exploit even the smallest
difference of a methyl group, that is, an Ile ! Val exchange between COX-1 and
COX-2, to lead up a new class of compounds and successful drugs.
27.10 Synopsis
as the nicotinamides NAD(P)+ or the flavine derivatives FMN and FAD and the
iron-containing protoporphyrin ring system in heme enzymes.
• The nicotinamide moiety in NAD(P)+ is an N-substituted pyridine derivative
that either accepts or releases a hydride ion in the 4-position. The cofactor binds
in many oxidoreductases to a conserved fold motif, the nucleotide-binding
Rossmann fold.
• Dihydrofolate reductase is involved in the biosynthesis of thymine. Inhibitors
competitive with the binding site of the natural substrate dihydrofolate have been
developed as potent chemotherapeutics in cancer therapy, or as bacteriostatics to
fight bacterial infections.
• Reduction of the cholesterol blood level is a strategy to fight coronary heart
disease and atherosclerosis as high excess of cholesterol is found in plaques
constricting and thus occluding blood vessels.
• HMG-CoA reductase is involved in the biosynthesis of precursors of cholesterol.
The substrate, composed of two acetate units, is reduced by using two equivalents
of NADPH. Inhibitors, the statins, which occupy the cofactor-binding site, were
first derived from natural compounds discovered by screening microorganisms.
Later fully synthetic derivatives were developed that evolved into the best-selling
drugs ever.
• Aldose reductase, an NADPH-dependent reductase lacking a Rossmann-folded
nucleotide-binding domain, is involved in the polyol pathway, along which
glucose is metabolized to sorbitol and subsequently to fructose. Overloading
this pathway results in increased production of polar compounds, which creates
osmotic stress and oxidative stress as a result of high reductase activity.
• Long-term consequences of poorly controlled blood glucose level in the case of
type-II diabetes preferentially affect cells that do not control their glucose uptake
by insulin. Inhibition of aldose reductase is a viable principle to reduce long-
term complications.
• Aldose reductase is able to reduce a broad scope of different aldehyde substrates.
This is achieved by a highly adaptive binding pocket, which also allows the
development of inhibitors showing largely deviating scaffolds and binding modes.
• Cortisol is transformed to cortisone and vice versa via two isoforms of 11b-
hydroxysteroid dehydrogenase, which is a NADPH-dependent reductase that
takes on a Rossmann fold. 11b-HSD1 inhibition has been suggested as
a promising therapy concept to treat metabolic syndrome.
• The cytochrome P450 enzymes are a superfamily of heme proteins that carry out
biochemical transformations as monooxygenases by introducing oxygen onto
a substrate being oxidized. They are particularly involved in the metabolism of
xenobiotics and a major part of the administered drug molecules are metabolized
in CYP 3A4, CYP 2D6, and CYP 2C9.
• The CYP enzymes are highly adaptive and accommodate substrates of signifi-
cantly different sizes. They can be inhibited by drug molecules, particularly
those containing heteroaromatic rings that coordinate the catalytic iron ion in the
heme center. Their expression can be induced and thus upregulated by xenobi-
otics activating, for instance, the PXR transcription factor.
694 27 Oxidoreductase Inhibitors
• Because the equipment with cytochrome P450 enzymes varies with geno- and
phenotype, this polymorphism causes varying metabolic behavior between dif-
ferent individuals. Differentiation into slow, extensive, and fast metabolizers has
consequences for the prescription and required dose level of a given drug
metabolized by the involved CYPs.
• Because the activity of a metabolizing CYP enzyme can be further modulated
either through inhibition or induction by coadministered drugs or by xenobiotics
taken up in the diet, severe consequences with regard to the dose level present in
the body can result, and this can cause undesired and dangerous side effects or
unexpected failure of drug action.
• Monoamine oxidases MAOA and MAOB are FAD-dependent oxidases and
metabolize important neurotransmitters such as dopamine, adrenaline, or sero-
tonin. Inhibition of these enzymes can help in the therapy of depression,
Alzheimer’s, or Parkinson’s disease.
• Most of the current MAO inhibitors are activated by an initial redox step and
a covalent attachment is formed to the FAD cofactor via a highly reactive
intermediate; this leads to an irreversible chemical modification of its redox
properties.
• The membrane-associated cyclooxygenases COX-1 and COX-2 synthesize the
endoperoxide PGG2, which is a precursor to a large variety of prostaglandins,
from arachidonic acid. Prostaglandins are an important class of paracrine hor-
mones and also referred to as inflammatory mediators.
• COX contains a heme center, and PGG2 is synthesized through a cyclooxidation
step involving radical intermediates. In a subsequent peroxidation step involving
release and diffusion of the substrate to another reaction site, PGG2 is further
modified to PGH2.
• COX is inhibited by non-steroidal anti-inflammatory drugs such as
acetylsalicylic acid, ibuprofen, indometacine, or diclofenac. They bind to the
reaction channel and block access of the natural substrate arachidonic acid.
• Acetylsalicylic acid transfers its acetyl group irreversibly to a channel-exposed
hydroxyl group of Ser530 in a reaction similar to that in serine hydrolases. As
a consequence, in cells lacking a nucleus such as thrombocytes, prostaglandin
synthesis and its products such as thromboxane are permanently blocked for the
lifetime of the cell.
• Two isoforms of COX exist. COX-1 is ubiquitously expressed in all tissues
and constitutively present. Due to its multiple involvement in many physiolog-
ical processes overdosing of COX-1 inhibitors can exert severe side
effects. COX-2 is induced and found in endothelial cells of proliferating
blood vessels, inflamed tissue, sites of atherosclerotic damage, and in some
tumor cells. This makes selective COX-2 inhibition a prospective therapeutic
principle.
• COX-1 and COX-2 differ in the reaction channel by the crucial exchange of an
isoleucine for a valine residue. The additional volume created in COX-2 by the
absent methyl group gives rise to the development of size-extended furcated
Bibliography 695
Bibliography
General Literature
Chan DCN, Anderson AC. Towards species-specific antifolates. Curr Med Chem. 2006;13:
377–98.
Endo A. A historical perspective on the discovery of statins. Proc Jpn Acad Ser B. 2010;86:
484–93.
Flower RJ. The development of COX-2 inhibitors. Nat Rev Drug Discov. 2003;2:179–91.
Gangjee A, Jain HD. Antifolates – past, present and future. Curr Med Chem Anti-Cancer Agents.
2004;4:405–10.
Hoffmann F, Maser E. Carbonyl reductases and pluripotent hydroxysteroid dehydrogenases of the
short-chain dehydrogenase/reductase superfamily. Drug Metab Rev. 2007;39:87–144.
Lamb DC, Waterman MR, Kelly SL, Guengerich FP. Cytochromes P450 and drug discovery. Curr
Opin Biotechnol. 2007;18:504–12.
Michaux C, Charlier C. Structural approaches for COX-2 inhibition. Mini Rev Med Chem.
2004;4:603–15.
Mitchell JA, Warner TD. COX isoforms in the cardiovascular system: understanding the activities
of non-steroidal anti-inflammatory drugs. Nat Rev Drug Discov. 2006;5:75–86.
Oates P. Aldose reductase, still a compelling target for diabetic neuropathy. Curr Drug Targets.
2008;9:14–36.
Tobert JA. Lovastatin and beyond: the history of HMG-CoA reductase inhibitors. Nat Rev Drug
Discov. 2003;2:517–26.
Vagelos PR. Are prescription drug prices high? Science. 1991;252:1080–4.
Webster SP, Pallin TD. 11b-Hydroxysteroid dehydrogenase type 1 inhibitors as therapeutic
agents. Expert Opin Ther Patents. 2007;17:1407–22.
Weinshilboum R, Wang L. Pharmacogenomics: bench to bedside. Nat Rev Drug Discov.
2004;3:739–48.
Wienkers LC, Heath TG. Predicting in vivo drug interactions from in vitro drug discovery data.
Nat Rev Drug Discov. 2005;4:825–33.
Xia W, Low PS. Folate-targeted therapies for cancer. J Med Chem. 2010;53:6811–24.
Youdim MBH, Edmondson D, Tipton KF. The therapeutic potential of monoamine oxidase
inhibitors. Nat Rev Neurosci. 2006;7:295–309.
Special Literature
Bertilsson L, Lou YQ, et al. Pronounced differences between native Chinese and Swedish
populations in the polymorphic hydroxylations of Debrisoquin and S-Mephenytoin. Clin
Pharmacol Ther. 1992;51:388–97.
Cody V, Pace J, Chisum K, Rosowsky A. New insights into DHFR interactions: analysis
of Pneumocystis carinii and mouse DHFR complexes with NADPH and two highly potent
5-(o-Carboxy(alkyloxy) trimethoprim derivatives reveals conformational correlations with
activity and novel parallel ring stacking interactions. Proteins. 2006;65:959–69.
696 27 Oxidoreductase Inhibitors
Daly AK. Pharmacogenetics of the cytochromes P450. Curr Top Med Chem. 2004;4:1733–44.
De Colibus L, Li M, et al. Three-dimensional structure of human monoamine oxidase (MAO A):
relation to the Structure of rat MAO A and human MAO B. PNAS. 2005;102:12684–9.
Ekroos M, Sjögren T. Structural basis for ligand promiscuity in cytochrome P450 3A4. PNAS.
2006;103:13682–7.
FitzGerald GA. COX-2 and beyond: approaches to prostaglandin inhibition in human disease. Nat
Rev Drug Discov. 2003;2:879–90.
Istvan ES, Palnitkar M, Buchanan SK, Deisenhofer J. Crystal structure of the catalytic ortion of
human HMGCoA reductase: insights into regulation of activity and catalysis. EMBO J.
2000;19:819–30.
Rosowsky A, Forsch RA, Wright JE. Synthesis and in vivo antifolate activity of rotationally
restricted aminopterin and methotrexate analogues. J Med Chem. 2004;47:6958–63.
Agonists and Antagonists of Nuclear
Receptors 28
For all the joy of the structure-based design of enzyme inhibitors, it must not be
forgotten that less than half of the prescribed drugs act on enzymes. Many other
drugs have receptors, transporters, pores, or ion channels as target structures. Most
receptors mediate the information transfer from the exterior into the interior of the
cell. Either activating or blocking them changes the cell’s state. In this way they can
take on modulating tasks. Transporters, pores, and ion channels serve to transport
selected substances across the membrane, especially substances that are unable to
cross by passive diffusion because of their polar character. Just as receptors the
latter proteins are embedded in the cell membrane. Before we turn to this class of
membrane-bound targets, another class of receptors that is found in the cell’s
interior should be considered. Nuclear receptors are controlled by specific ligands.
An endogenous hormone must first penetrate the cell to achieve activation. This is
usually accomplished by passive diffusion through the membrane. The ligands must
therefore possess adequate lipophilic or amphiphilic properties or must be sub-
strates of transporters.
Nuclear receptors are soluble receptors that are found in the cytosol. As tran-
scription factors, they regulate the expression of specific genes in the cell nucleus
and are therefore responsible for the production of proteins. They bind directly to
DNA and take on an important role in gene regulation in embryonic development,
in cell growth, and in cell differentiation and specialization. Malfunctioning of
these receptors leads to diseases with uncontrolled cell growth (e.g., cancer),
metabolic disorders (diabetes or obesity), or reproductive disruption (infertility).
They are activated by hormones. These natural ligands, which include the steroid
hormones and also lipophilic ligands such as retinoic acid, diverse fatty acids,
triiodothyronine, vitamin D, prostaglandins, bile acids, and phospholipids must
passively cross the cell membrane barrier (Fig. 28.1). Once they arrive at the site of
action, they bind to the ligand-binding domains of the nuclear receptors. From the
OH R
H3C H3C
19
H H3C H
10
H H H H
HO O
28.1 Estradiol 28.2 Progesterone, R = COCH3
28.3 Testosterone, R = OH
H3C O
CH3
OH NH2 OH
CH3
COOH
HO
I I
HO O I
O
HO OH 3,5,3'-Triodothyronine CH3
OH
1,25-Dihydroxyvitamin D3
Prostaglandin D2
CH3 CH3
COOH
COOH
CH3
Fig. 28.1 The natural ligands of nuclear receptors are steroids such as estradiol 28.1,
progesterone 28.2, and testosterone 28.3 as well as molecules such as retinoic acid, fatty acids,
triiodothyronine, vitamin D, or prostaglandines.
point of view of drug design, these receptors are interesting target structures
because the natural ligands correspond to the typical size of a drug molecule. In
2003, 34 of the 200 most often prescribed drugs acted on nuclear receptors. At first
glance, these target structures seem ideally suitable, but the biological control of
gene expression is, in contrast, very complex. The receptors not only have ligand-
dependent domains, but also ligand-independent domains to activate transcription.
As soon as the receptors migrate into the cell nucleus, the coactivators, corepressors,
and transcription factors contribute to the regulation of gene expression. An
upregulation as well as a downregulation can be achieved. They also seem to interact
with other signal transduction pathways that are controlled by, for instance, NF-kB
or activator protein AP-1. On the other hand, in view of molecular diversity, the
protein family of nuclear receptors seems to be straightforward. In our genome, there
are 48 genes that code for the different receptors.
28.2 The Structure of Nuclear Receptors 699
Nuclear receptors are all constructed according to the same blueprint. They contain
three domains. The N-terminal A/B region is the most variable in the family.
It contains the transactivation domain and is involved in the ligand-independent
recognition of cofactors and further transcription factors. Next comes the
DNA-binding domain, which contains about 70 amino acids and two so-called
zinc-finger motifs. This domain is the most conserved in the entire gene family.
The C-terminal domain ends with the ligand-binding region, which contains about
250 amino acids. It hosts the binding site of low-molecular-weight ligands and
contributes an additional regulatory element to the recognition of coactivators
and other transcription factors.
Nuclear receptors are divided into two groups. The first one comprises steroid
receptors, which form a homodimer to be activated. The second large group
contains receptors that form a heterodimer with the promiscuous retinoid-X
receptor (RXR) to function. There are further receptors that can bind to DNA as
monomers. The dimerization is achieved as a response to the binding of an agonist,
or the dimer formation is stabilized by the bound agonist. Some nuclear receptors
reside in the cytosol as inactive complexes with heat shock protein. Ligand binding
stimulates the decomposition of these initially inactive complexes and triggers the
signal to migrate into the cell nucleus. There, the dimerized receptor binds with its
DNA-binding domain to a so-called DNA-response element, which resides on the
target gene in the promoter or repressor region. The newly formed complex serves
as a further docking site for coactivators. Their additional binding is translated into
an initiation signal for the start of transcription and subsequent gene expression.
Each DNA-binding domain recognizes a specific pattern of six bases in the
major groove of DNA by using a two-helix motif (▶ Sect. 14.9). This pattern is
located mirror-symmetrically on both complementary strand segments (Fig. 28.2)
in opposite directions. Two zinc fingers stabilize the two-helix motif. For this, the
zinc ion coordinates tetrahedrally to four neighboring cysteine residues, which
enables crosslinking within the protein strand.
The ligand-binding domains of the nuclear receptors also follow a common
construction principle. They are made up of 12 helices. The sequence at the end of
the 12th helix has a particular task. It opens and closes access to the ligand-binding
pocket like a door. In doing so, it undergoes a spatial rearrangement that gives
a signal for the activation for the receptor (Sect. 28.4).
The ligand-binding pockets in the nuclear receptors encompasses about
400–600 Å3. They have polar amino acids on both ends and a belt of hydrophobic
residues in the center. The ligand binding pockets are even larger in the receptors
that form heterodimers with the RXR retinoic acid receptor. In the peroxisomal
proliferation-activated receptors PPAR, they can encompass up to 1,300 Å3.
Despite having a common architecture and rather broad variation ranges in volume
for the ligand accomodation, many ligand-binding domains are able to achieve
astonishing selectivity with respect to the recognition of their ligands. This selec-
tivity shall be illuminated in more detail in the following section.
700 28 Agonists and Antagonists of Nuclear Receptors
The male and female sexual hormones and the corticosteroids are substances with
stunningly similar structures. All are derived from an identical basic scaffold. On
a grand scale, Nature manages to invoke a broad spectrum of the most diverse
biological effects with minimal structural variation. In doing so, a mistake could
have fatal consequences. The difference between estradiol 28.1, progesterone 28.2,
and testosterone 28.3 shall be examined in greater detail. A hydroxyl group occurs
on the aromatic ring of estradiol in the first ring of the steroid scaffold that is
changed to a carbonyl group in the partially hydrogenated ring of progesterone and
testosterone. The aromatic A-ring of the female hormone estradiol adopts a planar
structure, but the ring in the male hormone, testosterone, forms a half-chair
(Fig. 28.3). Furthermore, a methyl group occurs at carbon atom 10 in the male
hormones and in progesterone. The 19-methyl group is missing in this position
because of the aromatic character of the first ring in estradiol. The 19-methyl group
shields this first ring from above and makes a fairly large spatial demand in
progesterone and testosterone.
How is this small difference recognized by the receptor? As the crystal
structure of the estrogen receptor with bound estradiol shows, the hydroxyl
group on the aromatic A-ring is involved in a hydrogen-bonding network with
Glu353 and, via a water molecule, with Arg394 (Fig. 28.4). Glu353 is most
probably deprotonated and recognizes the hormone by the donor functionality of
its hydroxyl group. A glutamine is found in the same position in the structure
of the progesterone receptor (Fig. 28.5). It forms a hydrogen bond with the
carbonyl group in the A-ring of progesterone through the amino group of its
28.3 Steroid Hormones: How Small Differences Translate to the Receptor 701
OH OH
H H
10 H H
H H
HO O
28.1 Estradiol 28.3 Testosterone
Fig. 28.3 The difference between the female hormone estradiol 28.1 and the male hormone
testosterone 28.3 consists of a change from a hydroxyl group on the aromatic ring of estradiol to
a carbonyl group in a partially hydrogenated ring of testosterone. The aromatic A-ring of estradiol
takes on a planar structure, whereas the A-ring in testosterone forms a half-chair. A methyl group
occurs on carbon C10 in the male hormone that gives the molecule additional volume.
Leu384
Leu387
H2O
Glu353
His524
Arg394
Fig. 28.4 Portion of the crystal structure of the estrogen receptor with bound estradiol (surface is
green, interior is blue). The hydroxyl group of the aromatic A-ring forms an H-bond to Glu353 and
a water-mediated bond to Arg294. The volume above the planar, aromatic A-ring is limited by
Leu384 and Leu387.
702 28 Agonists and Antagonists of Nuclear Receptors
Met756 Met759
Gln725
H2O
Thr894
Arg766
Fig. 28.5 Portion of the crystal structure of the progesterone receptor with bound progesterone
(surface is green, interior is blue). The carbonyl group on the partially hydrogenated A-ring
accepts an H-bond form Gln725 and binds to Arg766 through a water molecule. Because of the
19-methyl group above the A-ring, the steroid occupies a larger volume that is limited by Met756
and Met759, which have more flexible and therefore better adaptive side chains.
a b
Asp351
Asp351
c d
Fig. 28.6 The ligand-binding domain of the nuclear receptors is constructed from 12 helices.
Upon binding an agonist such as estradiol 28.1, the 12th and last helix (blue) closes like a gate over
the entrance to the ligand-binding pocket (a, c). Asp351 orients on the tip of the helix and stabilizes
it in the active position. At the same time, the recognition site is opened for the coactivator with the
helical LxxLL motif (violet) to bind to the receptor. Upon binding an antagonist such as raloxifen
28.5, helix 12 cannot close up the entrance channel (b, d). The terminal basic group of the
antagonists forms a hydrogen bond to Asp351.
28.4 Helix Open, Helix Closed: How Agonists and Antagonists Are Differentiated 705
CH3
NH
H3C N
O
O
O
HO
OH
HO S
CH3
28.5 Raloxifen 28.6 4-Hydroxy-Tamoxifen
Helix 12
Glu448
LxxLL-Peptid
Fig. 28.8 The recognition site of the LxxLL motif on the surface of the coactivator in this crystal
structure is reflected in the 11-membered peptide with the estradiol-bound receptor. The peptide
takes on a helical geometry and orients its three leucine residues in the hydrophobic groove on the
surface. Three amino acids of helix 12 (blue) form a part of this surface. Glu488 on helix 12 binds
to the LxxLL motif at the tip of the N-terminal end of the helix.
706 28 Agonists and Antagonists of Nuclear Receptors
Steroid hormones are produced in endocrine adenocytes, for example, in the adrenal
glands, the testes, or in the ovaries, and are released into the blood stream. There they
circulate freely, often by binding to a transport protein. Far remote from their site of
production, they reach the target cells for which the signal is meant. Because of their
lipophilic character, they can passively permeate through membranes. Once in the
cytosol, they bind to the corresponding steroid receptor. Five classes of steroid
receptors are differentiated: glucocorticoid, mineralocorticoid, androgen, estrogen,
and progesterone receptors. Two subtypes of estrogen receptor (a-ER and b-ER)
have been discovered that differ in the exchange of a leucine for a methionine, and
a methionine for an isoleucine in the vicinity of the binding site of the C- and D-ring
of the steroid scaffold. The binding affinity to their receptors is extremely large,
typically 0.05–50 nM. As a result of the binding, the gene expression that was
described in the previous sections, is initiated. The cellular response to these
processes occurs within hours to days. In addition to this control process, which
has direct gene expression as a goal, steroid hormones can also initiate fast regula-
tory processes in cells. For this, binding occurs to receptors on the cell exterior.
These receptors, which belong to the class of G protein-coupled receptors or to the
dimerizing receptors with a tyrosine kinase domain, are discussed in ▶ Chap. 29,
“Agonists and Antagonists of Membrane-Bound Receptors.”
As an example, the function of the estrogen receptor shall be examined in more
detail. Estrogen controls the menstrual cycle of women in childbearing years. In
addition to this function, estrogen reduces the risk of coronary heart disease and
supports the maintenance of bone density. After menopause, at an age of about
50 years, the ovaries stop producing estrogen so that women at this age are at an
increased risk of coronary heart disease and osteoporosis. Altogether the hormone
homeostasis of the organism must find a new equilibrium. Often this is accompa-
nied by unpleasant physical and psychological symptoms in menopause. Hormone
replacement therapy was proposed in the 1960s as a solution. The body is supplied
with estradiol 28.1 or an analogous receptor agonist. For example, diethylstilbestrol
28.4, which is related except that it lacks a steroid scaffold, was once used, but is no
longer prescribed because of an elevated risk of cancer.
The long-term use of hormone replacement therapy increases the risk of breast
cancer significantly. This devastating result was proven in a study in the USA in
which a million nurses took part. The relationship between ovarian function and the
development of breast cancer had already been described over a hundred years ago.
In 1936 Antonie Lacassagne speculated that the effect of estrogen antagonists could
lead to the prevention of breast cancer. The discovery of the first antagonists was
once again purely by accident. Compound 28.7 was synthesized at Merrel in the
USA in the late 1950s as part of a cardiovascular research program (Fig. 28.9).
Because of its chemical similarity to 28.8, a then-known synthetic estrogen surro-
gate, it was also examined in an estrogen-activity test. This effect was not seen, but
rather the opposite: antiestrogen activity. Clomiphene 28.9 was obtained by minor
structural modification. This compound was introduced to the market in the 1960s
28.5 Agonists and Antagonists of Steroid Hormone Receptors 707
CH3 CH3
N OMe
O CH3 N CH3
O
OMe
H
OMe
Cl
OH MeO
H H3C
N CH3 N
O O
CH3 OH
H H
HO (CH2)9SO(CH2)3CF2CF3
Cl
MeO
28.9 Clomiphen 28.10 Nafoxidin 28.11 Fulvestrant
Fig. 28.9 Tamoxifen 28.6 was developed from compound 28.7, which originated in cardiovas-
cular research. The marketed product has a hydrogen atom in the 4-position, but the actual active
substance is the oxidation product, the 4-hydroxy derivative 28.6. Fulvestrant 28.11 does not show
the same resistance that has been observed with tamoxifen.
as an ovulation inducer to treat infertility in women. With this, the goal of introduc-
ing a drug to prevent breast cancer initially missed the mark. The development of
nafoxidine 28.10 was also discontinued because of pronounced side effects.
In England, ICI had been pursuing a program for the development of non-steroidal
estrogen replacements for breast cancer therapy since 1940. Because the interest in
contraceptives was in the foreground in the 1970s, it must be seen as a stroke of luck
that tamoxifen 28.6 emerged from this program in 1973 and obtained approval for
the treatment of breast cancer. The compound quickly proved to be a breakthrough
in the treatment of breast cancer. Today it is estimated that the use of tamoxifen in
the industrialized countries has saved one million years of women’s lives each year.
It was only discovered in retrospect that tamoxifen is a prodrug. The actual
active substance is obtained by hydroxylation at the 4-position. It stood ripe for
further development, from which raloxifen 28.5, among others, emerged
(Fig. 28.7). All derivatives with antagonistic effects carry a side chain with
a basic group. As explained in Sect. 28.4, this side chain blocks the refolding of
helix 12 into the active position. The example of raloxifen also shows how complex
effects on the total organism can be. Originally raloxifen was developed for breast
cancer therapy. However, this goal was abandoned in the late 1980s because the
compound displayed no advantages over tamoxifen. It proved, however, to be
708 28 Agonists and Antagonists of Nuclear Receptors
CH3
COCH3
N H3C
OH H3C OH OCOCH3
H3C H3C
H3C H
H H H H
H H H O
HO O Cl
28.12 Ethinylestradiol 28.13 Mifepriston RU486 28.14 Cyproterone Acetate
Fig. 28.10 The introduction of a 17b-ethinyl group leads to orally active steroids, for example,
ethinylestradiol 28.12. The progesterone receptor antagonist mifepristone 28.13 acts as an
antigestagen: a “morning after pill.” The antiandrogen cyproterone acetate 28.14 has gained
importance in the specific therapy of prostate cancer.
a potent drug for the treatment and prevention of osteoporosis. Moreover, it lowered
the risk of breast cancer. Raloxifen is considered to be a selective estrogen receptor
modulator (SERM). Compounds with such a profile are believed to have great
potential as a hormone-replacement therapy without increasing the risk of osteo-
porosis, coronary heart disease, or breast cancer.
Often the entire profile of a compound is only apparent after long-term use.
Tamoxifen afforded the unsettling result that 50% of breast tumors began to grow
again under long-term therapy. The development of resistance is explained by the
fact that the estrogen receptor is phosphorylated by protein kinase A. This does not
prevent tamoxifen from binding, but the antagonistic effect is reversed. Fulvestrant
28.11 seems to provide a solution to this problem because resistance has not yet
been observed (Fig. 28.9).
The progesterone receptor is closely related to the estrogen receptor. Whereas
estrogen 28.1 (follicle hormone) promotes and steers oocyte maturation in the
proliferation phase and indirectly initiates ovulation, progesterone 28.2 (corpus
luteum hormone) is formed in the secretory phase of the menstrual cycle. It controls
the cyclic changes in the uterus and uterine mucous membranes, decreases the
fertility, and maintains an already-intact pregnancy. Gestagens, progesterone recep-
tor agonists, as well as estrogen derivatives, were introduced as contraceptive
hormones (Fig. 28.10). In the 1950s, Carl Djerassi and Gregory Pincus had already
laid the foundations for oral contraceptives. It is based on the timed administration
of a combination of an estrogen with a gestagen; this suppresses the ovulation, that
is, release of a mature egg cell, at mid-cycle. A progesterone antagonist,
mifeprostone 28.13 (RU486), which in analogy to the estrogen antagonists, carries
a nitrogen function on its side chain, was discovered at Roussel Uclaf during
a search for glucocorticoid receptor antagonists. Its use as a “morning after pill”
due to its antigestagen effects is highly controversial in many countries. For the
termination of an intact pregnancy, the single administration of 600 mg of mifep-
ristone, and then 36–48 h later a prostaglandin to induce uterine contractions, is
used. This combination leads to a termination of pregnancy in 96% of cases up until
the 7th week of pregnancy. Persistent bleeding can occur as a side effect, and in rare
28.6 Ligands of PPAR Receptors 709
cases, heart function disorders. The opponents to this substance can be consoled
that, for these reasons alone, it is not appropriate for widespread use.
The male hormone testosterone 28.3 acts as an agonist on the androgen receptor.
It is responsible for the development of secondary male characteristics, intervenes
in the process of spermatogenesis, and regulates protein synthesis. This character-
istic of the enlargement of skeletal muscle cells by androgens has led to its use as an
anabolic hormone to improve performance in competitive athletes, bodybuilding,
or in livestock breeding. Antiandrogens such as cyproterone acetate 28.14 are
suitable for the treatment of prostate cancer.
In addition to the sexual hormones, there are even more active substances from
the class of steroids. In addition to the cardiac glycosides, which occur in plants, the
adrenal corticosteroids or corticoids are of great importance. If the adrenal glands
fail, the absence of these substances can lead to death, or in the case of an adrenal
under or over-function, severe illness can occur. They are distinguished by their
binding to the respective nuclear receptors into glucocorticoids and mineralocorti-
coids. The basic scaffold is very closely related to progesterone 28.2, though they
carry more functional groups (28.15–28.17; Fig. 28.11). The natural agonists of
both receptors are cortisol 28.16 and aldosterone 28.17. The therapeutic importance
of glucocorticoids was underestimated in the beginning. It was only after specific
drugs without mineralocorticoid side effects, such as dexamethasone 28.18
and betamethasone 28.19, were available that broad therapeutic application was
possible. Glucocorticoids influence metabolism, intervene in water and electrolyte
homeostasis, and influence the cardiovascular and nervous system. They are anti-
inflammatory, immunosupressive, and antiallergic. Highly active variants are used in
emergency cases of anaphylactic shock or sepsis. They also have severe side effects.
Their use requires that strict attention is paid to the indication and dosage.
The mineralocorticoids influence the water and electrolyte homeostasis. They
increase the resorption of sodium ions in the kidney, and increase the excretion of
potassium. Ligands for the mineralocorticoid receptor can be used as diuretics.
A potassium-sparing diuresis can be achieved with the structurally related
spironolactone 28.20, which competitively displaces aldosterone from its receptor.
The selective antagonist eplerenone 28.21 is used as a selective compound for the
treatment of hypertension and congestive heart failure.
CH2OH
O
H3C R 28.15 Corticosterone, R = H
HO
H3C H
28.16 Cortisol, R = OH
H H
O CH2OH
O
CH2OH HO H3C OH
HO R
O H3C H
O
H3C H F H
O
H H
28.18 Dexamethasone, R = a-CH3
O
28.17 Aldosterone 28.19 Betamethasone, R = b -CH3
O O
H3C O H3C O
H3C H H3C O H
H H H
O S O
O CH3
H3C O O
Fig. 28.11 Corticosterone 28.15 and cortisol (hydrocortisone) 28.16 are glucocorticosterods.
They regulate the release of glucose, both by stimulating gluconeogenesis, and by inhibiting its
metabolic degradation. A stress-induced release of cortisol leads to rapid release of glucose as an
energy source. The mineralocorticoid aldosterone 28.17 is responsible for the regulation of the
water and electrolyte homeostasis. The naturally occurring glucocorticoids act in an anti-
inflammatory manner, but they have mineralocorticoid side effects. Dexamethasone 28.18 and
betamethasone 28.19 are “pure” glucocorticoids. They have 30-times stronger anti-inflammatory
activity and the mineralocorticoid side effects of cortisol are absent. The diuretic spironolactone
28.20 achieves its effect by a competitive displacement of aldosterone from its receptor.
Eplerenone 28.21 is a mineralocorticoid receptor antagonist and is used for the therapy of
hypertension and congestive heart failure.
Its activation increases the fatty acid degradation in this organ. Artificial ligands
of this receptor type are lipid-lowering compounds from the group of fibrates
28.22–28.26 (Fig. 28.12). A crystal structure with the bound agonist 28.27 and
antagonist 28.28 was determined with the PPARa receptor (Fig. 28.13). As in the
case of the estrogen receptor, it is again helix 12 that orients over the entrance gate
of the ligand upon agonist binding. The terminal acid group of the agonist forms
a hydrogen bond to Tyr464 and stabilizes helix 12 in the active position. Antagonist
28.28 is elongated by a propionamide group. It blocks the refolding of helix 12 into
the active position. In the unfolded geometry it accomodates in another region of
the receptor surface.
28.6 Ligands of PPAR Receptors 711
O N
CH3
O N N
O
H3C CH3 O
N
Cl O
CH3
28.24 Etofyllin clofibrate
O CH3
Cl O
O CH3
H3C CH3
O
28.25 Fenofibrate
O
O
OH
H3C CH3
HN
Cl
28.26 Bezafibrate
LxxLL
Glu462
Tyr464
Tyr464
28.6 Ligands of PPAR Receptors 713
CH3
NH
S
O
O
28.31 Ciglitazone
O
H3C NH
S
N O
O
28.32 Pioglitazone
O
CH3
NH
N S
O
O
N
28.33 Rosiglitazone
The crystal structure with the bound ligand showed however, that the S enantiomer
is bound by the receptor. It was also possible to cocrystallize a peptide with the
LxxLL recognition motif with this structure. Once again, helix 12 makes the
binding pocket available in the active position and stabilizes the helical segment
of the LxxLL recognition peptide by positioning Glu471.
Recently, the insuline sensitizers of the glitazone type came into discussion due
to unexpected side effects. Rosiglitazone has been withdrawn from the European
market in 2010 due to risks of heart failure. For pioglitazone this risk is unknown,
714 28 Agonists and Antagonists of Nuclear Receptors
however, a risk for bladder cancer has been indicated and therefore also this
compound has been withdrawn from the market in 2011 in some countries. Never-
theless, PPARs also represent a possible target structure for cancer therapy. Pros-
tacyclin (▶ Sect. 27.9) is the natural ligand for a receptor that was initially termed
PPARd, but later proved to be closely related to PPARb. Its expression is regulated
by a variety of oncogenic signaling pathways. The receptor is often overexpressed
in tumor cells. Therefore antagonists of this receptor could represent a new concept
for the development of antitumor drugs.
O
H PO(OC2H5)2 HO O
H3C O N O
PO(OC2H5)2
NH HO
O O
O O OH HO
O
O OH O
O O OH
OH
H O
R N O NH
HO CH3O
OH O Me N
O O O
O N
O OH N
O CH3
CH3
CH3 NH
H3C O S
O
O
HO
CH3
28.39 Troglitazone
Fig. 28.15 Upon binding an activator, the pregnane-X receptor induces the expression of
cytochrome P450s from the CYP 3A family, which metabolize numerous drugs. Small ligands
such as phenobarbital 28.34 and the cholesterol-lowering SR12813 28.35 as well as large natural
products such as paclitaxel 28.36, hyperforin 28.38, or the macrolide rifampicin 28.37 activate
PXR. The insulin sensitizer troglitazone was withdrawn from the market because of its activity on
the PX receptor. Small changes such as the exchange of a phenyl group for a tert-butyl group on
paclitaxel can be enough to suppress this activating property.
Fig. 28.16 Schematic representation of the polypeptide chain in the estrogen receptor (a) and the
pregnane-X receptor (b–d). An insertion of 45 amino acids occurs in PXR that renders the lower
right structural portion extremely adaptive. Because of this, the receptor can bind ligands of very
different size: (b) crystal structure with bound SR12813 28.35, (c) crystal structure with bound
hyperforin 28.38, (d) crystal structure with bound rifampicin 28.37. For comparison, the estrogen
receptor bound to estradiol is shown (a).
28.8 Synopsis
• The nuclear receptors are a family of 48 members that are present as soluble
proteins in the cytosol. They are transcription factors and play an important role
in gene regulation. They form either homo- or heterodimers and are activated by
small molecules such as steroid hormones, retinoic acid, fatty acids, triiodothy-
ronine, vitamin D, prostaglandins, bile acids, or phospholipids.
• Nuclear receptors exhibit a ligand and a DNA-binding domain, however ligand-
independent domains are also involved in the activation of transcription. The
activated receptor, stimulated through agonist binding, migrates to the cell
nucleus and recruits co-activators, co-repressors, and transcription factors to
regulate gene expression.
• The ligand-binding domains can exhibit impressive selectivity in the recognition
of their ligands. Steroid receptors can distinguish the tiny structural differences
between male and female hormones in terms of H-bond donor/acceptor func-
tional group exchanges and presence or absence of the 19-methyl group.
• Agonist and antagonist binding induce a different orientation of helix 12, which
closes the entrance to the ligand-binding site. Antagonist binding hampers
reorientation of this helix across the entrance gate and simultaneously blocks
the recognition site for the binding of a helical LxxLL motif found on the surface
of the coactivator.
• Agonists and antagonists of the steroid receptors are important drugs to interfere
with the menstrual cycle as contraceptives, act in anticancer therapy, show anti-
inflammatory, immunosuppressive, or antiallergic activity on the glucocorticoid
receptor, or act as diuretics or hypertensive agents on the mineralocorticoid
receptor.
• PPAR receptors occur as several subtypes and form heterodimers with the retinoic
receptor upon activation. Agonists of the PPARg receptor can induce an increase
in glucose metabolism. They are used as insulin sensitizers in diabetes therapy.
• The transcription and expression of cytochrome P450 enzymes involved in
the metabolism of xenobiotics can be regulated by the nuclear receptors
718 28 Agonists and Antagonists of Nuclear Receptors
Bibliography
General Literature
Gronemeyer H, Gustafsson J-Å, Laudet V (2003) Principles for modulation of the nuclear receptor
superfamily. Nat Rev Drug Discov 3:950–964
Moore JT, Collins JL, Pearce KH (2006) The nuclear receptor superfamily and drug discovery.
ChemMedChem 1:504–523
Ottow E, Weinmann H (eds) (2008) Nuclear receptors as drug targets. In: Mannhold R, Kubinyi H,
Folkers G (eds) Methods and principles in medicinal chemistry, vol 39. Wiley-VCH,
Weinheim
Special Literature
Table 29.1 Particularly many subtypes are known for the serotonin receptors with therapeutic
possibilities to treat hypertension, migraine, schizophrenia, depression, anxiety, emesis, and
gastrointestinal motility disorders
Modulated
Receptor Gene Type; therapeutic indication enzyme
5-HT1A 5-ht1A GPCR, Gi; CNS diseases such as anxiety Adenylate
and depression cyclase
5-HT1B 5-ht1B GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT1D 5-ht1Da (h), 5-ht1Db GPCR, Gi; neuronal inflammatory processes,
≙ 5-ht1B (R) migraine
5-HT1E 5-ht1E GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT1F 5 ht1F GPCR, Gi; neuronal inflammatory processes,
migraine
5-HT2A 5-ht2A GPCR, Gs; CNS disease, atypical Phospholipase C
antipsychotic, wound healing, arterial
hypertension
5-HT2B 5-ht2B GPCR, Gs; CNS disease, atypical
antipsychotic, wound healing, arterial
hypertension
5-HT2C 5-ht2C GPCR, Gs; CNS disease, atypical
antipsychotic, wound healing, arterial
hypertension
5-HT3 5-ht3 Ion channel, suppression of cytostatic-induced –
emesis
5-HT4 5-ht4 GPCR, Gs; gastrointestinal tract, irritable Adenylate
bowel syndrome cyclase
5-HT5 5-ht5A, 5-ht5B GPCR, ?; circadian rhythm ?
5-HT6 5-ht6 GPCR, Gs; involved in memory and Adenylate
learning cyclase
5-HT7 5-ht7 GPCR, Gs; regulation of the day/night Adenylate
rhythm cyclase
HT, ht 5-hydroxytryptamine (serotonin), R rat, h human, GPCR G protein–coupled receptor, Gs, Gi
stimulatory or inhibitory G protein
Fig. 29.1 Schematic representation of the spatial orientation of the transmembrane helices in
bacterial rhodopsin (a), bovine rhodopsin (b), and in the human b2-adrenergic receptor (c).
29.3 Structure of the Human b2-Adrenergic Receptor 723
How the actual structure of a human GPCR really looks became all the more
exciting. In 2007 it was made possible. Two structures of the human b2-adrenergic
GPCR were described under the auspices of Brian Kobilka at Stanford University
with the collaboration of groups at Scripps Research Institute in La Jolla, California
and MRC in Cambridge, England. For this tremendous achievement Brain Kobilka
724 29 Agonists and Antagonists of Membrane-Bound Receptors
Phe193
Val114
Asp113
p
Ser203
Asn312
Phe290 Phe289
Fig. 29.2 Segment of the crystal structure of the human b2-adrenergic receptor with bound
carazolol 29.1, a partial inverse agonist.
was awarded the Nobel prize in 2012 together with Robert Lewkowitz who first
characterized and isolated the b-adrenergic receptors in the 70s. The receptor’s
high flexibility and proteolytic instability represented the biggest problems in its
crystallization. The third intracellular loop proved to be particularly troublesome.
The researchers had to use a trick to overcome this. It was only after a specific
antibody was found that binds to this loop and stabilizes the receptor in its native
functional structure that the crystallization was successful. In a second strategy, the
critical loop was excised from the receptor and replaced by T4 lysozyme, a well-
known and easily crystallized protein. The newly formed fusion protein displayed
largely unchanged pharmacological properties. Both structures were crystallized
together with a potent partial inverse agonist, carazolol 29.1 (Figs. 29.2, 29.3). They
differ only slightly from each other in the transmembrane region. However, it is just
in this area that the difference to the previously determined structure of bovine
rhodopsin is almost three times as large. This underscores the structural differences
to the rhodopsin receptor (Fig. 29.1). Because carazolol only blocks about 50% of
the receptor’s basal activity, it is referred to as partial inverse agonist.
Carazolol binds to Asp1133.32 and Asn3127.39 (the superscript numbers indicates
the helix and position at which the amino acid is found, respectively) with its
alkylamino and alcohol functions. From mutational studies it was known that the
exchange of Asp113 for an asparagine leads to the loss of antagonist binding.
It hampers the G-protein activation by agonists by four orders of magnitude.
The mutation of Asn312 for a non-polar amino acid such as alanine or phenylala-
nine causes the receptor function to collapse, whereas the function is partially
29.3 Structure of the Human b2-Adrenergic Receptor 725
HN +
N +
H N
H H
HO H
O OH
OH
29.1 Carazolol HO
29.5 Isoprenalin
Isoprenaline
HN + CH
N + 3
H N
H H
O OH HO H
OH
29.2 Pindolol HO
29.6 Adrenaline
+
N
H
H
O OH
29.3 Propranolol
+
N
H
O
H
O OH
29.4 Betaxolol
Fig. 29.3 Ligands of the human b2-adrenergic receptor. Carazolol 29.1, pindolol 29.2, propran-
olol 29.3, and betaxolol 29.4 are b-blockers, whereas isoprenaline 29.5 and adrenaline are receptor
agonists.
preserved by replacement with an amino acid with a polar side chain (threonine or
glutamine). Carazolol’s heteroaromatic tricyclic moiety forms a hydrogen bond to
Ser2035.42 with its NH group. This group was also recognized as critical from
a mutagenesis study by using catecholamine agonists. It was known from b-
blockers from the aryloxyaminopropanol class with nitrogen-containing heterocy-
cles such as pindolol 29.2 that exchanging this serine for another residue causes the
affinity of these compounds to the b-adrenergic receptor to drop substantially.
Carazolol is surrounded by numerous contacts to hydrophobic amino acids
(Val1143.32, Phe2906.52, Phe1935.32). This explains why all b-blockers display an
aromatic moiety in this region (Fig. 29.3).
Many b-blockers display poor selectivity for the subtypes of the b-adrenergic
receptors (b-AR). Nonetheless, such selectivity is exceedingly desirable because,
for example, the b1 receptor is found in the cardiac vasculature and the b2 subtype is
found in the bronchi. Efficient b1 receptor inhibition reduces the contractility and
frequency of the heart. At the same time though, bronchoconstriction by blocking
the b2 receptors is undesirable. Interestingly, all amino acids that surround
726 29 Agonists and Antagonists of Membrane-Bound Receptors
carazolol in the binding site of the b2 receptor are conserved in the b1 receptor. The
observed 94 exchanges between the b1 and b2 receptors are all in the loop regions.
Therefore it is assumed that the pharmacological differences that are exploited in
selective ligands such as betaxolol 29.4 are found in the entrance region to the
binding site and cause small changes in the helix packing.
The concept of cutting out unstable loops and exchanging them for T4 lysozyme
proved to be extremely successful. In the meantime, the research group of Ray
Stevens at the Scripps Institute in La Jolla, California has managed to elucidate
the structures of the adenosine A2A receptor, the dopamine D3 receptor, and the
CXCR4 chemokine receptor. All have the same overall architecture, but the shape
and position of the binding pockets differ significantly. The CXCR4 receptor is not
regulated by a small ligand, but rather by a protein. A structure determination was
carried out with two agonists, initially with the low-molecular-weight ligand IT1t,
then with the cyclic 15-residue peptide CVX15. In spite of the binding pocket of
this receptor being much larger, it was shown that the receptor can be controlled by
peptidic macromolecules as well as small ligands. The CXCR4 receptor represents
a possible target structure in cancer therapy as well as for HIV infection. In the
latter case, it serves as a co-receptor that give the virus initial access to T cells.
It binds to the viral glycoprotein gp120 (▶ Sect. 31.4).
In the first structures of the b-adrenergic receptor using the T4-lysozyme-fused
receptor, the protein was in an inactive state. If the partial inverse agonist carazolol
is compared with structurally smaller agonists such as isoprenaline 29.5, it is
obvious that the two hydroxyl groups of the catechol moiety form hydrogen
bonds with Ser2045.43 and Ser2075.46. Moreover Asn2936.55 and Tyr3087.35 have
been described as being critical for agonist binding. These residues are too far apart
in the carazolol structure to efficiently interact with agonists such as isoprenaline.
This confirms the assumption that the receptor must undergo a conformational
change to successfully accommodate an agonist.
The structure of the b1 receptor with an antagonist, cyanopindolol 29.6
(Fig. 29.4) was resolved in the research group of Gebhard Schertler, initially in
Cambridge, England, and later at the Paul Scherrer Institute in Zurich, Switzerland.
The scientists used a thermostable receptor variant from turkey with six exchanged
amino acids. Its overall structure is not different from the b2 receptor, but the
stability of the inactive form is increased. Overall, the b1 receptor has less intrinsic
(or basal) activity than the b2 receptor. This increased intrinsic activity of the b2-AR
is physiologically important. The T264I mutant of b2-AR, which occurs as a human
polymorphism and displays reduced intrinsic activity compared to b1-AR, is asso-
ciated with a heart disease.
In addition to the cyanopindolol antagonist, a crystal structure was also deter-
mined for a complex with the full agonist isoprenaline 29.5. As expected, binding of
this structurally small agonist leads to a contraction of the binding pocket; helices
H5 and H7 move toward one another (Fig. 29.4). Agonists as well as antagonists use
their aminopropanol group to form two hydrogen bonds to Asn327 on H7. The
larger antagonist, cyanopindolol 29.6 uses its indole NH function for an interaction
with the side chain from Ser211 on H7. The catecholamine isoprenaline 29.5, on the
29.3 Structure of the Human b2-Adrenergic Receptor 727
Ser215
other hand, employs its two aromatic OH groups for hydrogen bonds to Ser211 and
Ser215 to pull H5 to the agonist-bound conformation. The relative shift of these two
helices against one another seems to change their mutual interaction areas and
contribute to the transition from the inactive to the active state. The mentioned
polymorphism, which leads to a reduction in the intrinsic activity of b2-AR, also
exerts an influence on the contact between H5 and H7 in this receptor.
As biophysical investigations have demonstrated, the activation of the
b-adrenergic receptor occurs analogously as in rhodopsin. In the case of the light-
dependent receptor, the activation is triggered by a cis–trans isomerization of cova-
lently bound retinal. The retinal-binding site is spatially located in the same region as
the ligand-binding site in other GPCRs. Detailed glimpses into the activation mech-
anism have allowed a comparison of the structures of a stabilized Glu113Gln mutant
of the active and inactive rhodopsin. After photoactivation, the receptor binds one
retinal molecule in the all-trans configuration (Fig. 29.5). At the same time, the
activated receptor was characterized in complex with an 11-residue peptide, which
728 29 Agonists and Antagonists of Membrane-Bound Receptors
inactive active
Trp2656.48
Trp2656.48
Tyr3017.48
Arg1353.50
Arg1353.50
Glu1343.49
Glu2476.30
ion lock
αG peptide
Fig. 29.5 Comparison of the crystal structures of inactive (green) and active rhodopsin. The
photoactivation is triggered by retinal, when its cyclohexene ring shifts because of an isomeriza-
tion of the 11-double bond from a cis to a trans configuration. This movement is translated to
Trp265 and from there through a cascade of water-mediated H-bonds all the way to the “ionic
lock,” which is made up of Glu134, Arg135, and Glu247. The salt bridges are dissolved, and the
binding site for the binding epitope of the a-domain of the G protein is established.
corresponds to the interacting epitope from the a-subunit of the G protein. The
b-ionone ring of retinal is shifted by 4.3 Å in the direction of a gap between helices
H5 and H6 in activated rhodopsin. In doing so, Trp2656.48 is also moved from its initial
position in the ground state. This transition requires a global restructuring of the
orientation of the helices, and a water-mediated interaction network between H6
and H7 is disrupted and rearranged. The cytosolic end of helix 6 moves by twisting
away from the center of the helix bundle H1–H4 and H7. The side chains from
Tyr2235.58 and Tyr3067.53 orient in the interior of the receptor, form new contacts to
the water network, and undergo interaction with the highly conserved E(DRY) motif
at the cytosolic end of helix 3. The salt bridges between the side chains of Glu1343.49,
Arg1353.50, and Glu2476.30, which form the so-called ionic lock, open and allow
access to the binding site for the peptide epitope of the a-domain of the G protein.
Upon ligand binding on the extracellular side, pharmacologically relevant
GPCRs presumably undergo very similar spatial shifting of the helical ends at the
contact surface in the cell interior. The activation process is presumably a multistep
cascade, during which multiple conformational states are passed through.
After all the new structures became available, it is interesting to compare
the initially constructed homology models of the b2-adrenergic receptor with the
29.4 Tracing Selective Dopamine D1 Agonists 729
More effort has been made in the search and optimization of potent ligands for
G protein–coupled receptors than in any other area in drug research. Thousands of
examples could be introduced here, but only two cases shall be discussed as
representative examples. The first case deals with the development of selective
agonists for a receptor that recognizes a small neurotransmitter as its natural ligand:
the dopamine receptor. An example of a receptor that is controlled by peptide
ligands will be discussed in Sects. 29.5 and 29.6.
Dopamine 29.7 (Fig. 29.6) is an important neurotransmitter that carries out
multiple functions in the body. A reduction in the dopamine concentration is
observed in particular brain regions in patients that suffer from Parkinson’s disease;
this is caused by the destruction of dopamine-producing cells. The disease can be
treated by the administration of L-DOPA. This compound is actively transported
HO NH2
29.7 Dopamine
HO
Ki (nM)
R R D1 D2
HO 29.8 Phenyl 63 6300
NH
HO 29.9 H 10000 2500
Fig. 29.6 Dopamine 29.7 and the dopamine receptor ligands 29.8 and 29.9. Compound 29.8
binds selectively to the D1 receptor. A comparison of the binding affinities of 29.8 and 29.9 shows
that the introduction of a phenyl substituent is responsible for D1 selectivity.
730 29 Agonists and Antagonists of Membrane-Bound Receptors
Fig. 29.7 Comparison of two conformations of 29.8 with the phenyl substituents in the plane of
(left) or above the plane of (right) the seven-membered ring. The phenyl rings have different
spatial segments. It can be assumed that only one of these two conformations is suitable for
binding to the receptor.
across the blood–brain barrier as an amino acid and is transformed in the brain to
biologically active dopamine (▶ Sect. 9.4).
In this section, work carried out at Abbott on the search for new dopamine
agonists that selectively bind to the D1 receptor shall be discussed. The goal of the
work was to find a compound that could be used to treat Parkinson’s disease that
lacked the known side effects of L-DOPA. The investigations, which were carried
out between 1988 and 1991, proved that the use of computer-aided methods, even
without knowledge of the 3D structure of the protein, can deliver decisive contri-
butions to the discovery of a new lead structure.
Initially an attempt was made to obtain data about the receptor-bound confor-
mation of D1 agonists to use the information later for the targeted selection of new
structures. The starting point for this work was another company’s compound:
SKF 38393 29.8 (Fig. 29.6). The simpler derivative 29.9, which lacks the phenyl
substituent found in 29.8, was first synthesized at Abbott. Compound 29.9 binds
more than a hundred times less potently to the D1 receptor than 29.8. Interestingly,
the affinity to the D2 receptor remained almost unchanged. This aroused the
suspicion that the phenyl group binds in an additional pocket in the D1 receptor
that is absent in the D2 receptor. Because it was known that the hydroxyl groups and
the amino groups are important for receptor binding, the question as to how the
phenyl ring is positioned relative to these functional groups was raised.
A conformational analysis showed that 29.8 can basically adopt two different,
energetically favorable conformations. In one, the phenyl substituent lies approx-
imately in the plane of the bicyclic ring; in the other, it is significantly above the
seven-membered ring (Fig. 29.7). To decide which of the two conformations are
adopted in the receptor, pairs of compounds were synthesized, each with a phenyl
substituent either above or in the plane of the ring. The corresponding unsubstituted
derivatives were also prepared. In doing so, rigid compounds were chosen that
correspond to only one of the two conformations. It was shown that the compound
with the phenyl substituents in the plane of the neighboring seven-membered ring
displayed potent dopamine D1 receptor binding. Obviously this is the biologically
active conformation.
In parallel to this work, an unselective new dopamine agonist 29.10 (Fig. 29.8)
was identified that binds equally potently to the D1 and D2 receptors. The previ-
ously compiled criteria for potent D1 binding were then used to determine
29.4 Tracing Selective Dopamine D1 Agonists 731
OH
HO R Ki (nM)
R D1 D2
29.10 H 1600 5000
N 29.11 Phenyl 63 >100000
H
OH
HO R
29.12 H 16000 >100000
OH
HO R
NH2
Fig. 29.8 The pharmacophore hypothesis developed at Abbott for D1-selective agonists led to the
synthesis of the highly affine and selective compound 29.14 via compounds 29.10–29.13.
the position where a phenyl substituent should be attached, analogous to 29.8. The
molecular comparison produced the suggestion for 29.11 (Fig. 29.8). This com-
pound was extremely successful! The binding affinity corresponds roughly to that
of 29.8, but 29.11 is D1 selective. The synthesis of 29.11, however, was not entirely
simple. Therefore additional D1 agonists were sought.
The problem was addressed on the computer by a 3D database search. ALAD-
DIN, a program developed at Abbott for this purpose, was used. The 3D database
of all Abbott substances was searched for structures that could have dopaminergic
activity by using the known pharmacophore pattern of dopaminergic compounds.
The computer search produced, among others, compound 29.12. This compound
indeed binds to the dopamine receptor. By adding an additional phenyl group in the
correct position, compound 29.13 resulted, for which a strong increase in the
binding affinity was observed. This lead structure, found in a 3D database search,
was systematically modified. The result of the work was finally 29.14. Of all the
analogoues that were known at the time, this compound represented the most
potently binding selective D1 agonist.
Yvonne Martin, who was intimately involved in the above-described work at
Abbott, offered an explanation for the success of the project. She identified two
factors as decisive: on the one hand the rational, very systematic approach in which
appropriate synthetic model compounds were chosen to establish the
pharmacophore hypothesis, and on the other hand the very close cooperation
between computer-based considerations and synthetic chemistry.
732 29 Agonists and Antagonists of Membrane-Bound Receptors
Cl
N
COOH
N
29.15 S-8307, R = Cl IC50 = 40 mM
Cl
N
X
N
O COOH
29.18 X = COOMe, R =
N
H
IC50 = 0.14 mM
COOH
29.19 X = OH, R=
IC50 = 0.30 mM
H
N N
N N
29.20 X = OH, R=
IC50 = 0.019 mM
Fig. 29.9 The most important intermediates in the development of the angiotensin-II receptor
antagonist losartan. The basic structure of the angiotensin-II antagonists 29.15 and 29.16, which
were published in a patent from Takeda, was retained. Variations at the substituents R were
oriented on a superposition of the Takeda structure with a model of the receptor-bound confor-
mation of angiotensin II (Fig. 29.10). Compounds 29.19 and 29.20 are orally available angioten-
sin-II receptor antagonists. Losartan 29.20 successfully completed clinical trials and is available
for clinical use since 1994.
successfully, and has been marketed as Lozaar ® since 1994. Losartan was therefore
the first angiotensin-II receptor antagonist to gain approval for the treatment of
hypertension. Only one year later, Novartis followed their colleagues at Dupont
with valsartan 29.21. In the meantime, an entire class of drugs, called sartans, have
received approval for therapy (Fig. 29.10). After multiple years of clinical use,
however, not all sartans have proven to be equally efficient, for instance for the
treatment of congestive heart failure. A comparative study with more than 5,000
patients in Sweden demonstrated that candesartan gives better therapeutic results
than losartan. Ninety percent of patients treated with candesartan survived for
one year, and 61% survived for 5 years. On the other hand, 83% of patients treated
with losartan survived for one year and 44% survived for 5 years. The higher
affinity that candesartan has for the AT1 receptor compared to losartan is conspic-
uous as well as its prolonged persistence at the site of action, which is 10–30 times
longer. Perhaps these parameters express the increased efficiency of candesartan.
Cl
N O
HO CH3
CH3 S
HOOC N
N N
HOOC CH3
N
N N
N N COOH
NH NH
N N
29.20 29.21 29.22
Losartan Valsartan Eprosartan
N
CH3 N N N
O
N
CH3 N O CH3
N N
O
O
O
O
N O
N N
NH HOOC N
N NH
N
29.23 29.24 29.25
Irbesartan Telmisartan Candesartan-Prodrug
Fig. 29.10 Losartan 29.20 was the first angiotensin-II receptor antagonist, shortly thereafter
valsartan 29.21 followed. Eprosartan 29.22, irbesartan 29.23, telmisartan 29.24, and candesartan
29.25 are further representatives of the sartan class. Candesartan is a prodrug; the red-colored
portion is cleaved to release the actual active substance.
The wealth of nuances that our sense of smell can perceive is impressive. Almost
poetically we try to describe gradations in scent with words. Our sense of smell is
probably the biological system that can be most easily illustrated when it comes to
the biological activity of the spatial chemical structure of molecules. With each
breath, volatile molecules are pulled into our noses and brush against the olfactory
receptors. There they leave a nuance-rich signal that is translated to a multifaceted
sense of smell in the brain. That the shape of molecules is coupled to a particular
order has been known for a long time. Elliptical molecules, for example, have
a camphor-like scent. Long stretched-out molecules are described as having an
ether smell, and floral character requires a construction that is reminiscent of the
shape of a violin case. However, even small structural changes can exert impressive
effects on our sense of smell (▶ Sect. 5.7).
736 29 Agonists and Antagonists of Membrane-Bound Receptors
HN NH2
NH OH HN N
O O O
H H
H2N N N N
N N N
H H H
O O O NH
HO2C H3C CH3 H3C O
CH3 CO2H
HO
Cl
His6
Losartan
N N
Pro7
N
N
N Ile5
N
H3C H C-Terminus
Phe8
Angiotensin II
29.20
Fig. 29.11 The C-terminal part (red) of the octapeptide angiotensin II (yellow, below) is
compared with the structural architecture of losartan (green) to generate a working hypothesis,
which later turned out to be wrong. The butyl side chain mimics an isoleucine residue and the
imidazole ring with the CH2OH group lies on the histidine. The proline and the phenyl ring of
a phenylalanine are mimicked by a biphenyl group. The tetrazole represents an isostere for the
terminal acid function.
The elucidation of our sense of smell is based on the work of Linda Buck and
Richard Alex, who were awarded with the Nobel Prize in medicine in 2004 for their
accomplishments. Scent molecules are perceived by the olfactory cells in the
olfactory mucosa of the nasal cavity. Different olfactory cells are depolarized and
activated by diverse scents, this means the receptor proteins on the cells can
distinguish between structurally divergent aromas by their affinity to them. GTP-
dependent activity of an adenylate cyclase increases as a signal in the cell. This is
interpreted as a distinct indication of the involvement of intracellular G proteins in
the olfactory process.
Linda Buck and Richard Alex therefore sought after a family of G protein–
coupled receptors that are expressed in the olfactory mucosa of the rat. They
were quickly successful. In the meantime it is known that there are about
29.7 Lessons Taught by the Nose: We Smell with GPCRs 737
from chimpanzees. The recognition of possible rivals in the wild for a male chimp
or a sexual partner for a female chimp might be more important than for us humans.
Perhaps Nature equipped chimpanzees with a more sensitive olfactory receptor for
androstenone for this reason.
Two aspects from the study of the sense of smell can be translated to the effects
of drugs on GPCRs. Synthetic agonists and antagonists compete for the same
binding site in cases of GPCRs that are regulated by small biogenic amines in
particular. Usually the synthetics are larger and interact with an increased number
of amino acid residues than the endogenous competitor. Polymorphisms based on
individual base exchanges are described for these receptors, too. Therefore an
attenuated sensitivity to the effects of these drugs on the mutated receptors must
be counted upon. As a result, this is noticeable when, within one group of patients,
variations in the therapeutic window are found. The other aspect that was illumi-
nated by the research on olfactory receptors is the combinatorial composition of
a binding profile made up of the individual interaction signals from the different
receptors. Multiple subtypes of pharmacologically relevant GPCRs are known, for
example the serotonin receptor, which are expressed on the cells. Efforts have
indeed been made to develop highly selective ligands for these subtypes, but this is
not an easy task if the receptor subtypes are particularly closely related. There is
always an attenuated binding to all related receptors. Therefore the signal that
reaches the cell is a composite of information from all the individual binding
profiles. These profiles differ for different ligands and afford a divergent pharma-
cological activity spectrum. This makes it extraordinarily difficult to estimate
the therapeutic value of a development candidate in this area before clinical trials.
It could be that these ligands have just the right balance on multiple subtypes to
achieve their value in therapy. Analogously, a scent develops its lofty potential by
the optimally graduated stimulation of a shotgun of multiple olfactory receptors,
whereas another does not exceed the modest niveau of a cheap perfume!
Not only GPCRs relay extracellular signals into the cell’s interior. A further large
group of membrane-bound receptors that can also achieve this task are the classes
of dimerizing or oligomerizing receptors that bind growth factors. These recep-
tors carry a tyrosine kinase domain on the cytosolic side. Therefore these receptor
tyrosine kinases can also be considered to be allosterically regulated enzymes, the
controlling domains of which are found on the cell’s exterior. The ligands for these
receptors, the growth factors, are themselves proteins with ca. 50–400 amino acids.
By binding, presumably initially to a monomeric building block of the receptor, the
dimerization is accomplished. Conformational changes on the cell’s interior are
induced in both of the tyrosine kinase domains that had come together, and this
results in the autophosphorylation of the receptor at multiple sites. This is the
trigger for the recruitment of further adapter proteins that also activate
29.8 Receptor Tyrosine Kinases and Cytokine Receptors 739
O H
HO N O HN Cl
N O
N
N OH MeO N
H O
Fig. 29.12 The fungal metabolite L783281 29.26 was discovered as an insulin mimetic. It
stimulates the tyrosine receptor kinase of the insulin receptor and has antidiabetic properties.
Gefitinib 29.27 inhibits the tyrosine kinase domain of epidermal growth factor.
740 29 Agonists and Antagonists of Membrane-Bound Receptors
with the natural ligand and foils receptor binding (▶ Sect. 32.3). An alternative
concept is the inhibition of tyrosine kinases from the interior of the cell. In this case
successes have been registered. Gefitinib 29.27 is a tyrosine kinase inhibitor
(▶ Sect. 26.3) for the epidermal growth factor receptor that has found clinical
application (Fig. 29.12). Antisense nucleotides (▶ Fig. 32.4) represent alternative
therapeutic concepts, as does gene silencing with siRNA (▶ Sect. 12.7) to reduce
the expression rate of the target receptor. In addition to the signal cascades that the
tyrosine kinase receptors use, living organisms also use cytokines for signal trans-
duction. Here too, protein-like signaling molecules that often adopt folding patterns
made up of a bundle of four helices (Fig. 29.13) are dealt with. Cytokines are
released from many cells. Their principle function is to disseminate signals in the
immune system. They are therefore involved in immune, inflammatory, and infec-
tious diseases. They also give signals to leukocytes and macrophages to migrate to
sites of inflammation. They play an important role in cell differentiation and cell
proliferation so that they also have importance in cancer therapy.
Cytokines are recognized by cell-surface receptors that are also coupled to
a protein kinase on the cell’s interior. They are able to initiate cellular processes
through these kinases. This can lead to the up- or downregulation of gene expres-
sion. Cytokines are also called interferons, interleukins, and chemokines. Interferons
stimulate cells involved in immune defense during viral infections in particular.
Interleukins were initially considered to be involved in the communication between
leucocytes, but because they are also involved in the modulation of cell growth and
cell death, they are also used for the treatment of tumors. Chemokines are signaling
molecules that attract immune cells to sites of inflammation.
From the point of view of therapy, cytokines themselves or functional surrogates
are interesting. Either stimulation or inhibition of their receptors can represent
a therapeutic concept. Because, once again, receptors that are regulated by proteins
29.9 Synopsis 741
29.9 Synopsis
The resulting sartans are potent antihypertensives. Later it was demonstrated that
the peptide agonist and the small-molecule antagonists do not share a common
binding site, nevertheless an incorrect working hypothesis resulted in successful
drug design.
• The wealth of nuances of our sense of smell is achieved by a simultaneous
recognition of odorants at multiple GPCRs with composite and attenuated receptor
profiles.
• Genetic polymorphism of the odorant receptors results in an attenuated sensi-
tivity of individuals for different scents.
• Composite receptor profiles and attenuated sensitivity due to genetic polymor-
phism can be expected for GPCRs targeted by marketed drugs too.
• Dimerizing or oligomerizing receptors bind growth factors and cytokines and
carry a tyrosine kinase domain on the cytosolic side. Upon activation, the kinase
domain starts autophosphorylation, which initiates kinase-dependent signaling
cascades.
• Activation or suppression of oligomerizing receptors needs ligands that interfere
with the binding of macromolecular endogenous ligands. Antibodies have suc-
cessfully been raised to compete with the natural ligands. Furthermore, small-
molecule kinase inhibitors have been developed to block function of the attached
cytosolic tyrosine kinase domain.
Bibliography
General Literature
Buck LB (2005) Unraveling the sense of smell (Nobel lecture). Angew Chem Int Ed Engl
44:6128–6140
Martin YC et al (1991) Molecular modeling-based design of novel, selective, potent D1 dopamine
agonists, in QSAR: rational approaches to the design of bioactive compounds. Elsevier,
Amsterdam, pp 469–482
Rexler RR et al (1996) Nonpeptide angiotensin II receptor antagonists: the next generation in
antihypertensive therapy. J Med Chem 39:625–656
Timmermans PBMWM, Wong PC, Chiu AT, Herblin WF (1991) Nonpeptide angiotensin II
receptor antagonists. Trends Pharmacol Sci 12:55–61
Special Literature
Bianco R et al (2007) Rational bases for the development of EGFR inhibitors or cancer treatment.
Int J Biochem Cell Biol 39:1416–1431
Cherezov V et al (2007) High-resolution crystal structure of an engineered human b2-adrenergic
G protein-coupled receptor. Science 318:1258–1265
Copeland RA, Pompliano DL, Meek TD (2007) Drug–target residence time and its implications
for lead optimization. Nat Rev Drug Discov 5:730–739
De Meyts P, Whittaker J (2002) Structural biology of insulin and IGF1 receptors: implications for
drug design. Nat Rev Drug Discov 1:769–783
Bibliography 743
The cell is the smallest structural and functional unit of all living things. Single-cell
organisms contain only one such unit. In complex organisms such as humans,
1013–1014 cells come together. Due to their constitution, cells are capable of
metabolism. They possess a complex architecture that is directly related to their
function. Because of the high degree of differentiation in higher-developed
organisms, it is not possible to refer to a typical, representative cell. Each cell is
surrounded by a cell membrane. It ensures that the cell represents an individual
closed unit. Signals must be transmitted through this membrane. Systems that achieve
this task were discussed in ▶ Chaps. 28, “Agonists and Antagonists of Nuclear
Receptors” and ▶ 29, “Agonists and Antagonists of Membrane-Bound Receptors”.
Material exchange must also be possible, however, so that the cell can be supplied
with the necessary substances for its function. The selective permeability of the
membrane is of special importance. Amphiphilic compounds can passively diffuse
through the membrane of their own accord. For example, the steroid hormones
discussed in ▶ Chap. 28, “Agonists and Antagonists of Nuclear Receptors” have
this property. Polar compounds such as amino acids, peptides, or sugars do not
permeate the membrane passively but they are essential for the maintenance of the
cell. Therefore the cell is equipped with special transporters that are sometimes
highly selective, but sometimes also surprisingly promiscuous. Because the substance
transport of polar compounds generally occurs against a concentration gradient, this
is accomplished only with the consumption of energy. Nature couples the task of such
a transporter with an energetically favorable reaction. In biological systems, the
hydrolysis of the triphosphate unit of ATP primarily serves this purpose.
Another group of charged particles, the ions, have fundamental importance in
the regulatory function of cells. Without special protein systems, however, they
could not permeate the membrane. If different concentrations of certain ions are in
the cell interior and exterior, a difference in the electrochemical potential will
result. Changes in the membrane permeability for ions play a decisive role in cell
stimulation and signal transduction. Nerve and muscle cells in particular react to
such stimuli with specific changes in their states. For example, the contractions of
muscle cells determine the heart beat. Nerve cells transmit stimuli over short or
long distances and serve to distribute information in the central nervous system.
The establishment and maintenance of such concentration gradients across the
membrane requires the transport of the relevant ions across the membrane barrier. It
is primarily the ion pumps that build a concentration gradient across the membrane
barrier. They work relatively slowly and consume energy. Therefore their function is
coupled to an energetically favorable reaction. Ion pumps achieve a transport rate of
102–104 particles per second. Indeed, at 103–105 molecules per mm2 their local density
in the membrane rate is relatively high, but for the fast switching of cellular processes,
the ion pumps are much too slow. Therefore there are specific ion channels that are
responsible for selective and passive ion passage along a concentration gradient. They
achieve a flow rate of 106–108 ions/s, which is only slightly below the rate of diffusion.
Their occupancy density in the membrane is much lower at 1–10 molecules/mm2. Ion
channels are either voltage or ligand-gated and allow a change in the membrane
potential within the millisecond range. If the pumps have established an electrochem-
ical gradient across the membrane, the opening of a specific ion channel leads to ion
flow across the membrane purely for entropic reasons.
The cell must also regulate its water homeostasis. Individual water molecules
can directly diffuse across the membrane. To transport larger quantities of water
across the membrane, specific pores called aquaporins are necessary that regulate
the in- or outflow of water according to the gradient in the osmotic pressure.
These systems of specific particle transport across a membrane shall be consid-
ered in this chapter in detail. They are all integral membrane proteins. Examples
of these membrane proteins that have been characterized by structural biology will
be presented. Ligands shall be discussed that represent an approach to the thera-
peutically relevant regulation of these proteins. Furthermore bacterial transport
systems that change the permeability of membranes can act as antibiotics to combat
other microorganisms shall be discussed.
Membrane
Cl- Na+/K+ Ca2+ Na+ K+ Interior
Channel ATPase Na Channel
+ Channel Channel
Fig. 30.1 Different pumps and ion channels ensure the calibration of ion gradients across the cell
membrane, and in doing so establish a potential difference across the membrane. They can be
either ligand or voltage-gated. Potassium channels (blue) carry potassium ions out of the cell
highly selectively and are largely responsible for the calibration of the resting potential. The
opening of fast sodium (red) and also calcium channels (green) cause an action potential, which
leads to depolarization. A pump (violet) that exchanges three Na+ for two K+ ions ensures the
re-establishment of the Na+/K+ ion concentration in the resting state. Chloride channels (yellow)
allow an influx of Cl- ions, which hyperpolarizes the cell and hinders depolarization. The
concentrations in mol/L (M) that are given on both sides of the membrane correspond to the
approximate values in the resting state.
Therefore an excess of positive ions accumulates on one side of the membrane, and
a deficit is on the other side. A potential difference is formed that, as in the first
case, can be calculated from the concentration difference at the boundary surface by
using the Nernst equation. After a little while the net migration of potassium ions
comes to a stop because the tendency to reduce the concentration gradient is
counterbalanced by the difference in electrostatic potential. Only a few potassium
ions, in fact, migrate before this dynamic equilibrium is established.
In living nature, it is above all sodium, potassium, calcium, and chloride ions
that form such potential differences between the interior and exterior of the cell.
Initially ion pumps are responsible for the membrane’s concentration gradient. If
the cell membrane were only permeable for potassium ions, a 30-fold concentration
gradient of K+ ions between the interior and the environment would result in
a voltage of 90 mV (Figs. 30.1 and 30.2). This is the situation in the resting
state of a cell. As described in the thought experiment, the outflow of K+ ions
through a highly specific potassium channel establishes this voltage difference. The
above-mentioned 90 mV is not measured, but rather a resting membrane
potential of about 70 mV (Fig. 30.2) is observed. Because other ions also have
a certain permeability, the membrane potential at any arbitrary time reflects
a complex mixture of different contributions of individual ions and their conduc-
tivities (Fig. 30.1). To stabilize the cell in a particular phase, (for instance, in the
resting state), the cellular ion distribution is maintained by Na+/K+ATPases. They
pump ions against the electrochemical concentration gradient by consuming ATP.
For each transport process, three sodium ions are pumped out of the cell while two
potassium ions are pumped in. There is another such pump to establish the calcium
ion concentration.
748 30 Ligands for Channels, Pores, and Transporters
0 mV
−50 mV
−100 mV
Ion Time
Flow
iK+←
iCa2+→ iNa+→
Time
Fig. 30.2 The membrane potential in the resting state is about 70 mV and is stabilized by the
efflux of potassium ions (iK+ , blue). Upon excitation, a fast sodium channel opens. The influx of
sodium ions (iNa+!, red) shifts the membrane potential by about 100 mV in the positive region.
When this value is reached, the sodium channel closes. The efflux of potassium ions repolarizes the
cell and shifts the potential below the threshold of the resting membrane potential (hyperpolari-
zation). In cells that have calcium channels, their opening can also contribute to depolarization and
therefore to an action potential (iCa++!, green).
the depolarization can be achieved if the opening of these sodium or calcium channels
is blocked. Many local anesthetics work by the principle of inhibiting sodium
channels on nerve cells. Calcium channel blockers minimize the Ca2+ influx. This
slows, for example, the diastolic depolarization in heart cells and the heart muscle
works more efficiently. Therefore compounds such as nifedipine, diltiazem, and
verapamil are used to treat hypertension and cardiac arrhythmias (▶ Sect. 2.6).
The electrophysiological processes described in this section reflect a highly
simplified picture. According to the function and tissue-specific location of the
considered cell, multiple ion-specific channels are at work to achieve a finely tuned
setting of the required membrane potential.
The finely attenuated setting of ion gradients across the membrane establishing the
overall membrane potential shows that channels must have a high selectivity for
individual ions. The difference between the ions to pass is very small. Sodium and
potassium ions have the same charge and differ in size by only a little more than
0.35 Å. Their hydration enthalpies are slightly different, but the geometry of their
hydration shell differs significantly. The larger potassium ion is surrounded by eight
water molecules, whereas the sodium ion preferably accommodates six nearest
neighbors. How can a protein exploit this small difference efficiently so that
a selective ion filter results? The achieved discrimination is impressive: only one
sodium ion is smuggled in for every 10,000 potassium ions!
Ion channels are gigantic molecular constructions. They are embedded in the
membrane. It is extremely tricky to remove them from the membrane and embed
them in a crystal lattice with auxiliary material without destroying them. Once this
is achieved, a crystal structure can be determined. Roderick MacKinnon managed
this masterpiece in 1998 at Rockefeller University in New York. Only 5 years later
his achievement was honored with the Nobel Prize.
Initially the structure of the KcsA potassium channel from the bacteria Strep-
tomyces lividans was determined. The channel is constructed from a homotetramer
and traverses the membrane with two long helices per monomer (Fig. 30.3). The
C-terminal end of another shorter helix is oriented in a cavern in the middle of the
channel. Such a helix forms a dipole moment due to the periodic orientation of well
aligned amide bonds along the protein backbone (▶ Sect. 14.2). The preferred
binding site for a positive charge is formed at the end of such a helix (Fig. 30.4).
Four of these helices are oriented toward the cavern in the interior of the channel.
The potassium ions, which are surrounded by a shell of eight water molecules, are
essentially pulled out of the cytosol. This allows the potassium ions to enter the
hydrophobic membrane environment. With that, however, no discrimination
against sodium ions has been achieved. After the acceleration course into the
channel a selectivity filter is enabled. For this, the potassium ions must shed their
hydration shells. During structure determination it was possible to capture some
750 30 Ligands for Channels, Pores, and Transporters
Fig. 30.4 Crystal structure of the bacterial potassium channel KcsA in the open state. The
channel forms a tetramer (a), each monomer is constructed from three helices. Two of these
helices (red) traverse the entire membrane whereas the third, shorter helix (blue-violet) is oriented
toward a cavern in the interior of the channel. There the potassium ions (violet spheres), which are
surrounded by eight water molecules, shed their water shell and enter the selectivity filter (b).
A potassium ion with its quadratic-antiprismatic coordination is shown before entering into the
filter. The carbonyl groups from the main chain adopt the octavalent coordination sphere wrapping
around the potassium ion with similar geometry and transfer the ion across the membrane.
subsequent amino acids are turned toward the interior in the potassium-selective
channels and contribute to the filter (Fig. 30.5). In unselective channels, these
carbonyl groups are turned away from this area and open a chamber that can indeed
accommodate an ion, but cannot achieve selectivity filtering.
The mechanism of the opening and closing of the potassium channel is under-
stood in greater detail thanks to further structural determinations, also on channels
from other organisms. A change in the membrane potential of about 50 mV causes
the channel to open. Because this voltage difference occurs within a distance of
about 50 Å, this causes a tremendous effect of about 100,000 V/cm. Obviously parts
of the channel that sense the difference in voltage become severely positively
charged and swim like paddles on the exterior of the membrane. A change in the
voltage across the membrane causes a movement of these paddles and initiates the
opening or closing of the channel. A kink in one of the extended transmembrane
752 30 Ligands for Channels, Pores, and Transporters
a b
Gly
Gly
Gly Gly
Asp Asp
Tyr Tyr
Gly Gly
Gly Gly
Val Val
Val Val
Thr Thr
Thr Thr
Fig. 30.5 Comparison of the tetrameric ion filter of the highly selective potassium channel KcsA
from Streptomyces lividans (a) and the sodium and potassium-permeable channel from Bacillus
cereus (b). The selective channel forms a tetramer from a TVGYG motif; a TVGDG is found at the
same place in the Na+/K+ channel. Both channels have the same geometry in the lower part formed
by threonine and valine residues. The backbone carbonyl groups of the following amino acids
Gly–Tyr are rotated toward the interior in the potassium-selective channel and contribute to the filter,
whereas the C¼O groups from the four Gly–Asp motifs are rotated away in the unselective channel.
It opens to a chamber that can accommodate an ion, but does not achieve selectivity filtering.
helices enables for this process. A combination of a kink and turn movement by about
30 of the helical end in each subunit of the tetramer causes the closure or the opening
of the channel. A highly conserved glycine residue is found at the bend position. Its
lack of a side chain affords this amino acid a larger conformational flexibility.
Therefore glycine is predominantly involved in conformational switches.
One group of potassium channels is ATP-dependent. Their structural architec-
ture is much more complex than the described bacterial channel. Two genes Kir6.1
and Kir6.2 are known that code for the pore-forming part of the ATP-dependent
channel. The channels are hetero-octamers, each constructed from four Kir channel
proteins and four regulatory units. The latter are called sulfonylurea receptors
because they can be blocked by sulfonylureas. ATP binds to the Kir subunit and
the channel closes. ATP is hydrolyzed to ADP via a multistep process, and the
ATP-induced closure is reversed. Through dissociation and renewed binding of
Mg–ADP, the channel is arrested in the open state. Its state therefore depends on the
ATP/ADP ratio in the cell. Active substances such as pinacidil 30.1, diazoxide 30.2,
or levcromakalim 30.3 are known to stabilize the channel in the open state
(Fig. 30.6). Pinacidil is used to treat high blood pressure, and diazoxide is used in
the therapy of Langerhans islet cell tumors. In contrast, the large group of sulfo-
nylureas (Fig. 30.6) blocks the regulatory subunit and leads to closure of the
attached potassium channel in the insulin-producing cells of the pancreas. Elevated
glucose concentrations stimulate insulin secretion from the pancreatic b-cells.
This release occurs as a response to a series of intracellular metabolic and
30.2 Molecular Function of a Potassium Channel at the Atomic Level 753
CH3
O O O
H3C CH3
S R1
N N
H H
N N CH3
R2
30.4 Sulfonylurea
N NH
H O O O
CN S
30.1 Pinacidil N N CH3
H H
H
N CH3 H3C
30.5 Tolbutamide
N
Cl S O O O
O O S N
N N
30.2 Diazoxide H H
H3C
O 30.6 Gliclazide CH3
N
HO
CN OH O O O
H3C CH3
S
N N
O CH3 H H
CH
C 3
H3C
30.3 Levcromakalim 30.7 Glibornuride
O O O
S
O N N
H H
Cl
N
H 30.8 Glibenclamide
OMe O O O
S
O N N
H H
MeO
N
30.9 Gliquidone
CH3
O O O O
H3C CH3
S
O N N
H H
H3C N N
H 30.10 Glimepiride
O O O O
H3C S N
O N N
H H
H3C N
H 30.11 Glisoxepide
O N
Fig. 30.6 Pinacidil 30.1, diazoxide 30.2, and levcromakalim 30.3 are potassium channel openers.
In contrast, sulfonylureas such as 30.4 block the regulatory subunit of the ATP-dependent
potassium channel in the insulin-producing cells of the pancreas. The basic scaffold of the
sulfonylureas can be broadly varied on both termini (30.5–30.11) with aliphatic (R1), aromatic,
or other cyclic groups (R2).
754 30 Ligands for Channels, Pores, and Transporters
In September 2007, after more than 45 years of use in therapy, the drug clobutinol
30.12 was withdrawn from the market (Fig. 30.7). This drug was used to treat dry
coughs. Over the years, it is estimated that it was used by approximately 200 million
30.3 Binding Unwanted: The hERG Potassium Channel as an Antitarget 755
Cl CH3
CH3
HO N
CH3
C 3
CH
30.12 Clobutinol
HO CH3
CH3
H3C
H
HO N
F
30.13 Terfenadine
H
N N
N N
F
30.14 Astemizol
MeO
O
N
HN N
N
30.15 Sertindol
Cl
N
H3C
HN
N N
H3C
N S
CH3 O
F
S O OH
O
N
30.19 MK499
CN
Fig. 30.7 Clobutinol 30.12 was withdrawn from the market after 45 years of clinical use because
of the risk of provoking an arrhythmia. Terfenadine 30.13, astemizole 30.14, sertindole 30.15,
thioridazine 30.16, grepafloxacin 30.17, and cisapride 30.18 met with the same fate and were
either withdrawn, or their indications for use were severely limited. MK499 30.19 is a potent
class-II antiarrhythmic agent, and it binds to the hERG potassium channel.
756 30 Ligands for Channels, Pores, and Transporters
mV Q
S
Time
The channel was found as a result of detailed genetic investigations of patients with
an inherited long-QT syndrome. An undesirable drug side effect can cause the same
condition when the hERG channel is inhibited by the administered drug. Even
though this side effect is rare, in acute cases it is extremely dangerous. It is
estimated that about 3,000 fatalities each year in the USA are attributable to such
adverse events. To avoid this side effect, attempts are now made to eliminate
binding to the hERG channel immediately during drug development. A structure
of this potassium channel is currently still unavailable. It is, however, related to the
bacterial KcsA channel that was discussed in the previous section.
An alanine-scan was carried out to determine which amino acids are decisive for
the inhibition. The altered binding of the potent class-II antiarrhythmic drug MK499
30.19 was tested. Two aromatic residues in this channel, Tyr652 and Phe656, proved
to be decisive. They are found on the four subunits in the interior of the broad cavern
before the entry into the selectivity filter (cf. Fig. 30.3, approximately at the height of
the potassium ion’s position). Moreover, the binding decreased even more when four
additional residues were replaced by alanine. With this information, homology
models of the hERG channel were constructed based on the crystal structure of the
KcsA channel. The residues that were determined to be critical are all oriented into
this cavern. Drugs that are held responsible for a prolonged QT interval fit in the
model so that an interaction with the aromatic residues is suspected. These model
considerations allow the construction of a superimposition model of known inhibi-
tors. It indicates that inhibitors bind with extended geometry and exhibit a charged
basic nitrogen atom in the center. This atom is in the middle of a pyramidal arrange-
ment formed by three to four hydrophobic aromatic moieties. This spatial pattern has
been further refined by using structure–activity relationships. It serves as a kind of
reference to check whether newly designed active substances could possibly bind to
the hERG channel. The goal of this design is not to optimize but to prevent binding.
The hERG channel is therefore considered as anti-target. In addition to these design
considerations, today the actual hERG-channel inhibition of synthesized compounds
is measured. In this way, attempt is made in an early phase of drug discovery to avoid
the bitter and very expensive surprise of finding severe side effects later.
There are multiple classes of ligand-gated ion channels. The nicotinic acetylcho-
line receptor, the 5-HT3 receptor, and the inhibitory glycine and GABAA
receptors belong to the first class: the Cys-loop superfamily. The first two are
excitatory receptors that respond to acetylcholine and serotonin. They are essential
for the fast nerve impulse transmission at the synapses. The inhibitory glycine and
GABAA receptors are controlled by glycine and g-aminobutyric acid, respectively.
These ion channels have a common architecture. They form a pore in the mem-
brane, and open and allow the passive flow of ions in response to the binding of an
agonist. They have a pentameric construction. The composition of this
heteropentamer varies. A multitude of different receptors are composed from
758 30 Ligands for Channels, Pores, and Transporters
Fig. 30.9 In the crystal structure, the nicotinic acetylcholine receptor has a diameter of 80 Å, and
is 125 Å long (a). It represents a pentamer made from five subunits. It traverses the membrane with
the central region that is composed of four helices per monomer. An extracellular domain binds to
the ligands and additional helices attach on the cytosolic side. The narrowest position in the
interior of the channel (b) reduces to 6 Å in the closed state (middle, indicated by the white
surface). There a belt of hydrophobic residues constricts and prevents the passage of sodium ions.
Upon opening, the helices rearrange by a concerted rotation and expand the channel passage by
3 Å, which is enough to allow the sodium ions to pass with their hydration shells. The interior of
the channel is polar and has many acidic amino acids (b, yellow indication).
H OMe
30.25 Pyrantel OMe
Cys S
Arg OH
MeO
Trp OH
Ala
OMe
S Cys N
CH3
Arg
Pro O
Asp
OO
Ser
Cys S
H3C N
S Cys
Gly
O
30.23 a-Conotoxin 30.24 Methyllycaconitine
remains spread apart for steric reasons. The peptide a-conotoxin serves
a carnivorous sea snail, which lives in tropical oceans, as a venom. Because the
snail cannot bite, it shoots its venom, which is packaged in small chitin-coated
arrows that even have barbed hooks, through a sort of blowpipe. The diterpene
alkaloid methyllycaconitine 30.24 from the seeds of the medicinal plant Larkspur
achieves the same effect. A movement of more than 10 Å is registered (Fig. 30.11)
in the receptor protein upon the binding of this ligand. This difference is transmitted
to the most narrow passage region of the channel via a cascade, and in doing so
regulates the sodium ion permeability.
As an additional aspect, these structures offer an insight into how chemically
completely different structures can invoke the same effect on a receptor.
30.5 Ligands Gate as Agonists and Antagonists: The Function of an Ion Channel 761
Fig. 30.11 By binding an agonist or antagonist in the ligand-binding domain of the nicotinic
acetylcholine receptor, a loop (red) lies either directly on the binding site, or it remains spread apart
by about 10 Å (right). The conformational signal is transmitted to the channel isthmus, which is
30 Å away, and leads to the channel remaining closed or opening.
Fig. 30.12 Crystal structure of the ligand-binding domain of the nicotinic acetylcholine receptor
with the bound agonists epibatidine 30.21 (a) and a-lobeline 30.22 (b), the peptidic antagonist
a-conotoxin 30.23 (c) and the diterpene alkaloid methyllycaconitine 30.24 (d). Despite their very
different sizes, they all occupy the same binding site. In the case of the agonists a loop (Fig. 30.11)
lies across the binding site that spreads apart in the case of the antagonists. All of these ligands
have a positive charge with which they undergo a cation–p interaction with the aromatic ring of
a tryptophan in the vicinity.
The glycine and GABAA receptors are inhibitory neuroreceptors because they
regulate the influx of chloride ions. This leads to hyperpolarization and lowers the
voltage-dependent excitability; the depolarization of the cell is hindered. Both of
these receptors are regulated by the low-molecular-weight ligands glycine 30.26
and g-aminobutyric acid (GABA) 30.27 (Fig. 30.13). Anesthetics as well as alcohol
30.6 Power Brake Boosters for GABA-Gated Chloride Channels 763
30.29 Barbital
30.30 Diazepam
modulate the activity of these receptors and lead to a stabilization of the channel in
the open state. Even cholesterol and other steroids can achieve this effect. The
synthetic pregnane steroid alfaxalone 30.28 opens the GABAA receptor for a longer
duration.
The GABAA receptor, like other members of the nicotinic receptor family, is
a heteropentamer whereby the simultaneous incorporation of an a, b, and g-subunit
is required. The inhibitory effect of drugs such as barbiturates 30.29 or benzodiaz-
epines 30.30 on the channel is based on this regulation (Fig. 30.13). Presumably,
they exert an influence on the dynamic properties of the receptor and stabilize it in
the open state. The benzodiazepines amplify the effect of the endogenous ligand
GABA in that the excitability of the cell is hindered by opening the chloride
channel. They are allosteric regulators and are therefore also termed “power
brake boosters.” The barbiturate-binding site is on the b-subunit, whereas benzo-
diazepine binds on the a-subunit. They act as sedatives, hypnotics, anxiolytics,
anticonvulsives, muscle relaxants, and anterograde amnestics (they cause memory
loss for the time the drug is in the system). Barbiturates have lost their importance
as sleeping pills and sedatives, above all because of their high addictive potential
and the risk of being used for suicide. They have been replaced by benzodiazepines,
which are better tolerated and cannot be used for suicide as a monosubstance.
Indeed, hardly any other substance class has illustrated the concept of
bioisosteric replacement as thoroughly as the benzodiazepines. As a result,
a plethora of derivatives are available that, depending on the individual profile
and pharmacokinetics, have opened therapeutic approaches for the treatment of
764 30 Ligands for Channels, Pores, and Transporters
N X = C,N
H3C
O X
S N N
H3C N Cl N
Cl R4
N O
N OEt
F N
CH3
O
30.31 Flumazenil
with increased lipophilicity (alkylation at N1, chlorine substituents in the 7- and 2’-
positions) quickly reach their effective concentration in the central nervous system.
This causes the sedative and hypnotic components to be amplified. Increased
hydrophilicity (unsubstituted N1 atom, 3-hydroxylation, no 2’-halogenation) is
desired for the profile as a tranquilizer.
Almost all benzodiazepines have agonistic effects and amplify the effect of
GABA. The modification to flumazenil 30.31 led to a compound with antagonistic
activity. It prevents the agonistic effects of benzodiazepines and reverses the
sedative effects. Interestingly, it is missing the phenyl substituent in the 5-position.
The above-described activity profile of benzodiazepines is broad and multifac-
eted. Therefore selective representatives of this class have been worked on in
pharmaceutical research that have only one quality of action, for example, that
have only an anxiolytic or only a sedative component.
The structure of the nicotinic acetylcholine receptor was introduced in Sect. 30.4.
As mentioned, the ligand-gated chloride channels belong to this family and
have a pentameric architecture. The structural details of this channel are still
unknown because, until now, a high-resolution structure determination of such
a channel has not yet been achieved. On the other hand, it is possible to gain more
detailed insights into the architecture of another class, the voltage-gated chloride
channels.
Nine isoforms of these ClC channels are present in our genome. They take on
numerous physiological functions, for example, the control of the resting potential
in skeletal muscle and non-excitable cells. Moreover, they exert an influence on the
absorption of sodium chloride from the kidney into the blood stream or they are
involved in processes that are necessary for the establishment of an acidic milieu.
Malfunction and genetically caused mutations in these channels are associated with
diseases such as myotonia, a pathological muscle tension or particular forms of
epilepsy, neuropathy, and osteopetrosis (a bone disease).
In 2003 the research group of Roderick MacKinnon managed to elucidate the
crystal structure of a bacterial ClC channel. It is constructed from two identical
subunits that are coupled through twofold symmetry. Interestingly, this membrane
protein does not have long helices that are oriented perpendicular to the membrane.
Rather, the 18 helices of this channel are packed tightly together and tilted up to 45
to the membrane axis. The channel pore is reminiscent of the form of an hourglass.
The pore broadens to an atrium on the intracellular and extracellular sides, where
positively charged arginine residues are found in the vicinity (Fig. 30.15). The
channel narrows in the center over a distance of about 15 Å. A selectivity filter
together with a conserved glutamate residue is found at the apex. This residue takes
on the function of a gatekeeper. Additionally, the ends of two antiparallel-oriented
helices end exactly there. They form a preferred binding site for a negative charge.
For this, the helices must have the opposite orientation compared to the potassium
766 30 Ligands for Channels, Pores, and Transporters
channel. Here they have their N-terminal ends at the most narrow place in the
channel. As with the potassium channel, the dipole moment in the helices generates
a special binding site for negatively charged ions. In the crystal structure, the
carboxylate group of Glu148 is found exactly at this position. If this residue is
exchanged for a neutral glutamine, the position is freed, and the glutamine adopts
another position. Instead a bound chloride ion is then found in this position. Upon
mutation to glutamine, the channel is left in a permanently open state. It is assumed
that the two structures describe the open and closed state of the ClC channel. The
fact that the Gln148 mutant exhibits a chloride ion in this position underscores how
important the special position between the two oppositely oriented helix ends is for
the stabilization of a negative charge.
In addition to this chloride ion, two other chloride ions were found in the open as
well as the closed channel. One sits deep in the pore and has completely shed its
solvation shell. It is stabilized by two NH groups from the main chain and the OH
groups from Ser107 and Tyr445 (Fig. 30.15b). The other chloride ion is found at the
entry and is still partially solvated by water molecules.
30.8 Transporters: The Gatekeepers to the Cell 767
The regulation via the glutamate as a placeholder allows the channel to open and
close based on external signals. The structurally related human ClC-0 channel is
voltage-gated when the potential on the interior of the cell shifts to the positive
range. An adjacent negative potential closes the channel. Upon increase of the
extracellular chloride ion concentration, the channel opens. The same can be
observed when the pH value of the environment drops. It is possible that the
glutamate residue changes its protonation state when it swings out of the cusp of
the pore to make way for the chloride ion. This would explain its regulatory
function during the pH conditions and the stoichiometric exchange of Cl for H+.
The ClC channels are specific for monovalent anions. In addition to chloride, albeit
with reduced permeability, Br, I, NO3, and SCN are also able to pass through.
Because the latter-mentioned ions play a subordinate role in biological systems,
a pronounced selectivity is not necessary. Nonetheless divalent ions such as sulfate
and hydrogenphosphate are denied passage. Time will tell how well the structure of
the bacterial channel reflects the properties of channels in higher organisms. It is in
question whether it is possible to modulate the functions of the channels with
ligands that can be developed into drugs.
The latter group of so-called ABC transporters represents a large family of pro-
teins that imports a broad palette of substances such as amino acids, ions, sugars,
lipids, or other drugs into the cell, but also removes them again. To date, 46 of these
ABC transporters have been identified in humans. They are composed of at least
two nucleotide-binding (NBD) and two transmembrane (TMD) domains. Several
structures of NBDs have been elucidated, and they all have a largely similar
construction. They bind ATP, which is essential for their operation. The TMDs
are decisive for the actual membrane passage. They ensure a buffer from the
hydrophobic membrane environment for hydrophilic substances.
The best-investigated transporter is the human MDR-ABC transporter
P-glycoprotein GP170 (MDR1/ABCB1). Like a hydrophobic vacuum cleaner, it
removes lipids as well as a broad palette of drug molecules from the cell. Electron
microscopy on 2D crystals afforded the first indications about its 3D structure and
mode of action (▶ Sect. 13.6). It traverses the membrane with 12 helices. The
NBDs are found on the cytosolic side. In its initial state, the transporter has low
affinity to ATP, and the two NBDs are found in spatially separate configurations.
The two transmembrane domains spread apart and open a cavity in the center that
can accept molecules with high affinity from the outer leaflet of the membrane’s
interior. The cavity seems to be highly adaptive which explains the transporter’s
pronounced substrate promiscuity and its ability to adapt to the requirements of
very different molecules. The substrate is passed from the membranes interior to the
exterior of the cell. Once initiated by substrate binding, the transporter undergoes
a dramatic conformational change that brings the two transmembrane domains
together again. The binding affinity for ATP increases. Simultaneously, the NDB
completes its rotational movement. Spatially, they come together. Presumably the
energetically favorable ATP hydrolysis is also coupled with this step. The activa-
tion barrier for the conformational transition is decreased. The substrate being
transported is released from the transmembrane domain into the exterior surface
layer of the membrane.
The development of resistance due to transporters represents a serious problem for
drug therapy. Therefore, it is all the more important to investigate the molecular
criteria that make molecules good substrates for these transporters. Consequentially,
it can be understood how to modify molecules so that they are no longer good
substrates. This task is nontrivial because the binding pockets in these transporters
are obviously distinctively adaptive, and therefore the typically small changes to
a drug molecule that are tolerable to its mode of action have no effect on its binding
behavior to the transporters. On the other hand, potent inhibitors of these transporters
can be sought. Some compounds such as R-verapamil (▶ Sect. 2.6) have been
discovered to serve this purpose. Their clinical use for breaking resistance, however,
has proven problematic because the inhibition of the transporters also prevents their
natural function from being carried out. On the other hand, it must not be forgotten
that the inducible and heterologous expression of these transporters represents
a decisive defensive mechanism of the cells against xenobiotics. It is not without
cause that Nature has developed such a highly efficient and flexible protective
mechanism. Therefore, it is possible that these transporters do not represent an
30.9 Membrane Passage in Bacteria: Pores, Carriers, and Channel Formers 769
ideal drug target in humans. This picture, however, may be very different for the fight
against bacteria and parasites. They also invoke such transporters to fight drug
molecules (cf. ▶ Sect. 3.2). The currently used weapons against bacteria and parasites
will eventually become ineffective. To break resistance, attempts have recently been
made to inhibit parasite and bacterial transporters. If these goals are achieved, it
would be a double success. On the one hand, resistance against older and well-proven
therapeutic drugs would be broken. On the other hand, the undesirable pathogens
would be additionally damaged, because the transporter would no longer be available
as a defense mechanism against undesirable foreign substances that are potentially
injurious for them. We must wait and see whether this concept, which is currently
being pursued in research, will bring the desired success.
Ion pumps also belong to the transporters that can carry ions across the mem-
brane against a concentration gradient with consumption of ATP. Recently crystal
structures have been elucidated for the first representatives of this protein class, the
so-called P-type pumps. Embedded by multiple long transmembrane helices, these
pumps undergo complex conformational rearrangements to accomplish their task.
These systems are also points of attack of very well-known and successful drugs.
Digoxin exerts its effect on the sodium/potassium pump (▶ Sect. 6.1). The proton
pump inhibitors omeprazole and pantoprazole block the H+/K+ pump in the stom-
ach (▶ Sect. 9.5).
opposite sides of the pore. This orientation of charged groups also contributes to the
selection of molecules that can pass through the pore.
Bacteria also synthesize small peptide-like systems that penetrate the mem-
branes of other organisms and in doing so also offer a possibility for the passage
of, for example, ions. These systems are termed transport antibiotics. They render
the membrane permeable in different ways. The antibiotic gramicidin A is an
oligopeptide made of 15 amino acids that have alternating L and D configurations.
The peptide forms a tube-shaped helical structure and traverses the membrane as
a dimer (Fig. 30.17). This creates a channel with a diameter of 4 Å in the interior. It
is highly permeable for monovalent cations such as Na+ and K+. On the other hand,
multivalent cations and anions are prevented from entering. Up to 107 cations per
second can pass through this channel, a transport rate that is only a factor of
10 below the diffusion rate in water. The cations must shed their hydration shells.
Then they apparently slide through the opening along the amide bonds that are
oriented parallel along the channel’s axis. The side chains of the hydrophobic
amino acids orient in the surrounding lipid membrane. The depsipeptide
valinomycin follows an entirely different mode of action. It is made up of valine,
lactate, and hydroxyisovalerate residues. It encapsulates the potassium ion with its
polar groups, which are oriented toward the interior. It presents its hydrophobic
groups to the outside. When wrapped into such a chelate–ligand complex, charged
ions can pass through the membrane barrier inside the covered, hydrophobic
particle. In addition to valinomycin, other such carriers are known, for instance,
nonactin (Fig. 30.18). These transport antibiotics alter the ion permeability of the
bacterial cell membranes and intracellular compartments. As a consequence, they
can cause bacterial cells to die. Valinomycin accumulates in, for example, the
mitochondrial membranes, increases the potassium influx, and in doing so disrupts
the mitochondrial energy homeostasis and ATP synthesis. The transport antibiotics
have importance as combination pharmaceuticals for external use, for instance, to
treat oropharyngeal infections.
30.10 Aquaporins Regulate the Cellular Water Inventory 771
Recently the lipopeptide daptomycin has been introduced into therapy to fight
Gram-positive bacteria. The cyclic peptide penetrates the bacterial cell membrane
with its hydrophobic side chain. It forms channels for ions by oligomerization. This
causes the cell membrane to be permeable for potassium ions. Their efflux leads to
depolarization and finally to bacterial cell death. Peptides with 20–25 amino acids
such as magainin (Locilex ®) use an analogous mechanism to form amphipathic
helices in the membrane.
The cellular lipid double layer represents a barrier for water molecules. Despite an
osmotic gradient across the membrane, simple diffusion does not occur. Therefore,
larger amounts of water molecules cannot cross actively or passively, or in
772 30 Ligands for Channels, Pores, and Transporters
O O O
O
O O
O
O O O
O
Nonactin
Fig. 30.18 Nonactin represents a chelating Ligand to coordinate potassium ions. It wraps
optimally around the ion and can penetrate as chelate complex the membrane. This transport
antibiotic then presents to its exterior hydrophobic side chains.
association with other particles the membrane. In 1992, the group of Peter Agre in
Baltimore, MD, discovered a 28-kDa protein in the erythrocyte membrane that turned
out to be a water pore. It only serves for water transfer, neither ions nor other small
molecules such as glycerol or urea can pass through it. The direction of the water flow
is determined by the osmotic pressure alone. This first aquaporin, discovered in
erythrocytes, was termed AQP1. In the meantime, over 100 aquaporins have been
discovered in all possible organisms. Humans alone have over ten isoforms, seven of
which are used in the kidney at different sites. Some porins are exclusively special-
ized on water, others, despite their similar architecture, also allow the transfer of
small molecules such as glycerol and urea. The discovery of aquaporins has revolu-
tionized our understanding of the regulation of water homeostasis. Therefore, Peter
Agre was awarded the Nobel Prize in 2003 for this achievement.
Sequence analyses of the aquaporins indicate an architecture constructed from
two almost identical segments. Each half contains a highly conserved Asn–Pro–
Ala–(NPA) motif. The functional aquaporin unit is a tetramer in which each
monomeric unit encloses a pore. As the crystal structure determination shows,
each pore is made up of six transmembrane helices. The channel extends like
a hose through the protein and widens on the extracellular and cytosolic sides to a
15 Å funnel-shaped vestibule (Fig. 30.19b). At its mid-point it narrows to a diameter
of 2.8 Å. The vestibules have many polar, but mostly uncharged amino acids.
A chain of accessible carbonyl oxygen atoms stretch along the wall of the pore,
which are presumably involved in passing the transitory water molecules along
(Fig. 30.19a). The opposite wall is made up of hydrophobic residues. Both impart
amphipathic character to the hose-shaped selectivity filter. The geometry of the
30.10 Aquaporins Regulate the Cellular Water Inventory 773
a b
Arg197
Arg197 Cys191
His182
Fig. 30.19 An aquaporin widens like a funnel on the extracellular and cytosolic sides (a). At the
narrowest position, the pore reduces to about 2.8 Å. At this site, positively charged His and Arg
residues are opposite one another and this prevents ion passage. As if on a string, the water
molecules migrate through the channel as they are passed along by hydrogen bonds to the carbonyl
oxygen atoms (a). The carbonyl groups are on one side of the channel, the opposite wall is made up
of hydrophobic amino acids. A cysteine residue is found in the vicinity of the isthmus that can
complex mercury ions and clog the channel. This explains the diuretic effects of mercury salts.
carbonyl groups that are arranged toward the interior is reminiscent of the selec-
tivity filter in the potassium channel. Because they are only found on one side of the
channel, they cannot completely replace the hydration shell around a cation.
A cation that wanders into the pore is therefore too large to pass through the
pore. A histidine and an arginine are found at the smallest isthmus. A phenylalanine
is found on opposite side. These three amino acids are highly conserved among
the porins that specialize in water permeability. Because of the charge on His and
Arg, they cause a further sieving for positively charged ions, even H3O+. Nega-
tively charged ions are so strongly repelled by the many negatively polarized
carbonyl groups that their passage is energetically much too unfavorable. The
channels that allow glycerol to pass aside from to water have an additional 1 Å in
diameter at their most narrow position. Simultaneously, the histidine, which is
conserved in exclusive water channels, is replaced with a glycine. Altogether, the
glycerol-permeable channel has a somewhat more hydrophobic character.
Aquaporins occur virtually ubiquitously in our bodies, though in larger numbers
and diversity in the kidney. To achieve quick control over their function, they are
partly stored in vesicles. When needed, the vesicles fuse with the cell membrane.
774 30 Ligands for Channels, Pores, and Transporters
In this way, the number of active aquaporins is increased. The water channels
represent an outstanding target structure for therapeutic intervention. In addition to
the development of diuretics, their use for the treatment of glaucoma, obesity, or to
fight angiogenesis in tumors have all been discussed. They have also moved into the
focus of research as a target for the development of drugs to treat parasitic
infections. Interestingly, mercury salts were used as diuretics a long time ago.
The thiol group of an accessible cysteine residue is found in the upper pore region
of AQP1 (Fig. 30.19b). Presumably, the mercury ion blocks the pore by coordina-
tion to this cysteine. Because of their toxicity, mercury salts are certainly not drugs
of choice. Time will tell whether research can find potent and selective alternatives
that can intervene in the targeted regulation of aquaporins to treat diseases that are
associated with their misregulation.
30.11 Synopsis
principle for the treatment of type-II diabetes mellitus because blocking the
regulatory unit results in enhanced insulin secretion from the b-cells.
• Depolarization and repolarization of the heart muscle cells are important for
correct control of the heart beat frequency. Drug molecules with a particular
pattern of aromatic moieties and a central basic nitrogen can block the hERG
channel, a potassium channel involved in the regulation of heart beat. A fatal
arrhythmia can occur. Therefore, potential binding to the hERG channel as an
anti-target is avoided in the early phase of drug discovery.
• Ligand-gated ion channels are huge transmembrane constructions of pentameric
architecture with 20 transmembrane helices. The extracellular ligand–binding
domains, also of pentameric geometry, accommodate binding sites of agonists
and antagonists and transmit the signal of ligand binding through a cascade of
conformational changes to the isthmus of the channel pore. The pore widens by
3 Å by concerted rotations of the five innermost helices; this allows passage of
sodium ions with their hydration shell.
• The ligand-binding domains of the pentameric ion channels can be addressed in
the case of the AChR with agonists such as nicotine or antagonists such as
a-conotoxin. Allosteric regulators are known for the GABA-gated chloride
channel of similar construction. They can amplify the effect of the endogenous
ligand GABA which blocks the excitability of the cells by opening the chloride
channel.
• Benzodiazepines bind to the a-subunit of the GABA-gated chloride channel and
dependent on their substitution patterns act as sedatives, hypnotics, anxiolytics,
anticonvulsives, muscle relaxants, or anterograde amnestics.
• The voltage-gated CIC chloride channels orient two extended helices with their
N-terminal ends toward the center of the channel. Together with a conserved
glutamate residue at the apex, they achieve the required selectivity, possibly via
an intermediate change of protonation state of a glutamate residue taking the role
of a gatekeeper.
• Transporters shuffle endo- and exogenous compounds across the cell membrane.
Transport is usually coupled with the energetically favorable hydrolysis of ATP
to allow membrane passage against a concentration gradient. Particularly the
human MDR-ABC transporter P-glycoprotein GP170 is upregulated in drug
resistance and removes a broad palette of drug molecules from the cell.
• Bacteria have developed special transporter systems to either allow access to
cells, or to penetrate the membrane of other organisms. One class of pores is
formed by large b barrels of parallel-oriented strands that open a passage to the
interior. Other systems either wrap around cations to form hydrophobic carriers
on their exteriors, or penetrate into membranes with helix-forming elements to
build-up channels that make them permeable for ions.
• Despite an osmotic gradient, water molecules cannot diffuse passively across the
membrane. The regulation of water homeostasis is performed by aquaporins,
which are channels that extend like a hose through the membrane-bound protein.
They have 15-Å wide funnel-shaped vestibule on both sides and narrow to
a diameter of 2.8 Å in the center. A chain of accessible carbonyl oxygen
776 30 Ligands for Channels, Pores, and Transporters
atoms stretches along one side of the pore and passes the transitory water
molecules along. The opposite wall is made up of hydrophobic residues. At
the isthmus, charged His and Arg residues prevent permeation of cations. To
achieve quick control over their function, aquaporins are partly stored in vesicles
and fused with the cell membrane as needed.
Bibliography
General Literature
Cascio M (2006) Modulating inhibitory ligand-gated ion channels. AAPS J 8:E353–E361
Higgins C (2007) Multiple molecular mechanisms for multidrug resistance transporters. Nature
446:749–757
MacKinnon R (2003) Nobel lecture, potassium channels and the atomic basis of selective
ion conduction, Accessed on 6 June 2012 from http://nobelprize.org/nobel_prizes/chemistry/
laureates/2003/mackinnon-lecture.html
Sanguinetti MC, Mitcheson JS (2005) Predicting drug–hERG channel interactions that cause
acquired long-QT Syndrome. Trends Pharmacol Sci 26:119–124
Sather WA, McCleskey EW (2003) Permeation and selectivity in calcium channels. Ann Rev
Physiol 65:133–159
Sui H, Han BG, Lee JK, Walian P, Jap BK (2001) Structural basis of water-specific transport
through the AQP1 water channel. Nature 414:872–878
Triggle DJ, Gopalakrishnan M, Rampe D, Zheng W (2006) Voltage-gated ion channels as drug
targets. In: Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal
chemistry, vol 29. Wiley, Weinheim
Unwin N (1993) Nicotinic acetylcholine receptor at 9 Å resolution. J Mol Biol 229:1101–1124
Unwin N (1995) Acetylcholine receptor channel imaged in the open state. Nature 373:37–43
Unwin N (2003) Structure and action of the nicotinic acetylcholine receptor explored by electron
microscopy. FEBS Lett 555:91–95
Vaz J, Klabunde T (ed) (2008) Antitargets. Prediction and prevention of drug side effects In:
Mannhold R, Kubinyi H, Folkers G (eds) Methods and principles in medicinal chemistry,
vol 38. Wiley, Weinheim
Special Literature
Doyle DA, Cabral JM, Pfuetzner RA, Kuo A, Gulbis JM, Cohen SL, Chait BT, MacKinnon R
(1998) The structure of the potassium channel: molecular basis of K+ conduction and selec-
tivity. Science 280:69–77
Dutzler R (2004) Structural basis for ion conduction and gating in ClC chloride channels. FEBS
Lett 564:229–233
Shi N, Ye S et al (2006) Atomic structure of a Na+- and K+-conducting channel. Nature
440:570–574
Ligands for Surface Receptors
31
which is easily accessible to possible active substances, they interact with the
extracellular matrix and mediate cell adhesion. This property could already be
used to reconstruct the contact between bones or bone implants and the surrounding
tissue. An improved regression of the tissue around bones can be achieved by the
adhesion of the extracellular domains of integrin receptors or the fixation of ligands
that stimulate these receptors.
The integrins are found in almost all types of cells in mammals. The family of
integrins is divided into numerous subtypes, and multiple subtypes can be simulta-
neously expressed on the same cell. They react quickly to external signals, that is, in
less than a second. They have the complex structural constitution of a heterodimeric
membrane protein; a and b subunits are distinguished, each of which is composed of
multiple domains. Some subtypes display an additional insertion domain. Several
divalent calcium and magnesium ions that form the so-called metal-ion-dependent
adhesion site (MIDAS) are essential for the function of integrin receptors. To date,
18 a and 8 b-subunits have been characterized in humans. They can be combined to
form heterodimers with different compositions. Until now, 24 different combinations
of these subunits have been evidenced in integrin receptors. The nomenclature for the
receptors matches with the following convention: they are termed axby receptors,
whereby x is expressed as a Roman numeral, and y is an Arabic number.
Signal processing occurs via a complex scheme of multiple sequential confor-
mational transformations. The completed transformation is reminiscent of the
opening of a pocketknife (Fig. 31.1). The folded receptor geometry initially goes
into a twisted geometry as the knife blade and the handle spread apart, and then
goes into an open horseshoe-like form. This geometry is presented when the receptor
is in an active state. The extracellular domain of the activated receptor is available for
interactions with other proteins. The binding takes place via a so-called b-propeller-
like domain and an insertion domain (I-like domain, Fig. 31.1), which is transferred
to the active state by the conformational changes outlined. At the same time, the
active conformation makes the MIDAS binding site available. The described struc-
tural considerations are based on crystal structure determinations made on the
individual domains of the receptor. Assembling these individual building blocks
allows an overview of the composition of the total construction. Nonetheless, more
accurate ideas about the individual conformations of intermediates that the receptor
goes through during its activation are eluded by this approach.
The construction and function of the aIIbb3 receptor shall be considered in
detail as an example. Fibrinogen receptor antagonists could be successfully devel-
oped for this receptor and introduced into therapy. The aIIbb3 receptor plays an
important role in the coagulation cascade. It occurs on the surface of platelets
(thrombocytes). In the resting state, about 50,000–70,000 inactive copies of this
receptor are available. If an injury occurs that stimulates blood coagulation, an
additional 50,000 receptors are transferred from the interior to the surface and are
conformationally activated. The receptor can now bind ligands that contain
a specific motif: an Arg–Gly–Asp (RGD motif) sequence. Fibrinogen, a dimeric
soluble plasma protein contains such a motif and reacts with the aIIbb3 integrin
receptor on the surface of the activated platelet. This crosslinking leads to
31.2 Successful Design of Peptidomimetic Fibrinogen Receptor Antagonists 779
Receptor
Binding Site
EGF1+2 b MIDAS-
Binding Site
Thigh
I-like
PSI Calf-1
b-Propeller Hybrid
EGF3
I-like Calf-2
I-Domain EGF4
b-Tail
α β
Inactive Active
Fig. 31.1 Integrin receptors have a complex structural construction consisting of a membrane-
bound heterodimer made from a- and b-subunits. Each subunit is constructed from multiple
domains, and some subtypes show an additional insertion domain (I-domain). The signal processing
occurs by a complex scheme of sequentially occurring conformational modifications that progress
from an inactive folded structure to an active horseshoe form. Multiple divalent calcium and
magnesium ions that form a metal-ion-dependent adhesion site (MIDAS) are essential for the
function. The receptor–ligand-binding region is on the b-propeller domain and an I-like domain.
aggregation of the platelets and initiates the formation of a thrombus for the wound
closure, a so-called primary or cellular hemostasis. Via a second docking site on the
platelet, the forming blood clot binds to the Von Willebrand factor, which is
produced by endothelial cells. A permanent connection is then created between
the aggregating blood platelet and the vascular wall by this contact.
A blockade of the surface receptors on the blood platelet leads to an arrest in the
coagulation process. Because this process is broadly needed over the entire organism,
internal bleeding could be the consequence. A snake, the common saw-scaled viper
or carpet viper (Echis carinatis), uses this active principle in its venom to subdue its
prey. Because they are often found in the vicinity of human settlements in Africa and
Asia, their bite has already been fatal for some members of our species as well. They
use a 49-residue peptide as venom that has an RGD sequence in its center. A drug
following this inhibitory principle is desirable to achieve a local anticoagulatory
effect. This is of interest in the context of angina pectoris, myocardial infarction,
stroke, atherosclerosis, or in emergency medicine to prevent ischemic complications.
As described in the last section, antagonists of the aIIbb3 integrin receptor, found on
the surface of thrombocytes, represent a rewarding point of attack for the
780 31 Ligands for Surface Receptors
N N
N N
N N
O H O H
N N
HN H O H3C-N H O
O HN O HN
NH O NH O
NH NH
O O
COOH COOH
31.1 31.2
H2N O O
S
S N
S S H N
H3C HN O
N O O
H O
H
H2N N N COOH O NH N
N NH O NH O
H H
NH O H COOH
H2N N N N
H H
31.3 O
31.4 Eptifibatide
O CH3
N
N N
H O
O
N N
N H COOH
H COOH HN
H2N O
NH
31.5 Ki = 2.3 nM 31.6 Lotrafiban Ki = 2.3 nM
Fig. 31.2 The bound conformation of the RGD motif of the natural ligand fibrinogen to the
aIIb/b3 integrin receptor subunits could be determined by using structurally rigid cyclopeptides.
They served as the first lead structures for the development of non-peptidic receptor antagonists
such as the benzodiazepines 31.5 and 31.6. The cyclopeptide eptifibatide 31.4 was introduced to
therapy as a drug.
H O
N
b-Propeller –
Domain H2N H O O
N O
+ N O
H HN
Asp224 H2N O
NH
Tyr122
O N
31.2 CH3
2+
Ca
Asp232 MIDAS
Domain
Fig. 31.3 Crystal structure of the aIIb/b3 integrin receptor with the cyclopeptide 31.2. The
structure confirms the assumption that the peptide is in a b-turn conformation at the receptor.
The peptide’s RGD motif binds in an extended geometry with its arginine residue between two
aspartic acids in the propeller domain, and with the aspartic acid residue to the metal ions in the
MIDAS binding site.
COOH
NH2 O
H H
H2N N N 31.7
N N
H
NH O O COOH
COOH
O
H H
H2 N N N 31.8
N
H
NH O COOH
COOH
O
N 31.9
N
H SC-52012
H2N O COOH
NH
COOEt
O
N N 31.10
H
H2N
NH
COOEt
H O
N
N 31.11 Xemilofiban
H
H2N O
NH
O COOEt
O CH3
N
N O SO2nBu
H HN
H2N O
HN
NOH COOH
Fig. 31.4 By starting with the linear peptide Arg–Gly–Asp–Phe 31.7, xemilofiban 31.11 was
obtained by stepwise modification. The ethinyl group instead of the pyridine ring does not change
the binding affinity but it significantly increases the bioavailability. A similar development
candidate, sibrafiban 31.12, was already tested but was not pursued to a marketed product because
of bleeding problems. Tirofiban 31.13 was introduced to the market for the emergency prevention
of ischemic complications due to a thrombus in the course of a stroke or heart attack.
for which a clinical trial as an i.v. application was undertaken. The goal of the work
was no longer a further increase in the binding affinity, but rather an improvement
in the bioavailability. For this, derivatives with reduced molecular weight were
preferentially investigated. It was shown that the C-terminal amino acid,
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 783
phenylalanine, could be replaced with a simple pyridine ring without a massive loss
in affinity. By additionally esterifying the carboxylate group, the Searle research
group arrived at a compound with weak oral activity. Compound 31.10 is a prodrug
that is quickly transformed in the body by esterases to the free carboxylate, which is
the actual active substance, (IC50 ¼ 0.15 mM for the free carboxylate). Finally,
aminobenzamidino succinates were investigated. Here the idea was to increase the
affinity by forming an additional H-bond to the receptor by reintroducing an amide
group. Indeed, 31.11 is a highly potent fibrinogen-receptor antagonist (IC50 ¼
0.067 mM for the free acid). The compound is well absorbed after oral administra-
tion. Searle introduced xemilofiban 31.11, as the compound was later named, into
clinical studies, which were, however, discontinued in phase III.
The work at Roche had led to the comparable development candidate
sibrafiban 31.12. A double prodrug came into the clinical trials. The company
undertook a broad study on 9,000 high-risk patients with this compound. At low
doses, an effect comparable to ASA (Aspirin ®) was found. At higher doses,
bleeding problems increased significantly. The development of this compound
was therefore abandoned. Despite many clinical studies with a large number of
development candidates, only Merck introduced a non-peptidic receptor antag-
onist, tirofiban 31.13 (Aggrastat ®), for emergency medicine to prevent ischemic
complications associated with a thrombus as a result of a stroke or heart attack.
According to the established RGD pharmacophore pattern of a basic group,
a bridge, and an acidic group, 31.13 was formed as an inhibitor with an IC50 ¼
375 nM (Fig. 31.5) by replacing the benzamidino group with a piperidine ring,
and by abandoning the amide group in the bridge between the basic group and the
acid function. Because it has inadequate oral availability, it is administered
intravenously. Time will tell whether fibrinogen-receptor antagonists will
achieve importance in the therapy of thrombotic diseases over and above their
use in emergency medicine.
Leukocytes, white blood cells, are transported through the body with the blood
flow. Their principle task is to defend against pathogens during inflammatory
processes. To achieve this task, they initially must be stopped in the normal
blood flow in the vessel that runs alongside the site of inflammation (Fig. 31.6).
This deceleration is made apparent in a type of leukocyte rolling adhesion. Surface
receptors that are also found on the rolling leukocyte are involved in stopping the
leukocytes. On the other hand, in cases of inflammation and in the vicinity of
the actual site, selectins are increasingly expressed on the cell surface of the
endothelium. Temporary contacts that are weak but very selective sugar–protein
interactions are responsible for deceleration. Finally the leukocyte is stopped
completely. Integrins on the leukocytes are responsible that interact with
intercellular adhesion molecules (ICAMS) on the endothelium. In the last step,
the leukocytes leave the blood vessel (extravasation). After their migration to
784 31 Ligands for Surface Receptors
Asp224
Tyr122
31.13 Tirofiban
31.4 Eptifibatide
Ca2+
Mg2+
Ca2+
Fig. 31.5 Superposition of the crystal structures of eptifibatide 31.4 with tirofiban 31.13 and the
aIIb/b3 integrin receptor. Peptidic as well as non-peptidic marketed products bind on one side to the
aspartic acids of the propeller domain and on the side opposite to the metal ions of the MIDAS
binding site. The example demonstrates how amino-acid residues can be replaced by other, non-
peptidic groups.
the site of inflammation, they fight the infection by releasing cytokines and
degrading substances. The latter attack the inflammation site both oxidatively and
proteolytically.
Some inflammatory processes lead to vascular damage by excessive leukocyte
infiltration, for instance in conjunction with a heart attack (reperfusion), by chronic
irritation as in rheumatoid arthritis, in atherosclerosis, diabetic angiopathy, or during
a carcinoma metastasis. In such situations, a therapeutic concept that intervenes in the
inflammatory cascade lends itself well to reducing excessive leukocyte infiltration.
This can be achieved by binding an antagonist to the selectins.
The selectins belong to the large group of lectins, a family of complex glyco-
proteins. They form interactions to carbohydrate structures and are able to achieve
the anchoring between cells and/or cell membranes through these contacts. The
selectins are a subgroup of these glycoproteins. They are classified as E-, L-, and
P-selectins. Structurally, they are related to each other and differ in the number of
particular repeat sequences (short consensus repeats). In addition to a C-terminal
cytoplasmatic part they have a transmembrane domain. The binding site for carbo-
hydrate molecules is found on a lectin domain at the N terminus. The structure of
such selectin domains is shown in Fig. 31.7.
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 785
a b integrin receptor
3 selectine
leukocyte
ligands
4
1
blood vessel
2
inflammation
P+E selectines
c d endothelial cell
ICAMs
6 1 rolling
2 inflammation
3 selectine receptors
4 integrin receptors
7 5 fixation
6 penetration
7 degradation
Fig. 31.6 (a) Leukocytes are transported through the body in blood vessels with the blood flow
(1). If the vessel passes a site of inflammation (2), the leukocytes are stopped out of the normal
blood flow. (b) They change their rolling behavior by interactions with selectin receptors (3),
which are increasingly expressed on the endothelium in the vicinity of the actual site. Integrin
receptors on the surface of the leukocytes are activated (4). The leukocytes are fully fixed because
of integrin receptor binding to intracellular adhesion molecules (ICAM; 5, c). Leukocytes leave
the vasculature (6, d) and migrate into the neighboring tissue to the site of inflammation, which
they fight by releasing cytokines and degrading substances such as oxidants and proteases (7).
31.14 Sialyl-Lewisx
a Sialic Acid
NHAc OH
N-Acetylglucosamine HO
Oligosaccharide HO
OH
O O OH
O Galactose
O
Glu107 NHAc OH
O O O
O
OH O O
Fucose HO O
Asn105 O OH
O HO
H2N Ca 2+ H3C
HO
OH HO
O
O HO Ser99
H2N O O
O NH2
O O O O
Asn83 Tyr48
Asp106
Asn82 Glu92 Tyr94
Glu80
Asn83
Ser99
Asn82
Glu80 Asn105
Glu92 Tyr94
Fig. 31.7 The crystallographically determined binding mode of sialyl–LewisX 31.13, exposed
binding epitope of the PSGL-1 protein to the selectin surface domain (a). The four carbohydrate
moieties: N-acetylglucosamine (violet), fucose (green), galactose (blue), and sialic acid (red) form
numerous hydrogen bonds with their oxygen atoms to the protein in a shallow, bowl-shaped
binding pocket (b). A calcium ion (violet sphere) is involved in the binding and interacts with
multiple protein residues as well as with the ligand’s fucose moiety.
31.15 (Fig. 31.8) with an IC50 ¼ 500 mM resulted. The affinity was further improved
by a factor of 5 by adding a second, structurally similar group to give 31.16.
Another path was forged at Revotar Biopharmaceuticals. Compound 31.15 was
used as a reference substance. Smaller, multiply hydroxylated aromatic rings were
31.3 Selectins: Surface Receptors Recognizing Carbohydrates 787
Sialic acid
OH
N-Acetylglucosamine HO NHAc
HO
OH
RO O OH
Galactose
OH O
NHAc O O
O
OH O O
OH O
O
H3C HO HO
Fucose OH
(CH2)6
COOH COOH
O
O O
O OH
O OH O OH
OH
OH OH OH
OH OH OH
OH OH
COOH
COOH S
O
OH
OH NH
O
O OH
OH
OH
OH
Fig. 31.8 By starting from sialyl–LewisX 31.14, a micromolar lead structure (31.15) was
developed by exchanging fucose for mannose and adding a hydrophobic linker with a terminal
oxygen group. The addition of a second, structurally analogous building block to give
bimosiamose 31.16 improved its affinity. By starting from a pyrogallol scaffold, sugar-dissimilar
structures with submicromolar affinity were obtained (31.17, 31.18).
Because viruses do not have their own metabolic and reproductive machinery, they
are forced to hijack a host cell for these tasks. They do, however, contain the
program and information for their reproduction archived in the form of their own
DNA or RNA. To gain entrance into a host cell, they must dock onto this cell, and
their envelope must merge with the host cell membrane. Let us discuss an example.
The HI virus fuses with T lymphocytes and in so doing, initiates an AIDS infection
(Fig. 31.9). The virus has a diameter of about 120 nm (1,200 Å). More than
70 glycoproteins are embedded in its membrane envelope. Each of these surface
proteins consists of so-called gp120 and gp41 subunits that arrange themselves as
trimers. The gp41 unit sticks up like a sewing pin in the membrane envelope, whereas
the gp120 unit is a nearly spherical external head to this pin. Both subunits could be
structurally and biologically characterized. The gp120 protein, which is constructed
from pleated sheets and helices, acts as a mooring anchor for the virus. It binds to the
CD4 receptor on the surface of the T lymphocytes. A conformational change in
the gp120 protein then occurs. This initiates the subsequent interaction with the
CCR5 or CXCR4 co-receptor, which are found in the vicinity. The binding to these
chemokine receptors causes another conformational change in the sewing-pin-like
“warhead” on the envelope of the virus. The monomers that make up the trimeric helix
bundle and form the gp41 subunit are each composed of three segments, the HR1,
HR2, and FP domains. The virus penetrates the membrane of the host cell with the FP
domains. The bundle of the three HR1 domains makes three grooves on its surface
available that are optimally suited to accept the HR2 domains (Figs. 31.9 and 31.10).
For this, they must adopt a helical geometry. The three initially extended and parallel-
oriented HR1 and HR2 peptide chains “zip” together and form a compact bundle of six
helices. This zipping together causes the membranes of the virus and host cell to be
pulled together. The fusion process of the envelopes is therefore initiated.
Can the fusion process be blocked so that the beginning of the infection process
can be stopped? The tightly packed bundle of HR1 helices makes a groove available
on the surface to accommodate the HR2 peptide with its helical construction.
Therefore peptides were synthesized at Duke University in Durham, North Caro-
lina, USA, to mimic the sequence of the HR2 domain. In 1996 in the subsequently
founded company, Trimeris, one of these peptides was discovered. DP178,
a 36-residue peptide is able, as is the HR2 peptide, to dock in the available groove
on the HR1 peptide and block gp41 from zipping. The lead structure was further
developed in cooperation with Roche to the drug enfuvirtide, a peptide made up of
31.4
HR2
Virus
a b c d
gp41
gp120
HR1
CD4
Fusion Inhibitors Impede Viral Invasion
Host Cell
Fig. 31.9 An AIDS infection is initiated by an attack of the HI virus (orange) on T-lymphocytes (gray; a). It uses a trimer of its surface proteins containing
a gp120 (violet) and gp41 subunits (red/green) for this purpose. The gp120 protein binds to the endogenous CD4 receptor (blue). A conformational change in
the gp120 protein takes place (b). For this, an interaction with the CCR5 or CXCR4 co-receptors (yellow), which are in the vicinity of the CD4 receptor, is
formed (c). Both receptors belong to the GPCR class. By binding to these chemokine receptors, the sewing-pin-like “warhead” gp41, which consists of three
segments in a helix bundle (red/green), undergoes a conformational change. The virus penetrates the membrane of the host cell with this helix bundle, and the
fusion process is initiated (d). Finally, the initially extended peptide chains assemble and compress themselves into a tight bundle of six helices. This brings the
virus and the host cell even closer together (d, inset).
789
790 31 Ligands for Surface Receptors
Fig. 31.10 The bundle of three HR1 domains (green) makes three grooves on its surface
available that are optimally suited to accept the HR2 domains (red) once these have transformed
to a helical geometry. Three initially extended and parallel-oriented peptide chains fold together
and form a tight bundle of six helices. This tying together pulls the membranes of the virus and the
host cell together.
Fig. 31.11 The HR2 peptide strand (red) interacts with the three hydrophobic amino acids
Trp628, Trp631, and Ile625 with the bundle structure of the HR1 domain (green). In screening,
the multiply charged structures 31.19 and 31.20 were discovered as mimics that can block the
bundle-type packing of the helices. Maraviroc 31.21 antagonizes a cytokine receptor that is
involved in initiating the fusion process between the HI virus and T lymphocytes (Fig. 31.9).
M2-Protein
Neuraminidase H3C NH3+
NH3+
Hemagglutinin Cl− Cl−
Matrix Proteins
Fig. 31.12 In addition to the docking protein hemagglutinin (blue), of which 15 subtypes are
known, the envelope of the influenza virus contains a neuraminidase (red), which has nine
variations (N1–N9), and the M2 proton channel protein (green). This pore can be blocked by
amantadine 31.22 and rimantadine 31.23. Upon maturation and budding of a newly formed virus,
the glycolytic activity of neuraminidase is needed to detach from the host cell (gray) in the last step
by cleaving a sugar chain (green).
infected host organism develops its own antibodies, or the immune system is stimu-
lated to produce such antibodies by a flu vaccine, these can still be adequately
recognized, even with small modifications in the capsid proteins, and rendered
harmless. Antigen shift is much more dangerous; it is caused by the exchange of
genetic information between virus species or subtypes. It can especially occur between
species, that is, upon transmission from animal to human. Because this route comes
from new combinations of surface proteins, it is difficult for the immune system to
build adequately high antibody titer fast enough to render such a modified virus
harmless. Such antigen shifts can lead to pandemics. As mentioned, usually they
originate in regions where numerous species such as ducks, chicken, pigs, cats, dogs,
and humans live together in close quarters. Because of the lifestyle, high population
density, and traditional animal husbandry practices in which animals and humans
live under the same roof, the East Asian regions of Southern China or Mexico have
repeatedly proven to be incubators for such genetically varied virus forms. There
have been many pandemics in the past. The most serious one worldwide was
certainly the so-called Spanish flu in 1918 that claimed at least 25 million fatalities.
The influenza viruses of this pandemic had a particularly virulent subtype, H1N1. In
1957 a pandemic occurred with the H2N2 subtype, and in 1968 it was the H3N2
combination that was particularly dangerous. The last pandemic warning from the
WHO was issued in the fall of 2009 after a renewed H1N1 variant (cf. 1918) in the
form of the so-called swine flu took its starting place in Mexico. One year later, we
knew better. This variant proved to be not nearly as dangerous in its observed form
as was initially expected. The current preventive therapy for the flu is a vaccination.
The vaccine contains parts of the surface proteins hemagglutinin and neuraminidase,
or also matrix proteins as antigens, and it stimulates the immune system to produce
antibodies. The production of a new vaccine takes some time and represents a great
financial effort. Therefore an attempt is made to estimate which viral subtypes might
be involved before a flu wave strikes. Viral envelope proteins are isolated from these
subtypes, and a vaccine is developed for the next vaccination campaign based on
these proteins. It was just this step that was initiated in the summer of 2009 to
prepare a vaccine against the swine-flu-type H1N1 for the more heavily populated
northern hemisphere in time for winter. The population was also simultaneously
asked not to neglect the vaccines from virus strains from previous years and to obtain
adequate protection from such a vaccination.
Three surface proteins can be concentrated upon for a defensive therapy with
low-molecular-weight compounds. The drugs amantadine 31.22 and rimantadine
31.23 (Fig. 31.12), which block the M2 proton channel protein, are already rather
old. The target protein is a pore that is open for protons. It is opened and regulated in
a pH-dependent manner by a ring of four histidine residues. If the four histidine
residues are deprotonated, the pore is closed due to a network of H-bonds between
the histidines. If the histidines are protonated and exist in a charged state, their
spatial orientation changes and the H-bond network is disrupted. As a consequence,
the channel of the M2 protein is opened. The two ligands 31.22 and 31.23 are not
very specific and do not allow an efficient therapy. Furthermore, multiple resistance
mutations have been observed that abolish the action of these drug molecules.
794 31 Ligands for Surface Receptors
Asp151
O
Asp151
O
O− H O R
H O− H O
R
H O−
O O
HO O H2N H O
H3C N − N HO O H2N H
H HO O − H2N Arg371 H3C N HO − N
O H Arg371
H + O H 2N
O
HO OH O HO
HO OH O HO
Glu277 O−
Glu277 O−
Tyr406
31.25 Tyr406
Asp151
O
R
O− H O O
HO
HO H3C N
O H H HO O HO
HO O H2N
N HO − N
H3C H O H2N Arg371 HO COO−
OH
O
O
HO HO
OH O HO
H3C HO O COO−
Glu277 O−
Tyr406 HO OH OH
Fig. 31.13 Reaction mechanism of the glycolytic cleavage of a sialic acid residue. The residue is at
the end of a carbohydrate chain that couples the virus to the host cell. The sialic acids binds to viral
neuraminidase. The glycosidic bond is cleaved from the remaining sugar chain with assistance from
the two neighboring acidic amino acids, Glu277 and Asp151. A sialosyl cation is formed that is
temporarily stabilized by Tyr406. The sugar is released after transfer of an OH group to the trigonal
center. The stable stereoisomer is formed by ring opening and reformation of the cyclic sugar.
Fig. 31.14 The development of the neuraminidase inhibitor zanamivir 31.28 and oseltamivir
31.32. Compound 31.26 was developed as a stable structural analogue to the sialosyl cation 31.25.
By exchanging the OH group for an NH2 group, 31.27 is formed with Ki ¼ 40 nM. The introduction
of a guanidinium group to form 31.28 brought a further improvement in the activity. Carbocyclic
analogues 31.29 and 31.30 were synthesized at Gilead Sciences and further optimized to 31.31 by
exchanging an OH for an NH2 group. To improve the bioavailability, an ester prodrug, 31.32 was
introduced into therapy. A depot form was developed with 31.33. Peramivir 31.34 is an intrave-
nously applicable neuraminidase inhibitor.
spatially neighboring Tyr406 residue. The crystal structure of the influenza neur-
aminidase with a sialic acid analogue 31.26 was determined in 1983 (Fig. 31.14). It
imitates the transition state of the enzymatic reaction. Compound 31.26 blocks the
protein with Ki ¼ 4 mM. To discover further key positions for additional functional
groups in the binding pocket, the GRID program from Peter Goodford
(▶ Sect. 17.10) was consulted. This method suggested that a favorable position
for a large, positively charged group exists in the vicinity of the 4-OH group and the
neighboring Glu119 and Glu227 residues. Exchanging this OH group for an
aliphatic amino group led to 31.27, which results in a stronger hydrogen bond to
the protein. The binding affinity improved into the nanomolar range. If the amino
group is modified to a guanidino group as in 31.28, the two neighboring glutamate
residues can be involved in an interaction to the ligand. Compound 31.28 binds to
the protein with Ki ¼ 0.2 nM. The substance was clinically developed by
GlaxoSmithKline (GSK), and zanamivir 31.28 was marketed under the name
Relenza® in 1999. Because of its high polarity, the drug has poor oral bioavailability.
It can only be applied by inhalation. A special inhalation device had to be developed
for the administration of the drug. Nevertheless, the launch of an orally available drug
to the market was still desired. A different approach was taken at Gilead Sciences
with the goal of reducing the high polarity of zanamivir (Fig. 31.14). Initially the
796 31 Ligands for Surface Receptors
a Arg149 b A rg 149
A rg 222
Arg222
Asp148 Asp148
His274
His274
Glu119
Glu277
Glu276 Glu276
Glu119
Arg373 Arg373
Fig. 31.15 The crystallographically determined binding geometries of zanamivir 31.28 (a) and
oseltamivir 31.32 (b) in neuraminidase. The acidic function of the inhibitors is anchored
by Arg115, Arg291, and Arg273. The opposite N-acetyl group interacts with Arg149.
Asp148 forms a layered geometry with the guanidinium group from zanamivir, whereas the
amino group found in the same position in oseltamivir forms a hydrogen bond to Asp148.
Compound 31.28 forms a hydrogen bond to Glu276 by using its glycerol groups, whereas
the more hydrophobic iso-pentylether group in oseltamivir induces a rearrangement of Glu276
to form a salt bridge to Arg222. Interestingly, upon exchange of His274 for a Tyr, resistance
to this drug occurs because the rearrangement that is needed for oseltamivir binding can no
longer occur.
central pyran ring of 31.26 was exchanged for a carbocycle (31.29) and the double
bond was relocated by one position. With this, 31.30 can better imitate the transition
state of the reaction. Next, the glycerol function was exchanged for a more hydro-
phobic iso-pentylether group in 31.31. To improve the bioavailability of 31.31
a prodrug strategy was chosen. By esterifying the free acid function, a new orally
available inhibitor, oseltamivir 31.32, resulted. Roche licensed this compound in
1999 and introduced it to the market as Tamiflu® (Fig. 31.15).
After almost 10 years of clinical use, the first cases of resistance to oseltamivir
and zanamivir were described. Even cross-resistance to both compounds has
occurred. The effects of such mutations can be illustrated for a His!Tyr exchange
at position 274, which creates oseltamivir resistance. The reorientation of Glu276,
which is necessary for oseltamivir binding, is blocked in the viral mutant.
A substantial reduction in the binding affinity is the result. Significantly less
resistance has been described for zanamivir. Perhaps this is because it is structurally
more similar to the sialic acid substrate and it is therefore more difficult for the
virus to develop a mutation without limiting its ability to bind its own substrate.
Such a concept is certainly the silver bullet to prevent fast resistance to a new
31.6 Stopping the Common Cold: Inhibitors for the Capsid Protein of Rhinovirus 797
promising drug. Perhaps, however, zanamivir has been largely protected from
resistance development because its inhalative application route is less convenient,
and it has therefore simply been used less often.
Follow-up drugs are already being sought. A divalent zanamivir 31.33 has been
described. It requires much fewer applications, analogous to a depot form. For
special circumstances peramivir 31.34, an intravenously administered drug, could be
found. It was developed from a furanose derivative and, like zanamivir, is inade-
quately orally available. During the last H1N1 pandemic (“swine flu”), peramivir was
in the last phase of its clinical trials and received an emergency use authorization for
the parenteral treatment of severe cases. Its binding also requires a rearrangement of
the Glu276 side chain, as is also required for oseltamivir. Therefore, cross-resistance
between the two drugs has already been described.
a b c
Fig. 31.16 The capsid of the picornaviruses has an icosahedral construction (a). Each of the
20 triangular surfaces of the icosahedron is made up of three viral surface proteins, VP1 (yellow),
VP2 (green), and VP3 (red). A fourth chain, VP4 attaches to VP2 and forms a deep ridge
(“canyon,” blue) that is important for the recognition of the adhesion proteins (ICAM-1, blue
chain) of the infected host cell (orange; b). The binding of an antiviral compound such as
pleconaril 31.40 (Fig. 31.18) below the canyon causes its conformation to change so drastically
that binding to the host cell’s surface protein can no longer occur.
The surface-exposed canyons are particularly important in the area that is formed
by VP1. On the one hand, the canyon forms binding sites for adhesion molecules
(ICAM-1, cf. Sect. 31.3) that are found on the surface of the infected host cells.
Because the virus may not change its surface composition there too much, antibodies
from vaccination serums (▶ Sect. 32.1) are also targeted against this part of the
canyon. Such a strategy can be very successful with viruses that have less-broad
distributions of serotypes than the rhinoviruses. On the other hand, the canyon has an
opening in the vicinity of VP1 to the interior of the viral capsid that is important for
the release of the viral genome. Michael Rossman’s research group at Purdue
University studied the proteins of the viral capsid in detail. By using cryo-electron
microscopy (▶ Sect. 13.6) they managed to determine a structure of the capsid
protein with an adhesion molecule. This complex was only determined at
a reduced resolution, but it proves that the binding of the adhesion molecule
ICAM-1 occurs in the deep crevice of the canyon (Fig. 31.17).
Because of the common architecture of the picornaviruses, an antiviral
therapy can be developed against these viruses by using the same concepts. One
strategy that was initially developed and pursued at Sterling–Winthrop was oriented
toward the stretched-out pocket that is found underneath the VP1 canyon
(Fig. 31.16b). Accommodation of antiretroviral compounds in this pocket causes
a conformational change at the bottom of the canyon. Because of this change in
geometry, the interactions with the adhesion molecule on the surface of the infected
800 31 Ligands for Surface Receptors
H3C
O O Cl O
O N
O
N
O H 3C O
H 3C Cl
31.35 β-Diketone Juvenile 31.38 WIN 54954
Hormone Mimetic
H3C
CH3
H 3C N
Cl O
O N O
O
N
O
O H3C
OMe H3C
H3C
31.36 Arildone 31.39 WIN 61893
CF3
O H3C N
N O O
N O
N
H3C N O
O H3C
H 3C
31.37 Disoxaril 31.40 WIN 63843 Pleconaril
Fig. 31.18 b-Diketones (31.35, 31.36) that showed antiviral activity against picornaviruses were
prepared in the course of a synthetic program toward developing juvenile hormone mimetics. By
introducing a terminal heterocycle, varying the chain length, and blocking positions on the
heterocycles to improve the metabolic stability (31.37–31.39), pleconaril 31.40 could be devel-
oped as an inhibitor of the viral attack on host cells.
host cell are altered. A stable contact with the host cell is no longer possible, and
the virus cannot transfer its viral RNA to the infected cell. The viral infection
is stopped.
The first lead structures at Sterling–Winthrop were b-diketones 31.35, which
were synthesized as intermediates in a research project for the development of
juvenile hormone (Fig. 31.18). Arildone 31.36 resulted from this lead substance,
and it successfully blocked the replication of the poliovirus. Because the b-diketone
building block exhibited unsatisfactory chemical and metabolic stability, it was
replaced with an oxazole ring. Further optimization led to disoxaril 31.37, which
was able to block viral infection by multiple picornaviruses in animal models.
By the end of the 1980s Michael Rossmann had already managed to elucidate
the binding geometry of the Winthrop compounds in complex with the viral capsid
proteins. The structures showed the occupancy of an extended pocket below
the canyon. Disoxaril 31.37 has no activity against the rhinovirus, and its bioavail-
ability of less than 15% also seemed to be unsatisfactory. Further development led
to derivatives with a di-ortho-substitution on the central phenyl ring. WIN54954
31.38 is significantly more potent and has better bioavailability. However, it does
not have the desired broad efficacy against all viral strains, and its metabolic
stability still leaves something to be desired. The methyl derivative 31.39 with
31.6 Stopping the Common Cold: Inhibitors for the Capsid Protein of Rhinovirus 801
VP-3 VP-1
VP-2
Pleconaril
VP-4
Fig. 31.19 Crystal structure of the viral capsid proteins of the rhinovirus HRV-14 with the
inhibitor pleconaril. The antiviral drug binds to VP1 (yellow) below the canyon (cut away in the
figure) in a narrow, stretched-out pocket formed by numerous hydrophobic amino acids. They sit
above an opening into the interior of the viral capsid. The drug induces a conformational change in
VP1 upon binding, and consequently disrupts the recognition of the host cell’s adhesion proteins.
the terminal oxadiazole ring also lacks the desired stability. It was only the
replacement with a CF3 group that led to success. The compound, pleconaril
31.40, entered clinical trials. Its binding mode to the capsid protein is shown in
Fig. 31.19. Its application in 2,100 patients with a rhinovirus infection showed that
the length and severity of the disease was shorter and milder, respectively, com-
pared to a placebo group. The regulatory authority the FDA, however, rejected the
market approval for pleconaril in 2002. Concerns regarding the safety of the drug
were expressed. There were indications that complications in women using oral
contraceptives occurred. For a disease such as the common cold from which our
bodies can recover without drugs, it is certainly appropriate to examine the effects
and risks of a drug very stringently. Undoubtedly a compound such as pleconaril
can reduce the inappropriate use of antibiotics or prevent a severe secondary
bacterial infection. Such a compound can also help asthma and COPD (chronic
obstructive pulmonary disease) patients with an infection. Schering–Plough
conducted further clinical trials with a compound that is used as a nasal spray.
Knowledge about the mode of action of this compound, which is transferable to
other picornaviral diseases, could be very valuable. It could help to develop drugs
for other infectious diseases with higher health risks. For these, however, there will
hardly be a comparably lucrative market.
802 31 Ligands for Surface Receptors
Our immune system defends us against harmful invasions by antigens and elimi-
nates cells that either have been infected or have transformed into a potentially
pathological state. A distinction is made between unspecific and specific defensive
mechanisms that are served by cellular and humoral (in body fluids) components.
The unspecific defensive mechanisms try to deactivate pathogens and foreign
substances upon first contact. We have various glycoproteins and interferons in
circulating blood and in tissues that carry out the first attack as the so-called
humoral complement system. Their defensive efforts are not targeted and serve
to degrade the foreign substances (cf. Sect. 31.3). For example, they adhere to
bacteria and create an opening in their membrane that allows fluid and salts to flow
in. This causes the bacterial cell to swell and finally burst. Lysozyme represents
a further factor that enzymatically hydrolyzes the cell walls of particular bacteria.
Additionally, interferons are released that have an immune-stimulating effect on the
neighboring cells. Proteins are produced in these cells that initiate entirely different
mechanisms of combating foreign substances. Moreover, the organism has an
additional, very effective and specific protective barrier, which, however, must
first be developed and “trained.” This immunological defense mechanism is first
active when a damaging material is recognized as such. The consequently initiated
immune response is largely made up by three types of cells: macrophages and B
and T lymphocytes. These defensive mechanisms are highly specific and usually
lead to immunity. With this, the body becomes insensitive to foreign materials that
it has had contact with once before. In the context of the humoral defense,
antibodies (▶ Sect. 32.3) assume this task. They are formed 5–7 days after an
immune-competent B lymphocyte makes contact with an antigen. After the first
contact with the invader, effector cells are formed for the production of antibodies
as well as memory cells that further circulate in the blood. Upon renewed exposure,
the defenses can immediately attack, even if the antigen was recognized years ago.
Vertebrates have developed an adaptive system for cellular defense that distin-
guishes between healthy and infected cells. T lymphocytes, also known as T cells,
play a decisive role in the cellular immune response. They belong to the white blood
cell group. Produced in the stem cells of the bone marrow, they mature in the
thymus to the actual T cells. They carry T-cell receptors on their surface that are
responsible for the recognition of antigens. By scanning cells for the characteristic
“diseased or healthy” they find these antigens in the form of peptide sequences that
are presented on the surface of the cells by MHC molecules (major histocompat-
ibility complex). Two different types of MHC molecules are distinguished that are
termed class I and II. MHC-I presents 8–10-residue peptides that are preferably
found in the cytosol of cells that have a nucleus. MHC-II presents longer peptides
that are formed during endosomal protein degradation. They occur on professional
antigen-presenting cells such as macrophages or B cells. T helper cells “look at” the
antigens presented by these cells, and regulate the immune response to these
antigens. The term for these molecules is “histocompatibility complex” because it
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 803
CD8
T-Cell
Virus
MHC
Viral-Protein Vesicle
Proteasome
Peptide-
Fragments
TAP
Cytosol
Fig. 31.20 When a cell is infected with a virus, viral proteins are found in the cytosol (gray).
They are degraded in the proteasome along with endogenous cellular proteins and cut into peptide
fragments. These fragments are relocated to the endoplasmatic reticulum (ER) by the TAP
transporter (green). There, the membrane-bound MHC class-I molecules (blue-violet) are loaded
with 8–10-residue peptides. Enclosed in vesicles, the peptide-presenting MHC molecules are
relocated to the cell surface and anchored with the membrane. T cells (green) scan through the
presenting MHC molecules by forming a complex with their T-cell receptors and recognize
whether the presented fragments are from endogenous or foreign proteins. If the protein is of
foreign origin, or an endogenous protein that has been overexpressed (e.g., in tumor cells), an
immune response is initiated.
somatic cells. On the one hand foreign proteins are recognized, on the other hand
the cytotoxic killer cells are also able to filter out cells that overexpress endogenous
proteins based on the high density of presented peptides.
These properties can be exploited to design peptide vaccines. By offering a large
amount of endogenous peptides, the immune tolerance to native proteins should be
overcome. Killer cells stimulate a specific and amplified immune defense against
the degenerate tumor cells. The goal of the development of such specific vaccine
serums is the replacement of endogenous peptides with analogues that can
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 805
Fig. 31.21 Crystal structure of the complex of an MHC-I molecule with a bound nonapeptide
Leu–Leu–Phe–Gly–Tyr–Pro–Val–Tyr–Val (gray) and the T-cell receptor: (a) total structure,
(b) peptide-binding site. The MHC molecule is composed of a heavy chain with the domains a1,
a2, and a3 (violet) and a light bm chain (blue). The pleated-sheet structure formed by the a1/a2
domains forms a bowl that is open above and is bordered by two long, parallel-oriented a helices
(yellow). It accommodates the antigen peptide fragment and presents its upper face to the T-cell
receptor. This hetereodimeric receptor made up of one a- (light-blue) and a b-chain (gray), also
has a pleated-sheet-like geometry. It recognizes the amino acid tyrosine in position-5 of the
antigen peptide with its hypervariable loops CDR3a and CRD3b.
provoke the same or an exaggerated immune stimulation, but that also have
much better stability and bioavailability because of the incorporation of non-
proteinogenic amino acids or peptidomimetic groups.
First the architecture of the complex of an MHC-I molecule with a presented
peptide and the T-cell receptor shall be considered in greater detail (Fig. 31.21).
The MHC-I molecule is composed of a heavy (ca. 360 amino acids) and a light
chain (90 amino acids). The heavy chain is anchored to the membrane and is
constructed from three domains a1, a2, and a3. The a1 and a2 domains form
a sort of bowl, the base of which is made up of a six-stranded antiparallel
806 31 Ligands for Surface Receptors
pleated sheet. The bowl is edged by two long helices oriented parallel to one
another. A crevice that accepts the antigen peptide fragment opens between the
helices. Peptides with a length of 8–10 Å fit in this area. Peptide binding occurs
largely because of hydrogen bonds to the N and C termini. MHC molecules are
highly polymorphic in their amino acid composition, even with the same architec-
ture so that interactions with the backbone are primarily responsible for their
binding in the crevice, and these interactions can be formed by peptides in general.
In the middle sequence segment, the antigen peptide protrudes slightly out of the
binding pockets of the a1/a2 domains. Residues at the beginning and end of the
oligopeptides orient in the small pockets of the MHC molecule. They determine
the binding affinity of each peptide to the protein. The residues in the center that
bulge out do not contribute much to the binding to the MHC molecule, but they
are decisive for the recognition and interaction with the T-cell receptor. The
sequence of the b-chain is virtually invariant, and most genetic modifications
occur in the a-chain. Furthermore, polymorphisms (▶ Sect. 12.10) have been
discovered there that vary from individual to individual. This is what the tissue
compatibility between donor and recipient in the case of organ transplantation
depends on. The susceptibility to infection and autoimmune disease can also find
an explanation in these variations.
MHC molecules bind the antigen peptide based on its sequence. They force it into
an extended conformation and expose the peptide’s central amino acid residues to the
exterior for molecular recognition by a T-cell receptor. The T-cell receptor is
a heterodimeric transmembrane glycoprotein that occurs exclusively on T cells. It
is constructed from an a- and a b-chain. The folding pattern of the two chains
is reminiscent of the structural construction of the light chains in antibodies
(▶ Sect. 32.3). The antigen-binding site is found in the loop area between the
individual pleated sheets of the domains. These loops are hypervariable and deter-
mine the recognition properties of each receptor. The receptor lies diagonally over the
peptide-binding site in complex with the MHC molecule. With its variable loops, it
covers, above all, CDR3a and CDR3b, the amino acid residues of the antigen peptide
that point away from the MHC molecule. At the same time, the T-receptor is in
contact with the surface portions of the flanking helices of the MHC molecule.
The design of peptidomimetics as candidates for a vaccine therapy to stimulate
the immune defenses shall be illustrated on the case of the melan-A/MART-1
antigens. These antigens are presented on the surface of melanoma tumor cells by
an MHC-I complex. The nonapeptide Ala–Ala–Gly–Ile–Gly–Ile–Leu–Thr–Val
and the decapeptide Glu–Ala–Ala–Gly–Ile–Gly–Ile–Leu–Thr–Val 31.42 were
isolated from melan-A in patients with this disease (Fig. 31.22). Both oligopeptides
bind with low affinity to the MHC molecule. An exchange of alanine for leucine
in position-2 significantly increases the binding affinity. The leucine-carrying pep-
tide 31.42 exhibits significantly larger immunogenic character than 31.41. It was
therefore chosen for a clinical vaccine study on melanoma patients. As a peptide
though, it has low stability in the organism and is quickly degraded. Therefore the
group of Francine Jotereau and Stéphane Quideau in Bordeaux, France adopted
the goal of developing a peptidomimetic. It should show the same binding affinity to
31.7 MHC Molecules: Where the Immune System Presents Peptide Fragments 807
31.41 Glu-Ala-Ala-Gly-Ile-Gly-Ile-Leu-Thr-Val
Glu Leu Ala Gly Ile Gly Ile Leu Thr Val
O - O
OH
O O O O
H H H H H
N N N N N COOH
H2N N N N N
H H H H
O O O O O
31.42 Glu-Leu-Ala-Gly-Ile-Gly-Ile-Leu-Thr-Val
H
H2N N H
N N
O
H
b-Ala Leu Ala N Leu Thr Val
O
O OH
H H O
H2N N H H H
N N N
N N N COOH
H N
O O O H
O O
31.43
Fig. 31.22 Decapeptide 31.41, which was isolated from patients, was optimized to 31.42
by exchanging an alanine for a leucine in position-2 to develop peptidomimetics as candidates
for an immune-stimulatory vaccine for melanoma. It served in clinical vaccine studies. By
the stepwise replacement of building blocks in 31.42, a peptidic lead structure could be modified
into a stabilized peptidomimetic that binds identically to the MHC molecule but provokes
an amplified immune defense by binding to the T-cell receptor. Four modifications were under-
taken in the stepwise development of 31.43. The N-terminal glutamic acid was exchanged
for a b-alanine (red) to improve stability. The replacement of the Gly–Ile unit by a
2-aminoethylene (blue) together with a change to a CO–CH2-indoyl group increased the immune
response on the T-cell receptor. The exchange of the second Gly–Ile unit for a peptidomimetic
moiety, 3-aminomethylbenzoic acid (green, AMBA) allowed the peptide backbone to take
the same course.
the MHC molecule, have the same or better affinity to the T-cell receptor, and be
significantly more stable. Because only a crystal structure of the binary complex of
31.42 with the MHC molecule was available, without the T-cell receptor a model
was developed with the help of a tertiary complex with a structurally similar peptide.
A glutamate residue in the first position was intended to be replaced by a peptidase-
stable moiety. The choice fell on b-alanine, which is barely proteolytically
808 31 Ligands for Surface Receptors
Fig. 31.23 Modeled binding geometry of the reference peptide 31.42 (brown) superimposed with
the peptidomimetic 31.43 (green) in the binding pocket of the tertiary complex with the MHC
molecule (yellow) and the T-cell receptor (a-chain is light-blue, b-chain is gray). The receptor
recognizes the side chain of the first isoleucine or the CO–CH2-indolyl group with the hypervar-
iable loops CDR3a and CDR3b. The introduced AMBA building block allows the course of the
peptide chain to remain unchanged and replaces the volume of the replaced Ile side chain with its
phenyl ring.
cleavable. The leucine in position-2 and the valine in position-10 should be retained
because they are decisive for anchoring the MHC molecule. A spatially conserved
orientation of the backbone scaffold was anticipated. The group proceeded stepwise.
The amino acid in position-5, which is an isoleucine in the reference peptide 31.42,
seemed to be critical for the interaction with the CDR3 loop of the T-cell receptor.
The sec-butyl group of isoleucine was replaced by an aromatic moiety, whereby an
indole group proved to be optimal. Next an attempt was made to change the central
peptide bonds of the Gly–Ile–Gly–Ile motif by reduction. Finally, an N-(2-
aminoethyl) bridge was chosen for this segment. The second Gly–Ile unit could be
replaced by the known peptidomimetic group 3-aminomethylbenzoic acid (AMBA).
The peptidomimetic 31.43 was obtained as a result of this optimization, which
displays virtually the same binding affinity as the reference peptide 31.42 to the
MHC molecule. Its modulated binding mode is shown in Fig. 31.23. This compound
provoked the most intense release of g-interferon in an assay to test the immune-
response stimulation. This is probably a result of the more intense interaction with
the T-cell receptor. Further development must demonstrate whether 31.43 repre-
sents a promising lead structure for the development of peptidomimetic vaccines for
an immune therapy for melanoma tumors.
31.8 Synopsis 809
31.8 Synopsis
Bibliography
General Literature
Andronati SA, Karaseva TL, Krysko AA (2004) Peptidomimetics– antagonists of the fibrinogen
receptors: molecular design, structures, properties and therapeutic applications. Curr Med
Chem 11:1183–1211
Chhabra SR, Abdul Rahim AS, Kellam B (2003) Recent progress in the design of selectin
inhibitors. Mini Rev Med Chem 3:679–687
De Palma AM, Vliegen I, De Clercq E, Neyts J (2008) Selective inhibitors of picornavirus
replication. Med Res Rev 28:823–884
Doranz BJ, Baik SW, Doms RW (1999) Use of a gp120 binding assay to dissect the requirements
and kinetics of human immunodeficiency virus fusion events. J Virol 12:10346–10358
Kolata G (2001) The story of the great influenza pandemic of 1918 and the search for the virus that
caused it. Touchstone, New York
Lazoura E, Apostolopoulos V (2005) Rational peptide-based vaccine design for cancer immuno-
therapeutic applications. Curr Med Chem 12:629–639
Matthews T, Salgo M et al (2004) Enfuvirtide: the first therapy to inhibit the entry of HIV-1 into
host CD4 lymphocytes. Nat Rev Drug Discov 3:215–225
Shimaoka M, Springer TA (2003) Therapeutic antagonists and conformational regulation of
integrin function. Nat Rev Drug Discov 2:703–716
Somers WS, Tang J, Shaw GD, Camphausen RT (2000) Insights into the molecular basis of
leukocyte tethering and rolling revealed by structures of P- and E-selectin bound to SLeX and
PSGL-1. Cell 103:467–479
von Itzstein M (2007) The war against influenza: discovery and development of sialidase inhib-
itors. Nat Rev Drug Discov 6:967–974
Bibliography 811
Special Literature
Douat-Casassus C, Marchand-Geneste N, Diez E, Gervois N, Jotereau F, Quideau S (2007)
Synthetic anticancer vaccine candidates: rational design of antigenic peptide mimetics that
activate tumor-specific T-cells. J Med Chem 50:1598–1609
Garboczi DN et al (1996) Structure of the complex between human T-cell receptor, viral peptide
and HLA-A2. Nature 384:134–141
Jiang S, Zhao Q, Debnath AK (2002) Peptide and non-peptide HIV fusion inhibitors. Curr Pharm
Des 8:563–580
Kim CU, Lew W et al (1997) Influenza neuraminidase inhibitors possessing a novel hydrophobic
interaction in the enzyme active site: design, synthesis and structural analysis of carbocyclic
sialic acid analogues with potent anti-influenza activity. J Am Chem Soc 119:681–690
Kim CU, Lew W et al (1998) Structure–activity relationship studies of novel carbocyclic influenza
neuraminidase inhibitors. J Med Chem 41:2451–2460
Kolatkar PR et al (1999) Structural studies of two rhinovirus serotypes complexed with fragments
of their cellular receptor. EMBO J 18:6249–6259
Kranich R, Busemann AS et al (2007) Rational design of novel, potent small molecule pan-selectin
antagonists. J Med Chem 50:1101–1115
Ku TW, Ali FE, Barton LS et al (1993) Direct design of a potent non-peptide fibrinogen receptor
antagonist based on the structure and conformation of a highly constrained cyclic RGD
peptide. J Am Chem Soc 115:8861–8862
Smith PW, Sollis SL et al (1996) Novel inhibitors of influenza silaidases related to GGI67. Bioorg
Med Chem Lett 6:2931–2936
Williams MA, Lew W et al (1997) Structure–activity relationships of carbocyclic influenza
neuraminidase inhibitors. Bioorg Med Chem Lett 7:1837–1842
Zablocki JA, Rico JG, Garland RB, Zablocki JA, Rico JG, Garland RB et al (1995) Potent in vitro
and in vivo inhibitors of platelet aggregation based upon the Arg-Gly-Asp sequence of
fibrinogen. (Aminobenzamidino)succinyl (ABAS) series of orally active fibrinogen receptor
antagonists. J Med Chem 38:2378–2394
Zhang Y et al (2004) Structural and virological studies of the stages of virus replication that are
affected by antirhinovirus compounds. J Virol 78:11061–11069
Biologicals: Peptides, Proteins,
Nucleotides, and Macrolides as Drugs 32
The importance of peptides, proteins, sugars, and nucleotides for functional pro-
cesses in our bodies has been discussed in many chapters in this book. An attempt
can be made to regulate or intervene in the processes that these endogenous
substance are involved in with exogenous, low-molecular-weight drugs. On the
other hand, the question can be raised as to whether the administration of endog-
enous biomolecules themselves might be a promising therapeutic concept in case of
some diseases. This is especially true for diseases in which a particular endogenous
substance is insufficiently produced by the organism, or is produced but is not
functional, for instance, because of an amino acid mutation. Only gene technology
methods (▶ Chap. 12, “Gene Technology in Drug Research”) opened the perspec-
tive to selectively produce polypeptides and proteins with specific characteristics in
adequate quantities.
As part of a strategy to use endogenous proteins and peptides as drugs, it can be
reasonable to slightly modify the native substances to endow them with additional
properties such as a longer half-life, better stability, or higher bioavailability. Often,
the serious problem occurs that peptides and proteins have much too poor stability and
bioavailability for oral application. Nonetheless, there are many promising application
areas such as the treatment of digestive disorders with the administration of lipases.
The issue of bioavailability is also different for skin diseases than it is for oral
application and systemic drug use. Even the skin, however, has a protective enzymatic
barrier that sensitive biomolecules cannot easily overcome. In hospital drug use, the
treating physician can easily choose an intravenous application for which this issue is
less critical. This problem shall be discussed in more detail by using the example of
insulin, the daily exogenous administration of which is essential for diabetics.
Another pharmaceutical concept with regard to the application of exogenously
administered biomolecules exploits the principle of the body’s own immune
defense. The body uses macromolecular structures for the recognition and targeted
deactivation of pathogenic substances. A drug therapy can copy this principle to
fight pathogens or malignant cells according to the same concept. These antibody
proteins from the humoral defense system are not orally bioavailable because of
their size and require intravenous application.
Endogenous proteins have long since been used in substitution therapy. Earlier,
material from animal pancreases was used as an insulin source for the therapy of
diabetes mellitus; this insulin was different from human insulin by one amino acid
(porcine insulin) or three amino acids (bovine insulin). Although these insulins are
suitable for therapy, and there are techniques to exchange the structurally deviant
amino acid of porcine insulin for that in human insulin, all of the slaughterhouses
in the world would not be enough to supply all diabetics with the necessary
insulin. Factor VIII deficiency in hemophiliacs used to be compensated for by
blood transfusions. Today, recombinantly manufactured proteins are exclusively
used because the possibility of contamination with viruses is too great with products
taken, e.g., from human blood. Often it was recognized much too late that the factor
VIII batches were infected with hepatitis viruses and HIV, the causative agent of
AIDS. Therefore efforts were made very early on to produce human proteins by
using gene technology. The first protein to be produced in this way was human
insulin from the bacterium Escherichia coli, which was introduced into therapy by
Eli Lilly in 1982. Although Hoechst also had worked out a promising method for
industrial manufacturing, their production could not begin until 1994. It took that
long in Germany until all of the objections to the manufacturing license were
32.1 Gene-Technological Production of Proteins 815
The structure and function of our immune system, which defends against foreign
substances, so-called antigens, was introduced in ▶ Sect. 31.7. It is divided between
unspecific and specific defense. In specific immune response, a distinction is made
between the humoral and cell-specific systems. The role of MHC molecules in
complex with the T-cell receptor as a control system to detect and cull diseased
and healthy cells was discussed in detail. Antibodies adopt a corresponding role in
the humoral system in that they detect foreign substances, which then are delivered
to phagocytic cells, such as macrophages, for degradation. Analogous to the
32.3 Monoclonal Antibodies 817
Fig. 32.1 Crystal structure of a complete IgG antibody. The two Fab regions form the left and
right branches (red, green) of the Y-shaped molecule. They are made up of a light (light color) and
a heavy chain (dark color). The antigen-binding site (light-blue arrow) is found at the end of both
branches. It is formed by eight loop regions. The Fc domain is connected through a hinge region
with multiple disulfide bridges. It forms the trunk of the Y-shaped molecule. Two chain strands
with a pleated-sheet architecture are positioned against one another here too. The schematic
construction of the antibody with the same color codes is shown below right.
Fig. 32.2 Comparison of the crystal structures of two Fab domains that were released by
proteolytic cleavage with papain. The eight variable loop regions that form the antigen-binding
site are represented with different colors. The structure (a) binds a small molecule as an antigen;
structure (b) on the other hand recognizes the surface of a protein as a foreign substance.
Loop areas are found at both ends of the bifurcated Y, three loops of which have
proven to be extremely variable among different antibodies in terms of their length
and sequence. Antibodies are able to offer binding sites for very different antigens
with these hypervariability loops, or complementarity-determining regions
(CDR). Like the fingers of a hand, these variable loops grasp or surround the
antigen. Eight CDR loops are shown in different colors in Fig. 32.2. Two antibody
structures are shown that, despite their almost identical folding, bind to two entirely
different antigens. One structure grasps phosphocholine 32.1, a small antigen, whereas
the other recognizes and binds the protein lysozyme (129 amino acids) via a large
surface patch (Fig. 32.3). Phosphocholine orients its charged quaternary ammonium
group to interact with two glutamic acid residues and one asparagine. The terminal
phosphate group forms H-bonds to a tyrosine and an arginine residue. The interface
between the antibody and lysozyme stretches over an area of about 20–30 Å. The
highly structured contact area takes on a shallow form. Seventeen residues of the
antibody are in direct contact with 16 lysozyme residues. Only a few antigen residues
burrow deeper into the antibody surface and form hydrogen bonds on their ends.
32.3 Monoclonal Antibodies 819
a H3C
O b
P
H3C N +
O O− Lysozyme
O− Phosphocholine
CH3
32.1
CDR1-8
CDR1-8
Fig. 32.3 The contact surfaces with the bound antigens are shown for both of the Fab domains
shown in Fig. 32.2. (a) The small molecule phosphocholine (green surface) is bound in a deep
pocket in the antibody. It penetrates the violet-colored surface of the antibody deeply. In the case
of the antigen lysozyme (b) a 20–30-Å contact surface is formed and 16 or 17 residues of both
binding partners, respectively, are involved in the interaction. An additional antigen contact
surface (green) with the antibody (violet) is in the direct vicinity.
Their ability to bind highly efficiently to chemical structures that have entirely
different sizes and compositions seems to make antibodies ideal for the detection
and culling of disease-causing foreign substances and malignant or degenerated cells.
To use them in diagnostics or therapy, they must be purposefully developed against
specific antigen surface structures and produced in adequate quantities.
The development of suitable antibodies can be accomplished in a donor organ-
ism. Antibody-producing cells can be isolated from the serum of an immunized
mammal and purified. To obtain larger quantities of antibody-producing cells, an
attempt can be made to culture the cells. Under these conditions though, the cells
grow for only a few generations, then they die. In 1975 Georges Köhler went to the
laboratory of César Milstein in Cambridge, England, to improve the production of
antibodies in cell cultures. There the idea emerged to hybridize normal antibody-
producing cells with easily reproducing tumor cells to make hybridoma cells, and
to combine the properties of both cell types in this way. Once again, serendipity
helped. Köhler decided on murine cells. Later it was discovered that these cells fuse
100-fold better with tumor cells than other cells do. The hybridoma cells produce
the desired antibodies and continue to divide for unlimited generations. They
became immortal antibody-producing cells. In the meantime, this method for the
manufacture of monoclonal antibodies has developed into a billion-dollar busi-
ness. Georges Köhler and César Milstein received the Nobel Prize. That was,
820 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
however, their entire pay. They neither patented their method, nor tried to establish
a company to profit from their invention.
Antibodies that are produced in this way are useful for medical diagnostics. On
the other hand, they can be used, for instance, to treat tumors or septic shock. In
general, they are used to fight diseases in which a protein in the body should be
neutralized. A problem occurs when antibodies are isolated from an animal organ-
ism. They can act as antigens themselves and therefore provoke an immune
response. Here, the formation of chimeric proteins, that is, combinations of
mouse antibody with parts of human antibodies, can help. The so-called humaniza-
tion, in which only the variable antigen-binding site of the mouse antibody is coupled
with a human antibody, is even more elegant. In vitro production of completely human
antibodies with certain viruses is another method. Just as with important proteins,
human antibodies can be produced in the milk of sheep. Companies such as Genzyme
Transgenics have bred transgenic sheep that produce human monoclonal antibodies in
their milk.
A bigger application field for antibodies is the prevention and treatment of
diseases with vaccines. The development of gene technology has contributed
very decisive progress for the production of vaccines. For example, vaccines for
hepatitis B used to be isolated from the blood of chronically infected patients, a very
laborious and dangerous technique. For a vaccine, however, the entire virus is not
needed. For the recognition by an antibody, it is sufficient to reproduce only a typical
surface segment from the envelope. The genetic information for this area is taken from
the virus and incorporated into plasmids (▶ Sect. 12.1). The envelope protein segment
is then produced just like any other protein in bacteria or in other appropriate cells.
Gene-technologically produced vaccines against AIDS and other viral and bacterial
diseases, even against parasitic diseases such as malaria, are being intensively
investigated.
Antibodies have achieved increasing importance as drugs in recent years. Well
over 200 examples are in clinical trials. Increasingly more recombinantly pro-
duced antibodies, usually with tongue-twisting names that end with “-mab” (for
monoclonal antibody) are arriving on the market. As mentioned, antibodies have
the huge advantage that they can be specifically raised against virtually any surface
structure. They then fish the corresponding antigen out of the organism highly
selectively and, once bound, deliver it to the usual degradation pathway of the
immune system via the phagocytic cells. Along the way, not only undesirable
intruders are neutralized, even cancer cells or undesirable signaling and regulatory
proteins can be removed from the organism. On the other hand, antibodies can illicit
or block a cell-specific receptor as proteinogenic signal molecule just as a drug
would. They can be exploited as a sort of tracking hound and combined with a
sophisticated molecular ferry they can transport an active molecule to the site of
action. Once arrived, the transported molecule is released in a very high local
concentration to evolve its action.
One disadvantage of antibodies should not go unmentioned. For them the cell
membrane represents an insurmountable barrier in almost all disease processes.
They are limited to the recognition of structures on the cell’s surface or they can
32.3 Monoclonal Antibodies 821
a b c
Ligand Homodimerized
Growth Ligand Receptor
Factor Membrane
Receptor Exterior
Membrane
Tyrosine Interior
Activated Deactivated
Kinase Tyrosine Tyrosine
Domain Kinase Kinase
Fig. 32.4 (a) The epidermal growth factor receptor is stimulated by binding a macromolecular
ligand. (b) Autophosphorylation initiates the intracellular tyrosine-kinase cascade and the signal is
transmitted into the cell. (c) A specific antibody that was raised against the surface structure of the
receptor can bind so tightly to the receptor that it blocks the uptake of the natural ligand. The signal
cascade is antagonized, and the signal transmission does not occur.
Fig. 32.5 An antibody raised against the surface protein (red) from tumor cells (orange) finds
such cells in the organism. If a metal ion chelator carrying a radioactive isotope is covalently
coupled to the antibody, a radiation source can be specifically brought into the direct vicinity of the
malignant cancer cell. Ionizing radiation from the nuclear decay is released, where it exerts its
tissue-destroying effects locally. The tumor tissue is treated by radiation therapy directly at the site
to fight the tumor.
that have been raised against CD20. They carry 131I or 90Y as radioactive sources.
This therapeutic approach couples radiation therapy with the body’s own immune
defense.
O Base
O
O RO
P O Base
−X O
O RO
P RNA
−X O
X=O Oligonucleotide
(R = H, OH)
X=S Oligonucleotide-Analogues
(R = H, OAlkyl)
N Base
H
N
O
N Base
O H
N
O
N
O H
PNA
Peptide–Nucleic Acid
Fig. 32.6 Modifications are performed on the backbone of the oligonucleotide strand to reduce its
polarity and increase its metabolic stability. A complete exchange of the ribose phosphate chain is
accomplished by using an oligogylcine peptide strand. Such a PNA shows a high degree of
geometric analogy to the RNA strand. As the crystal structure (right) of an RNA (gray arrow
stands for the phosphate sugar strand) and PNA (orange-colored strand with the green-colored
amide bonds) double strand shows, both scaffolds can successfully hybridize with one another.
the many further modifications of this sort, for example, substitution with carbonates,
carbamates, acetals, imines, or oximes, the sugar moiety has also been chemically
modified. Methylation or methoxyethylation of the 20 -OH group of the ribose ring
leads to reduced toxicity and improved stability to RNAse H. This enzyme has the task
to degrade the RNA that is needed for the gene-expression process, but without use
thereafter. By cleaving the bond in the sugar–phosphate backbone, the nuclease
reduces the mRNA into its monomeric building blocks again. The desired higher
stability can also be achieved by the formation of a cyclic ether between the 20 -OH
group and C40 of the ribose ring to create the so-called locked nucleic acid (LNA).
A rather extensive exchange is the replacement of the sugar–phosphate group with an
oligoglycine strand. The thus-formed peptide–nucleic acid (PNA) can form a com-
plex with the mRNA very well. The crystal structure of a double-stranded hybrid of
a DNA and a PNA strand is shown in Fig. 32.6. The PNA strand shows little toxicity
because of its high biological stability, but there are problems with its cell penetration
32.5 Nucleosides and Nucleotides as False Substrates 825
due to its poor solubility. Chimeric structures of LNA/PNA with DNA oligomers have
been considered as alternatives.
The important criteria that an antisense drug must fulfill are as follows:
• Simple chemical synthesis
• Adequate in vivo stability
• Good membrane permeability and distribution in the organism
• Adequate intracellular half-life
• Strong and sequence-specific binding to the target mRNA
• Good nuclease stability
• No unspecific binding to other biological macromolecules.
Antisense therapy can be applied locally as well as systemically. The local
application allows a high concentration of the antisense nucleotide at the site of action.
In 1998, fomivirsen (Vitravene®) was introduced by Novartis Ophthalmics as the first
antisense nucleotide for the treatment of cytomegaloviral retinitis. This disease occurs
as an opportunistic infection in immunodeficient AIDS patients. The compound must
be applied directly in the vitreous humor and prevents the production of viral proteins
by binding to viral mRNA. In 2002 the company discontinued its marketing for
financial reasons. Other local therapies have skin diseases such as psoriasis as
a goal. The systemic application is usually oriented toward the treatment of different
cancer diseases. Antisense nucleotides against the mRNA of the BCL-2 protein, which
is expressed in many malignant diseases, have been developed. Other approaches are
oriented against TGF-b2 (transforming growth factor b2) because this protein is not
only held responsible for the growth and metastasis of tumors but also because it
protects tumor cells from attack by the body’s own immune cells (▶ Sect. 31.7).
Moreover, antisense nucleotides are used to fight inflammatory diseases (Crohn’s
disease, ulcerating colitis, and asthma) and the metabolic syndrome. It was discussed
in ▶ Sect. 27.8 how high hopes are placed on an antisense strategy for the blockade of
the expression of phosphatase PTP-1B.
It is noteworthy that antisense–DNA technology is already well established in
plants and is an important auxiliary for elucidating specific metabolic pathways.
Here, an mRNA nucleotide is not applied, but rather an antisense DNA that is
loaded onto small gold particles and “shot” into the cell. Transcription of the
antisense DNA affords antisense mRNA, which then forms a complex with the
“right” mRNA and prevents the biosynthesis of the corresponding protein in this
way. The first gene-technologically altered food products to be generated in
this way were the long-keeping Flavr-Savr tomatoes.
As monomeric DNA and RNA building blocks, nucleosides have an analogous role
for the construction of oligonucleotides and genes as amino acids have for protein
construction. As carriers of the hereditary information and coding instructions for
protein biosynthesis, DNA and RNA are essential biomolecules for a multitude of
processes in our bodies. Interventions in the synthesis of these biomolecules, above
826 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
all in processes that are necessary for the production of larger quantities of
these molecules, can afford important principles for drug therapy. The inhibition
of these processes is especially interesting. This is primarily possible with mole-
cules that are very similar to nucleosides but that are modified at decisive positions.
As false substrates, they are indeed recognized by the enzymes as starting material
for the DNA and RNA biosynthesis, but in subsequent steps they lead to
a termination of the synthesis. An increased synthetic capacity is especially needed
in reproducing cancer cells and proliferating viruses. A restriction in the synthesis
rate of these molecules can lead to an effective strategy for the fight against cancer
diseases and viral infections.
Nucleosides are constructed from a purine (adenine and guanine) or
a pyrimidine base (cytosine and thymine in DNA or cytosine and uracil in RNA)
and a pentose. If the OH group is missing from the 2-position of the cyclic five-
membered-ring sugar, the nucleoside is used as a building block for DNA. The
hydroxylated form serves RNA as a monomeric building block. By transforming
the exocyclic hydroxymethylene group into a phosphate ester, a nucleoside
becomes a nucleotide.
The biosynthesis of thymine was discussed in ▶ Sect. 27.3. The enzyme
thymidylate synthase transfers a methyl group onto the pyrimidine base uracil
to convert it into thymine (▶ Fig. 27.8). If a slightly modified substrate is offered
to thymidylate synthase, this molecule is indeed recognized by the enzyme and
bound, but the subsequent biosynthesis is terminated. Therefore such pyrimidine
analogues are used as chemotherapeutics in tumor therapy. Exchanging
a hydrogen atom for fluorine in the 5-positon of the uracil scaffold 32.2 to
5-fluorouracil 32.3 is initially not recognized because of the very similar sizes of
H and F (Fig. 32.7). The modified base is then metabolized via the mono- and
diphosphate to 5-fluoro-20 -desoxyuridinediphosphate. After cleavage of the phos-
phate group, it is accepted by thymidylate synthase as a false substrate. There it
reacts with Cys146 by forming a covalent bond, and in doing so, it irreversibly
blocks the enzyme. Tegafur 32.4 represents a prodrug of 5-fluorouracil that is
activated in the liver by CYP 3A4. As an advantage to 5-fluorouracil, tegafur can
be orally administered as a chemotherapeutic and used ambulantly as palliative
chemotherapy. Capecitabin 32.5 represents another prodrug for the treatment of
colorectal cancer. It must be activated in multiple steps in tumor tissue. After
cleavage of the carbamate group and exchange of the NH2 function for
a carbonyl group by cytidine deaminase, fluorouracil is released, which can be
further biotransformed.
Several purine base analogues have also been described such as
6-mercaptopurine 32.6 or 6-thioguanine 32.7. After biotransformation and phos-
phorylation, they competitively inhibit purine biosynthesis. Accordingly, nucleo-
sides such as fludarabine 32.8, cladribine 32.9, and pentostatin 32.10 inhibit
adenosine deaminase and are used as chemotherapeutics for leukemia.
Antivirals follow a completely different mode of action. Because viruses
store a program for their reproduction and proliferation, but lack their own
metabolism, they must exploit the infected host cell for their own purposes.
32.5 Nucleosides and Nucleotides as False Substrates 827
O
O O
O HN O
NH F
NH F F
−
NH N
O3PO −
N O O3PO
O N O
O N O HC N O
O 3
O
OH OH
32.2 Uracil-desoxy- 32.3 5-Fluorouracil- OH OH
monophosphate desoxymonophosphate 32.4 Tegafur 32.5 Capecitabin
NH2 NH2
N N
N N
SH SH
HO N N HO N N Cl
N N HO
N N
O O
N N H2N N N
H H OH OH
32.6 6-Mercaptopurine 32.7 6-Thioguanine 32.8 Fludarabin 32.9 Cladribin
HO O
N H3C
O NH
NH
HO N N HO
N NH N O
O O
HO N NH2
OH O OH
32.11 Aciclovir 32.12 Thymidine
32.10 Pentostatin
NH2
O NH2
O
H3C N
N H3C N
NH
NH
HO HO N N
N O N O HO −
O O N O O3P O
O
CH3
N3
32.13 AZT Zidovudine 32.14 Zalcitabine 32.15 Stavudine 32.16 Tenofovir
Fig. 32.7 Nucleoside analogue inhibitors of thymidylate synthase, diverse deaminases, and
reverse transcriptase.
For this, they reprogram the host cell so that it takes on the production of the
necessary viral components. As a prerequisite, the viral hereditary information must
be introduced into the genome of the infected cell. Depending on the type of virus,
a reverse transcriptase (RT) or a DNA polymerase carries out this task. These
enzymes need RNA/DNA nucleosides as starting material for the synthesis or
translation. If a false substrate is offered as a nucleotide building block, this can
lead to the termination of the reproduction of the viral genome by the synthetic
machinery of the host cell. An effective principle for the treatment of viral
infections is therefore achieved.
828 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
The group of herpes viruses stores their genes on double-stranded DNA that is
synthesized by a viral DNA polymerase. If this viral polymerase is offered a false
substrate that is very similar to the natural nucleosides, but unsuitable to continue
the nascent chain construction, termination of the DNA synthesis will result. It is
important that this drug has adequate selectivity for the viral polymerase so that the
endogenous DNA polymerases of the host cell are not excessively inhibited in
parallel. The OH groups in the 50 - and 30 -positions are critical with respect to the
construction of the backbone of a DNA strand. Drugs that are meant to lead to
a chain termination during DNA replication are usually altered at the 30 -position
of the pentose ring. Aciclovir 32.11 was introduced in ▶ Sect. 9.5 as a prodrug for
the treatment of viral infections (Fig. 32.7). Formally, the five-membered ring of the
nucleoside is opened, and the OH group at the 30 -position is missing. Nonetheless,
the guanoside analogue is initially phosphorylated by the viral thymidine kinase
only, and therefore transformed into the 50 -monophosphate exclusively in virally
infected cells. The further transformation to the triphosphate is carried out by
endogenous kinases. Once activated in this manner, it is incorporated into the
nascent DNA strand by the viral DNA polymerase with hydrolysis of the triphos-
phate. In the subsequent step however, any further attachment of a nucleoside
building block is impossible, and chain termination occurs because the necessary
30 -OH group is absent.
In so-called retroviruses, a large group of enveloped viruses, the genetic
information is stored in the form of a single RNA strand. These viruses are the
cause of a few widespread infectious diseases. They infect animals and humans, but
most are specialized on a particular host. In humans, it is above all the HI virus that
represents a deadly threat.
To reproduce, the retroviruses must transcribe their RNA into DNA and incor-
porate the latter into the genome of the host cell. For this purpose, they have the
following enzymes: a reverse transcriptase (RT) and an integrase. The principle
of a reverse transcriptase was first described in 1970 by Howard Temin and David
Baltimore independently of one another; they were awarded the Nobel Prize in
1975. The discovery toppled the previously accepted dogma that information in
biology must always flow in the direction from DNA to RNA to protein. The RT
initially synthesizes an RNA–DNA hybrid strand. For this, the enzyme uses its
DNA polymerase function. It reads the synthetic protocol, however, from its own
single-stranded RNA. Then the hybrid must be converted into a pure double-
stranded DNA. For this, the RT uses a second domain that has an RNAse
H function. Proteins with this activity are used to degrade RNA after it has already
been read in the protein biosynthesis and is of no longer need. The remaining
single-stranded DNA is finally completed to a double-stranded DNA by the DNA-
polymerase activity of the RT. The newly formed DNA with the viral construction
plan is then incorporated into the host cell’s chromosome by the integrase.
Since its discovery and structural characterization, HIV reverse transcriptase
represents a preferred target enzyme for drug design and shall be considered in the
following section in greater detail. The enzyme is a heterodimer constructed from
a p66 and a p51 subunit (Fig. 32.8). Both subunits are coded by the gag-pol gene
32.5 Nucleosides and Nucleotides as False Substrates 829
Thumb
Palm
p66 DNA Strand
RNA Strand
Finger
Guanine
Adenine
Thymine
Uracil
Cytosine
p51
Fig. 32.8 Crystal structure of the HIV reverse transcriptase. The protein is made up of a p66
(purple) and a p51 (yellow) subunit. A hybrid double strand of DNA (pink) and RNA (bright-
green) is positioned in the protein structure. The palm area, where the polymerase activity of the
transcriptase is carried out, lies between the finger and thumb area.
and are cut out of the primary gene product by HIV protease. The p66 subunit
carries the residues for the polymerase and RNAse activity. The p51 domain is
important for the protein’s structural architecture, and it completes the binding site
for the double-stranded DNA and the DNA–RNA hybrid strand. The architecture of
the p66 subunit can be compared with the shape of a hand. It can be divided into
finger, thumb, and palm regions. To accomplish its function, the RT must undergo
significant conformational changes. The thumb and finger regions in particular must
rearrange to grasp the DNA strand and to accommodate the next nucleotide
triphosphate that is to be incorporated into the DNA sequence. The crystal structure
of HIV-RT together with the RNA–DNA hybrid strand is shown in Fig. 32.8. By
artificially anchoring the DNA strand covalently with the enzyme, it was possible to
determine the crystal structure of a tertiary complex of protein, DNA, and the newly
accepted nucleoside triphosphate (Fig. 32.9a). The nucleotide to be incorporated is
coordinated by two magnesium ions through its phosphate group and brought into
position at the end of the nascent DNA strand. The two magnesium ions that
mediate the binding are fixed in place by two aspartic acid residues, 110 and 185.
830 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
Fig. 32.9 (a) Crystal structure of reverse transcriptase with a covalently attached DNA strand.
A thymidine-50 -triphosphate (TTP) together with two magnesium ions are in the polymerase site.
As the reaction proceeds, this TTP substrate is added to the backbone of the phosphate sugar chain
of the newly synthesized DNA. (b) Binding mode of AZT–monophosphate in the binary complex
with reverse transcriptase and the DNA strand. The AZT substrate is added to the nascent DNA
strand. In the subsequent step, the chain elongation stops because the azide group is unsuitable for
the addition of the next phosphate group.
A second HIV-RT inhibition mechanism was elucidated that was initially discov-
ered by serendipity in screening. It causes an allosteric enzyme blockade that is
not competitive with the natural nucleosides. A hydrophobic pocket in the palm
region of the protein can open and accommodate small organic molecules. Like
a wedge, it fixes the enzyme in a broadly open conformation that prevents the protein
from accepting the RNA–DNA hybrid strand (Fig. 32.10). In doing so, these allosteric
inhibitors do not prevent the uptake of the nucleoside triphosphate substrate, but rather
obstruct the subsequent reaction steps that cause the incorporation of the nucleotide
into the nascent DNA strand. The small, allosteric binding site is formed by aromatic
and hydrophobic residues that almost exclusively come from the p66 subunit. Inter-
estingly, the binding pocket accepts ligands that are chemically very different
(Fig. 32.11). The first-discovered inhibitors nevirapine 32.17, TIBO 32.18, and
loviride 32.19 adopt a butterfly-like geometry in the binding pocket.
Palm
Thumb
Finger
N N
N
N
H3C H
O
32.17 Nevirapine
Uncomplexed
Nevirapine Bound
Fig. 32.10 Nevirapine 32.17 was discovered as an allosteric inhibitor of reverse transcriptase in
screening. The rigid molecule binds to the protein in a small, hydrophobic pocket adopting
a butterfly-like conformation (below left). Like a wedge, the occupancy of this pocket leads to
the fixation of the open conformation of the enzyme (green). The thumb and finger regions remain
far from one another. Upon binding the RNA–DNA hybrid double strand, both of these regions
must move toward one another (green ! red) to grasp the double helix. The allosteric inhibitor
prevents this movement and does not allow the protein to rearrange into its active conformation.
832 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
H3C O Cl
Cl
H
N N N N
NH
N N Cl
O NH2
N S
H3C H O CH3
CN CN CH3 CN CN CN
H
N
N N N N N N
Br
NH2 NH2 NH2
32.23 32.24 32.25 Dapivirine 32.26 Etravirine
Resistance mutations were observed very quickly in this allosteric binding site
too. They change the form and the aromatic character of the binding pocket and
rapidly lead to a drop in binding affinity of the allosteric inhibitors. At Janssen
Pharmaceuticals in Beerse, Belgium, under the direction of Paul Janssen and in
close cooperation with the research group of Edward Arnold at Rutgers University
in New Jersey, a triazine or pyrimidine moiety was incorporated as structural
element into 32.23–32.26 by starting with loviride 32.19 and indolylthioureas
(ITU) 32.22 (Fig. 32.11). The new derivatives were systematically analyzed by
crystallography. To the great surprise of the scientists, different binding modes
were evidenced for the structurally very similar derivatives 32.23 and 32.24
(Fig. 32.12). In the context of evading resistant mutants, this result is ideal. It is
distinctly more difficult for the viruses to effectively develop resistant mutants
against compounds that experience adaptive, chameleon-like binding modes. Con-
sequently, the researchers exploited this behavior. Compounds were developed that
had the ability to reorient into alternative binding modes (so-called jiggling). On the
other hand, they had a sufficient amount of conformational degrees of freedom so
that they could adapt to small changes in the enzyme (so-called wiggling), if,
for example, a small amino acid is exchanged for a larger one upon mutation.
32.6 Molecular Wedges Destroy Protein–Nucleotide Recognition 833
Phe227
Val106
Phe227
Val106
Trp229
Trp229
CN CN
H
N
Cl Cl Cl
N NH N NH
N N N N
NH2 NH2
32.23 32.24
Fig. 32.12 The two triazines 32.23 and 32.24 block the allosteric binding site of HIV reverse
transcriptase. Surprisingly, the ligands, which have very similar chemical structures, adopt entirely
different binding modes. Clinical candidates were developed from this compound series that have
a remarkable resistance-breaking profile. This is attributed to the multiple binding modes of the
adaptive ligands able to adjust to a binding pocket that has been altered by mutagenesis.
As a result dapivirine 32.25 and etravirine 32.26 were developed that display an
impressively invariable resistance profile compared to the precursor compounds.
This example shows that adaptive inhibitors in particular have a clear advantage.
This is especially true if substances are to be developed that should have a high
tolerance profile against a broad range of mutated variants of a viral protein.
Another class of molecular wedges to destroy protein-nucleotide recognition are
the quinolone carboxylic acids or quinolones for short. They represent an important
class of antibiotics to fight infections that are caused by Gram-negative bacteria
in particular. They attack gyrase, an enzyme that belongs to the group of
topoisomerases and catalyzes the over-spiralization of bacterial DNA. This DNA
over-spiralization is caused by the addition of extra turns and is necessary to pack
the molecule in the bacterial cell as efficiently as possible. Gyrase must twist the
cyclic bacterial chromosome around itself so that the DNA is placed in the form of
a noose around the enzyme. To introduce an additional turn, the enzyme must make
a temporary break in the DNA double strand. Then the topologically lower end of
the cut strand must be moved to the upper end and reconnected. The cleavage of the
834 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
Tyr118
Ser78
3¢ 5¢
32.30
5¢ 32.30
Asp508 3¢
Tyr118
Fig. 32.13 Crystal structure of the topoisomerase (topo IV from Streptococcus pneumoniae, gray
ribbon model) with two oligomeric DNA sequences (blue and violet) and two bound moxifloxacin
molecules (green). The protein must wrap the ring-shaped bacterial DNA around itself like
a noose for over-spiralization. The two DNA segments in the crystal structure emulate this
orientation. To achieve an extra turn in the DNA the double strand must be broken. This cut
occurs with an offset of four base pairs. In doing so, the 50 -end of the free phosphate group is
temporarily covalently attached to Tyr118. The 30 -end remains non-covalently bound in the
vicinity of the magnesium ion (near Asp508). The 1-cyclopropyl group stays in the direction of
Ser78. Exchanging this residue for a Phe or Tyr leads to resistance to this antibiotic. The large
basic group in the 7-position orients itself outside the complex in the surrounding solvent.
DNA double strand is accomplished so that an offset of four base pairs occurs. The
50 -end of the freed phosphate group is temporarily coupled via a covalent
phosphoester bond with a tyrosine residue (Tyr118, Fig. 32.13). The 30 -end with
its OH group remains non-covalently bound in the spatial vicinity of one of the
acidic residues of the magnesium-binding site that is formed by Glu433, Asp508,
and Asp510.
The first representative of the quinolones was nalidixic acid 32.27, which was
introduced into therapy in 1962 for the treatment of urinary tract infections
(Fig. 32.14). The 1-alkyl-4-pyridone-3-carboxylic acid scaffold 32.28 was varied
in further drug development, especially around the 1-alkyl group and in
the 7-position with basic piperazine-like groups. The addition of a fluorine at the
6-position led to a significant improvement in the activity. Important representa-
tives of the drug class are ciprofloxacin 32.29 and moxifloxacin 32.30.
A structure determination of the protein–DNA complex with moxifloxacin was
accomplished in 2009. Two antibiotic molecules intercalate between the two
cleaved ends of the DNA (Fig. 32.13). Like a wedge, they prevent the reassembly
of the cleaved ends of the double strand. Their planar heteroaromatic scaffold is
sandwiched on either side by a guanine from the one strand and an adenine from
the other strand. The cyclopropyl group resides in a pocket that is formed by Ser78 and
Asp83. The development of resistance has been observed as a result of mutations in
these residues. Above all, the replacement of Ser78 by larger residues such as Phe or
Tyr led to a reduction in activity due to steric reasons. The basic ring substituent in the
7-position is oriented between the base pairs four positions further in the sequence
32.6 Molecular Wedges Destroy Protein–Nucleotide Recognition 835
from the cleavage site and resides in a solvent-accessible volume area. This explains
why this group could be broadly varied in the context of quinolone development. The
6-fluorine group is oriented away from the protein and DNA; presumably its electron-
withdrawing properties are needed to optimally adjust the electron density of the
central aromatic moiety for stacking with the neighboring bases. Interestingly the
3-carboxyl and the 4-keto groups, which all quinolones have in common, are oriented
away from the above-mentioned magnesium-binding site so that an involvement of
these groups in the chelation of the metal ion seems unlikely.
Another example of such a molecular wedge that disrupts protein–DNA recog-
nition was observed in the resistance development to tetracyclines (6.13,
▶ Fig. 6.3). Tetracyclines inhibit ribosomal function, which will be introduced in
the next section. Interestingly, tetracyclines bind to a transcription factor, the Tet
repressor, which regulates the supply of the transport protein TetA. It is responsible
for expelling foreign substances from bacterial cells, including tetracyclines. As
long as the Tet repressor is bound to the gene segment that codes for the transport
protein, its expression is suppressed. If, on the other hand, tetracycline binds to the
repressor, it loses its affinity for the regulatory DNA segment. Similar to a switch, it
falls off the DNA, and the gene expression is initiated. The transport protein is
produced, and the antibiotic is expelled from the cell. Resistance occurs because the
tetracycline concentration in the bacteria cells that is needed to block the ribosome
can no longer be achieved.
Interestingly, tetracycline, together with a bound magnesium ion, positions itself
like a wedge between the helices of the repressor and causes a conformational
change (Fig. 32.15). The repressor works very similarly to the zinc finger that was
discussed in ▶ Sect. 28.2. As a dimer, the protein reads from the two palindromic
DNA sequences, which are two helix–turn–helix motifs that are virtually arranged
symmetrically to one another at a separation of 36 Å. The wedging by tetracycline
causes a broadening of the separation of the helix–turn–helix motif to 40 Å.
836 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
36 Å 40 Å
OH O OH O O
OH
NH2
OH
HO CH3 H N(CH3)2
6.13 Tetracycline
Fig. 32.15 Crystal structure of the Tet repressor with a bound sequence segment of DNA (left)
and an intercalating tetracycline 6.13 (right). The protein is a dimer constructed exclusively from
helices (red and green cylinders). It grasps the repressor palindrome DNA sequence segments with
its C2-symmetrical helix–turn–helix motif. A tetracycline molecule pushes between each of the
two monomers of the repressor like a wedge and causes a conformational change in the helical
protein. This increases the relative distance between the two reading motifs from 36 to 40 Å, which
is too far to still be read. The repressor–DNA binding does not occur.
The DNA base sequence can no longer be correctly read. The repressor loses its
affinity for the gene segment, and the production of the transport protein is initiated.
Tetracyclines practically act as a switch and can specifically regulate the gene
expression. This property is used in molecular biology to purposefully turn on gene
expression.
Biomolecules do not only control and regulate the function of organisms, but can also
be used as chemical weapons in the fight against competitors for survival. Microor-
ganisms such as bacteria and fungi in particular produce a multitude of unusual
substances that they use against their opponents in competition. These enemies,
which are often other bacteria and fungi, should be destroyed to win the continual
battle for limited resources. Microorganisms also endanger the health of humans.
32.7 Macrolides: Microbial Warheads 837
In the time before modern drug research, infectious diseases were the main cause of
death (▶ Sect. 1.3). This makes it all the more obvious that the structure and modes of
action of these microbial weapons should be examined in detail to sound out their
potential for a drug therapy against, for example, bacterial pathogens.
Microorganisms have a unique multienzyme complex that does not exist in
humans for the synthesis of these complex, often macrocyclic substances with
peptidic character; their synthesis is independent in the following described peptide
and protein synthesis in the ribosome. The produced compounds have a size from
a few hundred up to a thousand Dalton. The multienzyme complex (so-called
nonribosomal peptide synthesis machinery) uses many additional amino acids
and low-molecular-weight synthetic building blocks, often with unusual stereochem-
istry, as starting materials as well as the 20 proteinogenic amino acids. Moreover,
peptide construction and ring closures are not only accomplished by the formation of
amide bonds; ester bonds can also be closed. The multienzyme complex for these
syntheses is modular and assembled from multiple function-specific domains.
Depending on the product formed, these domains are compiled in the complex with
the necessary multiplicity. An individual module is composed of domains for the
recognition, activation, and incorporation of particular substrate components into the
desired product. They represent the basic function for the extension of the nascent
peptide. Additionally, continuously new synthetase domains are being discovered that
allow deviation from a simple linear synthesis sequence. Synthetic products that result
from the use of such multienzyme complexes often display variations in the peptide
backbone that allow branching and finally macrocyclization. Another synthetic route
that also produces similarly complex and pharmacologically interesting natural prod-
ucts is the polyketide synthetic pathway. It does not use amino acids, but rather
it represents a modification of the fatty acid biosynthesis. The C2 units of
decarboxylated malonyl-CoA are used as starting materials.
Many of the compounds that are synthesized in this manner are macrocycles of
variable ring size. Relatively small rings with nine members all the way to 30- or
40-atom rings have been discovered. Macrolides with 14–16-membered rings are
especially used as antibiotics for the treatment of bacterial infections.
However, macrocycles can also intervene in entirely different mechanisms that
influence, for example, the cell cycle, the integrity of the cell membranes, or
stimulate the immune system. The macrocyclic undecapeptide ciclosporin
(▶ Sect. 10.1) made organ transplantation possible. Its administration prevents
the rejection of the donor organ as foreign tissue in the recipient. Ciclosporin acts
as an immunosuppressant in that it inhibits both the humoral and cellular immune
response and suppresses the release of interleukin-2 (IL-2) from T cells. The
absence of IL-2 release prevents the maturation of the T cells to cytotoxic killer
cells (▶ Sect. 31.7). After penetration, ciclosporin binds to the cytosolic protein
cyclophilin. The ensuing binary complex inhibits the calcium-dependent phospha-
tase activity of the calcineurin–calmodulin complex responsible for the dephos-
phorylation of an activating nuclear factor. As a consequence the migration of this
transcription factor into the cell nucleus does not occur, and the IL-2 synthesis is
blocked. Macrolides such as nystatin, natamycin, or amphotericin B associate with
838 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
ergosterol in the cell membrane of fungi. Via this antimycotic principle they
influence the membrane integrity and make the cell membrane permeable for
potassium ions. This can lead to cell demise in the corresponding fungus.
Rhizopodin, sphinxolide B, kabiramide C, and jaspisamide A interact with actin
polymerization. In doing so they disrupt the development of the cytoskeleton and
demonstrate cytostatic effects. Zearalenone was discovered in the group of mold
toxins and shows a comparable effect as an estrogen.
The largest group of macrolide compounds exerts its effect against ribosomal
function. In this synthetic machinery, the genetic hereditary information is converted
into the production of new proteins. Based on its central importance of the mainte-
nance of all life, the ribosome has been the focus of intensive research for many years.
This large and multilayered natural complex was discovered in the 1950s, and more
than 25 years ago, work toward its crystallization and structure determination began
in the group of Ada Yonath at the Weizmann Institute in Israel. In small steps,
increasing information could be deciphered from the diffraction data about the spatial
construction of this ribonucleoprotein complex. However, the real breakthrough
came in 2000, when the crystal structure of the large 50S subunit was elucidated at
a resolution of 2.4 Å in the group of Tom Steitz at Yale University in New Haven,
CT. The group of Venkatraman Ramakrishnan at MRC in Cambridge, England, was
successful with the smaller 30S subunit and could contribute to the structure eluci-
dation of the total ribosome. The three researchers were awarded the Nobel Prize in
chemistry in 2009 for this grandiose tour de force. The first high-resolution structure
analysis was accomplished with the ribosome from the very robust bacteria Thermus
thermophilus and Haloarcula marismortui. Recently, the ribosome from the eubac-
terium Deinococcus radiodurans has proved to be an obliging and easily crystallized
workhorse. Many structure determinations of complexes with antibiotic macrolides
have been accomplished using this system (see below). It shows a high sequence
homology to the ribosomes of important pathogenic organisms.
The surprise was great after the first high-resolution structure determination.
Indeed, the ribosome is a molecular complex of proteins and nucleic acids, but
because of its catalytic function it must not be termed an “enzyme” but rather
a “ribozyme.” Proteins do not catalyze the decisive synthesis steps. It is the RNA
molecules that take on this function. This fact provides evidence that the ribosome
is evolutionary one of the oldest catalyst in living Nature. Despite its stately size of
over two million Daltons, it is highly conserved and occurs in archaebacteria,
prokaryotes, and highly developed eukaryotes with great similarity. The organisms
from the three domains of life have a common origin that reaches back over
3.5 billion years! Because of its central importance for the production of proteins,
it is not surprising that the ribosome in particular has become a prominent target
structure for the chemical weapons of microorganisms. They bind to a few vulner-
able points on the ribosome, and in doing so, turn its function off. These binding
sites are in the vicinity of the mechanistic active sites.
To understand the importance of these sites in detail, the working procedure of
the ribosome must next be considered. The blueprints for our proteins are stored as
the genome on DNA (▶ Sect. 12.3). To translate this information into proteins, in
32.7 Macrolides: Microbial Warheads 839
Fig. 32.16 In principle, 64 triplets can be formed with four bases: guanine (G), uracil (U),
adenine (A), and cytosine (C). In the diagram, these are oriented from inside to outside. To decode
an amino acid, begin with the central quadrant, for example, U, and then a base is taken from the
first ring, for example, an U again. The third base is chosen from the second, dark-gray ring. If it is
also a U, then the code is UUU for phenylalanine. Three triplets are interpreted as a stop codon
(UAG, UAA, UGA). Because 20 proteinogenic amino acids are available, up to six codons can
encrypt a single amino acid (e.g., Arg or Leu). Tryptophan (UGG) and methionine (AUG) are
encoded by a single triplet only. In a few enzymes such as glutathione peroxidase, a selenocysteine
is found in the active site. This 21st proteinogenic amino acid is encoded by the UGA codon in
certain contexts; UGA usually serves as a stop codon.
Nascent H AS2-tRNA
N H H
Peptide AS1 N
AS1 AS1
H AS2-tRNA AS2-tRNA
Chain N
O O
O O O
H OH
N
tRNA tRNA N N
A2451 tRNA
Large Subunit A2451 A2451
E
P A
mRNA
Small Subunit
Fig. 32.17 mRNA carries the genetic translation procedures for the synthesis of new proteins in
the ribosome on a single strand. The tRNAs are loaded with one of the 20 proteinogenic amino
acids according to the codon in the anticodon loop. The ribosome has three tRNA-binding sites, the
A-, P-, and E-sites. The A-site picks-up the aminoacylated RNA, the P-site binds the peptidyl–
tRNA, and the tRNA leaves the ribosome via the E-site. The energy required for the formation of
the polypeptide chain is supplied by coupled GTPase activity. To be correctly recognized, the
tRNA in the A- or P-site must display a complementary base triplet in its anticodon loop. A new
amide bond is formed in the ribosome’s peptidyl transferase center between the amino acid in the
P- and A-sites. The amino group of the amino acid AA2 on the aminoacylated tRNA performs the
nucleophilic attack on the carbonyl group of the AA1 amino acids of the peptidyl–tRNA.
A trigonal geometry is formed at the carbonyl carbon atom via an intermediate tetrahedral
transition state. The surrounding nucleosides, for example, A2451, are responsible for the polar-
ization and stabilization of the temporarily charged transition state.
the starting point on the mRNA is the base sequence AUG. As a result, the so-called
P-site of the ribosome has a tRNA with the pattern UAC in the anticodon loop.
This tRNA carries the amino acid methionine. The next triplet code on the mRNA
is, for example, CGC. This leads to a tRNA with the sequence GCG being taken in
at the A-site, next to the P-site. Such a tRNA is loaded with the amino acid arginine.
The two amino acids at the end of the loaded tRNAs orient in the catalytic peptidyl
transferase center (Fig. 32.19). There, peptide bond formation is catalyzed
between the two amino acids, and the first connection in the backbone of the new
protein is formed. The individual steps of the reaction mechanism are reminiscent
of the reaction sequence in proteases. However, it occurs in the opposite direction,
and the substrate recognition in the catalytic center occurs exclusively by the
nucleic acids (Fig. 32.17). After the methionine transfer, the discharged tRNA
32.7 Macrolides: Microbial Warheads 841
leaves the P-site via the neighboring E-site. The tRNA from the A-site migrates
into the neighboring P-site. This corresponds to a progression of the sequential
information on the mRNA. The emptied A-site is now occupied by a new tRNA, the
base triplet in the anticodon loop of which is complementary to the next triplet
sequence of the mRNA. The new protein grows according to this synthesis
sequence and leaves the ribosome via the so-called ribosomal tunnel
(Fig. 32.19). If the ribosome comes to a triplet sequence that corresponds to
a stop codon, the protein synthesis is terminated.
The biosynthesis is carried out with breathtaking speed. No more than 50 ms are
needed for one synthesis cycle. As mentioned, the ribosome is a mixed complex
made from two thirds RNA and one third protein and is organized in two subunits.
The small subunit (30S in prokaryotes) is responsible for the interpretation of the
genetic code. The large subunit (50S in prokaryotes) adds the individual amino
acids to the nascent peptide chain according to the blueprints on the mRNA.
As already mentioned above, the huge ribosome is blocked by antibiotics at
a few vulnerable points. Although antibiotics show distinct structural differences
among themselves, they bind in overlapping regions that are composed of ribo-
somal RNA molecules. In addition to the large group of macrolides, other ligands
with a completely different chemical structure have been found to block this region
of the 50S subunit. Among these are chloramphenicol 32.31 and clindamycin 32.32
(Fig. 32.20). Both bind in the vicinity of the peptidyl transferase center and compete
with the tRNA for the A- and P-sites. Tetracycline 6.13 and the aminoglycoside
842 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
rRNA
Protein
Fig. 32.19 View of the 50S ribosome subunit; the RNA portion is white, the protein portion in
light-blue. The three tRNAs in the A- (violet), P- (orange), and E-sites (red) were fit into the model
based on the crystal structure data. The black frame surrounds the peptidyl transferase center in
which the tRNAs with their anticodon loops protrude. Moreover the amide bond of the nascent
polypeptide chain is formed there. This chain (brown) leaves the catalytic site via the ribosomal
tunnel. Macrolides bind in the front part of the peptide tunnel and stop the chain synthesis after
only a few steps. The binding of structurally diverse antibiotics (space-filled, in red, green, violet,
and blue) to the involved nucleotides is indicated (Figure from Hansen et al., Molecular Cell 10,
117–128 (2002), reprinted with the kind permission of the publisher).
6.14 (▶ Sect. 6.4, ▶ Fig. 6.3) also attack the ribosome, but they inhibit the function
of the 30S subunit. Macrocyclic substances 32.33–32.38 bind at the entry to the
ribosomal tunnel, which is not far from the peptidyl transferase center. Their
inhibitory effect is exerted by blocking the growth of the nascent polypeptide.
According to their size, they allow the synthesis of protein fragments of upto 3–7
amino acids before the synthesis succumbs.
The most important representative from this group of compounds is erythromy-
cin 32.33, a macrolactone with a 14-membered ring. The Philippine scientist
Abelardo Aguilar sent soil samples from the province of Iloilo to Lilly in 1949.
There, a metabolic product was isolated that showed antibiotic effects. The natural
32.7 Macrolides: Microbial Warheads 843
CH3
Cl OH
H
N SMe
Cl N H
N O
O H3C OH
OH NO2 O
H3C
Cl HO OH
32.31 Chloramphenicol 32.32 Clindamycin
CH3 CH3
O O
H3C 10 CH3 OH H3C 10 CH3 OH
HO O NMe2 HO O NMe2
HO 7 2⬘ MeO 7 2⬘
OH OH
O O
H3C CH3 H3C CH3
O O
O CH3 O CH3
CH3 O CH3 O
O CH3 O CH3
H3C OMe H3C OMe
OH OH
32.33 Erythromycin 32.34 Clarithromycin
O
H3C O O CH3
N H3C
H3C 10 CH3 OH H3C N CH3
OH
O NMe2 HO O NMe2
HO HO 7 HO
2⬘
OH OH
O O
H 3C CH3 H3C CH3
O O
O CH3 O CH3
CH3 O CH3 O
O CH3 O CH3
H3C H3C OMe
OMe
OH OH
32.35 Roxithromycin 32.36 Azithromycin
NMe2
O
OH
N CH3
H O
H3C CH3 O N N
O O N
H3C N CH3 O
HN O O N S
O CH3
N H
CH3 O N O
O2S O O
O NH O
OH
NEt2 N
Fig. 32.20 Chemical structures of a few antibiotics that bind to the 50S subunit of the ribosome.
The substances 32.33–32.38 represent macrolides.
844 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
product was marketed in 1952 under the name Iloson ®. Its total synthesis was
a challenge for the synthetic chemists. Erythromycin’s total synthesis from simple
starting materials was first accomplished in 1981 in the research group of Robert
Woodward. The compound is well tolerated, but has inadequate acid stability. The
free OH group in the 7-position reacts with the 10-carbonyl group by intramolecular
ketalization. This step initiates the substance’s degradation to products that are
inactive as antibiotics. Therefore, erythromycin must be administered in the form of
gastric acid-resistant tablets. Clarithromycin 32.34 is derived from erythromycin by
ether formation at the 7-OH group. This suppresses the instability under acidic
conditions. Analogously, roxithromycin 32.35 achieves comparable stability by an
exchange of the 10-carbonyl group for an oxime. In azithromycin 32.36, the lactone
ring is expanded to 15 members, and the carbonyl group is replaced by
a methylamino group, which is not susceptible to attack by the OH group.
The sensitivity spectrum of Gram-positive pathogens against these macrolides is
somewhat different, and this is also because of differences in bioavailability.
Erythromycin can be used well topically. Therefore it is often used for skin
diseases. Clarithromycin, roxithromycin, and azithromycin are acid-stable and
have better tissue penetration. They are often used to treat respiratory infections,
and infections of the ears, nose, and throat. Erythromycin and clarithromycin are
potent cytochrome P450 CYP 3A4 inhibitors (▶ Sect. 27.6). Therefore the metab-
olism of numerous other drugs that are metabolized by this enzyme can be blocked.
If this fact is overlooked in the dosing, a dangerous increase in the concentration of
simultaneously applied drugs can result (Fig. 32.20).
The binding modes of erythromycin 32.33 and roxithromycin 32.35 are shown in
Fig. 32.21. As mentioned, they obstruct the exit tunnel of the nascent peptide chain in
the vicinity of the peptidyl transferase center in the ribosome. The region is formed
exclusively by RNA building blocks and the binding takes place largely through
pronounced van der Waals contacts with the tunnel wall. A decisive interaction
is found in the form of an H-bond between the nucleoside adenosine 2058 to the
20 -OH group of the amino sugar group. The development of resistance plays an
important role in the use of these antibiotics too. Exchanging the adenine base for
a guanine causes the inhibitory potential of erythromycin to be reduced by five orders of
magnitude. For steric reasons, a guanine at position 2058 leads to a repulsive interaction
with the ribosome (Fig. 32.21). It is just this exchange that is observed in resistant
mutants of clinical pathogens. Interestingly, eukaryotes also display a guanine at this
position. This fact explains why the 14-membered macrolide has good selectivity for
the inhibition of bacterial ribosomes because they display an adenine there.
In many examples in this book it has been shown how, small molecules find
exactly the intended site of action among many macromolecular target molecules
based on their appropriate steric construction and also their correct placement of
interacting functional groups. Perhaps the question has occurred to some readers
as to whether the situation might occur in which two ligands exert their influence
on a target structure by synergistic binding. In fact, these cases exist. Many of
these have probably not been recognized yet, above all those in which the affinity
of both components differs strongly. The mode of action of such potentiating
32.7 Macrolides: Microbial Warheads 845
A2057
Erythromycin
2.30 A2058→ G
2.99
3.02
A2059
U2609
Roxithromycin
Fig. 32.21 Crystallographically determined binding geometry of erythromycin 32.33 (gray) and
roxithromycin 32.35 (brown) at the beginning of the peptide tunnel near the peptidyl transferase
center. An essential hydrogen bond is formed between the 20 -OH group of the amino sugar moiety
and adenosine 2058 (green, 2.99 Å). A mutation of A2038 to guanosine (orange), causing
resistance, brings an amino group into the direct vicinity of the macrolide. A repulsive distance
of 2.30 Å (violet) indicates an unfavorable interaction. At 3.02 Å, the distance between the amino
group and the ether oxygen atom is not favorable. As a result, the binding affinity of the macrolide
to the A ! G resistant mutant is reduced by five orders of magnitude.
effects has only been characterized in very few cases. One such example shall be
discussed as a final case. The macrocyclic streptogramine A and B, dalfopristin
32.37, and quinupristin 32.38 bind to the ribosome in close proximity to one
another (Fig. 32.22). Quinupristin 23.38 arranges, comparably to erythromycin, in
the front part of the ribosomal tunnel. In this way, very short peptides can still be
synthesized by the ribosome. Dalfopristin 32.37 additionally prevents even these
synthesis steps by its binding in the peptidyl transferase center, and the accom-
modation of the tRNA molecule does not occur. If the binding position of
dalfopristin is compared with that of chloramphenicol 32.27 (Fig. 32.23), a very
similar volume segment is occupied. The mutually enhanced binding of the two
macrolides is explained by a pronounced hydrophobic contact surface that
reduces the solvent-accessible surface. Furthermore, an altered conformation is
observed for the highly conserved, catalytically important U2585 residue. This
causes a stable distortion in the peptidyl transferase center. This additional effect
contributes to the synergistic inhibition of the ribosome when both macrolides are
bound simultaneously. Both compounds came to market as a 70:30 mixture of
dalfopristin/quinupristin in 2000 under the brand name Synercid ®. The drug
represents a potent antibiotic against highly resistant bacterial strains.
846 32 Biologicals: Peptides, Proteins, Nucleotides, and Macrolides as Drugs
Fig. 32.23 Dalfopristin 32.37 (red) and quinupristin 32.38 (green) are shown in their crystallo-
graphically determined binding geometries with transparent surfaces. Analogously oriented bind-
ing modes are shown for erythromycin 32.33 (gray), clindamycin 32.32 (yellow), and
chloramphenicol 32.31 (light-blue), superimposed on the structural data. Quinupristin binds in
the ribosomal tunnel analogously to erythromycin. Dalfopristin orients in the peptidyl transferase
center and blocks the uptake of tRNAs in the A- and P-sites, in an analogous manner as
chloramphenicol.
32.8 Synopsis 847
32.8 Synopsis
Bibliography
General Literature
Aboul-Fadl T (2005) Antisense oligonucleotides: the state of the art. Curr Med Chem
12:2193–2214
Banting A (2004) The bittersweet science. In: Pinker S (ed), Folker T (Series ed) The New York
Times Magazine, 16 Mar 2003 or in The best American science and nature writing 2004,
Houghtom Mifflin Company
Brekke OH, Sandlie I (2003) Therapeutic antibodies for human diseases at the dawn of the twenty-
first century. Nat Rev Drug Discov 2:52–62
Das K, Lewi PJ, Hughes SH, Arnold E (2005) Crystallography and the design of anti-AIDS drugs:
conformational flexibility and positional adaptability are important in the design of non-
nucleoside HIV-1 reverse transcriptase inhibitors. Prog Biophys Mol Biol 88:209–231
Dürfahrt T, Marahiel MA (2005) Peptidantibiotika vom molekularen Fließband. Nachr Chem
53:507–513
Kurreck J (2003) Antisense technologies: improvement through novel chemical modifications.
Eur J Biochem 270:1628–1644
Bibliography 849
Milenic DE, Brady ED, Brechbiel MW (2004) Antibody-targeted radiation therapy. Nat Rev Drug
Discov 3:488–498
Poehlsgaard J, Douthwaite S (2005) The bacterial ribosome as a target for antibiotics. Nat Rev
Microbiol 3:870–881
Vivet-Boudou V, Didierjean J, Isel C, Marquet R (2006) Nucloside and nucleotide inhibitors of
HIV-1 replication. Cell Mol Life Sci 63:163–186
Yonath A, Bashan A (2004) Ribosomal crystallography: initiation, peptide bond formation, and
amino acid polymerization are hampered by antibiotics. Annu Rev Microbiol 58:233–251
Special Literature
Graham J, Muhsin M, Kirkpatrick P (2004) Cetuximab. Nat Rev Drug Discov 3:549–550
Hansen JL, Ippolito JA et al (2002) The structures of four macrolide antibiotics bound to the large
ribosomal subunit. Mol Cell 10:117–128
Harms JM, Schlünzen F, Fucini P, Bartels H, Yonath A (2004) Alterations at the peptidyl
transferase centre of the ribosome induced by the synergistic action of the streptogramins
dalfopristin and quinupristin. BMC Biol 2:4
Laponogov I, Sohi MK et al (2009) Structural insights into the quinolone–DNA cleavage complex
of type II topoisomerase. Nat Struct Mol Biol 16:667–669
Saenger W et al (2000) The tetracycline repressor—a paradigm for a biological switch, Angew.
Chem Int Ed 39:2042–2052
Schlünzen F, Zarivach R et al (2001) Structural basis for the interaction of antibiotics with the
peptidyl transferase centre in eubacteria. Nature 413:814–821
Appendix
R
+ H
H3N CH2 COO– + +N COO–
H3N COO–
H2
Glycine (Gly) G Cα-substituted Proline (Pro) P
amino acids
CH3
CH3
R= CH3
CH3 CH3
Alanine (Ala) A Valine (Val) V Leucine (Leu) L
CH3 S
CH3
CH3
Isoleucine (Ile) I Methionine (Met) M Phenylalanine (Phe) F
N
OH
N N
H H
Tyrosine (Tyr) Y Tryptophan (Trp) W Histidine (His) H
CH3
OH OH SH
a Loop b c
Glu106
Helix
His96
Thr200
Ligand rZn2+
Surface
Catalytic Site Folded
Surface Catalytic Site sheet
This figure explains how protein structures with bound ligands are represented
on many images of this book. (a) The protein is schematically represented by the
folding of its main chain. Parts of the polymer chain with β-sheet structure (arrows)
are shown in cyan, helical segments (cylinders) in red and loop regions in green.
(b) Amino acids in the active site are displayed as stick model. If not mentioned
differently, carbon atoms of the protein are shown in orange, those of the ligands in
gray. Oxygens are displayed in red, nitrogens in blue, sulfur in yellow, phosphorous
in orange, fluorine in turquoise, chlorine in green, bromine in brown, iodine in violet
and metal ions in gray-blue. Hydrogen atoms are shown in white, however, mostly for
clarity, they are omitted. (c) Amino acids are labeled by a three letter code (as
indicated in the beginning of this book) along with their position in the sequence
(e.g., His94). Hydrogen bonds formed between a ligand (here: p-fluorophenylsul-
fonamide) and amino acid residues of the protein are indicated as thin green lines.
(d) Next to the binding site, the solvent accessible surface is displayed (cf. section
15.6) as white opal surface. (e) Analogous representation with opal surface, this time
together with the adjacent amino acids of the binding pocket. (f) Overall view on the
protein (here Carbonic Anhydrase II, section 25.7) with the sketched binding pocket
around the catalytic center which is blocked by an inhibitor. The latter molecule binds
to the zinc ion of the protein and forms three hydrogen bonds. The trace of the
polymer chain is shown as a contiguous ribbon, color coding is the same as in (a) for
the different segments (The images were generated with the program DS visualizer
V2.0.1.7347 of Accelrys Inc., Copyright 2005-2007)
Most of the images used in this textbook will be made available as computer animations via the
homepage of the author (www.agklebe.de). Interested readers are advised to consult this
homepage to obtain access to the images as rotatable 3D objects.
Illustration Source References
Fig. 1.4 From Noe CR, Bader A (1993) Chem Britain 29:126–128
Fig. 4.1 Segment of the crystal structure of the complex of the retinol-binding
protein with retinol (PDB code: 1RBP)
Fig. 4.7 From Andrews PR et al (1984) J Med Chem 27:1648–1657
Fig. 5.8 Crystal structure of Candida antarctica lipase with two enantiomers of
the transition-state-analogue inhibitor from Bocola et al (2003) Protein Eng
16:319–322
Fig. 5.12 From Caner H et al (2004) Drug Discov Today 9:105–110
Fig. 5.15 Segment from the crystal structure of a complex of trypsin with DX9065a
(PDB codes: 1MTS and 1MTU)
Fig. 5.16 Segment from the crystal structure of an inhibitor complex of
carboanhydrase II (PDB code: 1CIL and Greer J et al (1994) J Med Chem
37:1035–1054)
Fig. 5.17 Segment from the crystal structure of the complex of the retinoic acid
receptor hRARg with BMS270394/5 (PDB codes: 1EXX and 1EXA)
Figure before Chap. 6: Announcement poster from the research group of the
author on the occasion of a conference in 2003 in Rauischholzhausen, Marburg,
Germany.
Fig. 7.7 Segments from the NMR structure of stromelysin and two fragments 7.1
and 7.2, and the common product 7.3 (Hajduk PJ et al (1997) J Am Chem Soc
119:5818–5827, the coordinates were kindly provided by P. Hajduk at Abbott)
Fig. 7.8 Segment from the crystal structures of thermolysin with different bound
molecular probes (PDB codes: 1FJQ (acetone), 1FJU (acetonitrile), 8TLI
(isopropanol), 1FJW (phenol)), and with bound benzyl succinic acid (PDB code:
1HYT)
Fig. 7.12 Superposition of the crystal structures of thymidylate synthase with
N-tosyl-D-proline derivates (PDB codes: 1F4C, 1F4D, 1F4E)
Fig. 10.10 Segment of the NMR structure of the BCL-XL complex with a 16-residue
peptide from the BAK protein (PDB code: 1BXl)
Fig. 10.14 From Bartlett PA (1992) Caveat user manual, San Francisco
Figure Before Chap. 11: # Dr. Dirk Bossemeyer, German Cancer Research
Center, Heidelberg, Germany.
Fig. 11.3 From Christen HR, Vögtle F (1992) Organische Chemie, 2nd edn, vol II,
Fig. 24.5, p 131. Otto Salle & Sauerl€ander
Fig. 11.4 From Gallop MA et al (1994) Fig. 2, Applications of combinatorial
libraries. J Med Chem 37:1233–1251
Fig. 11.9 From Ramström O, Lehn J-M (2002) Fig. 1, Nat Rev Drug Discov
1:27–36
Fig. 11.10 Superposition of the crystal structures of acetylcholinesterase with a syn
and anti click chemistry reaction product (PDB codes: 1Q83, 1Q84)
Fig. 12.4 Fig. 6 from Lottspeich F (1999) Angew Chem 111:2630–2647; reprinted
with the kind permission of the author and publisher
Fig. 12.5 From an illustration of Fonds der Chemischen Industrie im Verband der
Chemischen Industrie e. V., Mainzer Landstraße 55, 60329 Frankfurt am Main,
Biotechnologie – kleinste Helfer – große Chancen
Fig. 13.2 Crystal packing of the structure with the reference code FUXBIJ
(Cambridge crystallographic database)
Fig. 13.3 Taken from Hargittai I, Hargittai M (1995) Symmetry through the eyes of
a chemist, 2nd edn, Figs. 8–23. Springer, New York, p 363; reprinted with the kind
permission of the author and publisher
Fig. 13.4 Taken from Pohl RW (1983) Einführung in die physik, 18th edn, vol 1,
Mechanik, Akustik and W€armelehre, Fig. 380, p 198; reprinted with the kind
permission of the author and publisher
Fig. 13.5 Taken from Glusker JP, Trueblood KN (1972) Crystal structure analysis,
a primer, Fig. 5. Oxford University Press, New York, p 19
Figs. 13.6 and 13.7 From Keller E (1982) Chem unserer Zeit 16:71–88, Figs. 7 and
25; reprinted with the kind permission of the author and publisher
Fig. 13.9 Electron density of the crystal structure of aldose reductase (PDB code:
1US0)
Fig. 13.10 Reprinted with the kind permission of the Siemens company (b), the
author and publisher, taken from Boese R (1989) Chem unserer Zeit 23:77–85,
Fig. 11, (c), (d), (e), Crystal packing of the structure with the reference code
OXACDH06 (Cambridge Crystallographic database; f)
Fig. 13.11 Reprinted with the kind permission of Bruker AXS Gmbh (b), Crystal
structure of TNF (c–f), (PDB code: 1TNF)
Fig. 13.14 NMR structure of a domain of the guanine nucleotide exchange factor
(PDB code: 1B64)
Fig. 13.15 From Montgomery JA, Niwas S (1993) Chem Tech 30–37: 34, Fig. 4
Fig. 14.1 Stevens ED (1978) Acta Crystallogr B34:544–551, Fig. 1; reprinted with
the kind permission of the publisher
Fig. 14.2 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.7, p 66 and Fig. 2.10,
p 68. MacMillan, New York
Fig. 14.4 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.12, p 70 and
Fig. 2.15, p 73. MacMillan, New York
Illustration Source References 855
Fig. 14.5 Taken from Lesk A (1991) Protein architecture, Fig. 4.1, part b and c,
Oxford University Press, Oxford; reprinted with the kind permission of the
publisher
Fig. 14.7 Kindly provided by Prof. R. Zimmer, LMU Munich (prepared with the
Molscript program; protein structures with den PDB codes: 1TIM, 4FXN, 1I1B,
3MBA, 2RHE, 2STV, 1UBQ, 1APS, 256B)
Fig. 14.8 From Branden C, Tooze J (1991) Introduction to protein structure,
Fig. 5.2, p 60, Figs. 5.14, 5.15, p 69, Fig. 5.17, p 71, Fig. 5.19, p 72. Garland,
New York, and Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.26, p 82. MacMillan,
New York
Fig. 14.9 Crystal structures of triosephosphate isomerase (PDB code: 1TIM) and
flavodoxin (PDB code: 3FXN)
Fig. 14.10 From Zubay G (1988) Biochemistry, 2nd edn, Fig. 2.12. MacMillan,
New York, p 70, and Illustration of a crystal structure of a Fab fragment with
phosphocholine (PDB code: 2MCP)
Fig. 14.13 Taken from Vyas K, Monahar H, Venkatesan K (1990) J Phys Chem
94:6069–6073, Fig. 1; reprinted with the kind permission of the publisher
Fig. 14.14 From a template, the source is unknown
Fig. 14.15 Taken from Bürgi HB, Dunitz JD (1994) Structure correlation, vol 2,
Fig. 13.24. Wiley, p 585; reprinted with the kind permission of the publisher
Fig. 14.16 Distribution of H-bond donor and acceptor groups around an imidazole
moiety; entry from the IsoStar database. Cambridge Crystallographic Data Centre.
http://www.ccdc.cam.ac.uk/products/ csd_system/isostar/
Fig. 14.17 Crystal structures of trypsin (PDB Code: 3PTB) and subtilisin (PDB
code: 1SBC)
Fig. 14.20 Crystal structures von DNA oligonucleotide strands with cisplatin and
daunorubicin (PDB code: 1A2E and 1AL9)
Fig. 15.1 (1994) Discover manual, Part 1, Fig. 3.5, San Diego
Fig. 16.1 Taken from Christen HR, Vögtle F (1992) Organische Chemie, vol I, 2nd
edn, Fig. 2.3, p 71. Otto Salle & Sauerl€ander
Figure before Chap. 17: Announcement poster from the research group of the
author on the occasion of a conference in 2005 in Rauischholzhausen, Marburg,
Germany
Fig. 17.1 Taken from Mackay MF, Sadek M (1983) Aus J Chem 36:2111–2117,
Fig. 1; reprinted with the kind permission of the publisher
Fig. 17.7 Superimposition of the crystal structure of dihydrofolate reductase with
dihydrofolate and methotrexate (PDB codes: 1DHF, 3DFR)
Fig. 17.9 Taken from Seidel W, Meyer H, Kazda S, Dompert W (1984) Fig. 6. In:
Seydel J (ed) QSAR and strategies in the design of bioactive compounds. Wiley,
pp 366–369; reprinted with the kind permission of the publisher
Fig. 17.10 Segment of the crystal structures of thermolysin with different, bound
molecular probes (cf. Fig. 7.8) superimposed with “hot spots” from a calculation
from DrugScore
856 Illustration Source References
Figure before Chap. 22: Announcement poster from the research group of the
author on the occasion of a conference in 2007 in Rauischholzhausen, Marburg,
Germany
Fig. 22.1 From Hopkins AL, Groom CR (2002) Nat Rev Drug Discov 1:727–730
Fig. 22.4 Segment from the crystal structure of a complex of creatinase with
carbamoyl sarcosine (PDB code: 1CHM)
Fig. 23.2 Binding pocket from the crystal structures of trypsin (PDB code: 1PPC),
thrombin (PDB code: 1DWD), faktor VIIa (PDB code: 1W7X) and factor Xa (PDB
code: 2P93)
Fig. 23.5 Segment of the crystal structure of the complex of thrombin with
cyclotheonamide A, an inhibitor from the marine sponge Theonella sp. (PDB
code: 1TMB)
Fig. 23.6 Modeled geometry from a crystal structure of thrombin with fibrinopep-
tide (PDB code: 1FPH)
Fig. 23.7 Superposition of the crystal structures of thrombin with fibrinopeptide
(PDB code: 1FPH) and PPACK (PDB code: 1PPB)
Fig. 23.10 Segment from the crystal structure of the complex of thrombin with
NAPAP (PDB code: 1DWD)
Fig. 23.12 Comparison of the crystal structures of NAPAP with trypsin and
thrombin (PDB code: 1PPC and 1DWD)
Fig. 23.17 Segment from the crystal structure of the complex of elastase with
a pyridone-like inhibitor (PDB code: 1EAT)
Illustration Source References 857
Fig. 23.18 Segment from the crystal structure of the complex of factor Xa with
rivaroxaban (PDB code: 2W26)
Fig. 23.24 Segment from the crystal structure of the complex of 1b-lactamase (PDB
code: 1TEM)
Fig. 23.26 Crystal structure of the yeast proteasome with bortezomib (PDB code: 2F16)
Fig. 23.27 Segment from the crystal structure of calpain II with leupeptin (PDB
code: 1TL9)
Fig. 24.3 Crystal structures of the aspartic protease cathepsin D (PDB code: 1LYB),
endothiapepsin (PDB code: 4ER1), HIV protease (PDB code: 5HPV), plasmepsin
(PDB code: 1SME), and renin (PDB code: 4APR)
Fig. 24.8 Superposition of the crystal structure of renin with CGP-38560 (PDB
code: 1RNE) and aliskiren (PDB code: 2V0Z)
Fig. 24.11 Segment from the crystal structure of renin with a piperidine-like
inhibitor (PDB code: 1UTH)
Fig. 24.12 Crystal structure of HIV protease with a peptide substrate (PDB code:
1MT9)
Fig. 24.18 Superposition of the crystal structures of HIV protease with a urea-like
(PDB code: 1HVR) and coumarin-like inhibitor (PDB code: 1UPJ)
Fig. 24.24 Crystal structures of HIV protease with inhibitors with a secondary
amine nitrogen atom (PDB codes: 1XL2, 3BHE, 2PQZ, 3BGB)
Fig. 24.25 Superposition of the ligands in the crystal structure with HIV
protease ritonavir (PDB code: 1HXW), atazanavir (PDB code: 2AQU), darunavir
(PDB code: 1T3R), amprenavir (PDB code: 1HPV), indinavir (PDB code: 1HSG),
nelfinavir (PDB code: 1OHR), saquinavir (PDB code: 1HXB), lopinavir (PDB
code: 2O4S), tipranavir (PDB code: 2O4P), and 24.58 (PDB code: 2QQN)
Fig. 25.2 Segment from the crystal structure of the complex of matrix metalloproteinase
MMP-12 with the cleavage product of the protease reaction (PDB code: 2OXZ)
Fig. 25.5 Segment from the superposed crystal structures of the complexes of
thermolysin with the inhibitor Cbz-GlyP-Leu-Leu (PDB code: 5TMN) and
a cyclized inhibitor (PDB code: 1PE5) derived from it
Fig. 25.6 Segment from the crystal structure of the complex of carboxypeptidase
with benzylsuccinate (PDB code: 1CBX)
Fig. 25.12 Segment from the crystal structure of the complex of lisinopril with
t-ACE (PDB code: 1O86)
Fig. 25.14 Segment from the crystal structure of the complex of fibroblast collage-
nase with Ro 31–4724 (PDB code: 2TCL)
Fig. 25.16 Segment from the crystal structures of the complexes of fibroblast
collagenase with a peptidic (25.49) and a non-peptidic inhibitor (25.50, PDB
codes: 1HFC and 966C)
Fig. 25.17 Segment from the crystal structure of the complex of carboanhydrase II
with p-fluorophenylsulfonamide and modeled geometries of a carbonylation in
CA II (PDB code: 1IF4)
Fig. 25.20 Segment from the crystal structure of the complex of phosphodiesterases
5 and sildenafil (PDB code: 1UDT)
858 Illustration Source References
Fig. 25.22 Segment from the crystal structure of the complex of peptide
deformylase from Escherichia coli with actinonin (PDB code: 1G2A)
Fig. 26.3 Crystal structure of the cAMP-dependent protein kinase (PDB code:
1L3R)
Fig. 26.4 Modeled geometries of the transition state on the coordinates of the
crystal structure of the cAMP-dependent protein kinase (PDB code: 1L3R)
Fig. 26.7 Crystal structure des complex of MAP kinase p38 with SB203580 (PDB
code: 1A9U)
Fig. 26.9 Superposition of the inactive and active form of the tyrosine kinase
domains of the human insulin receptor (PDB codes: 1IRK and 1IR3)
Fig. 26.10 From a figure out of Fabian MA et al (2005) Nat Biotechnol 23:329–336;
reprinted with the kind permission of the author and publisher
Fig. 26.12 Superposition of the crystal structures of BCR-ABL protein kinase with
bound imitinib (Gleevec®) and tetrahydrostaurosporine (PDB codes: 2HYY and
2HZ4)
Fig. 26.14 Segments from the crystal structures of Src kinase with
ANP and the mutated Src-kinase with N6-benzyl-ADP (PDB codes: 1KSW
and 2SRC)
Fig. 26:17 Superposition of the crystal structures of Ser/Thr-Kinase PIM-1 with
staurosporine and a ruthenium complex (PDB codes: 1YHS and 2BZH)
Fig. 26.19 Segment from the crystal structure of human tyrosine phosphatase
PTP-1B (PDB code: 1PTY)
Fig. 26.20 Segments from the crystal structures of human tyrosine phosphatase
PTP-1B with different inhibitors (PDB codes: 1PTY, 1NO6, 1NNY, 1N6W)
Fig. 26.22 Segment from the crystal structure of human tyrosine phosphatase
PTP-1B with an allosteric inhibitor (PDB code: 1T4J)
Fig. 26.23 Crystal structure of COMT with a substrate-analogue inhibitor and
S-adenosyl-L-methionine (PDB code: 1VID)
Fig. 26.25 Superposition of the crystal structures of COMT (PDB code: 1VID and
1JR4)
Fig. 26.26 Superposition of the crystal structures of FTase with farnesyl diphos-
phate and the farnesylated tetrapeptide CAAX (PDB codes: 1FT2 and 1D8D)
Fig. 26.28 Superposition of the crystal structures of with BMS-214662 and the
farnesylated tetrapeptide CAAX (PDB codes: 1SA5 and 1D8D)
Fig. 27.3 Segment from the crystal structure of dihydrofolate reductase from
Lactobacillus casei with methotrexate (PDB code: 3DFR)
Fig. 27.4 Segment from the crystal structure of horse liver alcohol dehydrogenase
with bound NADPH (PDB code: 1HET)
Fig. 27.7 Segment from the crystal structure of cytochrome P450 14-a-
steroldemethylase (CYP51) from Mycobacterium tuberculosis in complex with
fluconazole 27.6 (PDB code: 1EA1)
Fig. 27.11 Superposition of the binding pocket of Pneumocystis jiroveci and murine
DHFR (PDB codes: 2FZI and 2FZJ)
Fig. 27.14 Segment from the crystal structure of HMG-CoA reductase with bound
HMG-CoA and mevalonic acid (PDB codes: 1DQA, 1DQ9)
Illustration Source References 859
Fig. 27.15 Segment from the crystal structure of HMG-CoA reductase with bound
inhibitors simvastatin and atorvastatin (PDB codes: 1HW9, 1HWK)
Fig. 27.19 Segment from four crystal structures from aldose reductase with sorbinil,
tolrestat, IDD594, and 27.46 (PDB codes: 1AH0, 2FZD, 1US0, 2NVD)
Fig. 27.20 Binding pocket from the crystal structure of aldose reductase with
sorbinil (PDB codes: 1AH0)
Fig. 27.24 Crystal structure of human 11b-HSD1 with carbenoxolone
superimposed with the complex of murine 11b-HSD1 with bound corticosterone
(PDB codes: 2BEL, 1Y5R)
Fig. 27.25 Segment from the crystal structures of human 11b-HSD1 in complex
with two inhibitors and the complex of murine 11b-HSD1 with bound corticoste-
rone (PDB codes: 2ILT, 2RBE, 1Y5R)
Fig. 27.27 Crystal structures of human CYP 3A4 uncomplexed and in complex with
metyrapone, erythromycin, and ketoconazole (PDB codes: 1W0E, 1W0G, 2J0D,
2V0M)
Fig. 27.30 From Fig. 3 in Weinshilboum and Wang (2004) Nat Rev Drug Discov
3:739–748
Fig. 27.33 Crystal structures of human MAOB in complex with tranylcypromine
and L-deprenyl (PDB codes: 1OJB and 2BYB)
Fig. 27.34 Crystal structures of human MAOA in complex with clorgyline (PDB
code: 2BXR) and MAOB with L-deprenyl (PDB code: 2BYB)
Fig. 27.37 Superposition of the crystal structures of cyclooxygenase-1 and 2 in
complex with arachidonic acid (PDB codes: 1PRH and 1CVU)
Fig. 27.39 Segment from the crystal structures of cyclooxygenase with arachidonic
acid (PDB code: 1DIY) and prostaglandin PGH2 (PDB code: 1DDX)
Fig. 27.41 Segment from the crystal structures of cyclooxygenase-1 with a bromine
analogue of acetylsalicylic acid (PDB code: 1PTH)
Fig. 27.42 Segment from the crystal structures of cyclooxygenase-2 with a bromine
analogue of celecoxib (PDB code: 6COX)
Fig. 28.2 Crystal structure of the DNA-binding domain of the estrogen receptor
with a bound oligonucleotide strand (PDB code: 1BY4)
Fig. 28.4 Segment from the crystal structures of the ligand-binding domain of the
estrogen receptor with bound estradiol (PDB code: 1ERE)
Fig. 28.5 Segment from the crystal structure of the ligand-binding domain of the
progesterone receptor with bound progesterone (PDB code: 1A28)
Fig. 28.6 Comparison of the crystal structures of the estrogen receptor with bound
estradiol and raloxifen (PDB codes: 2J7X and 1ERR)
Fig. 28.8 Segment from the crystal structure of the ligand-binding domain of the
estrogen receptor with bound estradiol and the LxxLL binding motif (PDB code: 2J7X)
Fig. 28.13 Superposition of the crystal structures of the ligand-binding domain of
the PPARg receptors with a bound agonist (PDB code: 1K7L) and antagonist (PDB
code: 1KKQ)
Fig. 28.16 Schematic course of the secondary structural elements in the crystal
structures of the estrogen receptors (PDB code: 2J7X) and three examples for the
PXR receptor (PDB codes: 1NRL, 1M13, 1SKX)
860 Illustration Source References
Fig. 29.1 Folding pattern as they are found in the crystal structure of bacteriorho-
dopsins (PDB code: 1BRD), bovine rhodopsin (PDB code: 1U19), and the human
b2-adrenergic receptor (PDB code: 2RH1)
Fig. 29.2 Segment of the crystal structure of the human b2-adrenergic receptors
(PDB code: 2RH1)
Fig. 29.4 Superposition of the crystal structures of the b1-adrenergic
receptors with bound cyanopindolol (PDB code: 2VT4) and isoprenaline (PDB
code: 2Y03)
Fig. 29.5 Crystal structures of a mutant of the inactive (PDB code: 1GZM) and
active (PDB code: 2X72) rhodopsin
Fig. 29.13 Crystal structure of the erythropoietin receptor with bound erythropoi-
etin (EPO; PDB code: 1CN4)
Fig. 30.3 Schematic representation of the crystal structure of the bacterial potas-
sium channel KcsA (PDB code: 1K4C)
Fig. 30.4 Segment of the crystal structure of the bacterial potassium channel KcsA
(PDB code: 1K4C)
Fig. 30.5 Segment of the crystal structures of a selective and unselective bacterial
potassium channel (PDB codes: 1K4C, 2AHY)
Fig. 30.9 Crystal structure (electron diffraction) of the nicotinic acetylcholine
receptor in the closed state from the electric organ of an electric ray (PDB code:
2BG9)
Fig. 30.11 Crystal structures of the ligand-binding domain of the nicotinic acetyl-
choline receptors from the California sea slug (Aplysia californica) with bound
a-conotoxin (PDB code: 2BYP) and epibatidine (PDB code: 2BYQ)
Fig. 30.12 Segment from the crystal structures of complexes of the ligand-binding
domains of the nicotinic acetylcholine receptor from the California sea slug
(Aplysia californica) with bound a-conotoxin (PDB code: 2BYP), methyllyca-
conitine (PDB code: 2BYR), a-lobeline (PDB code: 2BYS), and epibatidine (PDB
code: 2BYQ)
Fig. 30.15 Crystal structure of the bacterial ClC channel from Escherichia coli
(PDB code: 1OTS)
Fig. 30.16 Crystal structure of the pore from the bacteria Rhodobacter capsulatus
(PDB code: 2POR)
Fig. 30.17 Crystal structure of gramicidin A (PDB code: 1GRM)
Fig. 30.18 Model of nonactins based on a crystal structure (CSD Refcode:
NONKSC)
Fig. 30.19 Crystal structure of bovine aquaporin 1 (PBD code: 1J4N)
Fig. 31.3 Segment from the crystal structure of the aIIb3 integrin receptor with
a cyclopeptide (PBD code: 1L5G)
Fig. 31.5 Superposition of the segment from the crystal structure of the aIIb3
integrin receptor with eptifibatide (PBD code: 1TY6/2VDN) and tirofiban (PBD
code: 1TY5/2VDM)
Fig. 31.6 From a drawing of the research group of Prof. B. Ernst, University of
Basel (http://www.pharma.unibas.ch/molpharm/index.html)
Illustration Source References 861
Fig. 31.7 Segment from the crystal structure of sialyl-LewisX and a selectin (PDB
code: 1G1R)
Fig. 31.9 Schematic drawing analogous to Doranz et al (1999) J Virol
12:10346–10358
Figs. 31.10 and 31.11 Segment from the crystal structure of the gp41 protein (PBD
code: 1AIK)
Fig. 31.15 Segments from the crystal structures of neuraminidase with zanamivir
(PDB code: 1A4G) and oseltamivir (PDB code: 2HT8)
Fig. 31.17 Superposition of the crystal structures of the capsid proteins of HRV-14
in complex with pleconaril (PDB code: 1NA1) and the cryoelectron-
microscopically determined complex with domains of the adhesion protein (PDB
code: 1D3I)
Fig. 31.19 Crystal structure HRV-14 capsid protein in complex with pleconaril
(PDB code: 1NA1)
Fig. 31.21 Crystal structure of the tertiary complex of a MHC I molecule with
a nonapeptide and the T-cell receptor (PDB code: 1BD2)
Fig. 31.23 Modeled complex of 31.42 (brown) and 31.43 (green) based on the
crystal structure of a tertiary complex (PDB code: 1AO7); from Douat-Casassus
C et al (2007) J Med Chem 50:1598–1609; coordinates were kindly provided by the
author
Fig. 32.1 Crystal structure of a complete IgG antibody (PBD code: 1IGT)
Figs. 32.2 and 32.3 Comparison of the crystal structures of the Fab domain of an
antibody with phosphocholine (PBD code: 2MCP) and lysozyme (PBD code: 1FBI)
Fig. 32.5 From Fig. 3 in Milenic DE et al (2004) Nat Rev Drug Discov
3:488–498
Fig. 32.6 NMR structure of an oligomeric double strand of RNA and PNA (PBD
code: 176D)
Fig. 32.8 Crystal structure of the HIV reverse transcriptase with a bound
RNA–DNA hybrid double strand (PBD code: 1HYS)
Fig. 32.9 Structure comparison of the crystal structures of HIV reverse transcriptase
with a bound DNA double strand and bound thymidine-50 -triphosphate (PBD code:
1RTD) and AZT (PBD code: 1N5Y)
Fig. 32.10 Superposition of the crystal structures of HIV reverse transcriptase in an
uncomplexed and a nevirapine-complexed state (PBD code: 1DLO and 1VRT)
Fig. 32.12 Crystal structures of HIV reverse transcriptase with two allosterically
acting triazines (PBD code: 1S9E and 1S9G)
Fig. 32.13 Crystal structure of topoisomerase IV from Streptococcus pneumoniae
with bound moxifloxacin (PDB-Code: 3FOF)
Fig. 32.15 Crystal structure of the Tet-repressor with bound DNA–oligonucleotide
(PDB-Code: 1QPI) and tetracycline (PDB-Code: 1BJY)
Fig. 32.19 Taken from Hansen JL et al (2002) Mol Cell 10:117–128, Fig. 4;
reprinted with the kind permission of the author and the publisher
Fig. 32.21 Crystallographically determined binding mode of erythromycin and
roxithromycin in the ribosome (PBD code: 1JZY and 1JZZ)
862 Illustration Source References
Fig. 32.22 Figure taken from Harms JM et al (2004) BMC Biol 2:4, Fig. 3; reprinted
with the kind permission of the author and the publisher
Fig. 32.23 Superposition of the crystallographically determined binding modes
dalfopristin, quinupristin (PBD code: 1SM1), erythromycin (PBD code: 1JZY),
chloramphenicol (PBD code: 1K01), and clindamycin (PBD code: 1JZX)
A Bocola, Marco, 96
Abraham, Donald, 382 Bode, Wolfram, 503, 504
Agre, Peter, 772 Böhm, Hans-Joachim, 443
Aguilar, Abelardo, 843 Bossenmeyer, Dirk, 210
Ahlquist, Raymond, 721 Böttcher, Jark, 555
Aires, Buenos, 42 Boyer, Herbert, 234, 235
Alarich, 42 Bragg, William, 316
Aldrich, Thomas Bell, 9 Bragg, William Henry, 265
Alex, Richard, 736 Bragg, William Lawrence, 265
Alexander, 42 Brenner, Sydney, 134
Amgen, 815, 822 Brodie, 406
Anderson, E.S., 234 Buck, Linda, 735, 736
Andromachus, 4 B€
urgi, Hans-Beat, 305
Ariëns, Everhardus J., 101
Arnold, Edward, 832
C
Cahn, Arnold, 23
B Cahn, R.S., 93
Babbage, Charles, 233 Capecchi, Mario, 245
Bajusz, Sándor, 502, 503 Capote, Truman, 38
Baltimore, David, 828 Carson, Rachel, 44
Banting, Frederick, 816 Carter, Paul, 496
Bartlett, Paul, 79, 204, 571 Caruso, Enrico, 38
Bayer, Adolf v., 9 Chain, Ernst Boris, 28
Beddell, Chris, 430 Christie, Agatha, 38
Bentham, Jeremy, 423 Cinchon, 42
Berger, Arieh, 302 Clement, Bernd, 509
Bernard, Claude, 372 Clinton, Bill, 239
Bernays, Martha, 53 Cobbe, Frances Power, 423
Berney, W., 31 Cohen, Stanley, 234
Bertini, Ivano, 566 Corey, 323
Besler, Basilius, 2 Craig, Paul, 157
Best, Charles, 816 Cramer, Friedrich, 63
Biot, Jean Baptist, 89 Cramer, Richard, 381
Black, James W., 7, 54 Criciani, Gabriele, 675
Bloch, Felix, 265 Crick, Francis, 316
Blow, David, 494 Crum-Brown, Alexander, 372
Blundell, Tom, 438, 541 Cushman, David, 574, 575, 577
G
Galvani, Luigi, 5 K
Ganellin, Robin, 403 Kafka, Franz, 38
Gasteiger, Johann, 318 Karplus, Martin, 366
Gates, Marshall, 49 Kearsley, Simon, 359
Name Index 865
L P
La Roche, Hoffman, 31 Paracelsus, 5, 420
Lacassagne, Antonie, 706 Pasteur, Louis, 89
Le Bel, Joseph Achille, 91, 315 Pauling, Linus, 64, 316, 323
Lehn, Jean-Marie, 227, 229 Pearlman, Robert, 318
Lemmen, Christian, 360 Pemberton, John S., 52
Li Shizhen, 5 Perkins, William Henry, 25
Liebreich, Oskar, 25 Perutz, Max, 316
Lipinski, Chris, 215, 410 Petzko, Greg, 143
Lippold, Bernard, 399 Pincus, Gregory, 708
Lipscomb, William, 565 Popper, Karl, 153
Loewi, Otto, 9 Pravaz, Charles G., 49
Long, Crawford W., 25 Prelog, V., 93
Loschmidt, Joseph, 14 Priestle, John, 542
Purcell, Edward, 265
M
MacKinnon, Roderick, 765 Q
Mally, Josef, 677 Quideau, Stéphane, 806
Mann, Thomas, 8, 38
Mares-Guia, Marcos, 499
Mariani, Angelo, 52 R
Marquardt, Fritz, 503 Ramakrishnan, Venkatraman, 838
Marshall, Garland, 356 Ramström, O., 229
Martin, Yvonne, 731 Rarey, Matthias, 368, 441
Meggers, Eric, 618 Reymond, Jean-Louis, 215
Mello, Craig, 247 Richet, Charles, 372
Merrifield, Robert Bruce, 216 Ringe, Dagmar, 143
Meyer, Emanuel, 460 Ritschel, Tina, 460
Meyer, Hans, 677 Röntgen, Wilhelm, 265
Meyer, Hans Horst, 373 Roosevelt, Theodore D. Jr., 28
866 Name Index
Y
T Yonath, Ada, 838
Takamine, Jokichi, 9
Temin, Howard, 828
Topliss, John, 154, 157
Tschudi, Gilg, 50 Z
Tucholsky, Kurt, 38 Zentgraf, Matthias, 328
Subject Index
Humira ®, 822 I
Humoral, 9 Ibritumomab, 822
Humoral complement system, 802 Ibuprofen, 104, 690, 692, 694
Humoral immune response, 837 ICAM-1, 799
Humoral system, 816 ICAMS. See Intercellular adhesion molecules
Humulin ®, 235 (ICAMS)
Humulin insulin, 235 ICI, 7, 510, 707
Huperzin A, 115 ICI 200880, 510, 511
Hybridoma cells, 819 IC50 value, 67
Hydantoinases, 94 IDD594, 664, 665
Hydration enthalpies, 749 Idea generators, 445
Hydride ion, 649 I.G. Farbenindustrie, 24
Hydrochlorothiazide, 160 IgG antibody, 817
Hydrocortisone, 710 IH values, 403
Hydrogen-bond acceptor, 69 IL-2, 837
Hydrogen-bond donor, 68 Iloilo, 843
Hydrogen bond network, 72 Iloson ®, 844
Hydrogen bonds (H-bonds), 68, 168 Image-mirror-image pair, 91
Hydrolases, 474 Imatinib, 18, 611–613
a/b-Hydrolases, 243 Imide bond, 306
Hydrophobic contact, 73 Imipenem, 523
Hydrophobic interactions, 71 Imipramine, 13, 32, 162, 163, 673
Hydrophobic protein–ligand interaction, 74 Immune reactions, 486, 823
Hydrophobic test compounds, 132 Immune response, 777
Hydroxamic acids, 569 Immune response modulation, 688
p-Hydroxyacetanilide, 24 Immune stimulation, 777, 805, 837
4-Hydroxycyclohexanone, 551 Immune system, 816
Hydroxydebrisoquine, 676 Immunoassays, 132
4-Hydroxydebrisoquine, 675 Immunoglobulins, 301, 302
Hydroxylases, 474 Immunosuppressants, 260, 837
Hydroxymethylglutaric acid (HMG), 655 Immunosupresssive, 709
Hydroxymethylglutaryl-coenzyme Imperial University in Dorpat, 7
A (HMG-CoA), 176, 178, 653, 655 Incretin hormones, 516
inhibitors, 166 Indinavir, 166
reductase, 176, 643, 653, 655, 693 Indolylthioureas (ITU), 832
reductase inhibitor, 655 Indometacin, 690, 692
3-Hydroxy-3-methylglutaryl-coenzyme-A Induced fit, 64, 350
reductase (HMG-CoA reductase), 653 Induced-fit adaptations, 460
p-Hydroxypropiophenone, 406 Industrialized countries, 3
11b-Hydroxysteroid dehydrogenase INF-a, 741
(11b-HSD), 643, 665, 666 Infantile paralysis, 798
4-Hydroxy-tamoxifen, 703 INF-b, 741
Hygiene, 3 Infections, 756
Hyperforin, 673, 714–717 Infectious disease, 450
Hyperpolarization, 748, 762 Infertility, 697, 707
Hypertension, 722, 749, 823 Inflammation, 113, 527
Hypertensive crises, 681 Inflammatory cascade, 488, 784
Hypervariability loops, 818 Inflammatory diseases, 825
Hypnotics, 763 Inflammatory mediators, 687
Hypoglycemic coma, 12 Inflammatory modulation, 590
Hypoglycemics, 160 Inflammatory processes, 783
Hypotensive effect, 537 Infliximab, 18
Hypothermia, 116 1918 Influenza, 8
884 Subject Index