Professional Documents
Culture Documents
In preparing the second edition of this book we have again relied heavily on
and benefited greatly from the advice and constructive criticism of numerous
colleagues. We are particularly grateful to Ken Holmes (Max-Planck Institute,
Heidelberg), Lawrence Stern (MIT), Michelle Arkin (Sunesis Pharmaceuticals),
and Watson Fuller (Keele University, UK) for their contributions to, respec-
tively, Chapters 14 and 18, Chapter 15, Chapter 17 and Chapter 18. Stephen
Harrison (Harvard University) and Paul Sigler (Yale University) provided
extensive help and advice on Chapters 8–10, 13 and 16, and Chapters 8–10
and 13, respectively, for which we are especially grateful.
The following, in alphabetical order, have reviewed one or more chap-
ters, correcting our errors of fact or interpretation and helping to ensure they
have the appropriate balance and emphasis: Tom Alber (University of Cali-
fornia, Berkeley), Tom Blundell (Cambridge University, UK), Stephen Burley
(Rockefeller University), Charles Craik (University of California, San Francis-
co), Ken Dill (University of California, San Francisco), Chris Dobson (Oxford
University, UK), Anthony Fink (Unversity of California, Santa Cruz), Robert
Fletterick (University of California, San Francisco), Richard Henderson (LMB,
Cambridge, UK), Werner Kühlbrandt (MPI, Frankfurt), David Parry (Massey
University, New Zealand), Greg Petsko (Brandeis University), and David
Trentham (NIMR, London, UK).
The book depends for its accessibility upon its illustrations and we are
hugely indebted to Nigel Orme, who, as with the first edition, has converted
sketches into lucid figures. Keith Roberts has again advised us on how best
graphically to represent chemical and structural phenomena. Jane Richard-
son (Duke University) has generously produced the Kinemage supplement to
this edition and the book relies upon Richardson-type diagrams throughout
to render the structures discussed comprehensible. We thank our publishers
Garland Publishing, now part of the Taylor and Francis Group, for their sup-
port, and in particular Matthew Day for his enthusiastic editing of the com-
plete manuscript. Miranda Robertson, in her inimitable style, has again
managed the entire project.
vii
Contents
Part I Basic Structural Principles 1 Domains are built from structural motifs 30
1. The Building Blocks 3 Simple motifs combine to form complex motifs 30
Proteins are polypeptide chains 4 Protein structures can be divided into
The genetic code specifies 20 different amino three main classes 31
acid side chains 4 Conclusion 32
Cysteines can form disulfide bridges 5 Selected readings 33
Peptide units are building blocks of protein
structures 8 3. Alpha-Domain Structures 35
Glycine residues can adopt many different Coiled-coil a helices contain a repetitive
conformations 9 heptad amino acid sequence pattern 35
Certain side-chain conformations are The four-helix bundle is a common domain
energetically favorable 10 structure in a proteins 37
Many proteins contain intrinsic metal atoms 11 Alpha-helical domains are sometimes large
Conclusion 12 and complex 39
Selected readings The globin fold is present in myoglobin and
12
hemoglobin 40
2. Motifs of Protein Structure 13 Geometric considerations determine
The interior of proteins is hydrophobic 14 a-helix packing 40
The alpha (a) helix is an important Ridges of one a helix fit into grooves of an
element of secondary structure 14 adjacent helix 40
The a helix has a dipole moment 16 The globin fold has been preserved during
Some amino acids are preferred in evolution 41
a helices 16 The hydrophobic interior is preserved 42
Beta (b) sheets usually have their b strands Helix movements accommodate interior
either parallel or antiparallel 19 side-chain mutations 43
Loop regions are at the surface of Sickle-cell hemoglobin confers resistance
protein molecules 21 to malaria 43
Schematic pictures of proteins highlight Conclusion 45
secondary structure 22 Selected readings 45
Topology diagrams are useful for classification
of protein structures 23 4. Alpha/Beta Structures 47
Secondary structure elements are connected Parallel b strands are arranged in barrels
to form simple motifs 24 or sheets 47
The hairpin b motif occurs frequently in Alpha/beta barrels occur in many different
protein structures 26 enzymes 48
The Greek key motif is found in antiparallel Branched hydrophobic side chains dominate
b sheets 27 the core of a/b barrels 49
The b-a-b motif contains two parallel Pyruvate kinase contains several domains,
b strands 27 one of which is an a/b barrel 51
Protein molecules are organized in a structural Double barrels have occurred by gene
hierarchy 28 fusion 52
Large polypeptide chains fold into several The active site is formed by loops at one
domains 29 end of the a/b barrel 53
ix
Alpha/beta barrels provide examples of Both single and multiple folding pathways
evolution of new enzyme activities 54 have been observed 93
Leucine-rich motifs form an a/b-horseshoe 55 Enzymes assist formation of proper disulfide
fold bonds during folding 96
Alpha/beta twisted open-sheet structures Isomerization of proline residues can be
contain a helices on both sides of the a rate-limiting step in protein folding 98
b sheet 56 Proteins can fold or unfold inside chaperonins 99
Open b-sheet structures have a variety GroEL is a cylindrical structure with a
of topologies 57 central channel in which newly synthesized
The positions of active sites can be predicted polypeptides bind 100
in a/b structures 57 GroES closes off one end of the GroEL cylinder 102
Tyrosyl-tRNA synthetase has two different The GroEL–GroES complex binds and
domains (a/b + a) 59 releases newly synthesized polypeptides
Carboxypeptidase is an a/b protein with a in an ATP-dependent cycle 102
mixed b sheet 60 The folded state has a flexible structure 104
Arabinose-binding protein has two similar Conformational changes in a protein kinase
a/b domains 62 are important for cell cycle regulation 105
Conclusion 63 Peptide binding to calmodulin induces a
Selected readings 64 large interdomain movement 109
Serpins inhibit serine proteinases with
5. Beta Structures 67 a spring-loaded safety catch mechanism 110
Up-and-down barrels have a simple topology 68 Effector molecules switch allosteric proteins
The retinol-binding protein binds retinol between R and T states 113
inside an up-and-down b barrel 68 X-ray structures explain the allosteric
Amino acid sequence reflects b structure 69 properties of phosphofructokinase 114
The retinol-binding protein belongs to a Conclusion 117
superfamily of protein structures 70 Selected readings 119
Neuraminidase folds into up-and-down b sheets 70
Folding motifs form a propeller-like structure 7. DNA Structures 121
in neuraminidase 71 The DNA double helix is different in A- and
The active site is in the middle of one side of B-DNA 121
the propeller 72 The DNA helix has major and minor grooves 122
Greek key motifs occur frequently in Z-DNA forms a zigzag pattern 123
antiparallel b structures 72 B-DNA is the preferred conformation in vivo 124
The g-crystallin molecule has two domains 74 Specific base sequences can be recognized
The domain structure has a simple topology 74 in B-DNA 124
Two Greek key motifs form the domain 74 Conclusion 125
The two domains have identical topology 75 Selected readings 126
The two domains have similar structures 76
The Greek key motifs in g crystallin are Part 2 Structure, Function, and Engineering 127
evolutionarily related 76 8. DNA Recognition in Procaryotes by
The Greek key motifs can form jelly roll barrels 77 Helix-Turn-Helix Motifs 129
The jelly roll motif is wrapped around a barrel 77 A molecular mechanism for gene control 129
The jelly roll barrel is usually divided into Repressor and Cro proteins operate a procaryotic
two sheets 78 genetic switch region 130
The functional hemagglutinin subunit has two The x-ray structure of the complete lambda
polypeptide chains 79 Cro protein is known 131
The subunit structure is divided into a stem The x-ray structure of the DNA-binding
and a tip 79 domain of the lambda repressor is known 132
The receptor binding site is formed by the Both lambda Cro and repressor proteins
jelly roll domain 80 have a specific DNA-binding motif 133
Hemagglutinin acts as a membrane fusogen 80 Model building predicts Cro–DNA interactions 134
The structure of hemagglutinin is affected Genetic studies agree with the structural model 135
by pH changes 81 The x-ray structure of DNA complexes with
Parallel b-helix domains have a novel fold 84 434 Cro and repressor revealed novel
Conclusion 85 features of protein–DNA interactions 136
Selected readings 87 The structures of 434 Cro and the 434
repressor DNA-binding domain are very
6. Folding and Flexibility 89 similar 137
Globular proteins are only marginally stable 90 The proteins impose precise distortions on
Kinetic factors are important for folding 91 the B-DNA in the complexes 138
Molten globules are intermediates in folding 92 Sequence-specific protein–DNA interactions
Burying hydrophobic side chains is a key event 93 recognize operator regions 138
x
Protein–DNA backbone interactions determine The finger region of the classic zinc finger
DNA conformation 139 motif interacts with DNA 178
Conformational changes of DNA are Two zinc-containing motifs in the
important for differential binding of repressor glucocorticoid receptor form one
and Cro to different operator sites 140 DNA-binding domain 181
The essence of phage repressor and Cro 141 A dimer of the glucocorticoid receptor binds
DNA binding is regulated by allosteric control 142 to DNA 183
The trp repressor forms a helix-turn-helix motif 142 An a helix in the first zinc motif provides
A conformational change operates a the specific protein–DNA interactions 184
functional switch 142 Three residues in the recognition helix provide
Lac repressor binds to both the major and minor the sequence-specific interactions with DNA 184
grooves inducing a sharp bend in the DNA 143 The retinoid X receptor forms heterodimers
CAP-induced DNA bending could activate that recognize tandem repeats with
transcription 146 variable spacings 185
Conclusion 147 Yeast transcription factor GAL4 contains
Selected readings 148 a binuclear zinc cluster in its DNA-binding
domain 187
9. DNA Recognition by Eucaryotic The zinc cluster regions of GAL4 bind at the
Transcription Factors 151 two ends of the enhancer element 188
Transcription is activated by protein–protein The linker region also contributes to DNA
interactions 152 binding 189
The TATA box-binding protein is ubiquitous 153 DNA-binding site specificity among the C6-zinc
The three-dimensional structures of cluster family of transcription factors is
TBP–TATA box complexes are known 154 achieved by the linker regions 190
A b sheet in TBP forms the DNA-binding site 154 Families of zinc-containing transcription
TBP binds in the minor groove and induces factors bind to DNA in several different ways 191
large structural changes in DNA 155 Leucine zippers provide dimerization
The interaction area between TBP and the interactions for some eucaryotic
TATA box is mainly hydrophobic 157 transcription factors 191
Functional implications of the distortion of The GCN4 basic region leucine zipper binds
DNA by TBP 158 DNA as a dimer of two uninterrupted
159 a helices 193
TFIIA and TFIIB bind to both TBP and DNA
Homeodomain proteins are involved in the GCN4 binds to DNA with both specific and
nonspecific contacts 194
development of many eucaryotic organisms 159
Monomers of homeodomain proteins bind The HLH motif is involved in homodimer
and heterodimer associations 196
to DNA through a helix-turn-helix motif 160
In vivo specificity of homeodomain The a-helical basic region of the b/HLH
motif binds in the major groove of DNA 198
transcription factors depends on interactions
The b/HLH/zip family of transcription factors
with other proteins 162
have both HLH and leucine zipper
POU regions bind to DNA by two tandemly
dimerization motifs 199
oriented helix-turn-helix motifs 164
Max and MyoD recognize the DNA HLH
Much remains to be learnt about the function of
consensus sequence by different specific
homeodomains in vivo 166
protein–DNA interactions 201
Understanding tumorigenic mutations 166 201
Conclusion
The monomeric p53 polypeptide chain is Selected readings 203
divided in three domains 167
The oligomerization domain forms tetramers 167 11. An Example of Enzyme Catalysis:
The DNA-binding domain of p53 is an Serine Proteinases 205
antiparallel b barrel 168 Proteinases form four functional families 205
Two loop regions and one a helix of p53 bind The catalytic properties of enzymes are
to DNA 169 reflected in Km and kcat values 206
Tumorigenic mutations occur mainly in three Enzymes decrease the activation energy of
regions involved in DNA binding 170 chemical reactions 206
Conclusions 172 Serine proteinases cleave peptide bonds by
Selected readings 172 forming tetrahedral transition states 208
Four important structural features are required
10. Specific Transcription Factors Belong 175 for the catalytic action of serine proteinases 209
to a Few Families Convergent evolution has produced two
Several different groups of zinc-containing 176 different serine proteinases with similar
motifs have been observed catalytic mechanisms 210
The classic zinc fingers bind to DNA in tandem 177 The chymotrypsin structure has two antiparallel
along the major groove b-barrel domains 210
xi
The active site is formed by two loop regions Transmembrane a helices can be predicted
from each domain 211 from amino acid sequences 244
Did the chymotrypsin molecule evolve by Hydrophobicity scales measure the degree
gene duplication? 212 of hydrophobicity of different amino acid
Different side chains in the substrate side chains 245
specificity pocket confer preferential Hydropathy plots identify transmembrane
cleavage 212 helices 245
Engineered mutations in the substrate Reaction center hydropathy plots agree with
specificity pocket change the rate of catalysis 213 crystal structural data 246
The Asp 189-Lys mutation in trypsin causes Membrane lipids have no specific interaction
unexpected changes in substrate specificity 215 with protein transmembrane a helices 246
The structure of the serine proteinase subtilisin Conclusion 247
is of the a/b type 215 Selected readings 248
The active sites of subtilisin and chymotrypsin
are similar 216 13. Signal Transduction 251
A structural anomaly in subtilisin has G proteins are molecular amplifiers 252
functional consequences 217 Ras proteins and the catalytic domain
Transition-state stabilization in subtilisin is of Ga have similar three-dimensional 254
dissected by protein engineering 217 structures
Catalysis occurs without a catalytic triad 217 Ga is activated by conformational changes of
Substrate molecules provide catalytic groups three switch regions 257
in substrate-assisted catalysis 218 GTPases hydrolyze GTP through nucleophilic
Conclusion 219 attack by a water molecule 259
Selected readings 220 The Gb subunit has a seven-blade propeller fold,
built up from seven WD repeat units 261
12. Membrane Proteins 223 The GTPase domain of Ga binds to Gb in the
Membrane proteins are difficult to crystallize 224 heterotrimeric Gabg complex 263
Novel crystallization methods are being Phosducin regulates light adaptation in
developed 224 265
retinal rods
Two-dimensional crystals of membrane Phosducin binding to Gbg blocks binding of Ga 265
proteins can be studied by electron
The human growth hormone induces
microscopy 225
dimerization of its cognate receptor 267
Bacteriorhodopsin contains seven
Dimerization of the growth hormone receptor
transmembrane a helices 226
is a sequential process 268
Bacteriorhodopsin is a light-driven proton
The growth hormone also binds to the
pump 227
prolactin receptor 269
Porins form transmembrane channels by
Tyrosine kinase receptors are important
b strands 228
enzyme-linked receptors 270
Porin channels are made by up and down
Small protein modules form adaptors for a
b barrels 229
signaling network 272
Each porin molecule has three channels 230
SH2 domains bind to phosphotyrosine-
Ion channels combine ion selectivity with
containing regions of target molecules 273
high levels of ion conductance 232
SH3 domains bind to proline-rich regions of
The K+ channel is a tetrameric molecule with target molecules 274
one ion pore in the interface between the Src tyrosine kinases comprise SH2 and SH3
four subunits 232
domains in addition to a tyrosine kinase 275
The ion pore has a narrow ion selectivity filter 233
The two domains of the kinase in the
The bacterial photosynthetic reaction center inactive state are held in a closed
is built up from four different polypeptide conformation by assembly of the
chains and many pigments 234
regulatory domains 277
The L, M, and H subunits have transmembrane Conclusion 278
a helices 236
Selected readings 280
The photosynthetic pigments are bound to the
L and M subunits 237
14. Fibrous Proteins 283
Reaction centers convert light energy into Collagen is a superhelix formed by three
electrical energy by electron flow through parallel, very extended left-handed helices 284
the membrane 239
Coiled coils are frequently used to form
Antenna pigment proteins assemble into oligomers of fibrous and globular proteins 286
multimeric light-harvesting particles 240
Amyloid fibrils are suggested to be built up
Chlorophyll molecules form circular rings from continuous b sheet helices 288
in the light-harvesting complex LH2 241
Spider silk is nature’s high-performance fiber 289
The reaction center is surrounded by a ring of 16 Muscle fibers contain myosin and actin which
antenna proteins of the light-harvesting slide against each other during muscle
complex LH1 242
contraction 290
xii
Myosin heads form cross-bridges between the Complex spherical viruses have more than one
actin and myosin filaments 291 polypeptide chain in the asymmetric unit 329
Time-resolved x-ray diffraction of frog muscle Structural versatility gives quasi-equivalent
confirmed movement of the cross-bridges 292 packing in T = 3 plant viruses 331
Structures of actin and myosin have been The protein subunits recognize specific parts
determined 293 of the RNA inside the shell 332
The structure of myosin supports the swinging The protein capsid of picornaviruses contains
cross-bridge hypothesis 295 four polypeptide chains 333
The role of ATP in muscular contraction has There are four different structural proteins in
parallels to the role of GTP in G-protein picornaviruses 334
activation 296 The arrangement of subunits in the shell of
Conclusion 297 picornaviruses is similar to that of T = 3
Selected readings 298 plant viruses 334
The coat proteins of many different spherical
15. Recognition of Foreign Molecules by the plant and animal viruses have similar
Immune System 299 jelly roll barrel structures, indicating an
The polypeptide chains of antibodies are evolutionary relationship 335
divided into domains 300 Drugs against the common cold may be
Antibody diversity is generated by several designed from the structure of rhinovirus 337
different mechanisms 302 Bacteriophage MS2 has a different subunit
All immunoglobulin domains have similar structure 339
three-dimensional structures 303 A dimer of MS2 subunits recognizes an RNA
The immunoglobulin fold is best described as packaging signal 339
two antiparallel b sheets packed tightly The core protein of alphavirus has a
against each other 304 chymotrypsin-like fold 340
The hypervariable regions are clustered SV40 and polyomavirus shells are constructed
in loop regions at one end of the from pentamers of the major coat protein
variable domain 305 with nonequivalent packing but largely
The antigen-binding site is formed by close equivalent interactions 341
association of the hypervariable regions Conclusion 343
from both heavy and light chains 306 Selected readings 344
The antigen-binding site binds haptens in
crevices and protein antigens on large 17. Prediction, Engineering, and Design of
flat surfaces 308 Protein Structures 347
The CDR loops assume only a limited range Homologous proteins have similar structure
of conformations, except for the heavy and function 348
chain CDR3 311 Homologous proteins have conserved structural
An IgG molecule has several degrees of cores and variable loop regions 349
conformational flexibility 312 Knowledge of secondary structure is necessary
Structures of MHC molecules have provided for prediction of tertiary structure 350
insights into the molecular mechanisms Prediction methods for secondary structure
of T-cell activation 312 benefit from multiple alignment of
MHC molecules are composed of antigen- homologous proteins 351
binding and immunoglobulin-like domains 313 Many different amino acid sequences give
Recognition of antigen is different in MHC similar three-dimensional structures 352
molecules compared with immunoglobulins 314 Prediction of protein structure from sequence
Peptides are bound differently by class I and is an unsolved problem 352
class II MHC molecules 315 Threading methods can assign amino acid
T-cell receptors have variable and constant sequences to known three-dimensional folds 353
immunoglobulin domains and Proteins can be made more stable by
hypervariable regions 316 engineering 354
MHC–peptide complexes are the ligands for Disulfide bridges increase protein stability 355
T-cell receptors 318 Glycine and proline have opposite effects on
Many cell-surface receptors contain stability 356
immunoglobulin-like domains. 318 Stabilizing the dipoles of a helices increases
Conclusion 320 stability 357
Selected readings 321 Mutants that fill cavities in hydrophobic cores
do not stabilize T4 lysozyme 358
16. The Structure of Spherical Viruses 325 Proteins can be engineered by combinatorial
The protein shells of spherical viruses have methods 358
icosahedral symmetry 327 Phage display links the protein library
The icosahedron has high symmetry 327 to DNA 359
The simplest virus has a shell of 60 protein Affinity and specificity of proteinase inhibitors
subunits 328 can be optimized by phage display 361
xiii
Structural scaffolds can be reduced in size Building a model involves subjective
while function is retained 363 interpretation of the data 381
Phage display of random peptide libraries Errors in the initial model are removed
identified agonists of erythropoetin receptor 364 by refinement 383
DNA shuffling allows accelerated evolution Recent technological advances have greatly
of genes 365 influenced protein crystallography 383
Protein structures can be designed from first X-ray diffraction can be used to study the
principles 367 structure of fibers as well as crystals 384
A b structure has been converted to an a structure The structure of biopolymers can be studied
by changing only half of the sequence 368 using fiber diffraction 386
Conclusion 370 NMR methods use the magnetic properties of
Selected readings 371 atomic nuclei 387
Two-dimensional NMR spectra of proteins are
18. Determination of Protein Structures 373 interpreted by the method of sequential
Several different techniques are used to study assignment 389
the structure of protein molecules 373 Distance constraints are used to derive possible
Protein crystals are difficult to grow 374 structures of a protein molecule 390
X-ray sources are either monochromatic or Biochemical studies and molecular
polychromatic 376 structure give complementary functional
X-ray data are recorded either on image plates information 391
or by electronic detectors 377 Conclusion 391
The rules for diffraction are given by Bragg’s law 378 Selected readings 392
Phase determination is the major
crystallographic problem 379 Protein Structure on the World Wide Web 393
Phase information can also be obtained by
Multiwavelength Anomalous Diffraction
experiments 381
xiv
Basic Part 1
Structural
Principles
1
This page is intentionally left blank.
The Building Blocks
1
Recombinant DNA techniques have provided tools for the rapid determination
of DNA sequences and, by inference, the amino acid sequences of proteins
from structural genes. The number of such sequences is now increasing
almost exponentially, but by themselves these sequences tell little more
about the biology of the system than a New York City telephone directory
tells about the function and marvels of that city.
The proteins we observe in nature have evolved, through selective pres-
sure, to perform specific functions. The functional properties of proteins Figure 1.1 The amino acid sequence of a
depend upon their three-dimensional structures. The three-dimensional protein’s polypeptide chain is called its
primary structure. Different regions of the
structure arises because particular sequences of amino acids in polypeptide
sequence form local regular secondary
chains fold to generate, from linear chains, compact domains with specific structures, such as alpha (a) helices or beta (b)
three-dimensional structures (Figure 1.1). The folded domains can serve as strands. The tertiary structure is formed by
modules for building up large assemblies such as virus particles or muscle packing such structural elements into one or
fibers, or they can provide specific catalytic or binding sites, as found in several compact globular units called domains.
enzymes or proteins that carry oxygen or that regulate the function of DNA. The final protein may contain several
polypeptide chains arranged in a quaternary
To understand the biological function of proteins we would therefore
structure. By formation of such tertiary and
like to be able to deduce or predict the three-dimensional structure from the quaternary structure amino acids far apart in
amino acid sequence. This we cannot do. In spite of considerable efforts over the sequence are brought close together in
the past 25 years, this folding problem is still unsolved and remains one of three dimensions to form a functional region,
the most basic intellectual challenges in molecular biology. an active site.
F T P A
V C C
L F A H C
D
K F L A N N
S N N
V T S V N
L
C
T S K Y C
3
Protein folding remains a problem because there are 20 different amino
acids that can be combined into many more different proteins than there are
atoms in the known universe. In addition there is a vast number of ways in
which similar structural domains can be generated in proteins by different
amino acid sequences. By contrast, the structure of DNA, made up of only
four different nucleotide building blocks that occur in two pairs, is relatively
simple, regular, and predictable.
Since the three-dimensional structures of individual proteins cannot be
predicted, they must instead be determined experimentally by x-ray crystal-
lography, electron crystallography or nuclear magnetic resonance (NMR)
techniques. Over the past 30 years the structures of more than 6000 proteins
have been solved, and the sequences of more than 500,000 have been deter-
mined. This has generated a body of information from which a set of basic
principles of protein structure has emerged. These principles make it easier
for us to understand how protein structure is generated, to identify common
structural themes, to relate structure to function, and to see fundamental
relationships between different proteins. The science of protein structure is at
the stage of taxonomy where we can begin to discern patterns and motifs
among the relatively small number of proteins whose three-dimensional
structure is known. Figure 1.2 Proteins are built up by amino
The first six chapters of this book deal with the basic principles of pro- acids that are linked by peptide bonds to form
tein structure as we understand them today, and examples of the different a polypeptide chain. (a) Schematic diagram of
an amino acid, illustrating the nomenclature
major classes of protein structures are presented. Chapter 7 contains a brief
used in this book. A central carbon atom (Ca)
discussion on DNA structures with emphasis on recognition by proteins of is attached to an amino group (NH2), a
specific nucleotide sequences. The remaining chapters illustrate how during carboxyl group (COOH), a hydrogen atom (H),
evolution different structural solutions have been selected to fulfill particular and a side chain (R). (b) In a polypeptide chain
functions. the carboxyl group of amino acid n has formed
a peptide bond, C–N, to the amino group
of amino acid n + 1. One water molecule is
Proteins are polypeptide chains eliminated in this process. The repeating units,
which are called residues, are divided into
All of the 20 amino acids have in common a central carbon atom (Ca) to main-chain atoms and side chains. The
which are attached a hydrogen atom, an amino group (NH2), and a carboxyl main-chain part, which is identical in all
group (COOH) (Figure 1.2a). What distinguishes one amino acid from residues, contains a central Ca atom attached
another is the side chain attached to the Ca through its fourth valence. There to an NH group, a C¢=O group, and an H atom.
The side chain R, which is different for
are 20 different side chains specified by the genetic code; others occur, in rare
different residues, is bound to the Ca atom.
cases, as the products of enzymatic modifications after translation.
Amino acids are joined end-to-end during protein synthesis by the for-
(a) side chain
mation of peptide bonds when the carboxyl group of one amino acid con-
denses with the amino group of the next to eliminate water (Figure 1.2b).
R
This process is repeated as the chain elongates. One consequence is that the
amino group of the first amino acid of a polypeptide chain and the carboxyl
group of the last amino acid remain intact, and the chain is said to extend H OH
Cα
from its amino terminus to its carboxy terminus. The formation of a succes- N C¢
sion of peptide bonds generates a “main chain,” or “backbone,” from which
project the various side chains. H
H O
The main-chain atoms are a carbon atom Ca to which the side chain is
attached, an NH group bound to Ca, and a carbonyl group C¢=O, where the amino group carboxyl group
carbon atom C¢ is attached to Ca. These units, or residues, are linked into a
polypeptide by a peptide bond between the C¢ atom of one residue and the
nitrogen atom of the next (see Figure 1.2b). The basic repeating unit along (b) peptide bond
the main chain from a biochemical or genetic viewpoint is thus
(NH–CaH–C¢=O), which is the residue of the common parts of amino acids
after peptide bonds have been formed (see Figure 1.2b).
R1 H H O
4
The amino acids are usually divided into three different classes defined R R
by the chemical nature of the side chain. The first class comprises those with
strictly hydrophobic side chains: Ala (A), Val (V), Leu (L), Ile (I), Phe (F), Pro
(P), and Met (M). The four charged residues, Asp (D), Glu (E), Lys (K), and Arg Cα Cα
(R), form the second class. The third class comprises those with polar side CO H N N H CO
chains: Ser (S), Thr (T), Cys (C), Asn (N), Gln (Q), His (H), Tyr (Y), and Trp L-form D-form
(W). The amino acid glycine (G), which has only a hydrogen atom as a side
chain and so is the simplest of the 20 amino acids, has special properties and Figure 1.3 The “handedness” of amino acids.
is usually considered either to form a fourth class or to belong to the first Looking down the H–Ca bond from the
hydrogen atom, the L-form has CO, R, and
class.
N substituents from Ca going in a clockwise
The four groups attached to the Ca atom are chemically different for all direction. There is a mnemonic to remember
the amino acids except glycine, where two H atoms bind to Ca. All amino this; for the L-form the groups read CORN in
acids except glycine are thus chiral molecules that can exist in two different clockwise direction.
forms with different “hands,” L- or D-form (Figure 1.3).
Biological systems depend on specific detailed recognition of molecules
that distinguish between chiral forms. The translation machinery for protein
synthesis has evolved to utilize only one of the chiral forms of amino acids,
the L-form. All amino acids that occur in proteins therefore have the L-form.
There is, however, no obvious reason why the L-form was chosen during evo-
lution and not the D-form.
cysteine cysteine
S
H
+
H
S
oxidation reduction
5
(a) Hydrophobic amino acids
H
C
HC CH
CH3 H 3C CH3 HC CH
– CH C
H3N+ C COO
CH2
H
A Ala, Alanine V Val, Valine F Phe, Phenylalanine
H3C CH3
CH3
CH
H2C CH3
CH CH 2
OH HO CH3
CH
CH2 CH2
O NH2
O NH2 C
SH C
CH2
CH2 CH2
CH2
6
Panel 1.1 The 20 different amino acids that
CH3 occur in proteins. Only side chains are shown,
except for the first amino acid, alanine, where
S all atoms are shown. The bond from the side
chain to Ca is red. A ball-and-stick model, the
CH2 CH2 CH2 chemical formula, the full name, and the
three-letter and one-letter codes are given for
CH2 C αH CH2 each amino acid.
N There are some easy ways of remembering
the one-letter code for amino acids. If only one
amino acid begins with a certain letter, that
P Pro, Proline M Met, Methionine letter is used:
C = Cys = Cysteine
H = His = Histidine
I = Ile = Isoleucine
M = Met = Methionine
S = Ser = Serine
V = Val = Valine
A = Ala = Alanine
G = Gly = Glycine
L = Leu = Leucine
NH2 P = Pro = Proline
+ T = Thr = Threonine
C=NH2
Some of the others are phonetically suggestive:
NH
F = Phe = Phenylalanine (“Fenylalanine”)
CH 2 R = Arg = Arginine (“aRginine”)
Y = Tyr = Tyrosine (“tYrosine”)
CH 2 W = Trp = Tryptophan (double ring in
the molecule)
CH 2
R Arg, Arginine In other cases a letter close to the initial is
used. Amides have letters from the middle of
the alphabet. The smaller molecules (D, N, B)
are earlier in the alphabet than the larger ones
(E, Q, Z).
H His, Histidine
H (d) Glycine
N
H
CH2
G Gly, Glycine
W Trp, Tryptophan
7
Figure 1.5 Part of a polypeptide chain that
O R2 H is divided into peptide units, represented
as blocks in the diagram. Each peptide unit
H C¢ Cα H contains the Ca atom and the C¢=O group of
N
N residue n as well as the NH group and the
Cα C¢ Ca atom of residue n + 1. Each such unit is a
H Cα
planar, rigid group with known bond distances
H O and bond angles. R1, R2, and R3 are the side
R1 R3 chains attached to the Ca atoms that link the
peptide units in the polypeptide chain. The
peptide bond
peptide group is planar because the additional
electron pair of the C=O bond is delocalized
over the peptide group such that rotation
around the C–N bond is prevented by an
Disulfide bridges stabilize three-dimensional structure. In some proteins energy barrier.
these bridges hold together different polypeptide chains; for example, the A
and B chains of insulin are linked by two disulfide bridges between the
chains. More frequently intramolecular disulfide bridges stabilize the folding
of a single polypeptide chain, making the protein less susceptible to degra-
dation. There are many examples of this, including proteins with short
polypeptide chains, such as snake venom toxins and protease inhibitors, that
need additional stabilizing factors to produce a stable fold. Much effort is
currently spent on introducing extra intramolecular disulfide bridges into
enzymes by site-directed mutagenesis in order to make them more thermo-
stable and hence more useful for industrial applications as catalysts, as
described in Chapter 17.
8
Glycine residues can adopt many different conformations
Most combinations of f and y angles for an amino acid are not allowed
because of steric collisions between the side chains and main chain. It is rea-
sonably straightforward to calculate those combinations that are allowed.
Since the D- and L-forms of the amino acids have their side chain oriented
differently with respect to the CO group, they have different allowed f and
y angles. Proteins built from D-amino acids would thus be expected to have
different conformations from those found in nature that are exclusively
made of L-amino acids. Since the L- and D-forms of each amino acid are mir-
ror images of one another, would a protein made exclusively of D-form
residues produce a structure that is the mirror image of the natural protein?
Stephen Kent and his colleagues at the Scripps Institute chemically synthe-
sized both the L- and the D-forms of HIV-1 protease. The D-enzyme proved
indeed to be the mirror image of the L-enzyme. Furthermore the D-enzyme
and L-enzyme had reciprocal chiral specificity on peptide substrates, the
D-enzyme only recognizing and cutting peptides made of D-amino acids. Per-
haps the choice of the L-form at the outset of the evolution of life on earth
was random and irrevocable.
The angle pairs f and y are usually plotted against each other in a
diagram called a Ramachandran plot after the Indian biophysicist
G.N. Ramachandran who first made calculations of sterically allowed regions.
Figure 1.7 shows the results of such calculations and also a plot for all amino
Figure 1.7 Ramachandran plots showing
(a) allowed combinations of the conformational
angles phi and psi defined in Figure 1.6. Since
+180 phi (f) and psi (y) refer to rotations of two
rigid peptide units around the same Ca atom,
β most combinations produce steric collisions
either between atoms in different peptide
groups or between a peptide unit and the side
L chain attached to Ca. These combinations are
therefore not allowed. (a) Colored areas show
sterically allowed regions. The areas labeled
psi 0 a, b, and L correspond approximately to
conformational angles found for the usual
α
right-handed a helices, b strands, and
left-handed a helices, respectively. (b)
Observed values for all residue types except
glycine. Each point represents f and y values
for an amino acid residue in a well-refined
–180 x-ray structure to high resolution. (c) Observed
values for glycine. Notice that the values
–180 0 phi +180 include combinations of f and y that are not
allowed for other amino acids. (From J.
Richardson, Adv. Prot. Chem. 34: 174–175,
1981.)
(b) (c)
+180 +180
psi 0 psi 0
–180 –180
–180 0 phi +180 –180 0 phi +180
9
acids except glycine from a number of accurately determined protein struc-
tures. It is apparent that the observed values are clustered in the sterically
allowed regions. There is one important exception. Glycine, with only a
hydrogen atom as a side chain, can adopt a much wider range of conforma-
tions than the other residues, as seen in Figure 1.7c. Glycine thus plays a
structurally very important role; it allows unusual main-chain conformations
in proteins. This is one of the main reasons why a high proportion of glycine
residues are conserved among homologous protein sequences.
Regions in the Ramachandran plot are named after the conformation
that results in a peptide if the corresponding f and y angles are repeated in
successive amino acids along the chain. The major allowed regions in Figure
1.7a are the right-handed a-helical cluster in the lower left quadrant (see
Chapter 3); the broad region of extended b strands of both parallel and
antiparallel b structures (see Chapter 4) in the upper left quadrant; and the
small, sparsely populated left-handed a-helical region in the upper right
quadrant. Left-handed a helices are usually found in loop regions or in small
single-turn a helices.
10
(a) (b)
Tyr (free radical)
Cys
coenzyme S Cys
O O
Glu S
μ-oxo bridge
H2O H2O O Zn
O O O
O 2–
Glu His
Fe Fe N
Asp O
O –
O O N
N N alcohol
conformations. These are called rotamers. Today, collections of these favored Figure 1.9 Examples of functionally important
conformations, or rotamer libraries, are a standard tool in computer programs intrinsic metal atoms in proteins. (a) The
used for modeling protein structures. di-iron center of the enzyme ribonucleotide
reductase. Two iron atoms form a redox center
that produces a free radical in a nearby tyrosine
Many proteins contain intrinsic metal atoms side chain. The iron atoms are bridged by a
glutamic acid residue and a negatively charged
The side chains of the 20 different amino acids listed in Panel 1.1 (pp. 6–7) oxygen atom called a m-oxo bridge. The
have very different chemical properties and are utilized for a wide variety of coordination of the iron atoms is completed
by histidine, aspartic acid, and glutamic acid
biological functions. However, their chemical versatility is not unlimited,
side chains as well as water molecules. (b) The
and for some functions metal atoms are more suitable and more efficient. catalytically active zinc atom in the enzyme
Electron-transfer reactions are an important example. Fortunately the side alcohol dehydrogenase. The zinc atom is
chains of histidine, cysteine, aspartic acid, and glutamic acid are excellent coordinated to the protein by one histidine
metal ligands, and a fairly large number of proteins have recruited metal and two cysteine side chains. During catalysis
atoms as intrinsic parts of their structures; among the frequently used metals zinc binds an alcohol molecule in a suitable
are iron, zinc, magnesium, and calcium. Several metallo proteins are dis- position for hydride transfer to the coenzyme
moiety, a nicotinamide. [(a) Adapted from
cussed in detail in later chapters and it suffices here to mention briefly a few
P. Nordlund et al., Nature 345: 593–598, 1990.]
examples of iron and zinc proteins.
The most conspicuous use of iron in biological systems is in our blood,
where the erythrocytes are filled with the oxygen-binding protein hemoglo-
bin. The red color of blood is due to the iron atom bound to the heme group
in hemoglobin. Similar heme-bound iron atoms are present in a number
of proteins involved in electron-transfer reactions, notably cytochromes. A
chemically more sophisticated use of iron is found in an enzyme, ribo-
nucleotide reductase, that catalyzes the conversion of ribonucleotides to
deoxyribonucleotides, an important step in the synthesis of the building
blocks of DNA.
Ribonucleotide reductase from Escherichia coli and mammals contains a
di-iron center (Figure 1.9a) that reacts with oxygen and oxidizes a nearby
tyrosine side chain, producing a tyrosyl free radical that is essential for the
catalysis. The two iron atoms in this iron center are close to each other and
bridged by the oxygen atoms of a glutamic acid side chain as well as by an
oxygen ion called a m-oxo bridge. Glutamic acid, aspartic acid, and histidine
side chains from the protein, as well as water molecules, complete the co-
ordination sphere of the octahedrally coordinated iron atom.
Zinc is used to stabilize the DNA-binding regions of a class of transcrip-
tion factors called zinc fingers, which are discussed in Chapter 10. Zinc ions
also participate directly in catalytic reactions in many different enzymes by
binding substrate molecules and providing a positive charge that influences
the electronic arrangement of the substrate and thereby facilitates the cat-
alytic reaction. One such example is found in the enzyme alcohol dehydro-
genase, which in yeast produces alcohol during fermentation and in our
livers detoxifies the alcohol we have consumed by oxidizing it. The enzyme
provides a scaffold containing three zinc ligands—one histidine and two cys-
teine side chains—which sequester a zinc atom so that it binds alcohol as a
fourth ligand in a tetrahedral coordination (Figure 1.9b).
11
Another random document with
no related content on Scribd:
Eräs vieras mies — kongolainen — hiipi yöllä hänen majaansa
teroitettu partaveitsi mukanaan.
Tambeli oli rikas mies, sillä hänellä oli vuohia ja putkia ja suolaa
säkeittäin. Hänellä oli kuusi vaimoa, jotka hoitivat hänen
puutarhaansa ja keittivät hänen ruokansa, ja he olivat ylpeitä
herrastaan ja upeilivat hänen hämäräperäisten keinottelujensa
avulla. Sillä Tambeli oli kauppias, vaikka harvat tiesivät sitä, ja hänen
tapanaan oli olla poissa kolme kuukautta vuodesta omilla asioillaan.
*****
Tambeli epäröi.
Kun kersantin ote tiukkeni, niin vahva Tambeli tarttui häntä puvun
liepukoihin ja heitti hänet menemään; sitten hän seisoi jäykkänä, sillä
Sandersin revolverin lämmin suu oli painettu hänen vatsaansa, ja
Tambeli, joka oli oman väitteensä mukaan kulkenut paljon ja nähnyt
monta, tiesi Sandersin mieheksi, joka kunnioitti ihmishenkeä vähän,
tuskin ensinkään.
Ei ollut, niin kuin mies oli sanonutkin, mitään lakia, joka kieltäisi
väkevien juomien myynnin hänen alaisellaan alueella, mutta
Sanders ei välittänyt perustuslaista. Hän oli pitänyt maansa vapaana
viinan kirouksesta, eikä hänellä ollut aikomusta lisätä vastuutaan.
Hän oli aseman ainoa valkoihoinen mies ja oli kiitollinen. Hän oli
pannut yhden vieraistaan, Isisin jonkin verran pelästyneen
kuninkaan, vartioinnin alaiseksi, mutta se ei häntä tyydyttänyt, sillä
Tambeli oli rikkonut lakia ja karannut viidakkoon.
Vieras nyökkäsi.
*****
— Tämä on hyvä merkki, sillä jokin sanoo minulle, että olet tehnyt
tehtäväsi.
— Hän oli minun isäni, vaikersi Bosambo, — ja minun heimoni;
hän oli suuri herra, ja kun hän on kuollut, niin valkeat miehet tulevat
kivääreineen, jotka sanovat »ha-haha», ja syövät minut.
— Ja, sanoi hän, — pitkä aika on kulunut siitä, kun sain viinaa.
Tambeli nyökkäsi.
— Kuka on, joka pettää minut? kysyi hän. — Sillä jos suurten
herrain korviin tulee tieto, että minä tapoin Sandin…
— Koira!
Mies yritti nousta, kaatui jälleen, ja pian hän myöskin vaipui uneen,
ensin rauhattomaan, sitten raukeaan kuin väsynyt mies.
Pitkän hetken kuluttua hän otti lapion, jonka oli tuonut tullessaan.
Hän näet tiesi, että sateet olivat olleet ankaria ja pikku joet tulvivat
saattaen pääjoenkin pian tulvimaan, ja edelleen hän tiesi Tembolinin
olevan niin korkealla paikalla, ettei sen asukkailla ollut mitään
vaaraa.
Kun hän viikkoa myöhemmin kuuli Mdalin uneksineen, että viljan
veisi kato, ei hän ollut millänsäkään. Hän ajatteli, että Mdali on hyvä
arvaamaan, ja antoi asian olla.
Sanders oli näinä aikoina puuhakas mies, eikä hänellä ollut aikaa
eikä halua pohtia sellaisia asioita.