You are on page 1of 29



Rev. Biochem. 1984.53,' 293-321
Copyright © 1984 by Annual Reviews Inc. All rights reserved


Quick links to online content



Annu. Rev. Biochem. 1984.53:293-321. Downloaded from
Access provided by Kangwon National University on 03/08/15. For personal use only.

CarlO. Pabo

Department of Biophysics, The lohns Hopkins University
School of Medicine, Baltimore, Maryland, 21205
Robert T. Sauer

Department of Biology, Massachusetts Institute of Technology,
Cambridge, Massachusetts, 02139


A Model for the Cro-Operator Complex.............. ......... .............................................. ..............

LAMBDA CRO.........



Evidence Supporting the Cro-Operator Model............................................. ........ ....................


LAMBDA REPRESSOR................................................................................................................................


A Model for the Repressor-Operator Complex .. ...................................................... ..............
Repressor Mutants Defective in DNA Binding ..... ........................... ........................ ....... .......


CAP PROTEIN............................................................................................................................................






Modelsfor CAP-DNA Interactions .................................................................................. "........




OTHER MODES OF INTERACTION .................................... .......................................................................



Sequence Homologies.................. ...................:.................................................................................


PROSPECTS FOR A "RECOGNITION CODE" .................... .......... . .................... ...... ..................................


SUMMARY ..... ........................................ ............................. .... .................... ................................ .... ...........


Sequence-specific DNA-binding proteins regulate gene expression and also
serve structural and catalytic roles in other cellular processes. How do these
proteins bind to double-helical DNA and how do they recognize a
particular base sequence? Here we review recent crystallographic, bio­
chemical, and genetic studies that address these questions. For the most
part, we emphasize work published between 1980 and 1983, since the first

Annu. Rev. Biochem. 1984.53:293-321. Downloaded from
Access provided by Kangwon National University on 03/08/15. For personal use only.

29 4


.three-dimensional structures of site-specific DNA-binding proteins were
reported during this period. Crystal structures are now available for the
Cro and cI repressors of bacteriophage lambda and for the CAP protein of
Escherichia coli (1-3). Each of these three proteins can turn off expression of
specific genes by prev�nting initiation of transcription; CAP and lambda
repressor can also enhance gene expression by stimulating transcription
from certain promoters. Other recent reviews discuss these proteins and
their physiological actions (4-10), the mechanism and control of pro­
karyotic transcription (11), and general aspects of protein-DNA interac­
tions (12, 13).
Cro, lambda repressor, and CAP interact with DNA in a basically similar
manner. Despite differences in size, domain organization and tertiary
structure, each of these proteins binds to operator DNA as a dimer and uses
a-helices to contact adjacent major grooves along one face of the double
helix. Moreover, sequence homologies suggest that many other DNA­
binding proteins use similar a-helical regions for DNA recognition. How
does each of these proteins recognize its proper binding site? What forces
are involved? Is there a "recognition code"? We consider these questions
after discussing the structures of Cro, lambda repressor, and CAP and
describing the models proposed for the respective protein-DNA complexes.

Lambda Cro binds to six operator sites in the double-stranded phage DNA
(14, 15). These sites are clustered in two operator regions, and each region
contains three 17-bp (base pair) sites. The DNA sequences of the six sites are
similar but not identical, and Cro's affinity for the different sites varies over
a tenfold range (15-17). The sequence of each operator site has approximate
two-fold symmetry, and the consensus sequence, shown in Figure 1, is
symmetric. The Cro monomer contains 66 amino acid residues (18, 19). Cro
exists as a dimer in solution (20) and this is the form active in DNA binding.
A crystallographic study at 2.8 A resolution by Anderson, Ohlendorf,
Takeda, & Matthews (1) showed that the Cro monomer contains three


2 3 456

7 e 9 8


+ + +





7 6 543 2



Figure 1




- -










+ +

Consensus Operator Sequence for Lambda Cro and Lambda Repressor.

A Modelfor the Cro-Operator Complex The structure of the Cro dimer immediately suggested its basic mechanism of DNA binding ( 1). 1984. This data. This arrangement provides an excellent fit between the surface of the protein and the surface of the DNA. 22). . Biochem. 39-45.annualreviews. G's indicate guanines implicated as contacts. Rev. The four C-terminal residues of Cro (residues 6366) are poorly represented in the electron density map and are probably somewhat disordered both in the crystal and in solution. had suggested that Cro bound in a symmetric manner and that it contacted ero Figure 2 Sketch of the lambda operator site OR3 and the Cro dimer. and 48-55) and three a-helices (residues 7-14.PROTEIN-DNA RECOGNITION 295 strands of antiparallel fJ-sheet (residues 2-6. The angle between the two Cro helices allows them to fit neatly into successive major grooves of the operator. the two copies of a-helix 3 form protruding ridges that are separated by the same center-to-center distance. which is shown in schematic form in Figure 2. Adapted with permission from (1). and accounts nicely for the observed DNA modification and protection Access provided by Kangwon National University on 03/08/15. Annu. 15-23. Downloaded from www. and 27-36).53:293-321. 34 A. that separates successive major grooves of B-DNA (Figure 2) (21. For personal use only. P's indicate phosphates that have been implicated as Cro contacts. The Cro dimer is stabilized by a region of antiparallel fJ-sheet that is formed by pairing Glu 54-Val 55-Lys 56 from each monomer. In the dimer.

The proposed complex also seems chemically reasonable.Annu. 296 PABO & SAUER adjacent major grooves along one face of the double helix (Figure 3) ( 15-17). since DNA in solution adopts this conformation (25). This bending seems plausible since it should require only a few kcal of energy (26).53:293-321. . However. and in the best model for the complex the operator DNA was smoothly bent with a radius of curvature of 75 A. 22). Rev. Model building started with a B­ form operator (21. Biochem. since a number of hydrophilic and charged residues can interact with exposed hydrogen­ bonding groups in the major groove and with the negatively charged phosphates. During this modeling of hydrogen-bonding interac­ tions. the DNA is bent around Cro so that each end of the operator is 5 A closer to the protein than it would be if the DNA were straight. Downloaded from www. Although the rest of the protein structures are quite different. but Ohlendorf et al (23) point out that a Figure 3 Alpha carbons from helices two and three of the proposed Cro-operator Access provided by Kangwon National University on 03/08/15. energy minimization (24) was used to ensure that the stereochemistry of the proposed complex was reasonable. Refinement of the Cro structure at 2. the corresponding helical regions of repressor and CAP are quite similar and may contact the DNA in a similar manneL Used with permission from (8). For personal use only.annualreviews. Thus. minor adjustments in the DNA structure were allowed. 1984.2 A and further model building by Ohlendorf et al (23) have allowed a detailed analysis of possible Cro­ operator interactions.

which are in or beyond helix 3.5. but the overall potential is dominated by a positive region that straddles the two-fold axis. Several additional polar interactions might also be made by residues Asn61-Lys62-Lys63-Thr64-Thr65 in Cro's flexible C-terminal region. As the model predicts. D. They also include Asn31. For personal use only. 5. and Lys62/63 are protected from chemical modification when Cro is bound to DNA.6. similar fit could also be obtained by a "hinge-bending" motion of the protein dimer that would allow it to contact a straight operator site. B. Rev.4.3. This interaction is illustrated in Figure 4 (top). The proposed Cro-operator complex also places a large number of residues near the sugar-phosphate DNA backbone. Lys32. manuscript in preparation). are listed below. Asn31. and Lys21 in helix 2. W. which all involve side chains in or near a-helix 3. The side-chain amide of Gln27 donates a hydrogen bond to N7 of the adenine at + 2. Steer. as shown in Figure 4 (center). Davis. Ohlendorf. The amino group of Lys32 donates one hydrogen bond to 04 of the thymine at . 1. Takeda. and Glu54 and Lys56 in the C-terminal /3-region (23). donates a second hydrogen bond to 06 of guanine . Lys56.PROTEIN-DNA RECOGNITION 297 Annu. 4. E. and accepts a hydrogen bond from N6 of the same adenine. Gln27. In addition. D. There is a weak negative potential on the far side of the dimer. Kim. 2. . as shown in Figure 4 (bottom). This region of positive charge coincides remarkably well with the presumed DNA­ binding site.annualreviews. and may donate a third hydrogen bond to N7 of guanine . The Cro-operator model predicts several specific contacts between each Cro monomer and the edges of base pairs in the major groove. Biochem. and Lys32 seem to make a few van der Waals contacts in the major groove. and Lys39. Protein modification studies provide general support for the model of the Cra-operator complex (Y. C. Lys21. Thr17. His35. Anderson. which is just above the major groove. 3.53:293-321. Those with hydrogen­ bonding potential include Gln16. but Tyr26 and Lys39. Evidence Supporting the ero-Operator Model The overall fit of Cro against the operator site is supported by a calculation of the electrostatic potential around the Cra dimer (27. The hydroxyl group of Tyr26 donates a hydrogen bond to 04 of the thymine at + 1. Caday. The hydroxyl group of Ser28 forms two hydrogen bonds with N6 and N7 of adenine . The base-numbering scheme is that for the consensus operator site shown in Figure 1: 1. Access provided by Kangwon National University on 03/08/15. These contacts. 28). Matthews. The guanidinium group of Arg38 donates two hydrogen bonds to 06 and N7 of guanine .4. Downloaded from www.

PABO & SAUER Annu. 298 Figure 4 (top) Sketch indicating possible hydrogen bonds between glutamine and an A: T base pair (86). .53:293-321. Access provided by Kangwon National University on 03/08/15. Downloaded from www. (bottom) Sketch indicating the hydrogen bonds that arginine could form with a G: C base pair (86). 1984.annualreviews. Biochem. (Center) Sketch indicating how serine could form a pair of hydrogen bonds with an A: T base pair (23). For personal use only.

and 6) (15-17. The repressor monomer contains 236 amino acids (33) and is thus considerably larger than Cro. but the Cro contacts seem to be a subset of the repressor contacts (compare Figures 2. M. Downloaded from www. 1984. Caruthers. Gln27 -+ Leu/ eys/Arg. S. and this result. Sauer. However.annualreviews. are not protected. the site called OR3 is one of the weakest binding sites for repressor but is the strongest binding site for Cro. detailed interpretation of these results is complicated by the fact that the ero is largely bound to nonoperator DNA in these studies. Glu54 -+ Lys/Ala. manuscript in preparation).· Other experiments show that the affinity of ero for operator DNA is reduced by carboxypeptidase digestion. Noble. The affinity of repressor for the six sites varies over a 50-fold range. Gln27 -+ His. but the sites for which repressor has highest affinity are not the sites for which Cro has highest affinity (15-17. R. Most of the proposed DNA contact residues are represented in a collection of mutations that are phenotypically defective in operator binding (A. LAMBDA REPRESSOR Lambda repressor recognizes the same six operator sites (14) that lambda Cro recognizes.32). Ser28 -+ Arg/Asn. 30). These include each of the five residues implicated in specific base contacts (Tyr26 -+ Asp. Chemical protection and modification studies show that Cro and repressor contact many of the same functional groups in the operators. it is likely that most of them owe their reduced operator binding to a defect in DNA binding. The different affinities of repressor and Cro for the six operator sites help explain the contrasting physiological roles of these two proteins (4-6). PROTEIN-DNA RECOGNITION 299 which are also predicted to be contact residues. suggests that the flexible C­ terminal region of Cro makes some DNA contacts. and it is likely that the structures of operator and nonoperator complexes may be somewhat different. The mutations introduced by this procedure (Tyr26 -+ Phe/Leu/Asp. For personal use only. 5. Pakula. . together with the protection of Lys62/63. Lys56 -+ Asn/Gln/Thr). For example. Although fut:ther analysis of all the ero mutations is needed to show that they do not disrupt Cra Access provided by Kangwon National University on 03/08/15. Biochem. Arg38 -+ GIn) and many of the residues implicated in l?ackbone contacts (Glnl6 -+ His . three of the proposed specific contact residues have been altered by oligonucleotide replacement of the appropriate region of the era gene (M. Lys39 -+ Thr. The correspondence between the positions of the mutations and the proposed DNA contact residues supports the model for the Cro-operator complex. Genetic studies also support the model proposed for the ero-operator complex. In addition. Lys32 -+ Thr/Gln. Rev. and repressor also binds to each 17-bp operator site as a dimer (29. Nasoff. 31). manuscript in preparation). Ser28 -+ Ala) all reduce the operator affinity of era.

and is a dimer in . form a compact. The first four helices. along with the irregular regions of chain that connect them.the crystal. 6 1-69 (helix 4). Biochem. which can be separated by cleavage with papain: or other proteases (34). . The N-terminal domain consists of an N-terminal arm and five a-helices. 1984. 300 PABO & SAUER The repressor monomer folds into two d0mains of similar size. globular Figure 5 The proposed lambda repressor-operator complex. 33-39 (helix 2). For personal use only.annualreviews. Rev. Thus the regulatory activities of the intact protein are retained by the N-terminal fragment.2 A resolution by Pabo & Lewis (3). Panel on right summarizes the results of chemical protection experiments at the site of ORl. Intact lambda repressor has not been crystallized. The N­ terminal domain. as an isolated proteolytic fragment of residues 1-92. 44-52 (helix 3) Access provided by Kangwon National University on 03/08/15. Most of this arm packs against another molecule in the protein crystal. b:lnds specifically to the lambda operator sites and mediates both positive and negative control of transcription (35).53:293-321. Downloaded from www. but the structure of the N-terminal operator-binding domain has been solved at 3. However. and 79-92 (helix 5). but residues 1-3 are disordered and not visible in the electron density map. The first eight residues of the domain form an arm that extends away from the globular region. The o:-helices of the domain include residues 9-23 (helix 1). Intact repressor binds to the operator more tightly becailse the C-terminal domain stabilizes the dimer and thereby stabilizes the protein-DNA complex. the N-terminal fragment binds to the operator less tightly than intact repres­ sor because the fragment dimer readily dissociates in solution (34).

Rev. Weiss. D. containing residues 4-92 and thus missing the first three residues of the arm. Figure 5 also summarizes the chemical protection data. A Model for the Repressor-Operator Complex Possible structures for the complex of the N-terminal dimer with operator DNA were evaluated by model building (3). Karplus. four of the protected sites are visible in the major groove on the "front" of the operator site. 3 1). like intact repressor. manuscript in preparation). and its N-terminal region is next to the sugar phosphate backbone of the DNA. R. . are polar and could readily make hydrogen bonds to bases in the major groove or interact with the sugar-phosphate backbone.annualreviews. the two N-terminal arms of repressor are set slightly to the sides of the operator helix and extend towards the "back" of the DNA near the center of the 17-bp site. 17. the N­ terminal portion of helix 3 fits directly into the major groove. The model building was also guided by chemical modification experiments that suggested that repressor contacted the major groove and that most contacts were on one side of the double-helical site (16. the different protection patterns imply that the first three residues of the arm must contact the major groove on the back of the operator site. allows each subunit of the dimer to contact one half of the operator site. Biochem. The fifth helix extends off to one side of the molecule and folds against helix 5 of a neighboring monomer. Serl-Thr2-Lys3Lys4-Lys5. which is shown schematically in Figure Access provided by Kangwon National University on 03/08/15. 1984. Patel. The 92-residue N-terminal domain. Downloaded from www. and repressor does not significantly wind or unwind the operator DNA (36). protects only the guanines on the front of the operator site.PROTEIN-DNA RECOGNITION 301 domain.53:293-321. The first five residues of the arm. In the proposed complex. M. a shorter N-terminal fragment. Biochemical studies show that these arms actually make major groove contacts on the back of the operator site (37). Model building indicates that repressor's arms are long enough to encircle the double helix. With these constraints. Since NMR studies show that the 1-92 and 4-92 fragments have the same conformation (M. Sauer. In each half site. possibly by wrapping around the DNA in the major groove. 32). and two are visible on the "back" (Figures 5 and 6). 17. B-form operator DNA was used for these studies because DNA in solution is B-form (25). protects several operator guanines from chemical methylation. This complex. showing the guanine N7 and phosphate groups implicated as repressor contacts on the front side of the DNA helix (16. Weiss et ai. Helix 2 is just above the major groove. NMR studies have subsequently shown that the operator conformation in the repressor­ operator complex is similar to the expected B-conformation (M. However. This helix-helix contact seems to be the major interaction stabilizing the N-terminal dimer. model building yielded only one arrangement that gave a good fit between the surface of the protein and the surface of the DNA. manuscript in preparation). For personal use only. Annu. as shown in Figure 6.

302 PABO & SAUER Figure 6 Proposed interaction between lambda repressor's N-terminal arm and the back of the operator site. and Ala49.Annu. Three of these side chains. the initial model building (3) simply used the crystallographic coordinates of the N-terminal dimer and searched for an overall fit of the repressor against the DNA. In a subsequent study. Biochem. The proposed contacts with the consensus site (Figure 1) are summarized below: 1.annualreviews. Gln44. Ser45. Panel on right shows guanines that are protected on the back of operator site ORl.53:293-321. Asn55.­ helix 3.4. The hydroxyl of Ser45 donates a hydrogen bond to the N7 position of guanine . It accepts a hydrogen bond from the N6 of adenine and donates a hydrogen bond to the N7 [Figure 4 (top)]. Downloaded from www. The side-chain amide of Gln44 makes two contacts with adenine + 2. is in the irregular region of protein chain just beyond helix 3. After these Access provided by Kangwon National University on 03/08/15. For personal use only. 1984. Except for adjustments in the position of the flexible N-terminal arm. This phase of model building started with the previous complex but then allowed surface side chains to move and allowed minor adjustments of the relative orientation of the subunits within the dimer. and the fourth. . Lewis et al (38) made more detailed predictions about the contacts between repressor and the consensus operator site. Rev. it appeared that four side chains from each subunit of the dimer could make specific major groove contacts on the front of the operator site. 2. are on (1.

Weiss et aI. and Asn61. The side-chain amide of Asn55 donates two hydrogen bonds to the 06 and N7 positions of guanine . 1984. 3. Ethylation of any of ten phosphates in the operator interferes with repressor binding (16.PROTEIN-DNA RECOGNITION 303 Annu. Downloaded from www. These residucs could interact with the phosphates on the outer edge of the site if the operator DNA were allowed to partially bend around the protein. 4. seven of the residue positions affected by these "DNA-binding" mutations cluster in the a2-a3 region of the N-terminal domain. Tn addition to the major groove contacts. Detailed predictions of contacts that involve repressor's N-terminal arm were not attempted. which is the first residue in helix 2.53:293-321. The methyl group of Ala49 makes van der Waals contacts with the methyl groups of thymine + 3 and thymine . decrease thc operator affinity of repressor but do not disrupt the structure of the mutant N-terminal domain. in the manner proposed for the ero-operator complex (23). several contacts between repressor and the sugar phosphate backbone of the DNA seem possible (38). these phosphates are symmetrically disposed on the front face of the operator site. 41). NMR studies show that the arm is flexible in solution (M.annualreviews. and residues 24-26 may be responsible for these ion pairs. contact the two phosphates near the outer edge of the operator site. which is part of the loop of irregular chain between helices 1 and 2 (see Figure 5). None of the proposed contacts between phosphates and amino acids involve ion pairs. Biochem. The salt-dependence of binding (30. Without reliable structural constraints. 39) suggests that a few ion pairs are formed between the repressor dimer and operator Access provided by Kangwon National University on 03/08/15. and Asn52.6. In the proposed complex. appear to contact the phosphates near the center of the operator site. Six are clustered near the center of the site and two are near each end of the site. Repressor Mutants Defective in DNA Binding Genetic and biochemical studies of repressor mutants provide strong support for the fundamental features of the proposed repressor-operator complex (40. affecting eight residue positions in the N-terminal domain. Residues Asn58. in the irregular region between helices 3 and 4. For personal use only. is near the DNA. Gln33. detailed model building gives too many possibilities to be useful. Twelve mutations. the Lys24-Lys25-Lys26 sequence. 17) and.5. which is the last residue in helix 3. As shown in Figure 7. Here. Tyr60. the mutations affect each of the four residues predicted to make specific major . as shown in Figure 5. The conformation of the arm in the crystal seems to be determined by crystal packing forces and there is no reason to believe that it adopts a similar structure in solution or in the repressor-operator complex. In fact. However. Rev. manuscript in preparation).

Downloaded from www..Annu.53:293-321. This issue has been Tyr ). we have referred to protein-DNA "contacts" without explicit reference to the energy provided by each contact. Ala49 -t ValjAsp.38). and thus the mutations provide strong experimental support for the model of the repressor­ operator complex.. Val I Gin Glu Sor Val Ala Asp Lys Mot Gly Mot Gly Gin Sor Gly Val Gly Ala Lou Pho Asn Gly 110 Asn Ala 35 40 ::=: 50 His ). 304 PABO & SAUER groove contacts (Gln44 -t Leu/Ser/Tyr. Rev. In Figure 8.. and 61. Biochem.. For personal use only. .. 60. The only significant differences involve residues 58. This figure also shows the position of the "DNA contact" residues predicted from model building (3. Thus. There is clearly a striking correspondence between the genetic and model-building results. Access provided by Kangwon National University on 03/08/15. C ro I Asp Helix 3 Sequence of the Helix 2/3 Regions from Repressor and ero. almost every residue that has been proposed as a DNA contact residue is altered by one or more repressor mutations. Tyr Leu Sor Glu Sor Lou I I .annualreviews. which are near the center of the repressor dimer (Figure 8) and which were implicated as backbone contacts by model building. the positions of the "DNA-binding" mutations are shown on a space-filling model of the N-terminal dimer. two of these sites were identified by mutations (Asn58 -t Ile . Asn55 -t Lys). and one residue that· is dose to the operator in the proposed complex (Gly43 -t Glu). Residues underlined twice are predicted to make contacts with the DNA backbone (23. 43). Those underlined four times are predicted to make major groove contacts. Ser45 -t Leu. Lys I I Arg Gin Thr I I I Illn Thr I I Gin Thr Lys Thr Ala Lys Asp Leu Gly Val Ty r Gin Ser Ala lie Asn Lys Ala lie His Ala Gly Ma Lys 25 == == == 30 20 Helix 2 Figure 7 Asp I Asp His Asn I I . two of the residues predicted to make phosphate con­ tacts (Gln33 -t Ser/Tyr. Although mutations at these positions were not obtained in the study cited above. Re p .. This excellent overall corre­ spondence between the genetic results and the model building would not be expected if the model were seriously wrong. The only "DNA-binding" mutation outside the 0(2-0(3 region alters a residue in the N-terminal arm (Lys4 -t Gin). Thus far.. The positions of surface mutations that decrease operator binding are shown. 38). Asn61 -t Ser) in a study where the mutant proteins were not characterized (42. Asn52 -t Asp).

Nelson. and Gln44 side chains each contribute about 2. For personal use only. the Lys4 --+ GIn. manuscript in preparation). Downloaded from Access provided by Kangwon National University on 03/08/15. Ser45 --+ Leu). PROTEIN-DNA RECOGNITION 305 addressed by studying the affinities of the purified mutant repressors for operator DNA. Gln33. This suggests that the Lys4. Even if the Lys4. Mutants may have reduced affinity either because favor­ able contacts are removed or because unfavorable sterk or electrostatic contacts are introduced. The operator affinities of these three mutants are about 100fold less than the wild-type affinity (H. R. Feldman. and Gln44 contacts are somewhat stronger than average. Positions of "DNA-binding" mutations are highlighted in right panel. it seems that the total repressor-operator binding energy can be reasonably explained by the contacts proposed in the model. If these energies are additive. Sauer. MODEL Figure 8 MUTANTS Space-filling models of lambda repressor's N-terrninal domain. . However.annualreviews. Rev.53:293-321. Both effects may contribute when the wild-type side chain is replaced by a larger side chain in the mutant (e. Gln33 --+ Ser and Gln44 --+ Ser mutations replace the wild-type side chain with a smaller side chain and thus their decreased operator affinity is likely to reflect the loss of favorable interactions. or half of the 16 kcaljmole free-energy change that occurs upon binding (30).g.Annu. then these three side chains contribute a total of about 8 kcaljmole.7 kcaljmole of free energy to the interaction between the dimer and the operator. Biochem. Gln33. Residues predicted to contact the operator DNA are highlighted in left panel. 1984. Computer graphics were provided by Richard J.

'Nild­ type repressor and the mutant Gln33 � Ser repressor have the same aflinjty ' for nonoperator DNA (H. Biochem. 1984. manuscript in preparation).gene activator protein (CAP). and substitution of serine at this position reduces the operator affinity of repressor about 100-fold. For personal use only. How cap. However. R. Presumably the reduced affinity is caused by electrostatic repulsion between the mutant side chain and the phosphate backbone. A consensus sequence has been suggested for the CAP binding sites (46).org Access provided by Kangwon National University on 03/08/15. it forms a complex with CAP. Sauer. The Asn52 -> Asp repressor mutant also affects a proposed phosphate contact but in this case the mutant has reduced affinity for both pperator and p. When cyclic AMP is present at a sufficient concentration. and this complex is active in specific DNA binding (45). Crystallographic studies by McKay. The CAP .53:293-321.Annu. E. and this suggests that Asn52 is close to the DNA in both the operator and nqnoperator complexes. regulates several catabolite-sensitive gene operons inE. Gln33 makes a contact with one of the phosphates near the outer' edge of the operator site (3. The C-terminal domain includes residues 136-209 and contains three a-helices (D. we rationalize this in rqolecular terms? One possibility is that Gln33 only contacts the DNA backbone in the specific repressor-operator complex. and a long a-helix (C). Since specificity depends on the ratio of operator to nonoperator binding ( 12). W�ber. 48) and has two domains (49). The N-terminal domain contains 135 residues and consists of a pair of short helice§ (A and B). This polint is illustrated by the Gln33 � Ser mutant of lambda repressor. such contacts could readily contribute to the specificity of binding. Nelson. but many of the individual sites differ considerably from this sequence. 44). 51) have determined the structure of the intact CAP dimer in a complex with cAMP. also called the cyclic AMP receptor protein (CRP). an eight-stranded antiparallel p-roll. However.onoperator DNA. and F) and two pairs of short antiparallel p-strands.ove might pr� vent Gln33 from approaching the backbone closely enough to make a cont ii ct. & Steitz (2. steric interference in the major gro. 306 PABO & SAUER Contacts between a protein side chain and the DNA sugar phosphate backbone are often referred to as "nonspecific" contacts. coli (7. Downloaded from www. CAP is a stable dimer in solution and this dimer is the active DNA binding species (50). " CAP PROTEIN l]1e catabolite. The CAP protein contains 209 residues (47. In the proposed complex. A sketch of the CAP monomer is shown in Figure 9.annualreviews. 38). In complexes of repressor with nonoperator DNA. while the N-terminal domain binds cyclic AMP and provides most of the dimer contacts. Rev. The C-terminal domain binds DN�. Gln33 increases the specificity of operator recognition.

This conformation of DNA (which is quite distinct from Z-DNA) has never been observed. it seems conformationally plausible (56) and in model- . while the other subunit has a "closed" conformation. Biochem. The cyclic AMP also occupies part of this dimer Access provided by Kangwon National University on 03/08/15. Rev.. 54).... 3. dimer contacts. .. DIMER CONTACTS cAMP Figure 9 BINDING Sketch of the CAP monomer. since the only regions of net positive charge are near the amino-terminal portions of the F-helices. and the primary functions of each domain are listed. McKay & Steitz (2) had proposed that CAP binds to left-handed B-form DNA. the CAP dimer is somewhat asymmetric. Calculation of the electrostatic potential at the surface of the CAP dimer provides some support for this model (55). 1984. . One subunit has an "open" conformation with a cleft between the domains.annualreviews..PROTEIN-DNA RECOGNITION . C TERMINAL DOMAIN DNA BINDING N TERMINAL DOMAIN Annu. However.. . . 51-54) but current results suggest that CAP binds to right­ handed B-DNA and uses the N-terminal portions of its F-helices to contact the major groove as shown in Figure 10 (3. Originally. . involve a pairing of the C-helices and some additional contacts between the C-helix of one subunit and the /3-ro11 of the other subunit. which are in the N-terminal domain.53:293-321. 307 . F===� . but it forms hydrogen bonds that bridge the dimer interface.. In the crystal form studied. . The approximate extent of each domain is outlined. since there are different orientations of the N-terminal and C-terminal domains in the two subunits. Models/or CAP-DNA Interactions Several different models have been proposed for the interaction of CAP with DNA (2. For personal use only.. Adapted with permission from (51). It is completely buried within the interior of the N-terminal domain. Downloaded from www.

Steitz. The specific contacts are listed below using the base numbering scheme of Figure 11: 1. Downloaded from www. For personal use only. However. 1984. If CAP were to bind to left-handed DNA. biochemical experiments rule out this model.annualreviews. Black dots indicate contacts with phosphates (45). Weber. The guanidinium group of ArgI80 donates two hydrogen bonds to 06 and N7 of guanine 3 [(Figure 4 (bottom)] and donates two hydrogen bonds to the symmetrically related guanine 16. T. This does not occur. building Access provided by Kangwon National University on 03/08/15. 2. then it should unwind the right-handed DNA found in solution by four turns (1440 degrees).Annu. Adaptecl with permission from (54).53:293-321. the two F-helices of the CAP dimer can fit neatly into successive major grooves of this left-handed DNA. Kolb & Buc (57) have shown that CAP binding unwinds DNA by no more than 30 degrees. In this model the DNA is bent around the protein with a radius of curvature similar to that predicted for the Cro­ operator complex. In fact. Several specific CAP-operator interactions have recently been proposed on the basis of model building with right-handed B-DNA (I. Biochem. The side chain of GluI81 in one monomer accepts one hydrogen bond . 308 Figure 10 PABO & SAUER The proposed interaction between the C-terminal domain of CAP and the CAP binding site in the lac operon. Rev. personal communication).

and also shows the symmetrically related L8 and L29 mutations in this site. The amino group of Lys188 donates two hydrogen bonds to 06 and N7 of guanine 5. 4. 3. Glu181 --+ Leu.annualreviews. change the same residue in helix F. manuscript in preparation) has suggested specific interactions between CAP and its binding site and also supports the model in which CAP binds to right­ handed B-DNA. personal communications). B. It is not known how cAMP increases the affinity of CAP for its specific DNA sites. the mutants bind equally well to the L8 site and the L29 site. but the crystal structure suggests some possibilities and rules 2 4 6 8 10 12 14 16 18 5' AT G TGAG T TAG eTCACT C 3' 3' TA CAeTeAATeGAG TGAG 5' � � A T T A La Figure 1 1 L29 Sequence of the CAP binding site in the lac operon. CAP mutants with increased affinity for the L8 site or the L29 site. while Leu181. P. Ebright. Arg185 donates one hydrogen bond to N7 of adenine 6. to make if CAP bound to left-handed DNA (1 Weber. Cossart. Two of the CAP mutations were selected using the L8 mutation and one was selected using the L29 mutation. if not impossible. Downloaded from www. A genetic study of CAP mutants with altered DNA-binding specificity (R. For personal use only. were selected and three different mutations were obtained. 1984. Va1181. but with reduced affinity for the wild-type site. Biochem.PROTEIN-DNA RECOGNITION 309 Annu.53:293-321. and donates one hydrogen bond to the symmetrically related adenine 13. which have symmetrically related base Access provided by Kangwon National University on 03/08/15. T. Rev. The side chain of Glu181 in the other monomer accepts one hydrogen bond from N6 of adenine 15 and may accept a second hydrogen bond from the N4 of cytosine 14. from N4 of cytosine 5 and accepts a second hydrogen bond from N6 of adenine 4. Glu181 --+ Val and Glu181 --+ Lys. it is likely that Glu181 normally recognizes the G: C base pairs at 5 and 14. Analysis of these CAP mutations also indicates that CAP interacts with its binding site in a symmetric fashion. These contacts can be accommodated in the complex of CAP with right-handed DNA shown in Figure 10. Beckwith. and two to the symmetrically related guanine 14. but they would be difficult. and Lys181 recognize the mutant A: T base pairs at these positions. . Steitz. Figure 11 shows the CAP binding site in the lac operon. Nevertheless. Since model building suggests that helix F makes major groove contacts. All three mutations. J. Gicquel-Sanzey.

Alpha-helices arranged in this way may be unique to DNA-binding proteins as they have not been found in any other protein structures (58. repressor. and in CAP these are designated helices E and F. and several research groups have predicted that these . this homology is not extensive and is far less precise than the other homology. However. It had been proposed. since the cyclic AMP is buried in the interior of the large domain. 59). Downloaded from www. The agreement between lambda repressor and Cro is slightly better. and CAP. In all other regions. This is illustrated in Figure 3. Biochem. Thus. For personal use only. that the adenosine ring of cAMP interacts directly with the DNA (52). Sequence Homologies ( A number of DNA-binding proteins share sequence homologies with ero. from cAMP analog studies.310 PABO & SAUER Access provided by Kangwon National University on 03/08/15. The structures of these (X-helical units within the three proteins are nearly identical. Here. out others (51). The structure of the CAP-cAMP complex shows that this proposal must be incorrect. 24 ex-carbons from the exE-exF region and 24 (X-carbons from the (X2-(X3 unit can be superimposed with a deviation of only 1. Superimposing the strictly conserved bihelical unit also reveals a limited structural homology among parts of helix 1 from Cro and repressor and parts of helix D from CAP (58. 20 (X-carbons from the two (X2-(X3 units superimpose with an average deviation of only 0. and CAP with DNA.annualreviews.7 A (59). the helical units of the protein dimers cannot be superimposed as preciisely as the helical units of the monomers. many of the DNA contacts are made by two (X-helices that are linked by a tight turn. In each of the three models. It should also be noted that the arrangements of the conserved helical units with respect to the dimer axes are not identical in the three proteins. since the buried cAMP interacts with both N-terminal subunits of the dimer.53:293-321. and the different orientations of these regions imply that the proteins could not be docked with their bihelical units contacting the DNA in precisely the same manner. 59). it might affect DNA binding by changing the relative orientation of the two subunits or by changing the orientation of the domains within a subunit. repressor.1 A per (X-carbon (58). For CAP and Cro. Howt:ver. 1984. these are helices 2 and 3. the tertiary folds of the three proteins are completely different. the first helix (2 or E) sits above the groove near the DNA backbone while the second helix (3 or F) fits partly or completely into the major groove. Rev. In both Cro and repressor. A CONSERVED ex-HELICAL STRUCTURE FOUND IN MANY DNA-BINDING PROTEINS In the proposed complexes of Cro.

Alpha-helices on the surface of a protein often have a characteristic pattern of nonpolar residues that face the hydrophobic core of the protein. then the strong conservation at positions 5. repressor and CAP contain such Access provided by Kangwon National University on 03/08/15. suggests that the homologous proteins also form similar helix-tum-helix structures. Downloaded from www. and LexA repressor sequences are significantly related (61). In the proteins of known structure. Biochem. the residues at positions 1-3. The pattern of conserved residues and residue types.annualreviews. Thus. there are only limited regions of homology. and most listings of homologies excluded . If the homologous sequences also form bihelical structures. 18. the homology almost certainly implies an evolutionary relationship. P22. glycine predominates at 9. Originally. As discussed below. 9. shown in the alignment of Figure 12. 434. This means that the homologous proteins could form similar bihelical units and have predominantly nonpolar side chains facing the hydrophobic core. position 9 forms part of the tight turn between the helices. 6-7. which is part of the hydrophobic core of Cro. entire protein sequences are homologous. Alanine is favored at position 5. ' illustrates an interesting problem in trying to predict structural homology from sequence homology. repressor and CAP. For personal use only. imd in these cases. 11-14.53:293-321. positions 5. A set of such sequences is shown in Figure 12.6 residues/turn. PROTEIN-DNA RECOGNITION 311 proteins also use helix-turn-)1elix structures for DNA interactions (60--62). and in the numbering scheme of Figure 12. 1984. in addition. the side chains at 5 and 15 in the helix 2/3 unit are in van der Waal's contact and probably help to maintain the proper angle between the two helices. and 16-17 are solvent exposed and hydrophilic. For example. The homologous DNA-binding proteins also tend to have nonpolar residues at these positions and. and either valine or isoleucine usually occupies 15. and 19 in the second helix. and 8 in the first helix and positions 15. 5. but many DNA-binding proteins have regions that are homo­ logous to the 0:2-0:3 sequences of Cro and repressor and the o:E-o:F sequence of CAP. have nonpolar residues at position 10. and 15 are among the most highly conserved. 9. these triplets occupy positions 4. and 15 could be rationalized in structural terms. bihelical units in the homologous proteins would have a number of exposed polar residues that might be used for DNA interactions. The bihelical units of Cro. Because the helical repeat is 3. the lambda.Annu. and the homologous proteins also tend to have hydrophilic residues at these positions. it was thought that glycine was required at this position. Rev. In some cases. In the alignment of Figure 12. In other cases. The highly conserved glycine at position 9 of the alignment (Figure 12). In Cro and repressor. Each of these residues seems to have an important role in maintaining the structure of the bihelical unit. these residues usually occur at relative helical positions 1-4-5 or 1-2-5 (63).

" ..11 e-Thr-Pro-Leu-G 1 n-Va l-Arg-Va1-Trp-Cys-Asn • 117 · .G1n-Leu-A 1a.312 PABO & SAUER 10 · . 10 ..Phe-Arg-Ser. " " " " 35 *•• " " " 41 ..Phe-G1y.Lys -Va1 . 35 118 119 120 121 *** 122 123 12 ..11e-A1 a-Gln-Arg..** Thr-Glu-lys-Thr-A1 a-G1u-A1 a-Va1-G1y.8 20 33 *** 34 " 28 !. Access provided by Kangwon National University on 03/08/15.Va1-A1 a-Arg-Leu-A1a-G1y-Va1-Ser-Va1-A1 a-Thr-Val-Ser-Arg-Val-11e-Asn. Bertrand. Leu-Tyr-Asp-Va1-A 1a-G1u-Tyr-A1a-G1y-Va1-Ser-Tyr-G1n-Thr-Va 1-Ser-Arg-Va 1-Va1-Asn.. 3 Sequences homologous to the a2-a3 sequences of lambda Cro and lambda repressor. CAP 36 ••* 181+ 185 186 *** 187 188 202 203 20'+ *** 205 · . 30 A cI I 32 *** ..11. 70 *** · .Val -Ser-Gln-Trp -Lys-G1u .* .*.Leu-Ser. and arabinose C protein (96). . K. . 162 163 164 !!! 166 167 168 !!! 170 171 17 2 173 171t. Rep 40 Arg-G1n-G1 u-Ile-G1y-G1 n-Ile. 12 G1 n-G1 u-Ser-Va1-A1 a-Asp-lys-Met-G1 y-Met-G1Y-G1n-Ser-G1Y-Va1-G1 y-A1 a-leu-Phe-Asn • 37 *** · . . Nguyen. ••* · • . H-inversion protein (93).. " 20 24 25 " " " *** 31 *** 170 171 172 173 *** 11 .1)6 . !!.Leu.Va1-Asp-Lys-Ser-G1n-I1 e-Ser-Arg.. .Ser-His -11e.Lys -Asn-Glu -Leu -G 1y-A1 a-G1y-I 1e-A 1. . P22 Cro " *. • Tet TnlO Tet pSCIOI 136 Th r-Arg-Lys -leu-A 1 a-Gln-Lys-Leu-Gly-Val-G1u-G1n-Pro-Thr-Leu-Tyr-T rp-His -Va1-Lys .Gly-I1e -Asn..I 1e-G1u-A 1 a • · . 175 176 177 *** " 179 180 181 182 183 *** 188 189 190 I'll ** * 192 193 1')4 195 *** 196 191 198 199 200 201 *** . . .annualreviews. 162 · • . 18 19 20 *** 21 2.inversion 161 A 1 a. . · 19 *** 30 *** " ••* · 18 .. Rev.. 61 n-Thr-61 u-Leu-A1 a.53:293-321.*. 21 Fnr 2.I 1e.1 a-G 1u... Th r-Arg-Arg-Leu-A 1 a -Glu -Arg-Leu-Gly-Va 1 -Gln-Gln.Pro-A 1a -Leu-Tyr -Trp-Hi s-Phe-Lys .Lys -Thr.I1e ..G1y-Asn-Tyr-Leu-G 1 y-Leu-Thr-Va1-G1u-Thr-Ile-Ser-Arg-Leu -Leu-G1y . * " *** 16 3 16 4 165 166 *** 167 16 6 169 170 *** 171 172 173 174 175 176 *** 162 163 164 165 *** 166 161 168 16!1 *** 110 171 112 173 114 l' 5 *** 177 176 179 *** 160 116 1" l' a 119 *** . • 13 1.A rg... 161 T rp Rep " **.Hi s . . transposon gamma­ delta resolvase (95). • 434 Rep ..T rp-lys.. Postle.I 1e-G1y-Val-Ser-Thr-Leu-Tyr-Arg-Tyr-Phe-Pro .-G1n-Leu. 1984. Downloaded from www. . 22 Annu. 28 30 . transposon Tn3 resolvase (94). P22 cl 36 Lys-Gl u-Gl u-Va l-A 1 a-Lys-Lys-Cys-G 1y.Il e-Ser-As p-Ala-A 1 a. 36 Gl n-Arg-G 1u-Leu.Th r-Lys-Ala. 187 · . 17 20 *** 23 *** 24 • • 28 *** • • • • • • 31 32 33 34 . Il e-A1 a-Ser. 21 · .* 45 " " . " · 14 G1n-A 1a-A 1a -Leu-G1y-lys-Met-Va1-G1 y-Va1-Ser-Asn-Va1-A 1a-I1 e-Ser-G1 n-Trp-G1 n-Arg . tetracyline repressors from TnlO and pSClOl (T. I. 125 *** 126 127 128 129 130 131 *** 132 133 134 *** 135 " • • Mat " . 196 H.. 178 P22 Rep " A rg-G1 y-As p-I 1 e. and the aE-aF sequence of CAP.. 37 Arg-G1n. . .A1 a-As p-A 1 a -le u. 13 G1n-Thr-lys-Thr-A1 a-Lys-Asp-Leu-G1y-Va1-Tyr-G1n-Ser-A 1a-11 e-Asn-lys-A1 a-11e-His • 17 · .Leu..Ser. G 1n.Met-Asn...4 *** 26 21 2. " "' **• Helix Resolvase y6 Resol vase Ara 215 .Se r-Arg. !!! 176 177 !! ! 179 1 • • e0 · ..Va l-A1 a -Gln -His -Va1.Jl e-A1 a-Arg ..Leu-Ser..I1e. . 4 2 " 44 .A1a -A1 a-Glu-G 1 u-H i s-Le u .· Thr-11e. . 197 198 199 � 200 *** 201 202 203 204 *** 205 206 207 208 209 210 *** 211 212 213 *** 211t " Heli x 2 Figure 12 " **.. K.Arg-Lys -Va 1. · 17 '+7 *** "1 *** 21 *** · 16 15 G1n-Arg-A 1 a -Va 1-A 1a-Lys-A 1 a-Leu-Gly.' • 434 Cro . manuscript in preparation). 11 Lex R .G 1y. 30 3') • • Cro 15 1. . G1 n-A 1a-G1u-Leu-A 1a-G1n-Lys-Va1-G1y-Thr-Thr-G1n-G1n-Ser.Ile-G1 u-G1n-Leu-G1u'Asn • · .Jl e-Asn .61Y-Va1-Lys-61n-61 n-Ser. . 29 Ga1 1• .Ser-Thr-Va1-Tyr.Thr-Arg -G1 y-Se r-Asn .Trp-Lys-Arg 28 Lac " 11 e-Lys-Asp..Pro -Asn.. . • 35 .Ser-11e-A1 a-Arg-Ser-Thr-Val-Tyr-Lys -Ile-Leu-Glu • 161 · .Phe -Arg.Pro -Ser-Arg.Thr-Glu-Ile-A1 a-Hi s-Gln-Leu. ..Gl u-Ser-Gl n. Ala. Tn3 160 .Cys . Sequence references: Fnr protein (92). For personal use only..Gl y . .Va 1-G1Y-Cys-Ser-Arg-G1u-Thr-Va1-G1Y-Arg-1le-Leu-Lys • 169 · • • . Citations for other sequences are list'ed in (60-62).

For example. lambda clI protein. structural homology can be present in proteins that lack sequence homology (74) and circular dichroism studies show that both of these small DNA-binding proteins are substantially a-helical (A. R o. NMR studies have identified tertiary interactions that are predicted by the model (71) and have identified two linked a-helices in the predicted positions (72). this glycine assumes a backbone conformation (phi 60. Y. repressor. it has been shown that a double mutant of lambda repressor.annualreviews. Struc­ tural studies will be required to determine whether the IX-helices of these proteins are used for DNA binding and. The strongest physical evidence ih support of a bihelical unit is for lac repressor.PROTEIN-DNA RECOGNITION sequences like AraC and Tn3 repressor for this reason. Circular dichroism shows that lac repressor. and CAP. What then do we make of the strong conservation of glycine at position 9? In Cro. Cro. 69. M. How­ ever. For personal use only. psi 44) commonly observed for glycine but only rarely observed for all other amino acids (65). It seems very likely that many DNA-binding proteins use helix-tum­ helix units and the question even arises whether there are any specific DNA­ binding proteins that do not use bihelical units. This suggests that residues other than glycine might be accommodated at position 9 but at the cost of some modest conformational strain. 70. R. and P22 repressor contain substantial regions of !i-helix in their DNA-binding domains (66-68) and there are mutants of each protein whose properties can be explained by the proposed helix-tum-helix model (64. There are many DNA-binding proteins that lack obvious homology with the sequences shown in Figure 12. manuscript in preparation). 313 = USE OF a-HELICES IN DNA RECOGNITION Several early model-building studies predicted that a-helices could fit into the major groove of B-form DNA (75-78). Biochem. Rev. containing Asp38 -+ Asn (position 6) and Gly41 -+ Glu (position 9) is functional (64). whether the helical regions resemble those of CAP. This proves that formation of the proper turn between helices 2 and 3 does not require glycine. Wulff. for recognition. = Annu. the Mnt and Arc repressors of bacteriophage P22 are two such proteins (73). Youderian. However. Sauer. whereas the larger side chains of other residues cause some steric hindrance. P. The side-chain hydrogen atom of glycine allows it to readily assume this Access provided by Kangwon National University on 03/08/15. and the structural information now available shows that this is a common mode of DNA recognition. or at least a-helical regions.53:293-321. and lambda repressor. 1984. unpublished). D. For some ofthe homologous proteins. if so. Vershon. Here. An a-helix with side chains has a diameter of about 12 A while the major groove . M. Rosenberg. further evidence suggests that they actually do form bihelical units like those of Cro. Downloaded from www. Susskind.

For example. fits directly into the major groove and contacts the edge of a base pair. the first residue in helix 3.Annu. helix 3 is not as closely parallel to the major groove. an ex-helix can be positioned with one of its ends. However. this arrangement should help orient the side chains for interactions with the major groove. Moreover. still allows the helix to contact four or five base pairs. since an ex-helix has a partial positive charge at its N-terminal end. The number of base pairs that could be contacted by a single ex-helix depends on the orientation of the ex-helix with respect to the groove and on the length of the side chains. since the side chains of an a-helix point toward the N-terminal end.whereas the major groove curves away in both directions. it is instructive to compare the way in which they use their a-helices for DNA binding. including the pattern of side­ chain contacts. For personal use only. Half of the common contact positions have different residues and even where the contact residues are identical there are im­ portant differences in the proposed complexes. This arrangement. In the repressor-operator complex (38). but the overall arrangement. 314 PABO & SAUER of B-DNA is about 12 A wide and 6-8 A deep. model building suggests that the ex-helix does not need to be precisely parallel to the major groove. 80). and the major groove carries a partial negative charge ( Access provided by Kangwon National University on 03/08/15. Consider the contacts made by the Gln27. 79. Gln27. This is clear from the pattern of side-chain contacts. each residue position in the ex2-ex3 region of repressor that is proposed as a specific: or backbone contact is also proposed to make a similar type of contact in Cro. which is similar to the one proposed for lambda repressor. Alpha-helix 3 in the Cro-operator complex is almost p:trallel to the major groove.annualreviews. near the C-terminus of helix 3. A variety of different orientations would still be sterically reasonable and would allow extensive contacts with the DNA. The ex-helical backbone can be viewed as a scaffold from which side chains can contact the edges of the base pairs in the major groove. Maximum contact is obtained if the helix is parallel to the local direction of the major groove (Figure 3). Biochem. one side of an ex-helix can fit snuggly into the major groove. but its N-terminal end is somewhat closer to the DNA than its C-terminal end (23). Rev. For example. is farther from the groove and appears to contact a phosphate. closest to the double-helical axis of the DNA and the helix can be tilted by 15-20 degrees with respect to the major groove. the individual side chain contacts made by the two proteins are actually quite different. is similar. 1984. rather than its center.) However. as shown in Figure 7. However His35. (More extensive contacts are not possible because the a-helix is straighJ .8 er28 side chains of Cro and the homologous Gln44-Ser45 . Downloaded from www. and with this arrangement side chains on the helix could contact fOllr to six base pairs. Since lambda repressor and Cro recognize the same operator sites. An arrangement with the N-terminal rather than the C-terminal end closest to the groove is probably preferred. Thus.

53:293-321. the backbone contacts could increase the specificity of the interactions with the base pairs. (23. some contacts seem to be made by regions with an irregular or extended structure. OTHER MODES OF INTERACTION Although the recent structural and genetic studies suggest that most of the site-specific contacts are made by residues from the a-helical regions of Cro.. and helix F of CAP are clearly used for recognition of sites in the major groove. The glutamines are both predicted to make the same contacts with adenine + 2 [Figure 4 (top) . Although helix 3 of Access provided by Kangwon National University on 03/08/15. whereas Ser45 of respressor appears to donate a single hydrogen bond to the N7 of guanine . However the contacts predicted for the serines are different. helix 3 of repressor. but these residues can still make the same contacts with adenine + 2 because the glutamine side chains. As discussed with respect to the Gln33 . Rev. can reach this base. repressor and CAP. These different predictions arise from differences in the structures of the two protein dimers. Downloaded from www. 38)]. Gln16 at the N-terminal end of Cro's helix 2 and the corresponding Gln33 at the N-terminal end of repressor's helix 2 appear to hydrogen bond to the phosphates. the axis of helix 2 or helix E is almost perpendicular to the sugar phosphate backbone and the partial positive charges at the N-terminal ends of these helices are close to the phosphates. In the proposed complexes. such backbone contacts appear to be directly responsible for some binding specificity. In the Cro dimer the a-carbons of Ser28 are some 6 A farther apart than are the (X-carbons of Ser45 in the repressor dimer (59). By correctly orienting the helices and side chains in the major groove. For example. Lambda repressor's flexible N-terminal arm provides another example of an extended structure that is used in protein-DNA recognition. Since the Cro and repressor (X-helices contact only four to five adjacent bases within the major groove. The (X-carbons of Gln27 (in Cro) and Gln44 (in repressor) are also separated by different distances in the two proteins.. being longer.3 [Figure 4 (center)].4. these contacts may serve as "clamps" that keep helix 3 from rolling in the groove.. PROTEIN-DNA RECOGNITION 315 side chains of repressor (Figure 7).annualreviews. the use of a contact from a nonhelical region allows each protein to contact an additional contiguous base-pair. 1984. Biochem.Annu. Ser28 of Cro appears to contact adenine . it is less clear what role the preceding helices (2 or E) serve and why the two helices are so strictly conserved as a bihelical unit. This should provide a favorable electrostatic interaction. The use of .. Ser mutant of lambda repressor. Arg38 of Cro and Asn55 of repressor are thought to make specific major groove contacts and both lie beyond the C-terminal end of helix 3.. In addition. For personal use only. From a structural point of view.

and arginine can make two hydrogen bonds to guanine . model-building studies. For personal use only. since hydrogen bonds are highly dependent on the position and orientation of the donor and acceptor groups." The structural information. Initial inspection of the Cro structure (1) had suggested that the antiparallel f3-ribbon that is involved in the Cro dimer contacts might bind to the minor groove. 1984. In principle. This has always seemed reasonable. Model-building studies have also suggested that f3-sheets might be used to bind double-stranded DNA.53:293-321. and may also limit interaction energies (82). For example. their net binding energies will be reduced by the entropic cost of fixing their positions in the complex. However. Although these proposals seem plausible. PROSPECTS FOR A "RECOGNITION CODE" Even in the absence of high-resolution information from co-crystals it is possible to make a number of reasonable guesses about the general nature of any "recognition code. this use of a flexible arm allows repressor to contact both sides of the DNA helix without creating a large kinetic barrier to association. 84) or the major groove (85) of B-form DNA. since a glutamine side chain in the major groove can make two hydrogen bonds to adenine. 316 PABO & SAUER such a structure was first suggested by Feughelman et al (81).org Access provided by Kangwon National University on 03/08/15. who proposed that an extended polypeptide chain could wrap around a double helix and bind in one ofthe grooves. Downloaded from www. Rev. The right-handed twist of a f3-sheet should allow a pair of antiparallel f3-strands to fit into the minor (83. there is no structural evidence to indicate that /I-sheets are actually used in site-specific recognition. However. the use of flexible binding regions may limit specificity by allowing alternative contacts with different sequences. as the arm can wrap around the DNA after the globular portion of the protein has bound to the front of the operator site. Biochem. and genetic data make it almost certain that hydrogen bonds between side chains and the edges of base pairs are responsible for much of the specificity in protein-DNA interactions. As previously noted (37). and since hydrogen bonds are responsible for specificity in so many other biological interactions. more detailed model building studies (23) suggest that the f3-ribbon does not lie in the minor groove and suggest that contacts from this region are limited to interactions between the side chains and the phosphate backbone.annualreviews. it could have been possible that site-specific binding proteins used a simple "recognition code.Annu. The C-terminal residues of Cro apparently provide a second example of a flexible region of protein that is used to make DNA contacts. For example. since repressor's N­ terminal arms are disordered in solution but adopt a specific conformation upon binding the operator." involving a one-to-one correspondence between the amino acid side chains and the bases in the DNA.

and CAP seem to indicate that the "recognition code" is degenerate . 1984. it seems that each base pair can be recognized by several different amino acids and that each amino acid can bind to several different bases. Gly48 in repressor's helix 3 seems to play a passive role in recognition since a larger side chain at this position would cause an unfavorable steric contact between the protein and DNA. Thr. Downloaded from www. since adenine is recognized by glutamine (in both complexes). "degeneracy" ofthis type implies that the "meaning" of a particular amino acid will depend on the conformation and orientation of the protein backbone. the methyl group of Ala49 makes one of the specific major groove interactions and. However. repressor. At this stage. Thus far. no one-to-one recognition code is consistent with the current data.Annu. it is still conceivable that the list of possible interactions will be small enough so that the "code" will have a . PROTEIN-DNA RECOGNITION 317 [see Figure 4 (top and bottom) . Thus the "lock and key" analogy that describes the fit of substrates to enzymes also seems to apply to protein­ DNA interactions. i. but serine also binds to guanine (in the repressor model). but it is also recognized by serine (in the Cro complex). Serine binds to adenine (in the Cro model). Arg. in CAP. Although the favorable energies obtained from van der Waals interactions. it was conceivable that glutamine would always be used to recognize an A: T base pair and that arginine would always be used to recognize a G : C pair. Ser. This is not surprising." the overall fit of the protein and DNA surfaces is also extremely important. since the periodicity of an IX-helix has no simple relationship to the periodicity of B-DNA.annualreviews. van der Waals interactions also seem to be an important part of the recognition process. it is difficult to guess how many different bonding patterns will be used in recognition. If specificity is still to be maintained. Lys. It actually would be impossible to have any simple repeating pattern of contacts made by one IX-helix. Asn. or electro­ static interactions are important aspects of "recognition. Glu. However.53:293-321. Several of the hydrogen-bonding side chains in ero also seem to make significant van der Waals contacts with the operator. in the three proposed complexes a variety of side chains induding those of GIn. and it seems inconceivable that any simple code could have escaped notice with all of the structural and genetic information that is now available. and His have been used to make hydrogen bonds in the major groove or with the DNA backbone.e. Rev. For personal use only. For Access provided by Kangwon National University on 03/08/15. and thus we cannot know how "degenerate" the "recognition code" actually is. It is likely that these residues will be commonly used for DNA recognition. replacement of Glu181 by Leu or Val changes the specificity of DNA binding. Biochem. The repressor­ operator and Cro-operator models certainly support this idea. In repressor. However. hydrogen bonding. The proposed hydrogen-bonding interactions for Cro. Tyr. and (86)].

there is no requirement for symmetry in protein-DNA interactions. This use of a-helical regions for DNA binding appears to be a common mode of recognition. SUMMARY Annu. and CAP occur on one side of the double helix. repressor. Recognition of specific base sequences involves hydrogen bonds and van der Waals interactions between side chains and the edges of base pairs. repressor. repressor. For personal use only. However. Cro. stabilize the protein-DNA complexes. and CAP. Cro. together with backbone interactions and electrostatic interactions. like RNA polymerase. the use . The current models for the complexes of Cro.There seem to be proteins that bind left-handed Z-DNA (87) and DNA in other con­ formations. Many other DNA­ binding proteins are dimers or tetramers and their operator sequences have approximate two-fold symmetry. The DNA-binding sites are recognized in a form similar to B-DNA. and CAP use symmetrically related subunits to interact with two-fold related sites in the operator sequences. cannot be expected to provide a completely reliable "high-resolution" view of the protein-DNA complex. Most of the contacts made by Cro. repressor. These specific interactions.3 18 PABO & SAUER predictive value and could be used to locate DNA-binding regions within a protein sequence or to predict the preferred DNA sites to which a particular protein Access provided by Kangwon National University on 03/08/15.annualreviews. repressor. even when coupled with genetic and biochemical studies. They do not form cruciforms or other novel DNA structures. Some sequence-specific DNA-binding proteins. Several general principles emerge from the studies of Cro. 3. An adjacent <x-helical region contacts the DNA backbone and may help to orient the "recognition" helices. For example. 2. and CAP.53:293-321. lambda repressor contacts both sides of the double helix by using a flexible region of protein to wrap around the DNA. but it remains to be seen how these structures are recog­ nized or how proteins recognize specific sequences in single-stranded DNA. and CAP with operator DNA are probably fundamentally correct. 1. but it should be emphasized that model building alone. 4. and CAP use <x-helices for many of the contacts between side chains and bases in the major groove. form symmetric complexes. However. Biochem. Downloaded from www. 1984. It seems likely that these proteins will. like Cro. Rev. lambda repressor. 5. do not have symmetrically related subunits and do not bind to sym­ metric recognition sequences.

. H. A. R. W. 1981 . helpful discussions. Bloomfield. 1 2 : 259-84 9. Biophys. . Adhya. Acad. R. D. should be extremely helpful in evaluating and refining the current models. F. Y. 1982. which may occur upon binding to the DNA. D. Prog.. H. A. Matthews. Biornol. A. McKay. Bear. W. R. Crick. 1977. Biophys. pp. B. D. Jeffrey. Nature 294 : 2 1 7-23 6. A . Cell 1 9 : 1-1 1 5. Y. S. Roberts. Takeda. 20. ACKNOW LEDGMENTS We thank our many colleagues for advice.. AI-15706). Ohlendorf..... Takeda. W.. Poteete. 1980. M. C. G. S.. Gussin. T. Anderson. C. Wang. Takeda.. W. R.. Maniatis.. Takeda. L. 0. Meyer. Ann.. A.. J. W.. H.. Nature 290 : 744-49 3. W. V. In Biological Regulation and Development. P. 1984. 1983. B... 17. B. Rev. W. 93-121. Y. D. A 34 : 93 1-35 25. D. In press 29. 1982. S. Harvard Univ. 1953. 0. Dynam. Ohlendorf. I.. J. Fisher.. Ptashne. B. and unpublished information.. W. Proc. ed. Y. Bio­ chern. M. D.. M.. Stahl. R. 23. W. Res. J. von Hippe\. Pirrotta. Sauer. 1. M. Rosenberg.. Echols. 1977. Y. McSwiggen. W.. A. 1 : 279-347.. c. Crystallographic studies of the Cro-operator (91) and repressor-operator complexes (38). R. In press Hsiang. For personal use only. M. 1 970. P. Hopkins. Johnson. D. 24. A.. B . S... 198 1 . Jeffrey. Johnson. D.. H. Richards. USA 75 : 1 783-87 16. G. M. Recent studies of B-DNA duplexes have revealed sequence-dependent variations in local structure that could affect protein recognition (8890). H.. Lauer. A. H. R. T. F. Matthews... R. J. Ptashne. F. USA 76 : 200--3 26. Proc. ed. B. Downloaded from www. could also affect the detailed structure of the complex. Cambridge.. D. Ohlendorf. D.. 1983. Sci. M. C. Rev. Ptashne. 247 : 1 28-40 7. Johnson. O. Y.. PhD thesis. In Lambda II. M. D.. Johnson. J.. New York : Plenum 1 3.. Lancelot. Acta Cryst. T. D. J. R. Anderson. Am. 1... Nature 270 : 274-75 Takeda. Crothers. 1982. Hukins.. 1975. B.. Pabo. Takeda. M. 1982. F. N. Ackers. Goldberger. 1971. H. A.. A. Science 221 : 1020--26 1 1. F.. Biochem. Mass. G. Tinoco. of standard B-DNA geometry for the operator is clearly an approximation. A. Maurer. Steinberg. F. Matthews.. Brady. A. 22.. Mol. Biophys. Lewis.. W. A. Anderson. Garges.PROTEIN-DNA RECOGNITION 3 19 Annu. Backman. D. Sci. H. Commun. 39 : 1-68 14. R. W. Physical Chemistry of Nucleic Acids. Cole. 252 : 6177-83 Watson.. Ptashne. Struct. Weisberg. D. New York : Harper & Row 27. M. H. 1982... 47 : 1 504-9 Ohlendorf. G. Pabo. Work performed in our laboratories was sup­ ported by grants to C. D. T. M. G. Cell 29 : 287- 89 8. Bioi. A. D. Ohlendorf.. 1983. Literature Cited 1 . Shimatake. D. 1. C. 5 3 : 389-446 12. Hendrix.. M. Cold . 1984. Nature 290 : 754-58 2. F. Chem. Cold Spring Harbor. T. Matthews. Ptashne. 1974. F. from the American Cancer Society and the NIH (GM-3147 1) and to R.. R. Nature 270 : 275-77 Roberts. D.. H.. 21.. Helene. Acad. Bioi. S. 1983. M. H. G. Cell S : 109-13 15.. which are now in progress.. 1983. Mol. Nature 298 : 443-47 4. W. C. J. 19. P. Bioi. Nature 1 7 1 : 736-38 Arnott. Natl. 1982. B. O. T. Echols. Pabo. 1977. Rev: Biochern. Bioeng. A. Anderson. Steitz. Meyer. A. 1979. D . Sauer. Matthew. from the NIH (AI-16892. 1 978. 1980. Natl. et al. 1979. Sci. Biochemistry 2 1 : 4989-99 28. Ann. 1 8. P. D. Chadwick.annualreviews. NY : Cold Spring Harbor Press 10. Johnson. Ptashne. Morgan. K.. Nature 298 : 718-23 Jack. Johnson.. W. Johnson. Maurer.53:293-321. V. von Hippel. Levitt. c. Kleid. Folkmanis. Access provided by Kangwon National University on 03/08/15. 1972. Matthews.. B. Small changes in protein structure. W. 1981.

33. Yocum. Acad.annualreviews. S. USA 79 : 3097-3100 59. Quant. Matthews. Krakow.. Bioi. K. 3 1.. K.. Ptashne. New York : Wiley.... M. Weber. Nucleic Acids Res. F.. 40. Crothers. USA 78 : 401 1. A. M. Pabo.. C. Matthews. 32.. Acad. Pabo. R. 1979. R. Rosenberg. M. Z. T . G. Sci. Boschelli. M. 1973.. J. M. B. 1983. pp. Biochemistry 20 : 6 1 09-18 72. A. H. D. T... DeAnda.-88. J. D. Sci. W. Sci. T. In press 55.. Anderson. Ohlendorf. D. T. W. 38. T A. 168 : 699-7 1 3 74. Nucleic Acids Res. S.. D. 1976. Proc. Dixon. Harvard Univ. Ebright. 1978. P. Ptashne. 69. D. 1982. Irwin. Nature 298 : 441-43 Lewis. J. Quant. F.1 5 Salemme. 1982. J. Gupta. D. 1983. pp. 1978. T. Natl. C. T. Ozaki.. See Ref. Weber. 1982. H. K. J. M. Wong. T.... 1982. Lewis... Proc. R.. C.. Natl. Access provided by Kangwon National University on 03/08/15. J. 36. G. 1983.. F. Struct. Dynam. Sci. K. A . B. 1. 1 970. R. McKay. 1 0 : 1 363-68 Aiba. M. Zubay. 1983. Ho. 1983.. Natl. W. Meyer. T. 1981. K. Vol. Skopek. BioI.. 1983. 1981. 42. Y. Beyreuther. USA 79 : 5263-67 54.. Meyer. Buc. Rev. Proc.. W. Gicquel-Sanzey. Sauer. I. ed. W. I. 1982. BioI. H. Sung. 51.. M. S.. Nature 298 : 447-45 1 62. Bio­ chemistry 1 6 : 938-43 67.. Res. Nelson. 1 980. M. T. Zuiderweg. T.. D. BioI. G. H... R... 49. Fujimoto. M. 1983. B. J. W. H. Pabo.. Schiffer. T. USA 80 : 2676-80 Skopek. 1982. Anderegg. J. 1983. A.. Nucleic Acids Res. Mol. 1983. F. Busby. Pabo. Matthews. BioI. Ohlendorf. A. Natl. L.. Nature 290 : 334-35 75. M.. Takeda. Sauer. J. 47 : 419-26 56.. R. W. Cold Spring Harbor Symp. Sci. B. Mol. D. Miller. W. 1979. 41. B. J. R. Sauer. 1984. Doty. Chern. USA 67 : 1616-23 77. Mass. Natl. T. Harvard 31. K. Acad. 1983. H. 9 1 . Grutter. R. R. T. M.. 39. Cold Spring Harbor Symp. H. R. 0. Q uant. Sci. Youderian. In Molecular Structure and Biological Function. M. Buc.. Gimble. W. D. J. Anderson. H. Lewis. Hehir. Matthews. N. S. Acad. B. Proc. P.. A. J. Cambridge. 107 : 1 45-58 Nelson.. J. B. Sci. 73. 0. T. 257 : 9 1 28-34 68. O. B. 46.. Ohlendorf. V. F. 48. USA 80 : 5837-41 . P.. 1983. 1 1 : 141-58 McKay. . M. R. 1977. M. N. Kleid. B.. Natl. W. 0. Nucleic Acids Res. T. Bioi. Proc. J. Arndt. H. Remington. Poteete. P..... New York : Plenum. M. K... J. Sauer. H. F. B. T. Proc. J. 1982. P. 45. F. USA 77 : 6486-90 57.. J. Sci.. J. Biomol. R.. Anderson. Lohman.. Ohlendorf. B. 1 0 : 5085-5102 63. 34. M. Matthews. C. Steitz. Duax.. Susskind. E. Jeffrey. Mol. Nat!. Proc. Natl.. For personal use only.. Ptashne. C. Y. Natl. Steitz. 257 : 95 1 8-24 Ebright. Ptashne.. 47. Sauer. H.. T. Lewis. Bioi. Sturtevant. F. Sauer. M. Miller. R. S. Doolittle. R . Sci. Acad.98 1 . R. F. Sauer.. 7 : 121-35 64.. Hutchinson. R. Biophys. Acad. Natl. P. Jurnak. Mol. C. Ptashne. N. B. 1979. M ... Cambridge. R. 37. Lu. Krovatin. Downloaded from www. 1983. In press 69. W. 1978. USA 79 : 1428-32 6 1 . Molecular Charac­ terization of the Lambda Repressor and its Gene cI. W.. Geisler. Natl.. R. Bansal. Acad.. M. In press Majors. 1981. Wuthrich. In Structures of Biological Molecules and Assemblies. A. M. M. G. Anderson. ed. J. 53. Backman. T. Adler. 1983. Nelson. H. F. PhD thesis. Bioi. J. 36. J.. 10 : 1345-61 Cossart.. Fanning. McPherson. Mol. T.. T. Hecht.. 44... 35. A.. Bioi. Griffen. Jeffrey. 47 : 441-49 Hecht. Kaptein. Acad. I. H. M. G. 159 : 1 9-33 Wood. D. M. DeAnda. C. Proc. Acad. USA 76 : 5061-5 Humayun. Annu. R. 0.. Cold Spring Harbor Symp... Weber. M. 1977. Sasisekharan. H. Weber. W.. Cell 32 : 3 1 9-25 65. Sci. E. Nature 279 : 396-400 Maniatis. In press de Crombrugghe. Nucleic Acids. 19 : 109-14 66. 47 : 435-40 Record. 1 982. 1982. R. BioI. Steitz. D. l. Quant.. 50.. F . J.. M.. B. 169 : 757-69 60. Evai'. T. Weber. J. Acad.. R. Mol. ed. NY : Cold Spring Harbor Lab. 1 : 1-20 76. Kolb. M. K. Biochem. M. C. Matthew. R. J.. p. Ptashne. 1959. J. O. Hecht. B.. Acad. H. 35 : 283-94 30. Anderson. A.. Wang.. Bioi. 12354 71. McKay. Chern. 1978. Yamomoto. C. A. Ptashne. Proc. In Biological Regulation and Development. 52.53:293-321. 1 0 : 473-85 58. ed. Univ. deHaseth.320 PABO & SAUER Spring Harbor Symp. Mass. T. Miller. Johnson. Krovatin. BioI.. 1982. W. R. PhD thesis. Bio­ chemistry 20 : 4774-80 Fried.. T. 4 : 1595-607 Sauer. 1967. Proc. USA 70 : 1 5 3 1-35 Pabo. A. Sci. Hutchinson. K. Steitz. Proc. 1982. Nucleic Acids Res. 1983. C.. 1979. R. H. Edmundson. Beyreuther. Reznikoff.. Bio­ chemistry 17 : 1092-1 100 Pabo. H. A. Ladner. Mol. T. 43. 70. In The Operon. T. New York : ElsevierjNorth Holland Aiba. Cold Spring Harbor. M. H. Steitz. Sauer. R. 1982.. USA 76 : 1 608-12 Sauer. H. J.

S. R. Dickerson. Seeds. Acad. H. J. G.53:293-321. F. A. M. 1978. Natl.. Bioi. Rich.. R. 1982. R. USA 79 : 7729· 33 88. Nature 271 : 1 30-35 Hoi. K. Doolittle. G. N. J. J. E. Timko. R. Anderson. Kraut. 1978. L. Rich. Vandon­ felaar. Sci. 1982.. S. Wilson.. Quart. 85. Simon.. Kim. van Duijnen. J. McCarthy. J. 87. 1983... 82. E. 1979. Bioi. Sci.. J. Nature 273 : 443-47 Pullman. J.. 1 0 : 6 1 1 9-30 93. B. B. B. J.. W. 168 : In press 92. Acad. Nature 268 : 1 1 5-20 Seeman. F.. 83.. F. 1979. A.. G. R. C. Steitz. G. Ikuta. Stokes. Proc. Horwitz. Proc... Nature 237 : 322-27 Warrant. S. C. Kozlowski. H. J. 1984... P. D. Miyada. Cygler. Zieg. H. Sci. Sci. C. D.... J. H. L.annualreviews. M. et al. Annu. Proc. et al. USA 73 : 804-8 Nordheim. Acad. F. Mol. Langridge. 1983. Sussman. A.. 1983.-H. 1977.-H. W. 80. 1972. C. Itakura. 1955. Geisler. R.. USA 71 : 283-87 Church. Nature 1 75 : 38 Huber.. Cass. J. Azorin. eell l 8 : 1 1 53-63 95.PROTEIN-DNA RECOGNITION 78.. Acad. Natl. G. A. 198 1 . Mol. Pullman.. Silverman.. A. Heffron. J. C.. T. Patel. W. Mol. Rev. G. et al. 149 : 761-86 89. 166 : 41 9-41 90. 81. F. Nature 300 : 38 1-83 96. Acad. S. R. Berendsen. W. Ohtsubo. A.. Rev. N. 1 982. 1 974.. J.. Y.. H. Dickerson.. R. A. Matthews. J. W. M. R. I.. Gronenborn. Oatley. Sci. Shaw. Bioi... C. 8 : 5267-65 .. 84. Drew. Science 209 : 1 370-74 94. Proc. Sci. D. Reed. R.. 1980. Nucleic Acids Res. Ohtsubo.. 4 : 27 1-76 Access provided by Kangwon National University on 03/08/15. Ohlendorf. Shibuya. Natl. M. J.. Mandel. Tesser. Guest.. R. E.. 1977.. Trends Biochem. Kim. Biophys. 1 9 81 .. M. USA 80 : 21 84-88 9 1 . Rosenberg. 1 4 : 289-380 Feughelman. Nucleic Acids Res.. M. Proc. Biochem. E. S. P. 1. Natl. H. Moller. Downloaded from www. B. R. W. 86. 321 Kwon. For personal use only. Natl. 1976. Wilcox. 79. 19K1. A.. USA 74 : 1458-62 Blake.