Professional Documents
Culture Documents
The spe(ific*ity of’ the signal sequence cleavage reaction has been postulated to
rrsidr in a signal peptidase active site that can bind only to particular (i, i+Z)
pairs of amino acids. In this paper, we present further patterns of non-random
amino acid utilization in a region around in viva cleavage sites. and show that
thry can br interpreted in t,erms of selection acting to reduce the number of
potential rompeting sites in thr vicinity of the correct one.
1. Introduction
l’rot,eins destined for export are generally synthesized with a 15 to 25 amino acids
long, hydrophobic N-terminal extension that somehow initiates the export
process. This so-called signal sequence is removed from the protein once export is
under way through the action of an endoproteolytic “signal peptidase”. Usually.
the specificity of the cleavage reaction is very high, an observation that has been
hard to reconcile with the very limited degree of sequence homology found
amongst different signal sequences.
Tn a recent study (von Heijne, 1983), based on a collection of 78 eukaryotic
signal sequences, we showed that the region around the cleavage site shows strong
preferences for particular amino acids in particular positions. Small, neutral
rchsiducs abound in positions - 1 and - 3 (counting from the cleavage site between
positions - 1 and + 1) but are rare in -2. Conversely, aromatic, charged, and
large polar residues are absent from positions - 1 and - 3 (except Gln in - I), and
abundant in -2. Pro is absent from the region -3 to + 1 but quite common in
- -5. and Gly is found predominantly in positions - 4 and - 1. Upstream from
position -5 (eukaryotes) or -6 (prokaryotes). finally, hydrophobic residues
dominat,e st,rongly, forming a hydrophobic core in the middle part of the signal
sequences.
These observations led us to propose that an acceptable cleavage site must fulfil
a “(-3, -1) rule”, i.e. it must have either Ala, Ser, Gly, Cys, Thr or Gin in
position - 1, and must not have an aromatic (Phe, His, Tyr. Trp), charged (Asp.
(ilu, T,ys. Arg), or large polar (Asn, (iln) residue in position -3, as well as no Pro
2-M (;. voti HEI.JNE
c~nlculatrtl for binomial distributions with mean amino acid frequencies as given by Levitt
(1978) for rukaryotic proteins. and by Lehninger (1970) for Eschrrichia coli proteins. In
positions -3 and - 1, only acceptable residues have been included in the expected
distributions (with the same relative frequencies as in t,he tabulations of Leant and
I,?hllirlgtT).
TI
Y
CLEAVAGE OF SIGNAL SEQUEXCES “47
1 MMAAGPRTSLLLAFALLCLPWTQVVG-A*FPAMSLSGLF
2 MNSQVSARKACTLLLLMMSNLLFCQN-VQT"LPVCSGGDCQ
3 MKLAITLALVTLALLCSPASA'G-ICPRFAXVI
4 MILCSYWHVGLVLLLFSCCCLVLG*S"EHETRLVAN
5 MENVRRMALGLVFMMALALSGVG-A*S-VMEDTLLSV
6 MGNIHFVYLLISCLYYSGCS-G*VNEEERLIND
7 NGLEKSLFLFSLLJLVLGWVQPSLG*G-ESSRDK
8 MLLQAFLFLLAGFAAKISA*S-MXXXXXXXX
9 MRYMILGLLALAAVCS-A'A-KKVEFKEPA
FIG. 1. Signal sequences with known (nos 1 and 2) or possible alternative cleavage sites in positions
-2 and + 1. Sequence 1, bovine growth hormone; 2, rat preprolactin; 3, rabbit uteroglobin:
4, acetylcholine receptor a subunit; 5, acetylcholine receptor b subunit; 6. acetylcholine receptor 6
subunit; 7, pancreatic RNase; 8, yeast invertase; 9, adenovirus glycoprotein. References are cited b?
van Heijne (1983) or in Materials and Methods. * The normal (or predominant) cleavage site;
*, acceptable alternative sites. Amino acids are given in the one-letter code; i.e. A, Ala; C, Cys; D, Asp;
E, Glu; F. Phe; G, Gly; H, His; I, Ile; K, Lys; I,, Leu; M, Met; N. Asn; P, Pro; Q, Gin; R, Arg; S. Ser:
T, Thr: V. Val: W, Trp; Y, Tyr; X, unknown.
cleavage site, as indeed they are not (Table l), since they do not compete well
with Ala for position - 1.
Still, there are examples of ambiguous cleavage to be found in the literature.
Bovine growth hormone is perhaps the best example, where 65% of the cleavage
takes place at an Ala residue, the remaining 35% having a neighbouring
acceptable Gly residue in position - 1, Figure 1 (Lingappa et al., 1977). Human
interferon, cloned into yeast cells, is also cleaved at more than one acceptable site
(Hitzeman et al., 1983). Finally, rat preprolactin is miscleaved when a more bulky
Thr analogue is incorporated at the normal cleavage site, cleavage now taking
place three residues upstream at an Asn residue (the nearest acceptable site.
TABLE 2
Number of forbidden and acceptable sites (forbidden : acceptable)
with Ala, Ser and Gly at the potential cleavage site
Position -5 -4 -3 -2 fl +2 +3 +4 +5
1 :o 0 :3 1:2 1 0 0 :0 0 0 0 :0 0 :0 0 :0
4:5 2 :0 7:" 3.2 2.3 14 3 1 0 :0 1:4
5:5 P:3 8:4 42 2-3 14 3:l 0 :0 1:4
10 MIQKAKRTVSFRLVLMCTLLFVSLPITKTSA*VNGTLMQYFEWYTP
lb QACPPETLVKVKDAEDPLCA
20 MSIQHFRVALIPFFAAFCLPVFA*HPETLVKVKDAEDQ
2b AFLF
2c s
2d L
30 MKKSLVLKASVAVATLVPMLSFA*AEGDDPAKAAFDSL
3b I.
3c Y
(Prop, to SW), or not at all (Pro-, to Leu). In these two cases, the (-3, - 1) rule
is not violated. Exchanging Pro-, for a more hydrophobic residue, however, has
t,he effect of extending the hydrophobic core towards the mature protein, and t’hr
window for cleavage defined by the core C terminus now encompasses not’ the
original cleavage site but part of the mature chain, where no potential site exists
(see Fig. 2).
A similar case has been described for the phage Ml3 coat protein, where an Asp
to 1,~ or Asp to Tyr (i.e. charged t*o hydrophobic) replacement in position +2 in
the mature protein slows down cleavage (Boeke et al., 1980). Again, cleavage
might be impaired in these mutants as the result of an erroneous window rather
than as t’he result of a direct effect on the signal peptidase active site.
Tn this discussion, we have deliberately left out cleavage mutants obtained for
the Iii. cnli lipoprotein (Inouye et al., 1983), since it is known that this and similar
lipoproteins. as well as a number of membrane-bound penicillinases (Nielsen B
I,ampen. 1982), are processed by a special signal peptidase t)hat does not cleave
other export~etl bacterial proteins (Tokunaga et al.. 1982).
4. Conclusion
The (-3, - 1) signal peptidase recognition site suggested in an earlier paper
(von Heijne, 1983), and also by Perlman & Halvorson (1983), is well suit’ed t)o
explain both results from studies on cleavage-deficient mutants proteins, as well
as the patterns of amino acid selection around the cleavage site presented here. In
particular, we suggest that the very low incidence of Ala in positions - 2 and + 2
to +5, and the observation that 15 out of 16 atanine residues found in position
+ 1 do not make acceptable cleavage sites according to the (-3. - 1) rule, show
t,hat a region approximately between positions - 2 and + 5 is subject to selection
aimed at reducing cleavage sit’e ambiguity. A st’retch of about seven residues
would thus be accessible to the signal peptidase active site. its position would be
determined by the position of the (1 terminus of the hydrophobic core (the
cleavage window would start 4 to 5 residues downstream from the core (’
t,erminus). and cleavage would take place at the “strongest”’ site allowed by the
( - 3, - 1) rule inside the cleavage window. In some cases, more than one site of
comparable st’rength may be used in ~ivo, giving rise to N-terminal heterogeneity
among the mature chains. A first attempt at formulat’ing a quantitative
2.50 G. VON HEIJNE
prediction scheme along these lines has been published (von Heijne, 1983).
Finally, the very low incidence of Cly in position -3 is noteworthy in the
context of a (-3, - 1) recognition site.
This work was supported by a grant from the Swedish Xatural Sciences Research
Council.
REFEREIWES
Bell: G. I., Santerre, R. F. & Mullenbach, G. T. (1983). No,ture (London), 302, 716-718.
Bernstein, K. E., Premkumar Reddy, E.. Alexander, C. B. & Mage. R. G. (1982). Nature
(London), 3q0, 74-76.
Boeke, J. D., Russel, M. & Model, P. (1980). J. Mol. Biol. 144, 103-116.
Chandra, T., Stackhouse, R., Kidd, V. J. & Woo, S. L. C. (1983). Proc. Yat. Acad. Sri.,
U.S.A. 80, 184551848.
Godine, ?J. E., Chin, W. & Habener, ,J. F. (1982). J. Biol. Chem. 257, 8368-8371,
Goodman, R. H., Aron, D. C. & Roos. B. A. (1983). J. Riol. Chem. 258, 5570-5573.
Gurr, *J. A., Catterall, *J. F. & Kourides, I. A. (1983). Proc. Yat. Acad. Sci., [‘.S.A. 80.
2122-2126.
Hitzeman, R. A., Leung, I>. W., Perry, L. ,J., Kohr, MT. ,J., Levine. H. L. bt Goeddel, D. \-.
(1983). Science, 219, 620-625.
Hortin, G. & Boime, I. (1981). Cell, 24, 453-461.
Inouye, S.. Hsu, C. S., Itakura, K. $ Inouye, M. (1983). Science, 221, 59-61.
Kenten, J. H., Molgaard, H. V.. Houghton, M.. Derbyshire, R’. B.. Vine?. ,J.. Bell. L. 0. &
Gould, H. J. (1982). Proc. Nat. Acad. Sci., I’.S.A. 79, 6661 -6665.
Koshland, D., Sauer, R. T. & Botstein, D. (1982). (‘ell, 30, 9033914.
Land, H., Grez, M.. Ruppert. S.. Schmale, H.. Rehbein, M.. Richter, D. 8r Schutz. ($.
(1983). Nature (London), 302. 342-344.
Lee, .J. S., Trowsdale, ,J., Travers, P. ,J.. Carey. J., Grosveld, F.. Jenkins. ,I. & Bodmer,
W. F. (1982). Nature (London), 299. 750-752.
Lehninger, A. I,. (1970). Biochemistry, p. 93, Worth. Ivew York.
Levitt, M. (1978). Biochemistry, 17, 4277-4285.
Lingappa. V. R.. Devillers-Thiery. A. 8r Blobel. (1. (1977). Pror. Sat. Acad. Sci.. I’.S.A. 74.
243222436.
Litman. G. W.; Berger. L., Murphy, K.. Litman. R.. Hinds. K., ,Jahn, C. I,. B Erickson.
B. W. (1983). Nature (London), 303, 349-352.
Lofdahl. S., Guss, B., Uhlen, M., Philipson, 1,. & Lindberg, M. (1983). Proc. Xat. Acad. Aci..
U.S.A. 80, 697-701.
Long, E. O., Wake, C. T., Gorski. J. & Mach. B. (1983). EMBO J. 2. 389-394.
MacDonald. R. ,J., Stary, S. J. & Swift, G. H. (1982). .J. Biol. Chem. 257. 14582214585.
Marks. M. D. & Larkins, B. A. (1982). J. Hiol. Chem. 257, 9976-9983.
Michaelis, S. & Beckwith. J. (1982). Annu. Rev. rlilicrobiol. 36, 435465.
Mizuno, T.. Chou, M.-Y. & Inouyr, lfM. (1983). FEBS Letters, 151. 1599164.
Moriuchi, T.. Chang, H.-C.. Denome. R. 8: Silver. J. (1983). ,Vature (London), 301. 80-82.
Nielsen, <J. B. K. & Lampen, ,J. 0. (1982). .I. Biol. (“hem. 257. 4490-4495.
Noda, M.. Takahashi, H.. Tanabe, T.. Toyosato, M., Kikyotani. S., Furutani. Y.. Tadaaki,
H., Takashima, H.. Inayama. S.. Miyata. T. &, Puma. S. (1983). Nutuw (London), 302.
5288532.
Overbeeke, N., Bergman, H., van Mansfield, F. & Lugtenberg. B. (1983). J. No!. Biol. 163.
513-532.
Palva, I.. Sarvas, M., Lehtovaara, P., Sibakov. M. B Kaariamen, L. (1982). Proc. -Vat.
Acad. Sci., U.S.A. 79, 5582-5586.
Perlman, D. & Halvorson. H. 0. (1983). J. %ol. Biol. 167. 393-409.
CLEAVAGE OF SIGNAL SEQUENCES 251
Tokunaga. M., Loranger, J. M., Wolfe. P. B. 8r Wu. H. C. (1982). J. Biol. Chem. 257, 99L?--
9925.
IJhler, M. & Herbert, E. (1983). J. Biol. (Them. 258, 257-261.
von Heijne. G. (1983). Eur. J. Biochem. 133. 17-21.
Wiebauer. K., Domdey, H., Diggelmann. H. & Fey. G. (1982). Proc. Xrzt. Acad. Ski., 1:YS.A.
79. 7077-7081.
Yelrrrton, E., Xorton, S., Obijeski. ,J. F. & Goeddel. D. V. (1983). Science, 219, 614-620.
Edited by 8. Brenner
Note added in proof: We have recently made an observation further underlining the
special status of Ala residues near the cleavage site (von Heijne & Flinta, unpublished
results): Ala when in position + I, is remarkably often followed in position + 2 by either
Pro (9 out of 20 cases found in our current data base of 130 eukaryotic and 29 prokaryotic
signal sequences) or Glu (6 out of 20 cases). Other residues in position + 1; such as Gly or
Ser. do not seem t)o correlate with the kind of residue found in position 12.