You are on page 1of 11

Synthetic Peptides and Proteins to Elucidate Biological Function

Christian F. W. Becker and Roger S. Goody,,


f ur Molekulare Physiologie,, Dortmund,, Germany
doi: 10.1002/9780470048672.wecb583

Advanced Article
Article Contents
Synthesis of Biological Macromolecules Chemical Methods for the Generation of Large Polypeptides HIV-1 Protease as a Paradigm for Elucidating Biological Function by Chemical Protein Synthesis Semisynthetic Proteins of the Ras-Superfamily Split-Inteins for Protein Semisynthesis in vitro and in vivo Prospects

Max-Planck Institut

The enormous progress made in the use of gene technological techniques over the past several decades has been the main driving force in the accumulation of our knowledge of biology at the molecular level. This progress has at times tended to push more classic approaches, such as those stemming from synthetic chemistry, into the background, and there has even been a tendency to regard contributions from this area as being superuous. This attitude has begun to change recently, with the emergence of the eld now referred to as chemical biology, and it is now appreciated that synthetic chemistry can make a unique contribution to the outstanding problems in fundamental biological and medically oriented research. The full potential of these methods is beginning to be realized in the area of peptide and protein synthesis, and this will be the topic of this article.

Synthesis of Biological Macromolecules


One of the early dreams of synthetic chemists was to achieve the total synthesis of important complex biological molecules (1). At the level of polymeric molecules, this includes proteins, nucleic acids, and polysaccharides. In all cases, early work initially involved synthesis of small fragments of the polymeric molecules (peptides, oligonucleotides, oligosaccharides) and addressed, and partially solved, the initially formidable synthetic obstacles, in particular those concerning protection and deprotection to prevent reactions occurring at unwanted positions of the molecules involved. The seminal breakthrough that led to extension of these methods to longer polymers in reasonably short periods of time was made by Merrield (2), who was the rst to show that synthesis of polymeric biological molecules could be achieved on a solid support, thus removing or at least dramatically simplifying the need for time-consuming purication and isolation of intermediates after the addition of each monomer. Merrield introduced this principle for peptide synthesis, but in fact polynucleotide synthesis, in particular DNA synthesis, proved to work at least as well, and in terms of reaching the long-term aim of total synthesis of biological macromolecules was the rst to be accomplished successfully and in relatively routine fashion (3). This is largely due to the fact that oligonucleotide synthesis of fragments of a length of

ca. 50 nucleotides is relatively facile on a solid support, and that enzymes can be used to ligate such fragments in a directed fashion to achieve the goal of total gene synthesis. Although this is not the most routinely used method for generation of complete coding regions for specic proteins, there are often situations where this is the method of choice, because it allows complete control of codon usage to optimize protein expression in the organism to be used. Gene synthesis is now offered on a commercial basis and plays a signicant role in modern biological research. Progress toward total synthesis of proteins has been slower, mainly due to the lack of easy availability of an enzymatic procedure equivalent to DNA ligation that would allow coupling of peptides of a length that can be conveniently prepared by solid-phase synthesis (depending on sequence, the largest fragments that can be produced are between 50 and 100 residues long). This situation has changed signicantly over the past 1015 years with the introduction and widespread use of methods for the ligation of protein fragments together with the combination of the methods of synthetic chemistry with techniques originating in biology. In the following, we initially discuss the advances that have been made at the technical level, and then introduce some of the many applications that exploit the new methods for the study of biologically important processes. 1

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

Chemical Methods for the Generation of Large Polypeptides


The principles of solid-state peptide synthesis have been reviewed extensively and will not be repeated here, except to remind the reader that this usually involves attachment of a suitably protected amino acid, which will become the C-terminal residue in the nished sequence, via its carboxyl group to a polymeric support. After exposing the N-terminus of this residue, this is allowed to react with the next protected and activated amino acid to form a peptide bond between the last and the penultimate amino acids in the target sequence. Repetition of this cycle allows the stepwise construction of the desired polypeptide from the direction of the C-terminus toward the N-terminus. Removal of protecting groups and cleavage from the solid support leads to the free polypeptide. The procedure outlined here is limited to oligopeptides and polypeptides of up to ca. 50 amino acids, and thus, it limits the availability of fully synthetic proteins, because most proteins or functional domains are at least ca. 100 residues long. A solution to this problem would be to generate polypeptides with a length of several tens of amino acids and then to couple (or ligate) them to produce signicantly larger proteins. In earlier work, this principle was used in a block condensation approach using fully protected polypeptides, but this did not prove to be a viable procedure in most cases. Another approach is to connect fragments of the protein using non-peptide linkers with chemistry, which obviates the need for side-chain protection. An example of this approach is given below, and it can be put to good use in certain cases. A major breakthrough was the introduction of the method known as native chemical ligation in 1994 (4). In this procedure, a peptide or polypeptide bearing a C-terminal thioester is mixed in aqueous solution under mild conditions with another peptide or polypeptide haboring an N-terminal cysteine residue (Fig. 1). The ligation reaction involves a thioester exchange reaction followed by an SN acyl transfer to generate a native peptide bond, a reaction that had been reported much earlier (5) but that had not been considered as a ligation method. Chemical ligation has been used for the total synthesis of a large number of proteins in recent years, as described in several reviews (69), and recent examples extend the size range to the order of 200 amino acids, in this case using multiple ligation steps (10, 11). Despite this progress, an attractive approach that is being used increasingly is that of a combination of synthetic and molecular biological methods in the technique referred to as expressed protein ligation, as discussed in a later section.

of chemical synthesis to the production of functional as well as site-specically modied enzymes concerns the protease from human immunodeciency virus 1 (HIV-1 PR). This enzyme cleaves the gag-pol polypeptide into functional proteins during virion budding from host cells and is essential for replication of the virus (12). Inhibitors of HIV-1 PR are an important class of anti-HIV drugs, and their development is at least partially based on the availability of structural and molecular information obtained with chemically synthesized HIV-1 PR.

First Access to HIV1-PR


HIV-1 PR is a homodimer made up of 99 amino acids (per monomer) that was made accessible for the rst time in 1988 by Schneider and Kent, who synthesized this protein using solid-phase peptide synthesis (SPPS) (13). An automated, rapid, and highly efcient procedure in combination with purication by size exclusion chromatography was used to generate a partially puried HIV-1 PR (14), which then also became available later on in 1988 by recombinant expression in Escherichia coli (15). Proteins generated by these two procedures had the same enzymatic properties. After the initial synthesis of HIV-1 PR, one advantage of this methodology, namely the possibility to incorporate unnatural amino acids during chemical synthesis, was demonstrated by replacing all cysteine residues in HIV-1 PR by -amino-n-butyric acid. The resulting enzyme was fully active and was crystallized to obtain one of the rst three-dimensional structures of HIV-1 PR that formed the basis for structure-assisted design of HIV-1 PR inhibitors (16, 17). At the same time this structure conrmed that chemically synthesized proteins can fold and crystallize identically with proteins from natural sources. Three different crystal structures of chemically synthesized HIV-1 PR with bound peptide inhibitors were subsequently published and contributed to the further development of HIV-1 protease inhibitors (1820).

Backbone Engineering of HIV-1 PR


The exibility of chemical protein synthesis was used to introduce changes into the protein backbone that could not be incorporated by other means. This paved the way for a general protein engineering approach and at the same time introduced the possibility of joining two fully unprotected peptide segments by a chemoselective reaction that generated an unnatural thioester bond between Gly51 and Gly52 of each HIV-1 PR subunit (Fig. 2a). The thioester linkage was generated by reacting an N-terminal HIV-1 PR peptide segment (aa 151) carrying a C-terminal thioacid with a C-terminal segment (aa 5299) having the N-terminal glycine replaced by bromoacetic acid and all additional cysteine residues replaced by -amino-n-butyric acid (Fig. 2a) (21). This constitutes an early example of a chemoselective ligation reaction that provided access to a medium-sized protein by linking two smaller unprotected polypeptides (easily accessible by SPPS) without the need for elaborate protection schemes as used in fragment condensation reactions. The resulting enzyme exhibited full activity, even though the thioester bond was placed inside a exible -hairpin loop (ap region) of HIV-1 PR, a region that undergoes drastic conformational changes during substrate and inhibitor binding. This

HIV-1 Protease as a Paradigm for Elucidating Biological Function by Chemical Protein Synthesis
Chemical protein synthesis and semisynthesis have been used to study the molecular basis of protein function in numerous cases. One of the very early and most impressive applications 2

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

O SH

OH HO O SR + H2N HO HS O NH Peptide 2

O H Peptide 2 HO

NH2 Transthioesterification O SH S H Peptide 2 HO H2N NH2 SN acyl shift O NH2 HO O NH NH OH O Peptide 2 OH

H2N

NH

O SH

OH SH O OH Peptide 2 O HO O NH

Peptide 2 HO

N H

NH2

H 2N

NH

Figure 1 Native chemical ligation (NCL) between two unprotected peptide segments. The initial transthioesterication reaction leads to an intermediate that undergoes an S to N-acyl shift via a ve-membered cyclic transition state and generates a native amide bond at the ligation site.

is due to the positioning of the two glycine residues on the outside of the aps, away from the substrate. However, the synthesis of another HIV-1 PR analog by Kent and Baca placed the thioester bond between Gly49 and Ile50, leading to a reduction in catalytic activity by a factor of 3000 (Fig. 2b) (22). This constituted the rst experimental evidence that hydrogen bonds between the backbone of the ap region and the substrate are important for catalytic activity. However, substrate specicity and afnity were not affected. These particular hydrogen bonds are transmitted from the protease backbone to the substrate via an internal water molecule and are believed to contribute to the distortion of the scissile bond of the substrate (23). The applicability of the thioester-forming chemoselective ligation approach was broadened by the fact that this chemistry can be carried out under acidic conditions in the presence of sulfhydryl groups. By taking advantage of this selectivity of the

alkylation reaction, two different HIV-1 PR monomers were prepared. These monomers carried a free sulfhydryl group at their N- or C-terminus, respectively, and were, subsequent to the thioester-forming ligation step, joined together by a disulde linkage to generate tethered dimers of two distinct HIV-1 subunits (24). This tethering of the two subunits produced one of the largest functional proteins prepared by chemical synthesis at that time and allowed the preparation of HIV-1 PR molecules with asymmetrically placed subunits. One example of such asymmetrical HIV-1 PR analogs was constructed with one subunit having a thioester bond between Gly51 and Gly52, which did not interfere with the biological activity of the protease, and a subunit that had a thioester bond between Gly51 and Gly52 and an additional ester bond instead of an amide bond between Gly49 and Ile50 (23). By replacing an amide with an oxygen atom in a unique position, no backbone hydrogen 3

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

(a)
O
HIV-1PR 1-51

(b)
O
HIV-1PR 53-99 HIV-1PR 1-49 HIV-1PR 51-99

SH + Br O

SH + Br O

Thioester-forming Ligation O
HIV-1PR 53-99 HIV-1PR 1-51

O
HIV-1PR 51-99 HIV-1PR 1-49

S O Purification & Folding

S O

Figure 2 Chemical synthesis of backbone engineered HIV-1 protease. The peptide segments were synthesized by SPPS and harbored a C-terminal thioacid (HIV-1 PR, aa 1-49/51, shown in blue) or an N-terminal bromoacetic acid modication (HIV-1 PR51/53-99, shown in red). These unique functional groups lead to an unnatural thioester bond either between Gly51 and 52 (Strategy A; the ligation site is shown in yellow in the cartoon representation of the HIV-1 PR dimer) or between Gly49 and Ile50 (Strategy B; the ligation site is located at the end of the N-terminal peptide segment depicted in blue). The chemoselective ligation reaction is followed by purication steps and folding of the protein into its functional conformation. Strategy A led to a fully functional HIV-1 PR, whereas strategy B led to a severe reduction in catalytic afnity. The functional dimer of the HIV-1 PR is drawn as a cartoon with only one subunit showing the modications introduced during synthesis. The second subunit (in green) is shown unmodied for clarity. Aspartic acid 25, site-specically labeled with 13 C for NMR spectroscopy studies, is shown in magenta in one HIV-1 PR subunit.

bond to a substrate carbonyl (via a water molecule) can be formed. Therefore, such a construct should exhibit a highly reduced catalytic activity if both ap regions are required to form hydrogen bonds. However, the ester analog of HIV-1 PR showed a reduction of kcat by only a factor of 2 upon this atom replacement. This demonstrated that only one ap region is used by the enzyme for catalysis and that the slightly reduced enzymatic activity of the ester analog is caused by the fact that, in such an asymmetric dimer, only one substrate orientation leads to productive binding. This is a vivid example of chemical protein synthesis as a unique tool in the quest of elucidating the molecular basis of enzyme catalysis.

(25). The chemical shifts of this

13 C

atom were observed

as a function of the pH and the presence and absence of substrate or inhibitor molecules. These titration experiments provided additional evidence for the suggested working model of aspartyl proteases and conrmed that HIV-1 PR is a member of this class of enzymes (26). The two aspartyl side-chain carboxyl groups (one from each subunit) act as general base and acid, respectively, thereby leading to the breakdown of the enzyme-substrate intermediate. The work on HIV protease demonstrates how chemical protein synthesis allowed isotope labeling of a 22-kDa protein with atomic precision and provided further insights into the chemical basis of the proteolytic cleavage reaction. Isotope labeling with atomic precision has since then been used to reveal structural features of other either chemically synthesized or semisynthetic proteins (2729).

Site-Specic Side Chain Labeling of HIV-1 PR


The incorporation of an aspartic acid residue with a 13 C atom at the side-chain carboxyl function at position 25 into aspartyl protease has made this catalytically essential group visible for nuclear magnetic resonance (NMR) spectroscopy (Fig. 2) 4

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

A Mirror Image HIV-1 PR


A characteristic feature of many biomolecules is their chirality and the stereochemical specicity that is conferred to proteins and especially enzymes by being constructed from monomers with uniform stereochemical conguration. This fact has inspired chemists and biochemists to generate mirror images of proteins (as well as other biomolecules) to test the properties of proteins made up of D- rather than the naturally occurring L-amino acids with regard to their biophysical behavior, enzymatic activity, and specicity. Currently, it is still not possible to modify ribosomal protein synthesis so that all-D-polypeptides can be produced, and this would in fact be a daunting undertaking. However, chemical protein synthesis and its ability to link peptides produced by solid-phase peptide synthesis via chemoselective reactions to form medium-sized proteins allows the synthesis of peptides from D-amino acids. Milton et al. demonstrated this capability of chemical protein synthesis by producing a mirror image HIV-1 PR using their already described thioester-forming ligation approach (30). When compared with the L-form of this enzyme (also produced by chemical protein synthesis), both proteins exhibited full catalytic activity but inverse chiral specicity, meaning that the D-form only cleaves D-substrates and the L-form only L-substrates. A crystal structure of the D-HIV-1 PR revealed that it was the mirror image of the L-form, and in the presence of a substrate-based D-inhibitor (D-MVT101), all major interactions between enzyme and substrate were clearly visible (31). In addition, all secondary structure elements clearly exhibited mirrored relationships such as the inverse handedness of alpha-helices and twists of anti-parallel beta-sheets (6). The synthesis of D-HIV-1 PR impressively demonstrates the basic determinants of protein structure and emphasizes the freedom and power of chemical protein synthesis. So far only a few D-proteins have been prepared, but potential applications are mirror image-based screenings where one screens a large library of L-peptides (generated by phage display) against a D-protein for high afnity binders (32). Any hits out of such a screen could be translated into D-peptides that would bind to naturally occurring L-proteins and possess highly interesting properties such as high stability against proteolytic cleavage and possibly low immunogenicity.

Semisynthetic Proteins of the Ras-Superfamily


Although the total synthesis of a protein allows complete control over the structure, including posttranslational modications and introduction of labels at desired sites in the sequence, it is still a major undertaking for which most laboratories whose main interest is in the biology of their target proteins are not equipped. In certain cases, for example when the site of introduction of a specic chemical modication is near the C-terminus, a combination of molecular biological and chemical methods has proved to be very powerful. With the Ras-family of guanine nucleotide binding proteins, where the C-terminus plays a critical role in location to specic membranes, two approaches have been used to solve the

problem of generating a C-terminus that is either naturally or unnaturally modied. In one of these, C-terminal peptides have been linked by a chemical method leading to an unnatural link. The chemistry used was based on the reaction of a truncated protein carrying a C-terminal cysteine with peptides carrying an N-terminal -maleimidocaproyl group (3337). In this manner, Ras derivatives containing C-terminal lipids (farnesyl in the case of K-Ras and farnesyl and palmitoyl in the case of H- and N-Ras) could be prepared as well as those containing uorescent or reactive groups. The most important result to emerge from these studies concerns the reversible modication of a cysteine residue by palmitate (38). Ras proteins seem to display weak and nonspecic general interactions with membranes via their farnesyl group (or a polybasic domain in K-Ras), but they are palmitoylated on Golgi membranes leading to their capture here. From there, they can be shuttled or reshuttled to their location on the plasma membrane by vesicular transport. This specic localization at the Golgi and plasma membranes did not occur when the palmitoyl group was replaced by a stable hexadecyl thioether, thus demonstrating the importance of a cycle of acylation and deacylation in the mode of action of these proteins. In the example described, which uses chemistry to create an unnatural linkage between the C-terminal region of Ras and the rest of the protein, there was no apparent detrimental effect of this departure from the natural peptide backbone, as shown by various tests of biological activity. This is presumably because the most important function of the region in the experiments discussed is to provide a exible linkage to the lipidated terminal residues. In other cases, there is a reason to believe that such a modication might be less well tolerated. In the case of the Rab proteins, which are members of the Ras-family involved in the regulation of vesicular transport, it is clear that the exact structure (sequence) of the hypervariable C-terminus is of critical importance for directing the individual members of the family of over 60 Rab proteins to distinct membrane targets. For this reason, and because one question to be investigated involved structural studies on complexes between Rab proteins and their partners, a method for producing posttranslationally modied Rab proteins with a natural polypeptide backbone throughout the whole protein was needed. This was achieved using the technique of expressed protein ligation (EPL), a procedure introduced by Muir et al. (3941). The procedure has been used in the Rab eld for the construction of a number of C-terminally modied proteins which have been used in biochemical, biophysical and cell biological studies (4246). In a specic case, as shown in Fig. 3a, a yeast Rab protein, Ypt1, was expressed in C-terminally truncated form in E. coli as a fusion protein with an intein domain and a chitin-binding domain (46). This construct could be puried by afnity chromatography on chitin-agarose. The C-terminal thioester of the truncated Ypt1 was cleaved from this support using a thiol reagent, a procedure that emulates the attack of a serine or cysteine residue in the C-extein, which is normally present in natural intein proteins (47). This thioester could be used for an in vitro ligation reaction with monogeranylgeranylated di-cysteine to generate the C-terminus in monolipidated form. As both the prenylated peptide and the reaction product (prenylated Ypt1) are insoluble 5

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

S O

SO3

HS + H2N O

geranylgeranyl S geranylgeranyl Cys Cys S

Transthioesterification

S
O H2N

geranylgeranyl S geranylgeranyl Cys Cys O

N-acylshift

H N O

S Cys S Cys (a)

SH

(b)
Figure 3 (a) Preparation of prenylated Ypt1 (a yeast Rab-protein) by expressed protein ligation. A C-terminal thioester of the truncated Rab protein was allowed to react with a doubly geranygeranylated tricysteine peptide, leading to transesterication and an SN acyl shift to generate a native peptide bond. (b) Interaction of the C-terminus of semisynthetic doubly geranylgeranylated Ypt1 with the lower domain of yeast GDI. GDI is shown in green as a ribbon structure, the C-terminus of YPT1 in magenta, and the geranylgeranyl groups in red and blue CPK representation. Several residues of the C-terminus of YPT1 were not visible in the electron density map, so that the connection to the prenyl groups is not observed directly. One prenyl goup (in red) is buried deeply into the hydrophobic core of GDI, whereas the other (in blue) is more supercially bound and shows interaction with the other prenyl group. The lipid binding site is generated by an opening movement of two -helices.

in an aqueous environment, the ligation reaction was performed in detergent solution. Using the expressed protein ligation approach, both singly and doubly prenylated Ypt1 molecules could be produced. The complexes of these proteins with their solubilizing protein, GDI (GDP-dissociation inhibitor), could be 6

crystallized, and their three-dimensional structures were determined (4648). This revealed for the rst time the nature of the lipid interaction with a binding site in an unexpected part of the GDI molecule (Fig. 3b). In the previously determined structure of GDI without a bound Rab molecule, this binding site was not detected, because a movement of one of the -helices of

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

the lower domain of GDI has to occur to create space for lipid binding, and this seems only to occur when the lipid residues, or possibly the whole prenylated Rab molecule binds. The position of binding was essentially the same for single or double geranygeranyl groups, and Fig. 3b shows only the physiologically more relevant doubly prenylated structure (most Rab molecules are doubly prenylated). The structural determination of the complex between GDI and prenylated Rab molecules has provided considerable information on the mechanism of action of GDI in the recycling of Rab proteins between target and donor membranes (48, 49). It also sheds light on the molecular basis of a form of x-linked non-syndromic mental retardation, in which there is an L92P mutation in GDI, which is highly expressed in brain, and which results in a reduced ability to extract Rabs from membranes. It was previously thought that this residue would be in the lipid binding site, but the structure depicted in Fig. 3b shows that the corresponding residue in yeast GDI (I100) is not in the lipid binding site but makes an important hydrophobic interaction with a conserved hydrophobic motif in the Rab C-terminal hypervariable domain. The same technology was used to create Rab proteins bearing a variety of uorescent groups at the C-terminus. This approach allowed introduction of such reporter groups near to the reactive SH groups, which are the site of prenylation while leaving these groups free for the prenylation reaction, a process that results in large uorescence signal changes in certain cases. Experiments on the prenylation of such selectively modied Rab proteins allowed insights into the molecular basis of another hereditary disease, namely x-linked degradation of chorioretinal cells in choroideremia, a disease caused by underprenylation of certain Rab proteins (50).

Split-Inteins for Protein Semisynthesis in vitro and in vivo


The technique of expressed protein ligation has been exploited extensively during the last couple of years to produce semisynthetic proteins with tailor-made properties (4, 39). Examples are described above, and the method has been reviewed in detail recently (41, 5153). The discovery that naturally occurring inteins, protein splicing domains that can excise themselves from a given polypeptide and join the anking domains via a peptide bond, can be split into two pieces that possess the ability to spontaneously associate and form a functional intein has further extended the utility of intein technology (5456). In particular, two split inteins (the DnaE and DnaB inteins) that do not require a denaturation and renaturation step to become fully functional are highly useful for the semisynthesis of specically modied proteins in vitro and in vivo (57, 58).

The DnaE Intein


The DnaE intein from Synechocystis ssp. is a naturally occurring split intein and consists of a longer N-terminal segment (123 amino acids) that can be C-terminally fused to almost any given protein sequence and expressed. The C-terminal segment

consists of only 36 aa and is easily accessible by chemical synthesis and therefore allows the addition of specically modied peptides to its C-terminus that are, upon trans -splicing, transferred onto the N-terminal protein that was expressed as a fusion protein with the N-terminal intein segment. This split intein system has enabled the rst semisynthesis of a GFPFLAG fusion protein in vivo (59). To achieve this goal, the N-terminal DnaE segment was fused to GFP and expressed in CHO cells. These cells were complemented with a chemically synthesized C-terminal part of the intein together with a FLAG tag and a protein transduction domain (PTD) for efcient uptake into the cells. The GFPFLAG fusion protein that was generated upon successful trans -splicing was unambiguously identied by GFPand FLAG-specic antibodies. Such a system allows the in vivo incorporation of biophysical probes, as long as the chemically synthesized part can be brought into the cells of interest. Detailed insights into the mechanism of the trans -splicing reaction of the DnaE intein were provided by crystal structures of this protein after excision and of a splicing-decient precursor protein (60). Further applications of the DnaE split intein include the development of a tandem trans -splicing system that is based on a combination of the DnaE split intein and the engineered, inducible VMA split intein (61). Such a system allows the segmental labeling of proteins with specic isotopes [as demonstrated by Otomo et al. with the articial PI-Pfu I and PI-Pfu II split inteins (62)] and uorophores. The DnaE split intein was also used by Camarero et al. to achieve the site-specic, oriented immobilization of proteins such as maltose binding protein (MBP) and enhanced green uorescent protein (EGFP) onto glass surfaces (63). A covalent bond to the glass surface was established by thioether formation between a maleimide group on the surface and a thiol group bearing PEG linker that also carried four amino acids, including a cysteine residue, which could act as a nucleophile in trans -splicing reactions, and the C-terminal segment of the DnaE intein (36 aa). Upon addition of a MBP- or EGFP-N-intein fusion construct that was either produced by recombinant or cell-free expression the intein halves associated and trans -splicing occurred, leading to the immobilization of MBP or EGFP on the surface. The associated DnaE intein halves were washed away, and the proteins remained, covalently bound via a PEG spacer, on the surface. The advantage of this approach is that no purication of the expressed proteins is necessary because only intein fusion constructs undergo the highly specic immobilization reaction. Furthermore, only low concentrations are needed to achieve efcient trans -splicing reaction [dissociation constant of the DnaE split intein halves is 43 nM, and trans -splicing occurs at a rate of ca. 7 105 s1 (61, 64)], which constitutes an advantage over immobilization techniques that rely on chemoselective reactions and strongly depend on reactand concentrations (6568). Thus, this approach points to a new route to produce protein chips without the need for large amounts of puried protein.

The DnaB Intein


The DnaB intein from Synechocystis spp. consists of 429 amino acids, including a homing endonuclease domain, in its native form. The removal of 275 amino acids leads to 7

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

O H N-Extein

HS O N H N-Intein + C-Intein H N

HS N H C-Extein

O NH2 O H N-Extein HS O N H N-Intein C-Intein H N HS N H C-Extein

O NH2

O S H N-Extein H2N N-Intein C-Intein H N O HS N H C-Extein

O NH2 O N-Extein HS H2N N-Intein C-Intein H N O S N H C-Extein H

O NH2

HS H N H2N N-Intein C-Intein

O NH O + H N-Extein

HS N H

C-Extein

Figure 4 Mechanism of trans-protein splicing. (a) Initial association of the intein halves to form a functional intein. (b) Activation of the N-terminal splice-junction via an N-S acyl shift. (c) Formation of a branched intermediate upon transthioesterication. (d) Branch resolution and intein release by succinimide formation. Spontaneous SN acyl rearrangement yields the processed product with a native peptide backbone.

a functional mini intein (154 aa) that can also be split into two halves that undergo trans -splicing when co-expressed in E. coli (57). To test whether trans -splicing also occurs in vitro , Mootz et al. expressed a fusion protein consisting of MBP and the N-terminal half of the DnaB intein (104 aa) and a fusion construct of the C-terminal half (47 aa) and a hexa-histidine tag (69). Upon mixing in stoichiometric amounts, successful trans -splicing produced the MBP-His-tag fusion protein. This constituted the rst case of an articial split intein that spontaneously assembled to form the active intein and underwent trans -splicing without the need for a denaturationrenaturation step. The only other articial split intein that does not require such a renaturationdenaturation step reported previously was the VMA intein from Saccharomyces cerevisiae . However, the N- and C-terminal segments of this intein do not assemble 8

spontaneously to form a functional intein. They require a dimerization domain that brings both halves in close proximity to each other, which induces trans -splicing (7072). This renders the DnaB split intein highly interesting for protein engineering approaches, and in combination with the DnaE split intein or with an inducible split intein such as the VMA intein, it provides a valuable tool to combine three protein segments with each other by two concomitant or subsequent trans -splicing reactions. An additional advantage of the DnaB split intein is the occurrence of a serine residue as the C-terminal nucleophile for the splicing reaction instead of cysteine residues. Cysteine residues might not be desirable in some cases because they can interfere with folding or labeling of the newly generated protein. Nevertheless a cysteine can replace the serine as a nucleophile at this position as demonstrated by the fact that the DnaB intein has been used to generate protein segments with N-terminal

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

cysteine residues. This was achieved by expressing the desired DnaB intein as a fusion construct with the target protein in inclusion bodies and by taking advantage of the pH sensitivity of the DnaB intein to prevent premature cleavage during work up (73). To extend the utility of the DnaB split intein, Liu et al. have tested 13 different sites to split this intein into two segments of different length (58). Until this series of experiments, all known articial split inteins had been split at the endonuclease domain. Out of 13 tested sites, 3 gave functional split inteins that would undergo trans -splicing, including 1 that consisted of only 11 N-terminal amino acids. Such a short N-terminal split intein half is accessible by chemical synthesis, and the introduction of chemically modied peptides at the N-terminus via trans -splicing was recently demonstrated. Such a system nicely complements the already established C-terminal modication approach via the DnaE split intein (74).

Prospects
The work reviewed here illustrates that, in the century since Fischer formulated his vision that the synthesis of proteins should be achievable using the methods of organic chemistry (1) this prediction has been largely fullled. What he could not possibly have predicted was the role that molecular biological techniques would play in combination with chemical methods, although he was realistic enough to imply that chemistry would not be the method of choice if biotechnological methods were available. Future developments in the area of synthetic and semisynthetic proteins are likely to include extension of ligation methods to amino acids other than cysteine and the increased use of strategies for generating proteins with precisely engineered properties in cells, including such approaches as conditional splicing, a technique in which a specic protein activity is generated intracellularly by exposure to a small membrane-permeable molecule (7072).

References
1. 2. 3. Fischer E. Proteins and Polypeptides. Angew. Chem. 1907; 20: 913917. Merrield RB. Solid phase peptide synthesis. 1. Synthesis of Tetrapeptide. J. Am. Chem. Soc. 1963; 85: 21492154. Nambiar KP, Stackhouse J, Stauffer DM, Kennedy WP, Eldredge JK, Benner SA. Total synthesis and cloning of a gene coding for the ribonuclease S protein. Science 1984; 223(4642): 12991301. Dawson PE, Muir TW, Clark- Lewis I, Kent SBH. Synthesis of proteins by native chemical ligation. Science 1994; 266: 776779. Wieland T, Bokelmann E, Bauer L, Lang HU, Lau H. Uber Peptidsynthesen. 8. Bildung Von S-Haltigen Peptiden Durch Intramolekulare Wanderung Von Aminoacylresten. Ann. Chem.Justus Liebig 1953; 583(2): 129149. Kent S. Total chemical synthesis of enzymes. J. Pept. Sci. 2003; 9(9): 574593. Dawson PE, Kent SBH. Synthesis of native proteins by chemical ligation. Ann. Rev. Biochem. 2000; 69: 923960. Muir TW, Dawson PE, Kent SB. Protein synthesis by chemical ligation of unprotected peptides in aqueous solution. Meth. Enzymol. 1997; 289: 266298.

4. 5.

6. 7. 8.

9. Goody RS, Alexandrov K, Engelhard M. Combining chemical and biological techniques to produce modied proteins. ChemBioChem 2002; 3(5): 399403. 10. Kochendoerfer GG, Chen SY, Mao F, Cressman S, Traviglia S, Shao H, et al. Design and chemical synthesis of a homogeneous polymer-modied erythropoiesis protein. Science 2003; 299(5608): 884887. 11. Becker CFW, Hunter CL, Seidel R, Kent SBH, Goody RS, Engelhard M. Total chemical synthesis of a functional interacting protein pair: the protooncogene H-Ras and the Ras-binding domain of its effector c-Raf1. Proc. Natl. Acad. Sci. U.S.A. 2003; 100(9): 50755080. 12. Kohl NE, Emini EA, Schleif WA, Davis LJ, Heimbach JC, Dixon RA, et al. Active human immunodeciency virus protease is required for viral infectivity. Proc. Natl. Acad. Sci. U.S.A. 1988; 85(13): 46864690. 13. Schneider J, Kent SB. Enzymatic activity of a synthetic 99 residue protein corresponding to the putative HIV-1 protease. Cell 1988; 54(3): 363368. 14. Kent SB. Chemical synthesis of peptides and proteins. Annu. Rev. Biochem. 1988; 57: 957989. 15. Graves MC, Lim JJ, Heimer EP, Kramer RA. An 11-kDa form of human immunodeciency virus protease expressed in Escherichia coli is sufcient for enzymatic activity. Proc. Natl. Acad. Sci. U.S.A. 1988; 85(8): 24492453. 16. Wlodawer A, Miller M, Jask olski M, Sathyanarayana BK, Baldwin E, Weber IT, et al. Conserved folding in retroviral proteases: crystal structure of a synthetic HIV-1 protease. Science 1989; 245: 616621. 17. Wlodawer A, Vondrasek J. Inhibitors of HIV-1 protease: a major success of structure-assisted drug design. Annu. Rev. Biophys. Biomol. Struct. 1998; 27: 249284. 18. Miller M, Schneider J, Sathyanarayana BK, Toth MV, Marshall GR, Clawson L et al. Structure of complex of synthetic HIV-1 protease with a substrate-based inhibitor at 2.3 A resolution. Science 1989; 246(4934): 11491152. 19. Swain AL, Miller MM, Green J, Rich DH, Schneider J, Kent SB et al. X-ray crystallographic structure of a complex between a synthetic protease of human immunodeciency virus 1 and a substrate-based hydroxyethylamine inhibitor. Proc. Natl. Acad. Sci. U.S.A. 1990; 87(22): 88058809. 20. Jaskolski M, Tomasselli AG, Sawyer TK, Staples DG, Heinrikson RL, Schneider J et al. Structure at 2.5-A resolution of chemically synthesized human immunodeciency virus type 1 protease complexed with a hydroxyethylene-based inhibitor. Biochemistry 1991; 30(6): 16001609. 21. Schn olzer M, Kent SBH. Constructing proteins by dovetailing unprotected synthetic peptides: backbone-engineered HIV protease. Science 1992; 256: 221225. 22. Baca M, Kent SBH. Catalytic contribution of ap-substrate hydrogen bonds in HIV-1 protease explored by chemical synthesis. Proc. Natl. Acad. Sci. U.S.A. 1993; 90: 1163811642. 23. Baca M, Kent SBH. Protein backbone engineering through total chemical synthesis: new insight into the mechanism of HIV-1 protease catalysis. Tetrahedron 2000; 56(48): 95039513. 24. Baca M, Muir TW, Schn olzer M, Kent SBH. Chemical ligation of cysteine-containing peptides: synthesis of a 22kDa tethered dimer of HIV-1 protease. J. Am. Chem. Soc. 1995; 117: 18811887. 25. Smith R, Brereton IM, Chai RY, Kent SB. Ionization states of the catalytic residues in HIV-1 protease. Nat. Struct. Biol. 1996; 3(11): 946950. 26. Suguna K, Padlan EA, Smith CW, Carlson WD, Davies DR. Binding of a reduced peptide inhibitor to the aspartic proteinase

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

27.

28.

29. 30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41. 42.

43.

44.

from Rhizopus chinensis: implications for a mechanism of action. Proc. Natl. Acad. Sci. U.S.A. 1987; 84(20): 70097013. Kochendoerfer GG, Jones DH, Lee S, Oblatt- Montal M, Opella SJ, Montal M. Functional characterization and NMR spectroscopy on full-length Vpu from HIV-1 prepared by total chemical synthesis. J. Am. Chem. Soc. 2004; 126(8): 24392446. Romanelli A, Shekhtman A, Cowburn D, Muir TW. Semisynthesis of a segmental isotopically labeled protein splicing precursor: NMR evidence for an unusual peptide bond at the N-extein-intein junction. Proc. Natl. Acad. Sci. U.S.A. 2004; 101(17): 63976402. Cowburn D, Muir TW. Segmental isotopic labeling using expressed protein ligation. Meth. Enzymol. 2001; 339: 4154. Milton RCL, Milton SCF, Kent SBH. Total chemical synthesis of a D-enzyme: the enantiomers of HIV-1 protease show demonstration of reciprocal chiral substrate specicity. Science 1992; 256: 14451448. Miller M, Baca M, Rao JKM, Kent SBH. Probing the structural basis of the catalytic activity of HIV-1 PR through total chemical protein synthesis. J. Mol. Struct.-Theochem 1998; 423(12): 137152. Schumacher TN, Mayr LM, Minor DL Jr, Milhollen MA, Burgess MW, Kim PS. Identication of D-peptide ligands through mirrorimage phage display. Science 1996; 271(5257): 18541857. Kuhn K, Owen DJ, Bader B, Wittinghofer A, Kuhlmann J, Waldmann H. Synthesis of functional Ras lipoproteins and uorescent derivatives. J. Am. Chem. Soc. 2001; 123(6): 10231035. Bader B, Kuhn K, Owen DJ, Waldmann H, Wittinghofer A, Kuhlmann J. Bioorganic synthesis of lipid-modied proteins for the study of signal transduction. Nature 2000; 403(6766): 223226. Reents R, Wagner M, Schlummer S, Kuhlmann J, Waldmann H. Synthesis and application of uorescent ras proteins for live-cell imaging. ChemBioChem 2005; 6(1): 8694. Kuhlmann J, Tebbe A, Volkert M, Wagner M, Uwai K, Waldmann H. Photoactivatable synthetic Ras proteins: baits for the identication of plasma-membrane-bound binding partners of Ras. Angew. Chem. Int. Ed. Engl. 2002; 41(14): 25462550. Volkert M, Uwai K, Tebbe A, Popkirova B, Wagner M, Kuhlmann J, et al. Synthesis and biological activity of photoactivatable N-ras peptides and proteins. J. Am. Chem. Soc. 2003; 125(42): 1274912758. Rocks O, Peyker A, Kahms M, Verveer PJ, Koerner C, Lumbierres M, et al. An acylation cycle regulates localization and activity of palmitoylated Ras isoforms. Science 2005; 307(5716): 17461752. Muir TW, Sondhi D, Cole PA. Expressed protein ligation: A general method for protein engineering. Proc. Natl. Acad. Sci. U.S.A. 1998; 95(12): 67056710. Severinov K, Muir TW. Expressed protein ligation, a novel method for studying protein-protein interactions in transcription. J. Biol. Chem. 1998; 273(26): 1620516209. Muir TW. Semisynthesis of proteins by expressed protein ligation [Review]. Annu. Rev. Biochem. 2003; 72: 249289. Iakovenko A, Rostkova E, Merzlyak E, Hillebrand AM, Thoma NH, Goody RS, et al. Semi-synthetic Rab proteins as tools for studying intermolecular interactions. FEBS Lett. 2000; 468(2-3): 155158. Alexandrov K, Heinemann I, Durek T, Sidorovitch V, Goody RS, Waldmann H. Intein-mediated synthesis of geranylgeranylated Rab7 protein in vitro. J. Am. Chem. Soc. 2002; 124(20): 56485649. Durek T, Alexandrov K, Goody RS, Hildebrand A, Heinemann I, Waldmann H. Synthesis of uorescently labeled mono- and

45.

46.

47. 48.

49.

50.

51.

52. 53.

54.

55.

56.

57.

58.

59. 60.

61.

62.

diprenylated Rab7 GTPase. J. Am. Chem. Soc. 2004; 126(50): 1636816378. Brunsveld L, Watzke A, Durek T, Alexandrov K, Goody RS, Waldmann H. Synthesis of functionalized rab GTPases by a combination of solution- or solid-phase lipopeptide synthesis with expressed protein ligation. Chemistry 2005; 11(9): 27562772. Rak A, Pylypenko O, Durek T, Watzke A, Kushnir S, Brunsveld L, et al. Structure of Rab GDP-dissociation inhibitor in complex with prenylated YPT1 GTPase. Science 2003; 302(5645): 646650. Paulus H. Protein splicing and related forms of protein autoprocessing. Annu. Rev. Biochem. 2000; 69(1): 447496. Pylypenko O, Rak A, Durek T, Kushnir S, Dursina BE, Thomae NH et al. Structure of doubly prenylated Ypt1:GDI complex and the mechanism of GDI-mediated Rab recycling. EMBO J. 2006; 25(1): 1323. Goody RS, Rak A, Alexandrov K. The structural and mechanistic basis for recycling of Rab proteins between membrane compartments. Cell. Mol. Life Sci. 2005; 62(15): 16571670. Rak A, Pylypenko O, Niculae A, Pyatkov K, Goody RS, Alexandrov K. Structure of the Rab7: REP-1 complex: insights into the mechanism of Rab prenylation and choroideremia disease. Cell 2004; 117(6): 749760. David R, Richter MP, Beck- Sickinger AG. Expressed protein ligation. Method and applications. Eur. J. Biochem. 2004; 271(4): 663677. Durek T, Becker CF. Protein semi-synthesis: new proteins for functional and structural studies. Biomol. Eng. 2005. Muralidharan V, Muir TW. Protein ligation: an enabling technology for the biophysical analysis of proteins. Nat. Methods 2006; 3(6): 429438. Southworth MW, Adam E, Panne D, Byer R, Kautz R, Perler FB. Control of protein splicing by intein fragment reassembly. EMBO J. 1998; 17(4): 918926. Mills KV, Lew BM, Jiang S, Paulus H. Protein splicing in trans by puried N- and C-terminal fragments of the Mycobacterium tuberculosis RecA intein. Proc. Natl. Acad. Sci. U.S.A. 1998; 95(7): 35433548. Shingledecker K, Jiang SQ, Paulus H. Molecular dissection of the Mycobacterium tuberculosis RecA intein: design of a minimal intein and of a trans-splicing system involving two intein fragments. Gene 1998; 207(2): 187195. Wu H, Hu ZM, Liu XQ. Protein trans-splicing by a split intein encoded in a split dnae gene of synechocystis sp. Pcc6803. Proc. Natl. Acad. Sci. U.S.A. 1998; 95(16): 92269231. Sun W, Yang J, Liu XQ. Synthetic two-piece and three-piece split inteins for protein trans-splicing. J. Biol. Chem. 2004; 279(34): 3528135286. Giriat I, Muir TW. Protein semi-synthesis in living cells. J. Am. Chem. Soc. 2003; 125(24): 71807181. Sun P, Ye S, Ferrandon S, Evans TC, Xu MQ, Rao Z. Crystal structures of an intein from the split dnaE gene of Synechocystis sp. PCC6803 reveal the catalytic model without the penultimate histidine and the mechanism of zinc ion inhibition of protein splicing. J. Mol. Biol. 2005; 353(5): 10931105. Shi J, Muir TW. Development of a tandem protein trans-splicing system based on native and engineered split inteins. J. Am. Chem. Soc. 2005; 127(17): 61986206. Otomo T, Ito N, Kyogoku Y, Yamazaki T. NMR observation of selected segments in a larger protein: central-segment isotope labeling through intein-mediated ligation. Biochemistry 1999; 38(49): 1604016044.

10

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

Synthetic Peptides and Proteins to Elucidate Biological Function

63.

64.

65.

66.

67.

68.

69.

70.

Kwon Y, Coleman MA, Camarero JA. Selective immobilization of proteins onto solid supports through split-intein-mediated protein trans-splicing. Angew. Chem/ Int. Ed. Engl. 2006; 45(11): 17261729. Martin DD, Xu MQ, Evans TC Jr. Characterization of a naturally occurring trans-splicing intein from Synechocystis sp. PCC6803. Biochemistry 2001; 40(5): 13931402. Soellner MB, Dickson KA, Nilsson BL, Raines RT. Site-specic protein immobilization by Staudinger ligation. J. Am. Chem. Soc. 2003; 125(39): 1179011791. Watzke A, Kohn M, Gutierrez- Rodriguez M, Wacker R, Schroder H, Breinbauer R, et al. Site-selective protein immobilization by Staudinger ligation. Angew. Chem. Int. Ed. Engl. 2006; 45(9): 14081412. de Araujo AD, Palomo JM, Cramer J, Kohn M, Schroder H, Wacker R, et al. Diels-alder ligation and surface immobilization of proteins. Angew. Chem. Int. Ed. Engl. 2005; 45(2): 296301. Camarero JA, Kwon Y, Coleman MA. Chemoselective attachment of biologically active proteins to surfaces by expressed protein ligation and its application for protein chip fabrication. J. Am. Chem. Soc. 2004; 126(45): 1473014731. Brenzel S, Kurpiers T, Mootz HD. Engineering articially split inteins for applications in protein chemistry: biochemical characterization of the split Ssp DnaB intein and comparison to the split Sce VMA intein. Biochemistry 2006; 45(6): 15711578. Mootz HD, Muir TW. Protein splicing triggered by a small molecule. J. Am. Chem. Soc. 2002; 124(31): 90449045.

71.

72.

73.

74.

Mootz HD, Blum ES, Tyszkiewicz AB, Muir TW. Conditional protein splicing: a new tool to control protein structure and function in vitro and in vivo . J. Am. Chem. Soc. 2003; 125(35): 1056110569. Mootz HD, Blum ES, Muir TW. Activation of an autoregulated protein kinase by conditional protein splicing. Angew. Chem. Int. Ed. Engl. 2004; 43(39): 51895192. Hackenberger CP, Chen MM, Imperiali B. Expression of N-terminal Cys-protein fragments using an intein refolding strategy. Bioorg. Med. Chem. 2006; 14(14): 50435048. Ludwig C, Pfeiff M, Linne U, Mootz HD. Ligation eines synthetischen Peptids an den N-Terminus eines rekombinanten Proteins durch semisynthetisches trans-Proteinspleien (p NA). Angew. Chem. (Engl.) In press.

See Also
Chemistry and Chemical Reactivity of Proteins Structure, Function and Stability of Proteins Lipidated Peptide Synthesis Synthesis of Natural and Unnatural Amino Acids

WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.

11

You might also like