You are on page 1of 8

Production of Recombinant Proteins in Escherichia coli

The genetics and biochemistry of Escherichia coli are probably the best understood of any known organism. The knowledge gained in the study of E. coli biology has been applied to the development of many of todays molecular cloning techniques. Most cloning vectors and methods utilize E. coli or its phages as a preferred host, primarily because of the ease with which the bacterium can be grown and genetically manipulated. These same characteristics made E. coli an attractive early choice as a host for the production of large quantities of protein encoded by cloned genes. Aside from its well-studied biology, E. coli is suitable as the basis of an expression system because of its rapid doubling time and its ability to grow in inexpensive media. Years of study devoted to gene expression in E. coli have provided numerous choices for transcriptional and translational control elements that can be applied to the expression of foreign genes (UNITS 5.2 & 5.3). As a result, E. coli has been and continues to be the expression system of choice and a substantial body of literature has accumulated on the successful expression of foreign genes in this host. Several problems with protein expression in E. coli have been encountered, and many have been ultimately solved. This unit describes methods that have been developed for production of recombinant proteins in E. coli and potential pitfalls that may be encountered. proteins with altered amino acids (e.g., selenomethionine in place of methionine) or stable isotope tags to facilitate structural studies. Production is followed by protein purification (Chapter 6) and characterization (Chapter 7). If the gene to be expressed in E. coli is of eukaryotic origin, it must be a cDNA copy, as E. coli will not recognize and splice out introns from transcripts of genomic copies of eukaryotic genes. Typical E. coli expression vectors contain the following elements.

UNIT 5.1

Selectable Marker
Expression plasmids contain sequences encoding a selectable marker to ensure maintenance of the vector in the host cell. Commonly used selectable markers in E. coli include bla (which encodes -lactamase and confers resistance to ampicillin and other -lactam antibiotics), cat (which encodes chloramphenicol acetyltransferase and confers resistance to chloramphenicol), and tet (which encodes a membrane protein that confers resistance to tetracycline).

Origin of Replication
Replication of a plasmid as an independent extrachromosomal element is controlled by its origin of replication. Some plasmids are under so-called stringent control, wherein the plasmids replication is coupled to replication of the host chromosome. As such, only one or at most a few copies of the plasmid are maintained in each cell. More commonly, E. coli vectors used for expression of cloned genes utilize relaxed control such as the ColE1 replicon found in pBR322 and derivatives. These plasmids generally have copy numbers of 10 to 200 plasmids per cell. The presence of multiple copies of the gene and its associated control elements can provide advantages in mRNA production and, as a result, may be reflected in an increase in the total amount of protein that accumulates. Nevertheless, the reduced copy number of stringent replicons may sometimes be advantageous: it allows for modulation of gene dosage, which can sometimes be helpful for altering the kinetics of expression or reducing the basal transcriptional levels of the plasmid-encoded promoter.
Production of Recombinant Proteins


The basic approach used to express foreign genes in E. coli begins with insertion of the gene into an expression vector, usually a plasmid (obtained from a commercial supplier or from the author of a published study). The next step involves transforming a suitable E. coli host strain with the plasmid, for example, by electroporation (UNIT 5.2). Transformed cells are then subjected to evaluation of plasmid stability and foreign protein expression following induction of the controllable transcriptional promoter present in the expression vector (UNIT 5.2). Once small-scale shaker flask experiments have identified successful expression systems (UNIT 5.2), the transformed E. coli strain can be used in large-scale fermentation systems (UNIT 5.3), which can be designed to produce recombinant

Contributed by Edward R. LaVallie

Current Protocols in Protein Science (1995) 5.1.1-5.1.8 Copyright 2000 by John Wiley & Sons, Inc.


An essential component of expression vectors is a controllable transcriptional promoter which, when induced, can direct the production of large amounts of mRNA from the cloned gene. There are a variety of controllable promoters that are routinely used. The lac promoter system utilizes the transcriptional control elements from the E. coli -galactosidase gene (Miller and Reznikoff, 1978). The lac promoter (like the tac and trc promoters, which have optimized lac promoter RNA polymerase recognition sequences) is controlled by binding of lac repressor (the lacI gene product) to an operator sequence in the promoter region. The repressor can be expressed either from the host genome (single copy, in which case an overexpressing repressor allele called lacIq should be used) or from the expression vector (multiple copies, with resultant tighter transcriptional control). Induction of the promoter is generally accomplished by the addition of isopropyl-D-thiogalactopyranoside (IPTG), a lactose analog that binds to the lac repressor and prohibits its binding to the lac operator. An example of a commercially available lac promoter vector is pBluescript (Stratagene). The pKK223-3 vector from Pharmacia Biotech is a source of the tac promoter. The trc promoter is used in the vectors pSE280, pSE380, and pSE420 from Invitrogen. Another commonly used promoter system utilizes the major leftward promoter of bacteriophage lambda, pL, and its control elements (Shimatake and Rosenberg, 1981). The pL promoter contained on an expression vector can be controlled by the phage-encoded cI repressor, which is typically expressed from an integrated copy of the phage in the host genome. A temperature-sensitive cI repressor (called cI857) is usually used; this encodes a repressor that is functional at lower temperatures but denatures at temperatures above 37.5C. Thus, pL-mediated protein synthesis can be induced by a simple temperature shift. Alternatively, temperature-independent pL promoter systems have been developed that utilize a cI repressor gene under the control of a separate inducible promoter (Mieschendahl et al., 1986; LaVallie et al., 1993a). A commercial source for the pL vector is the pPL-Lambda vector from Pharmacia Biotech. The T7 RNA polymerase promoter is another popular transcriptional control element for heterologous expression. The RNA polymerase from bacteriophage T7 is highly selective for specific T7 phage promoter sequences that

are uncommon in other DNAs (Studier et al., 1990). A gene of interest placed under the control of a T7 promoter can be selectively expressed by induction of host-encoded T7 polymerase synthesis, which itself is under the control of an inducible promoter such as lac. Induction of T7 polymerase synthesis causes almost exclusive expression of the gene under the control of the T7 promoter. In some instances, the T7 transcription can outcompete the host RNA polymerase, resulting in accumulation of large amounts of the gene product of interest. There are many commercially available expression vectors that utilize the T7 polymerase/promoter system, such as the pGEMEX vectors from Promega, the pRSET vectors from Invitrogen, and the pET vectors from Novagen.

Translation Initiation Sequence

Initiation of translation on mRNAs requires the presence of a so-called Shine and Dalgarno sequence or ribosome binding site (RBS) in close proximity to an initiator methionine (Shine and Dalgarno, 1974). The RBS consists of a purine-rich stretch of nucleotides complementary to the 3 end of 16S RNA, located 5 to 13 bases 5 to an initiator ATG. RBS elements typically used in expression vectors derive from well-translated E. coli or bacteriophage genes. For instance, the pTrcHis and pRSET vectors (Invitrogen) and the pGEMEX (Promega) vectors use the T7 gene 10 RBS.

SPECIFIC EXPRESSION STRATEGIES Direct Intracellular Expression

Direct expression refers to the fusing of the coding sequence of interest to transcriptional and translational control sequences on an expression vector, with an initiator methionine codon preceding the open reading frame. This approach can be used to produce cytoplasmic proteins, and it can also be used for the intracellular expression of normally secreted proteins. In the latter case, the DNA sequence encoding the signal peptide is replaced by the initiator methionine codon. Success with the direct approach is often variable. First, translation initiation is inconsistent due to the fact that sequences 3 to the initiator methionine can influence the efficiency of ribosome binding (Looman et al., 1987; Bucheler et al., 1990). For reasons that are not fully understood, maximizing the A+T content of the 5 end of the coding sequence (taking advantage of the degeneracy of the

Production of Recombinant Proteins in Escherichia coli

Current Protocols in Protein Science

genetic code) can sometimes improve the efficiency of translation initiation (De Lamarter et al., 1985; Devlin et al., 1988). Second, recombinant proteins produced in the cytoplasm often form dense, insoluble aggregates of protein called inclusion bodies (Schein, 1989). In some ways, this can be viewed as an advantage because inclusion bodies are easily purified from the soluble proteins (UNIT 6.3). In addition, proteins in inclusion bodies are usually resistant to proteolytic degradation. In cases where activity of the expressed protein is unnecessary (e.g., in production of protein to be used as an antigen to produce antibodies), inclusion body formation and resultant insolubility of the protein may actually be preferred. If active protein is desired, however, recovery of properly folded protein from inclusion bodies requires the total denaturation of the protein using reagents such as urea or guanidine, followed by subsequent refolding using protocols that must be determined empirically for each protein (UNIT 6.5). The success of refolding protocols is variable, depending on the particular protein, and may yield little or no correctly folded material. If protein insolubility is encountered and is undesired, production of soluble protein can sometimes be enhanced by simply lowering the growth temperature during protein synthesis (Bishai et al., 1987; Schein and Noteborn, 1988). If the protein is normally secreted, then a secretory construct may produce more soluble protein (see section on Secretion). Alternatively, production of the protein as a fusion to a highly soluble partner may alleviate the problem (see section on Fusion Proteins). If the gene is expressed poorly, translation efficiency may be a problem. If this is the case, it is often advantageous to maximize the A+T content of the 5 coding sequence and/or replace the RBS with a sequence that initiates translation more efficiently (Olins and Rangwala, 1989). Another potential cause of poor translation is a preponderance of rare codons in the coding sequence, that is, codons that are underutilized by E. coli (Robinson et al., 1984). Stretches of contiguous rare codons may contribute to lower expression levels and should be changed to codons that are frequently used by E. coli. Finally, poor expression levels may be caused by product instability. In this instance, use of host strains and induction methods that minimize proteolytic degradation should be attempted (UNIT 5.2). In instances where the gene is expressed well and soluble protein can be produced, the protein must then be purified from an abundant

and diverse mixture of other E. coli cytoplasmic proteins. This can be a difficult and time-consuming process, and the purification scheme will be different for each protein based on its physical and biochemical characteristics (Chapters 6 and 8).

Secretion of proteins in E. coli is mediated by the presence of an N-terminal signal sequence that is cleaved after translocation of the protein. Expression of cloned gene products as secreted proteins in E. coli has been utilized as an alternative to cytoplasmic expression for proteins that are normally secreted. In E. coli the protein is secreted to the periplasmic space between the cytoplasmic and outer membranes, in contrast to extracellular secretion that occurs in gram-positive bacteria and eukaryotic cells. The result is that in E. coli the secreted protein remains cell-associated, although in a compartment separated from the cytoplasmic proteins that make up the vast majority of the total cellular protein. This can be advantageous in terms of protein purification if techniques are used that release only periplasmic contents while leaving the cytoplasmic membrane intact (Neu and Heppel, 1965). Secretion of heterologous gene products has been successfully employed for various proteins that are difficult to produce in the cytoplasm of E. coli as soluble and active proteins, including various growth factors (Cheah et al., 1994), receptors (Fuh et al., 1990), and recombinant Fab fragments (Skerra, 1994). Although eukaryotic signal peptides have been reported to function in E. coli, most secretion vectors utilize signal peptides derived from prokaryotic genes such as the ompA (Takahara et al., 1988; Cheah et al., 1994), pelB (Power et al., 1992), phoA (Oka et al., 1985), or hisJ (Vasquez et al., 1989) signal peptides. Although secretion provides an alternative to cytoplasmic expression that can sometimes result in the production of properly folded and active protein, the yield of desired protein is often low. Also, overexpressed gene products have been reported to form inclusion bodies even in the periplasm (Bowden and Georgiou, 1990); so secretion is not a panacea for the problem of insolubility.

Fusion Proteins
The problems that have plagued successful overexpression of cloned gene products in E. colinamely, inconsistent expression levels, protein insolubility, and difficult purification of the gene product from E. coli contaminants

Production of Recombinant Proteins

Current Protocols in Protein Science

Production of Recombinant Proteins in Escherichia coli

have been most successfully addressed by the use of fusion proteins. Fusion proteins are created via a translational fusion of the coding sequence for the protein of interest to a gene for a highly expressed protein partner (or carrier protein). Typically, the gene encoding the protein of interest is inserted in-frame 3 to the coding sequence for the carrier protein, in place of the usual termination codon. This allows uniform translational initiation of the carrier protein regardless of the coding sequence fused to its 3 end, which helps to ensure consistent expression levels. Carrier proteins are usually chosen based on specific attributes that make them suitable in this role. The most successful fusion systems employ the maltose-binding protein (MBP; Maina et al., 1988), glutathione S-transferase (GST; Smith and Johnson, 1988), or thioredoxin (TRX; LaVallie et al., 1993a). The genes for these proteins are well expressed, and the proteins are highly soluble and provide specific physical characteristics or affinities to aid purification. These qualities also extend to the protein sequences fused to them, thereby enhancing the solubility and ease of purification of the entire fusion protein. Maltose-binding protein is a 43-kDa secreted protein from E. coli. As its name implies, it binds specifically to maltose or amylose, a property that can be exploited in purification schemes (Guan et al., 1988). MBP fusions can be secreted into the periplasm, or they can be expressed without the MBP signal peptide, which results in accumulation of the fusion protein in the cytoplasm. MBP fusions are usually well expressed, and a high proportion of MBP fusion proteins are soluble. MBP expression plasmids and reagents can be purchased from New England Biolabs. The MBP expression vectors utilize the tac promoter and a plasmid-encoded lacIq gene for transcriptional control. These vectors contain a recognition sequence for the site-specific protease factor Xa to allow removal of the carrier protein following purification (see section on Fusion Protein Cleavage Methods). Glutathione S-transferase is a 26-kDa cytoplasmic protein from Schistosoma japonicum. Carboxy-terminal protein fusions to GST are usually soluble and well-expressed (Smith and Johnson, 1988). GST binds specifically to glutathione, and GST fusion proteins can be purified in a single step from crude bacterial lysates by affinity chromatography on immobilized glutathione. The GST expression plasmids are

available from Pharmacia Biotech, and utilize the tac promoter and plasmid-encoded lacIq for inducible transcriptional control. A variety of vectors are available which contain either the factor Xa or the thrombin recognition sequence to allow cleavage and removal of the GST from the protein of interest following affinity purification (see section on Fusion Protein Cleavage Methods). Thioredoxin is a 12-kDa intracellular E. coli protein. It is very soluble, can be highly overexpressed from plasmid vectors (Lunn et al., 1984), and has been shown to be a very successful fusion partner. A wide variety of gene products can be produced abundantly in soluble fashion when fused to TRX (LaVallie et al., 1993a). Two different characteristics of TRX can be exploited to allow specific purification of some TRX fusion proteins. TRX accumulates at specific sites along the inner surface of the cytoplasmic membrane, called Bayers patches or adhesion zones. These sites constitute an osmotically sensitive compartment, and TRX (and some TRX fusions) can be selectively released from the cytoplasm and separated from the bulk of E. coli proteins by osmotic shock or freeze/thaw procedures. In addition, TRX is thermostable, and some TRX fusions can be purified by selective thermal denaturation of contaminants. In instances where osmotic shock or heat treatments do not provide adequate purification, an altered TRX protein has been developed (E.R.L., manuscript in preparation) that allows specific purification by metal-chelate affinity chromatography. The TRX fusion vector (available from Invitrogen) contains an enterokinase recognition sequence positioned at the junction between TRX and the C-terminal fusion partner. In addition to the carrier proteins listed above, there are many other proteins that have been used to produce fusions for the purpose of generating large amounts of protein in E. coli. Among them are E. coli -galactosidase (Ruther and Muller-Hill, 1983) and TrpE (Yansura, 1990), Staphylococcus aureus protein A (Nilsson et al., 1985; vector available from Pharmacia Biotech), chloramphenicol acetyltransferase (Knott et al., 1988), bacteriophage lambda cII protein (Nagai and Thgersen, 1987), and various carbohydrate-binding proteins (Taylor and Drickamer, 1991; Helman and Mantsala, 1992). Table 5.1.1 lists many fusion partners that have been described in the literature.

Current Protocols in Protein Science

Table 5.1.1

Common Fusion Proteins

Fusion protein Maltose-binding protein Glutathione S-transferase Thioredoxin Protein A -Galactosidase Chloramphenicol acetyltransferase lac repressor Galactose-binding protein Cyclomaltodextrin glucanotransferase Lambda cII protein TrpE protein

Specific purification methoda Amylose binding Glutathione binding Selective release, heat stability, MCAC IgG binding APTG or anti--galactosidase antibody binding Chloramphenicol binding lac operator binding (Lundeberg et al., 1990) Galactose binding Cyclodextrin binding None None

aMCAC, metal-chelate affinity chromatography; IgG, immunoglobulin G; APTG, p-amino--D-thiogalactoside.

Table 5.1.2

Fusion Tags

Amino acid tag Polyhistidine Polyaspartic acid Polyarginine Polyphenylalanine Polycysteine In vivo biotinylated peptide Flag peptide

Ligand Metal-chelate affinity resin Anion-exchange resin Cation-exchange resin HIC resina Thiol Avidin or streptavidin Anti-Flag antibody

Reference/supplier Qiagen Dalbge et al. (1987) Brewer and Sassenfeld (1985) Persson et al. (1988) Persson et al. (1988) Schatz (1993) International Biotechnologies (IBI)

aHIC, hydrophobic interaction chromatography (UNIT 8.4).

Fusion Tags
Fusion tags are small stretches of amino acids added to the N-terminal or C-terminal end of a protein. Although they do not usually help to increase expression levels or protein solubility, they can be advantageous in protein purification and detection. The tags are generally chosen either because they encode an epitope that can be detected and purified using an antibody that binds to it, or because the tag amino acids provide a physical characteristic that can be exploited for easy and specific purification. A popular example is the polyhistidine tag, usually a stretch of six consecutive histidine residues added to either the N or C terminus of a protein, that provides specific binding to metal chelate resins. There are a number of polyhistidine vectors on the market, such as the pTrcHis vector from Invitrogen. Other examples, such as polyarginine (Brewer and Sassenfeld, 1985) or polyaspartic acid tags (Dalbge et al., 1987), can be used to alter the binding

behavior of a protein on ion-exchange resins. A noteworthy fusion tag is a small stretch of amino acids recognized and biotinylated in vivo by the biotin-protein ligase in E. coli (Schatz, 1993). This allows specific capture of the biotinylated protein on immobilized avidin or streptavidin. Table 5.1.2 lists a number of the more popular fusion tags.

Fusion Protein Cleavage Methods

Whereas fusion proteins and fusion tags have proved useful for the reasons outlined above, the end result is that the protein of interest will carry additional sequences that may hamper its functional activity. There are several methods for removal of the carrier protein, which can be divided into chemical cleavage methods and enzymatic (proteolytic) cleavage methods. Chemical cleavage of fusion proteins can be accomplished with reagents such as cyanogen bromide (Met; Itakura et al., 1977; Villa et al., 1989), 2-(2-nitrophenyl-

Production of Recombinant Proteins

Current Protocols in Protein Science

Production of Recombinant Proteins in Escherichia coli

sulfenyl) - 3 - methyl - 3 - bromoindolinine or BNPS-skatole (Trp; Dykes et al., 1988), hydroxylamine (AsnGly; Bornstein and Balian, 1977), or low pH (AspPro; Szoka et al., 1986). The use of these reagents, as described in UNIT 11.4, can be applied to the site-specific cleavage of fusion proteins provided that the fusion is designed so that the chemically labile bond is positioned at the desired point of scission. Chemical cleavage reagents tend to be inexpensive and efficient, and many of the reactions can be performed under denaturing conditions so that even insoluble proteins can be cleaved. However, one disadvantage of using chemical cleavage reagents for the site-specific cleavage of fusion proteins is that the reactions are generally performed under extremes of pH and/or temperature that can result in unwanted amino acid side-chain modifications. Chemical cleavage methods also have the disadvantage of low specificity, so it is more likely that the protein of interest will contain an internal cleavage site. Enzymatic digestion is usually the method of choice for soluble fusion protein cleavage as the reactions are carried out under relatively mild conditions. Proteases such as trypsin (Lys or Arg), endoproteinase Asp-N (Asp), and Staph V8 S. aureus V8 protease (Glu or Asp) can be used in this role. High-quality trypsin, S. aureusV8 protease (endoproteinase Glu-C), and endoproteinase Asp-N can all be obtained from Boehringer Mannheim. These proteases, like chemical cleaving reagents, are limited by their low degree of specificity (recognizing and cleaving at single amino acids). Alternatively, thrombin (Leu-Val-Pro-ArgGly-Ser; Gearing et al., 1989), factor Xa [Ile-Glu(or Asp)-Gly-Arg; Nagai and Thgersen, 1984, 1987; Gardella et al., 1990], renin (Pro-Phe-His-LeuLeu-Val-Tyr; Haffey et al., 1987), collagenase (Pro-XGly-Pro-Y; Germino and Bastia, 1984), and enterokinase (Asp-Asp-Asp-Asp-Lys; Dykes et al., 1988; LaVallie et al., 1993b) are much more suitable in this role. All of these enzymes have extended substrate recognition sequences (up to seven amino acids in the case of renin), which greatly reduces the likelihood of unwanted cleavages elsewhere in the protein. Factor Xa and enterokinase are most useful for cleaving off C-terminal fusion partners because they cleave on the carboxyl-terminal side of their respective recognition sequences. This allows the release of fusion partners containing their authentic amino terminus. In addition, the catalytic subunit of bovine enterokinase has been cloned

and expressed (LaVallie et al., 1993b), and the recombinant enzyme has been shown to be approximately 100-fold more efficient than the native intestinally derived enzyme in fusion protein cleavage reactions (Racie et al., 1995). The recombinant enterokinase is available from New England Biolabs.

Methods for the overexpression of cloned gene products in E. coli have improved significantly since it was first attempted. Common problems such as variable expression levels, inclusion body formation, and purification difficulties have been successfully addressed by advancements in expression technology. Probably the most significant of these advancements has been the development of fusion proteins and fusion tag expression and purification techniques. These methods have resulted in more consistent production of soluble and active protein, and have allowed for simple and efficient purification of the proteins from bacterial lysates. Although the production of soluble, properly folded, and active recombinant proteins in E. coli is still not guaranteed, the likelihood of success is far greater than it was just a few years ago. This progress should help ensure that E. coli will continue to be the host organism of choice for recombinant protein production.

Bishai, W.R., Rappuoli, R., and Murphy, J.R. 1987. High-level expression of a proteolytically sensitive diphtheria toxin fragment in Escherichia coli. J. Bacteriol. 169:5140-5151. Bornstein, P. and Balian, G. 1977. Cleavage at AsnGly bonds with hydroxylamine. Methods Enzymol. 47:132-145. Bowden, G.A. and Georgiou, G. 1990. Folding and aggregation of -lactamase in the periplasmic space of Escherichia coli. J. Biol. Chem. 265:16760-16766. Brewer, S.J. and Sassenfeld, H.M. 1985. The purification of recombinant proteins using C-terminal poly-arginine fusions. Trends Biotechnol. 3:119122. Bucheler, U.S., Werner, D., and Schirmer, R.H. 1990. Random silent mutagenesis in the initial triplets of the coding region: A technique for adapting human glutathione reductase-encoding cDNA to expression in Escherichia coli. Gene 96:271-276. Cheah, K.C., Harrison, S., King, R., Crocker, L., Well, J.R., and Robins, A. 1994. Secretion of eukaryotic growth hormones in Escherichia coli is influenced by the sequence of the mature proteins. Gene 138:9-15.

Current Protocols in Protein Science

Dalbge, H., Dahl, H.H.M., Pedersen, J., Hansen, J.W., and Christensen, T. 1987. A novel enzymatic method for production of authentic hGH from an Escherichia coli-produced hGH precursor. Bio/Technology 5:161-164. De Lamarter, J.F., Mermod, J.J., Liang, C.M., Eliason, J.F., and Thatcher, D.R. 1985. Recombinant murine GM-CSF from E. coli has biological activity and is neutralized by a specific antiserum. EMBO J. 4:2575-2581. Devlin, P.E., Drummond, R.J., Toy, P., Mark, D.F., Watt, K.W.K., and Devlin, J.J. 1988. Alteration of the amino-terminal codons of human granulocyte colony-stimulating factor increases expression levels and allows efficient processing by methionine aminopeptidase in Escherichia coli. Gene 65:13-22. Dykes, C.W., Bookless, A.B., Coomber, B.A., Noble, S.A., Humber, D.C., and Hobden, A.N. 1988. Expression of atrial natriuretic factor as a cleavable fusion protein with chloramphenicol acetyltransferase in Escherichia coli. Eur. J. Biochem. 174:411-416. Fuh, G., Mulkerrin, M.G., Bass, S., McFarland, N., Brochier, M., Bourell, J.H., Light, D.R., and Wells, J.A. 1990. The human growth hormone receptor. Secretion from Escherichia coli and disulfide bonding pattern of the extracellular binding domain. J. Biol. Chem. 265:3111-3115. Gardella, T.J., Rubin, D., Abou-Samra, A.-B., Keutmann, H.T., Potts, J.T. Jr., Kronenberg, H.M., and Nussbaum, S.R. 1990. Expression of human parathyroid hormone (1-84) in Escherichia coli as a factor X-cleavable fusion protein. J. Biol. Chem. 265:15854-15859. Gearing, D.P., Nicola, N.A., Metcalf, D., Foote, S., Willson, T.A., Gough, N.M., and Williams, R.L. 1989. Production of leukemia factor in Escherichia coli by a novel procedure and its use in maintaining embryonic stem cells in culture. Bio/Technology 7:1157-1161. Germino, J. and Bastia, D. 1984. Rapid purification of a gene product by genetic fusion and site specific proteolysis. Proc. Natl. Acad. Sci. U.S.A. 81:4692-4696. Guan, C., Li, P., Riggs, P.D., and Inouye, H. 1988. Vectors that facilitate the expression and purification of foreign peptides in Escherichia coli by fusion to maltose-binding protein. Gene 67:2130. Haffey, M.L., Lehman, D., and Boger, J. 1987. Sitespecific cleavage of a fusion protein by renin. DNA 6:565-571. Hellman, J. and Mantsala, P. 1992. Construction of an E. coli export-affinity vector for expression and purification of foreign proteins by fusion to cyclomaltodextrin glucanotransferase. J. Biotechnol. 23:19-34. Itakura, K., Hirose, T., Crea, R., Riggs, A.D., Heyneker, H.L., Bolivar, F., and Boyer, H.W. 1977. Expression in Escherichia coli of a chemically synthesized gene for the hormone somatostatin. Science 198:1056-1063.

Knott, J.A., Sullivan, C.A., and Weston, A. 1988. The isolation and characterisation of human atrial natriuretic factor produced as a fusion protein in Escherichia coli. Eur. J. Biochem. 174:405-410. LaVallie, E.R., DiBlasio, E.A., Kovacic, S., Grant, K.L., Schendel, P.F., and McCoy, J.M. 1993a. A thioredoxin gene fusion system that circumvents inclusion body formation in the E. coli cytoplasm. Bio/Technology 11:187-193. LaVallie, E.R., Rehemtulla, A., Racie, L.A., DiBlasio, E.A., Ferenz, C., Grant, K.L., Light, A., and McCoy, J.M. 1993b. Cloning and functional expression of a cDNA encoding the catalytic subunit of bovine enterokinase. J. Biol. Chem. 268:23311-23317. Looman, A.C., Bodlaender, J., Comstock, L.J., Eaton, D., Jhurani, P., de Boer, H.A., and van Knippenberg, P.H. 1987. Influence of the codon following the AUG initiation codon on the expression of a modified lacZ gene in Escherichia coli. EMBO J. 6:2489-2492. Lundeberg, J., Wahlberg, J., and Uhlen, M. 1990. Affinity purification of specific DNA fragments using a lac repressor fusion protein. Genet. Anal. Techn. Appl. 7:47-52. Lunn, C.A., Kathju, S., Wallace, B.J., Kushner, S.R., and Pigiet, V. 1984. Amplification and purification of plasmid-encoded thioredoxin from Escherichia coli K12. J. Biol. Chem. 259:1046910474. Maina, C.V., Riggs, P.D., Grandea, A.G., Slatko, B .E., Mora n, L.S., Ta gliam onte, J.A., McReynolds, L.A., and Guan, C. 1988. An Escherichia coli vector to express and purify foreign proteins by fusion to and separation from maltose-binding protein. Gene 74:365-373. Mieschendahl, M., Petri, T., and Hanggi, U. 1986. A novel prophage independent trp regulated lambda pL expression system. Bio/Technology 4:802-808. Miller, J.H. and Reznikoff, W.S. (eds.) 1978. The Operon. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Nagai, K. and Thgersen, H. C. 1984. Generation of -globin by sequence-specific proteolysis of a hybrid protein in Escherichia coli. Nature 309:810-812. Nagai, K. and Thgersen, H.C. 1987. Synthesis and sequence-specific proteolysis of hybrid proteins produced in Escherichia coli. Methods Enzymol. 153:461-481. Neu, H.C. and Heppel, L.A. 1965. Release of enzymes from Escherichia coli by osmotic shock during the formation of spheroplasts. J. Biol. Chem. 240:3685-3692. Nilsson, B., Abrahmsen, L., and Uhlen, M. 1985. Immobilization and purification of enzymes with staphylococcal protein A gene fusion vectors. EMBO J. 4:1075-1080. Oka, T., Sakamoto, S., Miyoshi, K., Fuwa, T., Yoda, K., Yamasaki, M., Tamura, G., and Miyake, T. 1985. Synthesis and secretion of human epiderProduction of Recombinant Proteins

Current Protocols in Protein Science

mal growth factor by Escherichia coli. Proc. Natl. Acad. Sci. U.S.A. 82:7212-7216. Olins, P.O. and Rangwala, S.H. 1989. A novel sequence element derived from bacteriophage T7 mRNA acts as an enhancer of translation of the lacZ gene in Escherichia coli. J. Biol. Chem. 264:16973-16976. Persson, M., Bergstrand, M.G., Bulow, L., and Mosbach, K. 1988. Enzyme purification by genetically attached polycysteine and polyphenylalanine tags. Anal. Biochem. 172:330-337. Power, B.E., Ivancic, N., Harley, V.R., Webster, R.G., Kortt, A.A., Irving, R.A., and Hudson, P.J. 1992. High-level temperature-induced synthesis of an antibody VH-domain in Escherichia coli using the PelB secretion signal. Gene 113:95-99. Racie, L.A., McColgan, J.M., Grant, K.L., DiBlasio-Smith, E.A., McCoy, J.M., and LaVallie, E.R. 1995. Production of recombinant bovine enterokinase catalytic subunit in Escherichia coli using the novel secretory fusion partner DsbA. Bio/Technology. In press. Robinson, M., Lilley, R., Little, S., Emtage, J.S., Yarranton, G., Stephens, P., Millican, A., Eaton, M., and Humphreys, G. 1984. Codon usage can affect efficiency of translation of genes in Escherichia coli. Nucl. Acids Res. 12:6663-6671. Ruther, U. and Muller-Hill, B. 1983. Easy identification of cDNA clones. EMBO J. 2:1791-1794. Schatz, P.J. 1993. Use of peptide libraries to map the substrate specificity of a peptide-modifying enzyme: A 13 residue consensus peptide specifies biotinylation in Escherichia coli. Bio/Technology 11:1138-1143. Schein, C.H. 1989. Production of soluble recombinant proteins in bacteria. Bio/Technology 7:1141-1149. Schein, C.S. and Noteborn, M.H.M. 1988. Formation of soluble recombinant proteins in Escherichia coli is favored by lower growth temperature. Bio/Technology 6:291-294. Shimatake, H. and Rosenberg, M. 1981. Purified regulatory protein cII positively activates promoters for lysogenic development. Nature 292:128-132.

Shine, J. and Dalgarno, L. 1974. The 3-terminal sequence of Escherichia coli 16S ribosomal RNA: Complementarity to nonsense triplets and ribosome binding sites. Proc. Natl. Acad. Sci. U.S.A. 71:1342-1346. Skerra, A. 1994. A general vector, pASK84, for cloning, bacterial production, and single-step purification of antibody Fab fragments. Gene 141:79-84. Smith, D.B. and Johnson, K.S. 1988. Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione Stransferase. Gene 67:31-40. Studier, F.W., Rosenberg, A.H., Dunn, J.J., and Dubendorff, J.W. 1990. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185:60-89. Szoka, P.R., Schreiber, A.B., Chan, H., and Murthy, J. 1986. A general method for retrieving components of a genetically engineered fusion protein. DNA 5:11-20. Takahara, M., Sagai, H., Inouye, S., and Inouye, M. 1988. Secretion of human superoxide dismutase in Escherichia coli. Bio/Technology 6:195-198. Taylor, M.E. and Drickamer, K. 1991. Carbohydrate-recognition domains as tools for rapid purification of recombinant eukaryotic proteins. Biochem. J. 274:575-580. Vasquez, J.R., Evnin, L.B., Higaki, J.N., and Craik, C.S. 1989. An expression system for trypsin. J. Cell. Biochem. 39:265-276. Villa, S., DeFazio, G., and Canosi, U. 1989. Cyanogen bromide cleavage at methionine residues of polypeptides containing disulfide bonds. Anal. Biochem. 177:161-164. Yansura, D.G. 1990. Expression as trpE fusion. Methods Enzymol. 165:161-166.

Contributed by Edward R. LaVallie Genetics Institute, Inc. Cambridge, Massachusetts

Production of Recombinant Proteins in Escherichia coli

Current Protocols in Protein Science