You are on page 1of 13

644 PROTEIN FOLDING Vol.

7

PROTEIN FOLDING
Introduction

Protein folding is one of the most important processes that are vital for the ex-
istence of every living system. The correctly folded state of proteins is essential
for the tremendous array of protein functions at all levels of complexity, from the
subcellular machinery to organs and macroscopic structural elements. A very in-
complete list of protein functions includes their activity as enzymes (biocatalysts),
sensors of chemical and physical signals, regulators of expression of the genetic
information, structural building blocks (ranging from the microscopic cytoskele-
ton to the macroscopic fingernails and hair), and mediators of the immunological
phenomenon (as antibodies, receptors, and antigen-presenting agents).
As correct protein folding is required for normal cell function, protein
misfolding has grave consequences. An incorrect folded state of proteins is the
hallmark of a large number of diseases of unrelated origin, some of which are the
most common diseases in the western society. Such diseases include cystic fibrosis,
the most common genetic disease in the Caucasian population, amyloid diseases,
including Alzheimer’s disease and Type II diabetes, and infectious prion diseases
such as bovine spongiform encephalopathy—“the mad cow disease.” Since amy-
loid diseases, are age-related, and the average age of the population will increase
significantly in the next decades, the prevalence of such diseases is predicted to
increase accordingly.
The “protein-folding problem” is related to the determination of how and
why a protein of a given amino acid sequence adopts a certain three-dimensional
structure. A recent extension to the problem includes the question of how and
why a protein adopts the misfolded or unfolded conformations. This article de-
scribes the different hierarchies of protein folding, the classical models and current

Encyclopedia of Polymer Science and Technology. Copyright John Wiley & Sons, Inc. All rights reserved.

those short peptides may be composed of amino acids that are different from the 20 naturally occurring amino acids. rather than the composition of the protein. phenyl. The information for the polymerization of a linear sequence of a given protein is en- coded in the DNA. For a very small protein of 50 amino acids. Proteins as Polymers Proteins are generally composed of a linear combination of the 20 naturally oc- curring L-α-amino acids (see PROTEINS). each devoted to the synthesis of one specific peptide. GENETIC METHODS OF POLYMER SYNTHESIS). that is. which is in turn translated into the protein polymer by cellular ribosomes (see POLYNUCLEOTIDES. the as- sistance of molecular chaperons in the folding process. polar. the linear arrangement of the building blocks. Two proteins with the very same amino acid composition may be completely different in their folded-state and molecular properties. which is transcribed to a messenger RNA. The poly- merization of the amino acids occurs through the formation of an amide linkage between the carboxyl group of given amino acid and the amino group of the next amino acid. Shorter polymers composed of amino acids (from 2 to around 40 amino acids long) are usually termed peptides. hydroxyl.Vol. 7 PROTEIN FOLDING 645 understanding of the folding process through defined pathways or funnels. Finally. amine. Therefore. rather than a random polymerization process. A typical length of a protein can range from tens to thousands of building blocks. with the resultant elimination of a molecule of water. In many cases. it is clear that the polymerization of a protein must be directed by a specific molecular plan. The molecular basis for the enormously diverse functions of the proteins is the sig- nificant diversity of their building blocks. Other peptides (usually those shorter than 10 amino acids) may be synthesized by cellular enzymes. These peptides may be composed of nonnatural amino acids or D-α-amino acids. Some of the peptides are products of cleavage of longer ribosome-translated precursor. the experimental method- ologies that are being used for the study of protein folding. and are thereby polyamides. It may include various functional groups such as thiol. The incredible degree of diversity presented by proteins can be appreciated by the calculation of the number of linear combinations available even for very short proteins. . This is due to post-translational modifications (such as the enzymatic formation of hydroxyproline by the prolyl hydroxylase protein enzyme). and amide groups. Furthermore. The side chain of the amino acids can be either negatively or positively charged. For a more typical-sized protein of 200 amino acids. This amide link- age is more commonly called a peptide bond and proteins are polypeptides. the number of linear combinations of the 20 natural amino acids is about 1065 . or aro- matic (substituted or nonsubstituted). The chemical nature of the 20 natural amino acids is extremely versatile. the article will describe recent advancements in the understanding of protein unfolding and misfolding. and the attempts and advances in the prediction of secondary and tertiary structures of proteins. the number of linear combinations is more than 10260 . the properties of this biopolymer are a result of the amino acid sequence. Moreover. aliphatic (branched or unbranched). ribosome-synthesized pro- teins may have building blocks that are different from the 20 natural amino acids. carboxyl.

) This method is based on the Edman degrada- tion reaction that removes the N-terminal amino acid of the protein chain. The α-helical structure. The two main structural elements that are seen in proteins are the α-helix and β-sheet struc- tures. which is more stable . The helical structure has 3. This is the linear amino acid sequence of the polypep- tide chain as described above. In the last few years. there has been an in- creased tendency to use mass spectrometry techniques to determine the sequence of proteins. This immense magnitude of structural diversity is the core of the central role of proteins in all living systems.6 residues per turn and the main forces that stabilize the structure are hydrogen bonds between amide hydrogens of peptide bonds and carboxyl oxygens of residues at the next turn of the helix. secondary. The β-sheet structure is formed by the stacking of individual β-strands. Determination of the primary structure of proteins was performed for many years by the N-terminal sequencing that was developed by Frederick Sanger for which he received his first Nobel Prize in Chemistry in 1958. The β-strands are usually from 5 to 15 residues long and are in a fully extended conformation. which was originally suggested by the Nobel Lau- reate Linus Pauling and Robert Corey (1). Primary structure is usually denoted by either one. These sequencing procedures are based on tandem mass spectrome- try (MS/MS) in which proteins. This results in an ensemble of many molecules with defined differences in mass. could be described as a spring coiled right-handed about an imaginary cylinder. Only two amino acids could not be distinguished using conventional MS techniques. However. Mass spectrometry (qv) is also a very powerful method for the determination of post-translational modifications. The size of a helix can range from 5 to 50 amino acids. The Structures of Proteins The structures of proteins are usually described in four levels of organization: primary. 7 All these calculations are done without taking into account any post-translational modifications. and quaternary. The Secondary Structure. recent techniques such as TOF mass spectrom- etry identify side-chain fragmentations that allow the selective identification of leucine and isoleucine. The β-strands can be arranged either in a parallel manner (in each strand the amino acid sequence of the chain is arranged toward the same direction) or in an antiparallel manner. which have the same molecular mass. or more commonly protein fragments.or three-letter conventional codes. The arrangement of the amino acids in a pro- tein into local structural elements is termed secondary structure. leav- ing a new N-terminus on the chain. The identity of the removed amino acid is then determined by HPLC analysis.646 PROTEIN FOLDING Vol. The high ac- curacy of the mass spectrometers and the ability to specifically select a precursor ion (through an ion trap or a quadrupole) allow the determination of the amino acid sequence of the chain. tertiary. (His second Nobel Prize in 1980 was for the development of DNA-sequencing techniques. In the context of the protein-folding question the secondary and tertiary structures are the most important. These are leucine and isoleucine. This secondary structure is also stabilized by hydrogen-bonding amide hydrogens and carboxyl oxygens of stacked chains. Primary Structure. are further fragmented in the mass spectrometer by collision with gas molecules.

The tertiary structure of a protein is formally described by the coordinates in space of all (or most) atoms of the protein molecules [see description below of the protein data- bank (PDB)]. and there is certainly a core-packing process that may involve more specific interactions (eg. The Thermodynamic Hypothesis A guiding principle in the study of protein folding is the “thermodynamic hypoth- esis” (6). Other elements like salt bridges. established by Christian A. This is the spatial arrangement of non–covalently linked protein subunits to form a functional protein assembly. The thermodynamic hypothesis suggests that a particular three-dimensional struc- ture of a protein occurs because this molecular arrangement is the most stable . and tertiary structures of the bovine pancreatic trypsin inhibitor (BPTI). and hydrogen bonds are also very im- portant in the stabilization of the folded proteins and fine-tuning of the folded state. are quite suc- cessful in low resolution prediction of the three-dimensional structure of proteins and folding pathways (3–5). which are flexible linkers that connect secondary structure elements. The main driving force for formation of three-dimensional struc- tures of proteins appears to be hydrophobic interactions (2–5). As with secondary structure elements. and loops. The arrangement of the protein in a way that minimizes the energetically unfavorable orientations (ie. 7 PROTEIN FOLDING 647 energetically. the specific ar- rangement of secondary structure elements) is almost infinite. stacking of aromatic residues in a specific orientation within the hydrophobic core). short turns stabi- lized by a specific pattern of hydrogen bonds. which present the protein as a linear arrangement of hydrophobic and hydrophilic elements. Even the hydrophobic core could not be regarded purely as an unstructured entity. but can also be elongated or possess other geometries. Extensive discussion of the quaternary structure is beyond the scope of this article. Quaternary Structure. Tertiary Structure.Vol. and its commercial availability. Other secondary structure elements are β-turns. The tertiary structure is usually quite compact and after globular. disulfide bonds. in spite of the fact that the number of possible folds (ie. However. secondary. hydrophobic moieties on the surface of the protein and hydrophilic moieties buried inside the core) appear to provide an overall structural pattern for the folded state. Figure 1 shows the primary. The tertiary structure is made up from the interaction of the secondary structure elements to form the overall folding pattern of the polypeptide chain. there is a basic set of a few thousand unique folds. because of its small size (58 amino acids). Anfinsen in the 1950s and early 1960s. binary models of proteins. the fact that it contains both α-helical and β-sheet structural elements. Interestingly. it is very clear that the fine structure of a protein is a result of much more complex interactions. This is the three-dimensional structure of a protein. This observation has a crucial role in our ability to predict the tertiary structure of proteins through fold-recognition methods as described below. One of the best- known examples of a quaternary structure of protein is the assembly of func- tional hemoglobin that is made of four non–covalently linked subunits. This small protein is being widely used as a model for folding studies.

7 (a) Primary structure NH2-RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA-COOH (b) Secondary structure (c) Tertiary structure Fig. only one of which has enzymatic activity. 1. The RNase is a relatively small protein of 124 amino acids that contains four disulfide bridges. The Anfinsen hypothesis could be best summarized in his own words from his 1972 Nobel Prize acceptance speech.” Chaperon-Assisted Protein Folding. there are 105 (7 × 5 × 3 × 1) possible combinations for the arrangement of the disulfide bond in the folded protein. “The native conformation is determined by the totality of interatomic interactions and hence by the amino acid sequence. This is consistent with a random formation of various 105 “scrambled” RNase structures. wireframe chem- ical structure (right). Levels of structural organization of the BPTI protein. The Anfinsen theory was a very clear turning point in our understanding of protein folding and it appears to be valid in the typical time frame of the folding process (milliseconds to seconds) for most studied proteins. However. Secondary structure elements are presented according to their determination as they appear in the file containing the BPTI coordinates (1BPI) as deposited in the protein databank (PDB). when the urea was removed. more recent studies have clearly demonstrated . Anfinsen reduced the disulfide bridges of RNase and unfolded the protein under extreme chemical conditions using high concentrations of urea. the amino acid sequence of protein. only about 1% activity was gained. However. Under these conditions the protein was completely unfolded and could be regarded as a random polymer. Therefore. (c). The development of the hypothesis was based on Anfinsen’s experiments with the ribonuclease (Rnase) protein. On the other hand. However. (a) Primary structure. followed by oxidation of the disul- fide bridges. The tertiary structures were prepared using rasmol software with BPTI coordinates. the protein spontaneously refolded back into its original form. and schematic strand representation (left).648 PROTEIN FOLDING Vol. in a given environment. Tertiary structure. (b) Secondary structure. α-helices are denoted by shaded boxes and β-strands are denoted by arrows. thermodynamically. only one orientation is enzymatically active. when the protein was oxidized in the presence of a high concentration of urea (that was only later removed).

Cyrus Levinthal suggested in the late 1960s. one could assume that a protein can sample about 1014 structures per second. the correctly folded protein is released from the chaperon assembly into the cellular environment. The very rapid folding reac- tion. 7 PROTEIN FOLDING 649 that in the cellular environment. that was proved to be highly instrumental in the conceptual realization of the folding pathways notion. that there are specific folding pathways. which are large protein assemblies. on the basis of his theoretical argument. once folding is complete (or even before). but they are mainly expressed under conditions of stress (such as heat shock). it would take 40 amino acids 1026 seconds or about 1018 years to examine all the possible confor- mations. Levinthal argued that if we consider an average rotation frequency around each bond. However. The chaperons. The apparent con- tradiction between the number of possible configurations and the fast folding rate of proteins is known as “Levinthal’s paradox” (11). Therefore. for most cellular proteins the “thermodynamic hypothesis” seems to hold during their initial process of folding upon synthesis by the ribosome. Folding Pathways It took several years after Levinthal’s and others’ theoretical arguments until folding pathways could actually be demonstrated. BPTI has three disulfide bridges. . Indeed. made the observation of folding intermediates quite difficult. the number of possible orientations of the dihedral angles of the various peptide bonds that compose the protein is astronom- ical.13) on the folding of bovine pancreatic trypsin inhibitor (BPTI). The physiological role of the molecular chaperons is not fully under- stood. Levinthal’s Paradox The thermodynamic hypothesis describes the energetic end point of the folding process. but it does not deal with the pathway by which the molecule reaches its global energetic minimum. are associated with the target protein during part of the folding process. Nevertheless. the folding process may be assisted or may require the help of so-called molecular chaperons (7–9). They also seem to serve as “quality control” machinery under normal growth conditions. This time scale is much larger than the present age of the universe. Assuming 10 possible conformations for each peptide bond in a very short 40 amino acid protein will result in 1040 possible conformations. The clear conclusion from Levinthal’s paradox is that folding cannot occur by sampling the entire conformational space and that there must be another way to reach the folded state of low energy. by irreversibly blocking all the free thiol groups at different folding stages. It is widely speculated that the main role of these molecular assemblies is to avoid protein aggregation.Vol. The first experimental proof for the existence of specific folding pathways came from experiments done by Thomas Creighton in the mid-1970s (12. The rate of protein folding is usually in the order of milliseconds to seconds (10). However. Chaperons are found in all living organisms including bacteria. and Creighton was able to trap folding intermediates of BPTI with one or two disulfide bridges formed. the protein that is presented in Figure 1.

2). the exact path is the result of a probability function rather than a deterministic function. in spite of sampling only a small part of the energy landscape. and they can only be represented as a probability function. The funnel approach does not contradict Levinthal’s argument that a given protein does not sample the entire energy landscape of its various conformations. Figure 2 presents a schematic view of the energetic landscape funnel and the folding pathway models. B. Energy of folding reaction. Later experiments by Weissman and Kim (14)—using different methodologies—suggested that the nature of the folding intermediate might be different from that originally described. This approach suggests that different partially folded species are distributed in the energetic landscape of the folding reaction according to the probability of occupation at a finite temperature in a way that is proportional to the Boltzmann factor. 7 A B Unfolded Intermediate Intermediate Folded Folded Energy Conformation Reaction coordinate Fig. other groups have suggested that the pathway toward the minimum of free energy might be less deterministic than previously assumed (15–20).650 PROTEIN FOLDING Vol. The reaction coordinate is deterministic and intermediates along the pathway are well-defined. The tra- jectory of the protein along the energy landscape is a probabilistic statistical mechanics function. According to this view. since in a typical experiment of protein folding there is a very large number of molecules. However. there is no way to predict the exact conformation of each molecule. However. A. The practical meaning of this approach is that there is no way to clearly know the folded state of a molecule during the folding process. A pathway view of the folding process. In more recent years. The folding process is represented as a chemical reaction. Analysis of the nature of the intermediates allowed the construction of a folding pathway of BPTI. Very recent studies that use single-molecule techniques should help further explore the energy landscape of protein folding (21). there was an agreement on the concept of intermediate species in the pathway from the unfolded to the folded state. Funnel view of the energetic landscape. This view is based on statistical mechanics rather than a chemical reaction view of the folding process (Fig. In this more modern point of view the folding process is seen as a multi- plicity of routes down a folding funnel rather than a distinct pathway of discrete intermediates. Furthermore. the molecular dynamics within a folding funnel involves the progressive formation of an ensem- ble of partially ordered structures. . 2.

Therefore. Experimental Methodologies As described above. The hydrophobic collapse model suggests that the first stage of the folding process involves the rapid collapse of the protein chain into a compact conformation that does not have well-ordered secondary structures. The nucleation-growth mech- anism suggests that the first stage in the formation of the folded state involves the formation of a small nucleus that is well-folded and compact. the folding reaction is a very rapid event (10). The CD spectrum of a protein in the far-UV region (180–250 nm) shows a clear indication of secondary structure (especially of α-helical structure). The properties that are usually mon- itored using these techniques are either aromatic residues fluorescence or cir- cular dichroism spectra.4). aromatic residues tend to be part of the hydrophobic core of the protein. (2) the nucleation-growth mechanism (25). the change in free energy (G) of the folding reaction can be calculated. Thus it allows the direct observation of the formation of secondary structures. The three major hypotheses for the order of events are (1) the framework model (22–24). Circular dichroism (CD) spectra of proteins provide direct information on the secondary structure of proteins. and (3) the hydrophobic collapse model (3. This nucleus involves only a small fraction of the protein polypeptide chain. . The fluorescence of the aromatic residues (especially of tryptophan) is largely dependent on the dielectric constant of its environment. These two techniques can be used not only to study the kinetics of protein folding but also the thermodynamics.Vol. one of the interesting questions that still remains open is the order of events during the folding pathways. By steady-state determination of the fraction of folded protein upon titration with denaturating agent such as urea. The secondary structure elements then grow in the general framework of the collapsed structure. monitoring the change of fluorescence during folding and un- folding reactions allow insight into the reaction mechanism. Furthermore. 7 PROTEIN FOLDING 651 Models for the Order of Events in the Folding Process Regardless of the view of the folding process as discrete steps with defined in- termediates or as an energetic funnel. It may be that different pro- tein molecules are being folded in different ways and/or that the actual sequence of events may be different from or a combination of the three models. The framework model suggests that the first stage in the folding reaction in- volves the formation of elements of secondary structures without any significant compactization. There is no clear indication which of the models correctly describes the folding mechanism. Therefore in order to be able to follow the kinetics of folding and sequence of events there is a need for rapid stopped-flow techniques. The formation of the nucleus is followed by hierarchical assembly of further structural elements to form the well-ordered three-dimensional structure. These secondary structure elements are then assembled to form the compact and folded final structure of the protein. The statisti- cal mechanics model also may suggest that all three models are actually different discrete trajectories on the energy landscape and each one of them can occur even at the same folding reaction of an ensemble of protein molecules.

The Chou– Fasman scale for secondary structure prediction is based on the statistical occur- rences of amino acids in different secondary structure elements. Proteins are scanned through a fixed-size window and each area of the polymer is scaled for its tendency to form the various secondary structures. Several years ago. Their classifica- tion was based on the structures of protein as deposited in the PDB. By the end of 2001 there were nearly 17. The PDB was established at the Brookhaven National Labo- ratories (BNL) in the 1970s and maintained there for many years. The categories were strong former (H). Prediction of Protein Folding The importance of the correct protein fold for the understanding of protein activity and design of novel proteins has led to much interest in the possibility of predicting correct protein folds. 3). indifferent former (i). and strong breaker (B) of the specific secondary structure elements. 7 Fourier-transformed infrared (FTIR) is another excellent method to study protein folding. the attempts to predict secondary structures of proteins have been much more (although not fully) successful. the San Diego Supercomputer Center at the University of California. FTIR has the advantage of being more sensitive for the study of proteins that contain β-sheet elements as compared to CD. breaker (b).org) and it contains the coordinates of protein structures as determined by X-ray crystallography or nuclear magnetic resonance (NMR) spec- troscopy. The rate of structure submission to the PDB has increased steadily over the years (Fig. The State University of New Jersey. Further developments of the Chou–Fasman method. Databank data is freely available via the Inter- net (http://www. The availability of the rapid step-scan method for FTIR is also very useful for the study of rapid folding reactions (see VIBRATIONAL SPECTROSCOPY). the maintenance and development of the PDB was transferred to Rutgers.27). taking into account the relative occurrence of the different amino acids in various proteins (which can be quite diverse). improved the prediction potential by taking into account not only the identity of the specific amino acid but . The Protein Databank (PDB). In spite of the great importance and interest in the prediction of the three-dimensional structure of proteins from the primary structure there are no such algorithms available to date. former (h).pdb. One of the most instructive tools available for researchers studying both basic and applied aspects of protein folding is the protein databank (PDB) (26. such as the GOR method (29). since FTIR spectroscopy can be applied to solids also. However. it allows the structural analysis of aggregated protein deposits. One of the earliest and most pivotal attempts to predict the secondary struc- ture of proteins was made by Peter Chou and Gerald Fasman (28).000 deposited struc- tures in the PDB.652 PROTEIN FOLDING Vol. The various amino acids were classified according to their ability to form or break the two major secondary structures. Furthermore. in terms of protein structure this method allows the determination of secondary structure. The frequency of vibration of the amide I band of the peptide chain (1500–1600 cm − 1 ) heavily depends on the struc- ture of the protein. weak former (I). Unlike the well-known use of FTIR as a method for the identifi- cation of functional groups. and the National Institute of Standards and Technology—three members of the Research Collaboratory for Structural Bioin- formatics (RCSB). San Diego.

specific secondary structure patterns). with a sharp decrease in 2001. 7 PROTEIN FOLDING 653 3500 Number of new structures at the PDB 3000 2500 2000 1500 1000 500 0 1972 1976 1980 1984 1988 1992 1996 2000 700 Number of new folds at the PDB 600 500 400 300 200 100 0 1980 1984 1988 1992 1996 2000 Year Fig. as described below. The number of new folds shows significantly different behavior.Vol. One very successful example of the neural net- works method is the development of the secondary structure prediction tool in a mail server called PHD (30). The data was taken from the official PDB statistics. The success rate of this process is formally determined by the a contest known as the Critical Assessment of Techniques for Protein Structure Prediction (CASP). New structures versus new folds in the PDB. There is a constant increase in the number of new structures submitted to the PDB. Another approach for the prediction of secondary structures is based on neural networks analysis. 3. Ac- cording to this method. Such . One of the major problems is finding a way to distinguish between the global mini- mum of free energy of a given protein molecule and local energetic minima. computational neural networks are trained by sequences of known secondary structures. but ab initio methods are still not very accurate. Most of the successful models of protein structures are based on homology modeling. The CASP contest showed a significant advance in the ability to predict three- dimensional structures of proteins over the years by homology modeling and fold recognition. Prediction of the three-dimensional structures of proteins is much more com- plex. also their context (ie.

This is consistent with the secondary transition toward a β-sheet structure upon amyloid fibril formation as is observed using CD. This process is also accompanied by a struc- tural transition of the aggregated proteins from their native fold into a predom- inantly β-sheet secondary structure. followed by energy minimization of the structure using force fields. An Automated Comparative Protein Modeling Server.48-nm reflec- tion on the meridian. and FTIR spectroscopy. as in the case of Toxin–Anitoxin systems (38). is available freely on-line (http://www. Another method for the prediction of three-dimensional structures of proteins is based on the in silico assembly of protein structure from shorter structural elements with more defined structures (32).ch/swissmod/SWISS-MODEL. which is related to the formation of biofilms (39). The “Correctly Folded State” as a Metastable State The folded state of protein. Instead of trying to predict the fold of the protein on the basis of the amount of free energy. Another of the successful methods that is based on a reverse approach relies on protein fold recognition by “threading” (31). it provides a way to look for the structure of new proteins using known folds. the diabetes-related islet amy- loid polypeptide (IAPP) and the Parkinson’s disease–related α-synuclein polypep- tides. soluble cellular proteins form large and ordered fibrillar structures. The fibrillar structures are well ordered in the long axis direction. Another recent example is the formation of curli amyloid fibril in Escherichia coli. is considered the state of lowest free energy for a given . Protein Unfolding and Misfolding There is an increased interest in recent years in protein unfolding and misfolding (33–37). Two amyloid-forming proteins. the method determines whether a given sequence can be fitted to a known fold. as determined by X-ray crystallography or NMR and deposited in the PDB. Indeed. The Swiss Model. There is also a correlation between the unfolded and misfolded states of proteins. there is a clear realization that small but significant parts of protein molecules may be natively unfolded (38). More advanced homology-modeling software is available commercially. Such reflection corresponds to the hydrogen bonding dis- tance between β-strands. 7 methods start by forcing the target sequence into the fold of its closest relative with a known folded state. and X-ray fiber diffraction shows a clear 0.expasy.654 PROTEIN FOLDING Vol. In this self-assembly process. Furthermore. The formation of amyloid fibrils is probably one of the most important cases of protein misfolding. the formation of misfolded protein aggregates is the hallmark of various unrelated diseases and therefore attracts much medical attention. 3) appears to be finite. have clearly been shown to be natively unfolded. in which their instability is an integral part of a regu- latory mechanism. For example.html). This method takes advantage of the fact that the number of folds is limited. As the number of folds (Fig. recent studies have indicated that protein unfolding and misfolding might also have a physiological role. One instance is that of unstable and short-lived proteins. Other research activities are directed toward the search for the physiological significance of the unfolding and misfold- ing phenomena.

Rev. 1501 (1985). For example. 579 (1974). Therefore if each protein molecule is not regarded as an independent thermodynamic system. 6. W. 1852 (2002). Biol. A. Fenton. B. and most notably the amyloid form. and D. G. Biochemistry 24. 7 PROTEIN FOLDING 655 isolated protein molecule. the fibrillation time of the aggregation-prone human calcitonin at physiological pH is 5 min at a concentration of 5 mg/mL. U. 13. Natl. P. 9. Acad. G. Myers and T. G. a solution of aggregation-prone but “correctly folded” proteins represent a system in a transient state that will eventually reach its global free- energy minimum of the aggregated state (35). Cell 107. E. K. F. G. Kim. Farr. but we consider the ensemble of proteins in aqueous solution as one large thermo- dynamic system. 252 (1951). and A. whereas the fibrilla- tion time of non–aggregation-prone salmon calcitonin under the same condition is about 7 months (40). U. Mol. 5. 223 (1973). Dill. Sci. E. K. 1619 (1995). Levinthal. A. 381 (1993). Acad. Onuchic. 14249 (1996). Dill. E. Biomol. N. Proc. 10. Proc. Phys.A. Natl. U. 17. Montal. amyloid-related proteins tend to undergo spontaneous aggregation in solution. 112 (1992). S. Struct. F. Curr. T. 11. Corey. Many (if not all) proteins may undergo a process of aggregation and misfolding at infinite time (35). J. T. However. 12. Chim. D. U. N. 235 (2001). Science 295. C. Biochemistry 29. Creighton.A. A. Acad. J. 44 (1968). form aggregated structures in solution. Biophys. For some pro- teins this may take minutes or hours and for others it can take months or years. J. Rev. This leads to the suggestion that the aggregated form. 71. P. in recent years it has been realized that disease-related proteins are not the only ones to undergo an aggregation process in solution. S. C. Science 250. 7133 (1990). Sci.A. Natl.S.” as coined by Jarrett and Lansbury (33)]. 15. Dill. Anfinsen. 8721 (1992). Science 181. 297 (1990). Chaudhuri. J. 2. 65. 22. as described above. Wolfenden. Rose and R. Oas. 37. Hayer-Hartl.Vol. J. It has been demonstrated that disease-unrelated proteins also. Thirumalai. recent studies have raised doubt about the validity of this assumption for concentrated ensembles of proteins in aqueous solution. of proteins may represent a generic form of proteins. 8. Proc. Science 256. R. R1038 (2001). L. 87. Biochem. S. Leopold. G. M. 11. 93. 87. Aggregation is also noted in preparations of many protein solutions after long stor- age. However. B. Wolynes. It appears that many (or perhaps all) proteins will sooner or later undergo the aggregation process. 7. Hartl and M. K. BIBLIOGRAPHY 1. Annu. 4. Science 267. 783 (2002). Annu. K. Pauling and R. Mol. A. P. 3. 14. myo- globin. Weissman and P. 89. . Horwich. but after this long time it does form aggregated fibrillar structures. 563 (1974). such as the SH3 domain. For example. 16. Biol. K. T.S. J. Creighton. and J. L. Rospert. Sci. Ellis. and a bacterial cold shock protein. Biol.S. Onuchic. It is therefore as- sumed that the nonaggregated state of polypeptides such as IAPP is a kinetically trapped metastable state [“kinetic solubility. Wolynes. J.

B. 739 (2002). 393 (1995). B. Ed.656 PROTEIN FOLDING Vol. 28. 1 (1975). Bourne. H. Dobson. 90. Fasman. Normark. 36. O. 20. B. . C. Annu. PVF. Natl. S. Daggett. Curr. D. and R. Chem. 31. and S. 459 (1982). H. See PRESSURE SENSITIVE ADHESIVES. 112. 851 (2002). Biochem. Heuser. 41. A. 235 (2000). Jarrett and P. Robson. Z.A. Fiebig. K. K. 2652 (1999). 126 (1996). and B. E. Protein Sci. Struct. Dill. 329 (1999). T. F. I. Proc. Onuchic. Ptitsyn and A. 274. Biochem. Eisenberg. R. 24. See VINYLCARBAZOLE POLYMERS. A. 573 (2002). Biol. J. Int. 29. M. Uversky. Hultgren. FASEB J. S. Mol. Sci. Luthey-Schulten. Proc. Mol. Fersht.A.S. D. Biol. Chem. and D. T. M. Wolynes. 51. Fersht and V. Tsai. Sci. Hammar. U. Robinson. Annu. Koetzle. 13222 (1974). Chou and G. C. Meyer Jr.. 38. Natl. G. N. V. PVK. Trends Biochem. 10. 40. 1055 (1993). Osguthorpe. Bhat.S. and P. Shimanouchi. Science 269. Rodgers. 257 (2002). F. Natl. M. WERKMEISTER Tel Aviv University PSA. 32. PVC. 120. T. Biol. L. PULTRUSION. Sci. Mol. Westbrook. and H. Boczko and C. Nucleic Acids Res. B. J. Kim and R. E. See VINYL CHLORIDE POLYMERS. U. 1942 (1993). Biol. P. Acad. Kanaori and A. Rev. T. E. 3626 (1995). FASEB J. S. Williams. Garnier. PVDC. Bernstein. J. R. J. 23. J. Z. E. Sauer. Rice. U. T. S. 3. M. Kumar. 25. N. 11.A. Baldwin. 28. 37. 24. Biochemistry 34. K. R. 77 (2002). 34. Baldwin. L. Wolfson. J. Nosaka. Gazit. 7558 (1993). Nussinov. Biochemistry 15. 7 18. Gilliland. T. H. Shindyalov. E. Tasumi. Brice. 22. D. Biol. PVP. M. and N. R. C. 30. Chan. O. N. J. 21. 27. J. S. Proc. Gazit and R. Pinkner. Kennard. Rost and C. Rev. Fischer. 399 (2001). J. Sci. 97 (1978). Crit. 92. S. J. 19. M. 3 (1997). A. D. L. R. See COMPOSITES. See VINYL AMIDE POLYMERS. Bowie. Rev. 33. Biophys. EHUD GAZIT VERONICA GLATTAUER JEROME A. Gazit. A. Sander. See VINYLCARBAZOLE POLYMERS. 35. 12138 (1995). 7.S. Brooks. Acad. J. Ma. and M. See VINYLIDENE CHLORIDE POLYMERS. 535 (1977). Biochem. Opin. Feng. Cell 108. 36. 59. Cell 73. J. Acad. D. J. Lansbury Jr. Roth. 26. Y. Chapman. M. Berman. Rashin. Weissig. P. FABRICATION. Science 295.. G. P. 16. N. Angew. U. P. Socci. 631 (1990). 39. S. E. F. L. Kim and R. G. Chem. D. Y. 90. J.