You are on page 1of 44

RACHELLE GAUDET: My name is Rachelle Gaudet.

I'm a professor of molecular and cellular biology at Harvard University.

And my research centers on structural biology, and the structure

of proteins in particular.

So proteins are responsible for many of the essential functions of life.

And their function comes from their structure,

which is determined by their sequence.

So how does this linear sequence of amino acids

dictate the fold, or the three dimensional arrangement of a protein?

Let's start by examining the basic building blocks of proteins.

Recall that a protein is a linear polymer, or polypeptide,

of amino acids, each linked by a peptide bond.

And these peptide bonds are highlighted here in yellow.

There are 20 common natural amino acids that

are the letters that make up most of the proteins in nature.

So each amino acid has four different groups coming off

of its central carbon, which is called the alpha carbon, or C-alpha.

First is just a hydrogen group.

Then there's an amino group, a carboxylic acid group, and finally,

the R group or side chain.

And that's the variable one.

That's the one that varies between the 20 different common amino acids

and proteins.

So at the physiological pH of about 7.4, the amino group

is going to be positively charged.

And the carboxylic acid group is going to be a carboxylate,

or negatively charged.

And that means that, overall, the amino acid is actually

going to be uncharged, or neutral, and a zwitterion because it has


both a negative and a positive charge.

The other important feature is that the amino acid is chiral

because there are four different groups coming off of the C-alpha carbon.

And the natural amino acids that are used by the ribosome

to synthesize proteins are all L-amino acids.

AUDIENCE: Do you have any tricks to remember

what an L-amino acid looks like?

RACHELLE GAUDET: Yes I do.

So if you take the hydrogen that's coming off of the C-alpha carbon,

and then you put the nitrogen on the left and the carboxylate on the right

as you would if you're reading an amino acid sequence from N-terminus

to C-terminus, then the side-chain should be pointing up

and towards you like it is in this aspartic acid.

To speak the language of proteins, it's important to first learn its alphabet.

And that's the amino acids.

The 20 common natural amino acids are shown here.

There are many ways of grouping the amino acids.

This arrangement, which is shown by polarity and charge,

is one of these groupings.

Classifying amino acids by polarity is important

because their polarity affects which noncovalent interactions they can form.

And these interactions are largely what gives proteins their shape.

There are three main kinds of noncovalent interactions.

The weakest ones are van der Waals interactions,

such as this one between an aliphatic isoleucine

and an aliphatic leucine side chain.

As illustrated on the energy diagram on the right,

van der Waals interactions are weak and act only

over short distances, although they are present between any pair of atoms
in close proximity.

The distance at which the energy is minimal

represents the van der Waals radius that's illustrated here

by the transparent spheres.

The strongest noncovalent interactions are salt bridges

between pairs of charged ions.

Here lysine side-chain is paired to the C-terminal carboxylate of the protein.

Depending on the polarity of the environment,

a salt bridge can provide more than 10 times the binding energy

of a van der Waals interaction.

Finally, hydrogen bonds are two to five times

stronger than van der Waals interactions.

But they only occur between polar groups with permanent dipoles.

And one of these polar groups is acting as a hydrogen bond donor,

and the other one is a hydrogen acceptor.

Here you can see a hydrogen bond within the backbone of a protein,

within an alpha helix.

Hydrogen bonds are unique because they are directional.

They are strongest when the two dipoles are aligned.

In contrast, both the van der Waals interactions and salt bridge

interactions are non-directional.

So now let's survey a few of the amino acids.

We won't examine them all.

That's something you should do on your own.

But we'll get a sampling so that we can see

some of the different functionalities that each R group can have.

So let's start with the two acidic amino acids, aspartate and glutamate.

They both have a carboxylic acid group at the end of their side chain.

Or maybe I should say a carboxylate because at physiological pH


they are ionized and charged-- negatively charged.

Now, glutamate is longer than aspartate by one methylene group.

And you might think that that's not very much-- that's not a big difference.

But it actually makes a big difference, especially

in the types of conformations, or rotamers,

that each of these side-chains can achieve.

In this animation, you can see that the glutamate

has many conformations that it can achieve,

many more than the aspartate side-chain.

And that means that it might be better able to position itself exactly

in the right position to interact with a substrate or a ligand

in the active site of an enzyme.

So although the glutamate can optimally position itself,

that can come at an entropic cost.

And that's because the conformational flexibility of the many rotamers

will then be limited once it reaches its bound confirmation, reducing

its entropy.

So now let's turn to histidine, which is another interesting amino acid that's

often found in the active sites of enzymes.

Now histidine has a pKa for the imidazole group

of its side-chain of about 6, which means that it can either

be uncharged or charged at physiological pH, depending on its environment.

So on the left is a deprotonated, or uncharged, form of the histidine.

Whereas, on the right is a protonated and positively charged

form of the histidine.

Now, in the neutral state the proton can actually

be on either nitrogen atom of the imidazole group.

And that's illustrated here by the two alternative structures--

one of them on the extreme left, and one on the right of the screen.
These two neutral states have different hydrogen bonding properties,

as suggested by the red and blue arrows here.

The transitions between the different states

can be used to shuttle protons in active sites.

So from histidine, we saw that the pKas of the protein groups

reflect their chemical properties.

Several amino acids have polar groups that

have pKas spanning a wide range of values, as shown here.

As a food for thought, consider why would tyrosine

be so much more acidic than threonine and serine,

even though they're all alcohols?

So most amino acids are formed of carbon, nitrogen, oxygen, and hydrogen.

But two of them have a sulfur atom.

The first one is methionine, which is shown here.

And methionine is essentially a hydrophobic residue.

And it's very similar in size and shape to leucine, shown here.

The second one is cysteine, which has a sulfhydyrl group.

And this group is interesting because it is actually

quite reactive under physiological conditions.

One of the reactions that it can undertake

is to be oxidized to form a disulfide bond.

So cysteines can react to form these disulfide bonds

under oxidizing conditions.

And those are conditions that are often found on the extracellular sides.

Whereas, inside cells, conditions tend to be

more on the reducing side, which means that the cysteines will

be found in the reduced free form.

So I told you that amino acids that are used

by the ribosome, the natural amino acids, are chiral and L-amino acids.
Well, there are actually two exceptions to this.

One is glycine.

Glycine has a hydrogen as a side chain, which

means that it now has two hydrogens coming off of its alpha carbon.

So it's not chiral.

But also, the small side-chain means that it

has fewer conformational restrictions.

And that's going to be important in the process of protein folding.

The second exception is proline.

And proline is a cyclic amino acid.

And that's because its side-chain is actually

covalently linked to its amino group.

Now this linkage, this covalent linkage, means

that proline is actually more conformationally

restricted than most amino acid.

And again, this interesting conformational property

comes into play when we think about protein folding.

So now that we've looked at some of the amino acids in detail,

let's revisit how we can classify the amino acids.

This Venn diagram shows some of the many ways to classify amino acids.

For example, if we look at lysine, it is charged at physiological pH

because its side-chain amino group carries a positive charge.

It can also readily form hydrogen bonds, and therefore, it's also polar.

Why is it also classified as nonpolar?

Good question.

But take a look at the structure again, and I'll give you a hint.

Now, aside from the charged amino group at the end of its side-chain,

the rest of the side-chain is aliphatic, or nonpolar.

So that means that lysine can sometimes act as nonpolar in certain situations.
So the sequence of amino acids that make up a protein

is called its primary structure.

And there are four levels of structure that

make up the full three-dimensional structure of proteins.

1. rt of transcript. Skip to the end.

2. RACHELLE GAUDET: In this video, we'll move to the next level of structure

3. within protein molecules.

4. And it's the chemical nature of the natural polypeptide

5. that puts constraints on the shapes that it can take in three dimensions.

6. This means that proteins will have a defined secondary structure.

7. And there are three types of secondary structure--

8. the beta sheet, the alpha helix, and loops.

9. So let's see how these elements come about.

10. But we'll first see how the constraints on the polypeptide

11. will give rise to the different secondary structure elements.

12. So recall that proteins are polymers of amino acids.

13. And these polymers come about from condensation reactions

14. that are catalyzed by the translation machinery in the cell.

15. And that leads to an unbranched polymer of amino acid residues

16. that are linked by amide bonds.

17. And in proteins, these amide bonds are called peptide bonds.

18. In the early days of protein biochemistry,

19. measurements of the dimensions of the peptide bond

20. led to a surprising finding.

21. The C alpha atom has the expected tetrahedral geometry,

22. and the carbonyl carbon has angles corresponding to a trigonal geometry,
23. suggesting that the peptide bond is a double bond.

24. But the peptide bond, the bond between the carbonyl carbon

25. and the amino nitrogen, measured 1.32 angstroms.

26. And therefore, it is shorter than a typical carbon-to-nitrogen single bond,

27. but longer than a carbon-to-nitrogen double bond.

28. This unusual bond length is due to the partial double-bonded character

29. of the peptide bond because of resonance,

30. as shown here in this chemical equilibrium.

31. An important consequence is that the peptide bond is planar.

32. The nitrogen and hydrogen of the amino group

33. and the carbon and oxygen of the carbonyl all lie in the same plane.

34. A second important consequence is that the peptide bond cannot rotate freely,

35. unlike normal single bonds.

36. So the peptide bond can be in only one of two conformations, cis or trans.

37. And I'm talking about the dihedral angle, which

38. states the position of the two C alpha atoms

39. that are linked through that peptide bond.

40. In the trans conformation, showed here at the top,

41. the two C alpha atoms are on opposite sides of the peptide bond.

42. Whereas in the cis conformation, they're on the same side of the peptide bond.

43. But as you see here, when they're on the same side of the peptide bond,

44. the two side chains tend to clash.

45. And that means that the cis conformation is unfavored.

46. So almost all peptide bonds are in the trans conformations,

47. with one exception, and that's the peptide bonds

48. that are end terminal to a proline.

49. Recall that prolines are cyclic amino groups,

50. and that means that the steric clash is similar in both
51. the cis and trans conformations.

52. And since the trans confirmation is now as bad as the cis one,

53. it is much more likely for a cis peptide bond to occur before a proline.

54. And note that it is the peptide bond on the end terminal

55. side of prolines that is affected.

56. So now we know that the peptide bond is fixed.

57. But the other two dihedral angles of the backbone on either side of each C alpha

58. carbon are free to rotate.

59. Each residue, therefore, has two such dihedral angles.

60. One named phi, between N and the alpha carbon, and the other named psi,

61. between the alpha and the carbonyl carbons.

62. An easy way to remember the order is that it goes alphabetically.

63. In proteins, N always comes before C, and P-H-I, or phi, comes before P-S-I,

64. or psi.

65. Peptide bonds are also named.

66. They're named omega.

67. Omega is either zero or 180 degrees for cis or trans bonds respectively.

68. Let's see how the phi and psi angles are measured.

69. To measure the phi angle, you look down the length

70. of the nitrogen to C alpha bond from its most end terminal end.

71. And then you measure the angle between the two carbonyl carbons.

72. Similarly, to measure the psi angle, you look down the length of the C alpha

73. to carbonyl carbon from its end terminal end,

74. and then you measure the angle between the two nitrogens.

75. Each residue, therefore, has two such dihedral angles.

76. So this isoleucine has dihedral angles phi of minus 138 degrees

77. and psi of 132 degrees.

78. Here I've plotted the dihedral angle values


79. for this isoleucine on a phi and psi plot.

80. Now, a famous Indian structural biologist, G.N. Ramachandran,

81. realized that not all dihedral angle values are allowed,

82. and there are some steric clashes.

83. So using molecular models, he tested all possible values of dihedral angles.

84. And then he realized that there are only three regions that

85. contain allowed values of psi and phi angles for normal natural proteins.

86. And this plot that you now see is called the Ramachandran plot.

87. Let's take a closer look at how the plot was built.

88. This animation illustrates a full rotation

89. around the phi bond of a valine residue.

90. The corresponding dihedral angles are indicated in the Ramachandran plot

91. on the top left.

92. Steric clashes are illustrated by cylinders

93. that you see appearing at some angles.

94. Note how the allowed angles, here within the blue region of the plot,

95. minimize those clashes.

96. And we can do the same process doing a full rotation around the psi bond.

97. Ramachandran looked at every single possible positions to build his plot.

98. We can see now again the three regions of allowable phi and psi angle values

99. on the Ramachandran plot.

100. Now, to form the two familiar regular secondary structures, alpha

101. helices and beta sheets, we need several residues in series

102. within one given region of this Ramachandran plot.

103. So let's see a specific example.

104. For an alpha helix, we need multiple residues

105. in a row to lie within the phi and psi region shown here in green.

106. And here is a cartoon representation of such an alpha helix.


107. This cartoon representation, which was invented by Jane Richardson,

108. shows a helical ribbon for helices.

109. Now, natural L-amino acids form right-handed helices.

110. So imagine if you're going up a staircase that's

111. shaped like this helical ribbon.

112. The railing that's on the outside should be in your right hand

113. if it's a right-handed helix.

114. Now let's look underneath this ribbon at the backbone atoms.

115. You can see that the backbone atoms, when

116. they're in this phi-psi region of the Ramachandran plot,

117. can form hydrogen bonds with their polar groups,

118. so that the carbonyl of the residue N will hydrogen bond

119. to the amino of the residue N plus 4, as highlighted here in yellow.

120. Side chains, which are illustrated here as spheres, will then point outwards

121. and slightly downwards on and off a helix towards it N terminus.

122. Now, the periodicity of an alpha helix is

123. that it contains 3.6 residues per turn.

124. And that will lead to important patterns in the primary structure

125. or sequence of a protein.

126. And I've illustrated it here with an example from ATP synthase.

127. So on this helix you can see the yellow residues are the hydrophobic residues,

128. and they all fall on one face of the helix.

129. And if you count them in the primary structure,

130. they're all about three or four residues apart.

131. And that allows this particular helix to form a hydrophobic core

132. by packing against another helix, which I'm now showing here in blue.

133. Here's a quick brain teaser for you.

134. Why do we sometimes say that prolines break helices?


135. Can you think of why that is?

136. There'll be a question about this at the end of the video.

137. Let's now turn to beta sheets.

138. Now, multiple residues in a series that fall within the blue region,

139. that's illustrated here on the Ramachandran plot,

140. can form a beta strand.

141. And then you need multiple beta strands to hydrogen bond

142. together to form a sheet.

143. Here is a four-stranded beta sheet in cartoon representation.

144. Each strand is illustrated by an arrow going from the N to the C terminus.

145. Sheets can contain parallel strands, anti-parallel strands, or a mixture.

146. As you can see here, this beta sheet is mixed.

147. Anti-parallel strands often come from adjacent parts

148. of the primary structure, and then they would be linked by a short loop.

149. But adjacent strands can also come from a whole other part

150. of the same polypeptide or from another protein all together.

151. Again, let's look at the backbone structure behind these cartoons.

152. In a single beta strand, the backbone is in an extended conformation.

153. And then the hydrogen bonding of backbone amino

154. and carbonyl groups between adjacent strands is what forms a beta sheet.

155. Looking at the edge of a beta sheet, you can

156. see how the side chains extend perpendicularly to the plane containing

157. the hydrogen bonds.

158. The side chains within one beta strand alternate above and below the plane.

159. So in beta sheets, we often see an alternating pattern

160. in amino acid properties.

161. For example, alternating aliphatic and polar residues

162. could form a sheet with one hydrophobic side and one polar side.
163. STUDENT: What about the last region on the Ramachandran plot?

164. RACHELLE GAUDET: Yes, so this combination of phi-psi angles is also

165. allowed, but it cannot be repeated for many residues, unlike the other ones.

166. And so this region will correspond to loop regions.

167. But be careful, for loops, any allowable phi-psi angles are OK.

168. So you'll find loop residues in all three regions of the Ramachandran plot.

169. So remember that to obtain an alpha helix

170. you need a repeating sequence of amino acids

171. that all fall within the green region of the Ramachandran plot,

172. and it would be the same for the beta sheets in the blue region

173. of the Ramachandran plot.

174. So a residue that's in the alpha region, that doesn't mean it's in the helix.

175. It could also be in a loop.

176. So to get a regular secondary structure you

177. need phi-psi to be within the similar region all in a series.

178. Loops and turns that will connect adjacent secondary structure elements.

179. Our first example here is two strands of an anti-parallel beta sheet

180. which are connected by a small loop.

181. And then the second example is a helix followed

182. by a loop followed by a strand.

183. Most loops will fall at the surface of proteins

184. and contain a lot of polar and small residues.

185. And they'll also form the shapes of active sites

186. because they have a lot of conformational flexibility

187. to really shape themselves around the ligands.

188. Remember, we talked about two special amino acids-- proline and glycine,

189. both of which have a lot of interesting conformational properties--

190. glycine being very flexible, and proline being restricted, but being
191. able to form cis peptide bonds.

192. And so proline and glycine are often found within loop regions of proteins.

193. Let's wrap up by looking at alcohol dehydrogenase,

194. an enzyme important in energy metabolism.

195. It contains a mixture of alpha helices, beta sheets, and loops.

196. You can see that accordingly its residues are distributed in all three

197. regions of the Ramachandran plot.

198. So we've now seen how the backbone of a protein

199. can be arranged into defined elements-- the alpha helices, beta sheets,

200. and loops.

201. This level of organization is the secondary structure.

202.

1. art of transcript. Skip to the end.

2. RACHELLE GAUDET: In this video, we'll look at the final levels

3. of protein structure, the tertiary and quaternary structure.

4. Why are we doing tertiary and quaternary structure together?

5. As we'll see, they're actually quite similar.

6. They both describe the arrangement of a protein

7. at a higher level than simple secondary structure.

8. In both cases, the specific three-dimensional shape of the proteins

9. results from assembling parts through largely non-covalent interactions,

10. the van der Waals, salt bridges and hydrogen bonds.

11. And that's why tertiary and quaternary structure are often described together.

12. Now, more specifically, the tertiary structure

13. is the arrangement of helices, sheets, and loops

14. within the single polypeptide.


15. They're three dimensional arrangements.

16. Whereas the quaternary structure is the rearrangement of several polypeptides

17. into a protein complex that performs a specific cellular function.

18. The tertiary structure is built from motifs to domains to proteins.

19. What do I mean by that?

20. When we described how the secondary structure elements are arranged

21. spatially, it's useful to define hierarchies of structure.

22. And we'll explore this using alcohol dehydrogenase as an example.

23. Alcohol dehydrogenase is an enzyme that we'll discuss again in a future video.

24. Here is the full structure of active alcohol dehydrogenase, shown

25. as a cartoon, with the helices in green, the sheets in blue, loops in gray.

26. The smallest grouping of secondary structure

27. is a motif, made up of several secondary structure

28. elements that assemble in a consistent way.

29. Like their names suggest, motifs recur in many different proteins

30. with varying sequences.

31. A very common motif, which we isolated here,

32. is the beta alpha beta motif, where two adjacent beta strands

33. are connected by an alpha helix.

34. Notice how this allows the formation of a parallel beta sheet.

35. One or more motifs can assemble to form a compact globular structure, a domain.

36. Here, the C terminal beta alpha beta motif, highlighted in yellow,

37. interacts with additional beta strands and several other helices

38. to form a domain.

39. Domains are independent folding units within proteins.

40. That is, an isolated domain can usually fold on its own

41. and maintain its tertiary structure independent of the rest of the protein.

42. Proteins often have multiple domains, which may then interact with each other
43. to form the functional unit.

44. In alcohol dehydrogenase, we can see how the domain we were examining,

45. now in yellow, interacts here with a second domain which

46. features a Rossmann fold, as we'll see in more detail in a minute.

47. The active sites of enzymes are often found in loop regions,

48. as I mentioned in the last video.

49. And these loop regions can be shaped both chemically and structurally

50. to provide specificity.

51. Often the active sites are in crevasses at the interface between two domains.

52. And these crevasses are rich in loops.

53. The active site of alcohol dehydrogenase lies between these two domains.

54. Here you can see in spheres representation both substrates,

55. NADP and ethanol, bound at the active site in the interface.

56. Active alcohol dehydrogenase actually contains two identical protein

57. molecules, as you can now see here.

58. Its quaternary structure is therefore a homodimer of two subunits,

59. with each one containing one active site at the interface of their two domains.

60. Note how part of the dimer interface is formed by two

61. interacting anti-parallel beta strands to generate one large mixed beta

62. sheet from the two parallel beta sheets in each Rossmann fold domain.

63. So the arrangement of the secondary structure elements in a domain

64. correspond to a fold.

65. Like motifs, folds are found in many different proteins.

66. For example, the Rossmann fold that we just

67. saw is a very common fold that's found in thousands

68. of proteins and often proteins that bind nucleotides as cofactors.

69. OFF CAMERA SPEAKER: What's the difference between a domain and a fold?

70. Aren't they the same thing?


71. RACHELLE GAUDET: Not quite.

72. Every domain has a fold.

73. But some folds, like the Rossmann fold, appear

74. in thousands of different domains and different proteins

75. with different sequences.

76. So alcohol dehydrogenase and glutathione reductase, for example,

77. each have a domain with the Rossmann fold.

78. Folds are used to classify protein structure.

79. So the first order of classification in protein structures

80. are three different big types of folds-- the alpha folds,

81. the beta folds, and the alpha beta folds.

82. Within alcohol dehydrogenase, we already saw an example of a mixed alpha beta

83. fold, the Rossmann fold.

84. The Rossmann fold is a particular arrangement

85. of beta strands in alpha helices, including several beta alpha beta

86. motifs.

87. This arrangement yields a six stranded parallel beta sheet

88. that is sandwiched between helices on either side.

89. An example of a mostly beta fold is carbonic anhydrase, shown here.

90. Carbonic anhydrase catalyzes the inter-conversion

91. of carbonic acid, the result of dissolved carbon dioxide

92. and bicarbonate.

93. It is therefore a critical enzyme for maintaining pH balance in the blood.

94. A common alpha fold is the globin fold.

95. An example is myoglobin, which is shown here,

96. the oxygen binding protein in muscles.

97. It releases oxygen under anaerobic conditions,

98. which is important during strenuous activity.


99. The globin fold contains eight helices and is often

100. associated with binding of the heme cofactor, shown here in orange.

101. Hemoglobin is another example of the globin fold.

102. Note how similar this subunit of hemoglobin

103. is to myoglobin, despite sharing less than 30% sequence identity.

104. These two domains share one fold.

105. But unlike myoglobin, hemoglobin is made of four separate polypeptide chains,

106. and is therefore said to be a tetramer.

107. This assembly of multiple polypeptides into one functional unit

108. is called the quaternary structure of a protein.

109. We often call these individual polypeptides

110. subunits. These subunits can have different primary structures.

111. Here, hemoglobin has two subunits of alpha hemoglobin

112. and two subunits of beta hemoglobin.

113. As we'll explore later in the course, the quaternary structure of hemoglobin

114. leads to very important functional properties

115. that are not possible from monomeric proteins like myoglobin.

116. Quaternary structure can also be formed of multiple copies

117. of the same polypeptides subunit.

118. Alcohol dehydrogenase, which we examined earlier in the video,

119. has two identical subunits, which are highlighted here in red and yellow.

120. As one might expect, proteins with similar sequences

121. have similar structures.

122. But more remarkably, proteins with highly divergent sequences often

123. still have the same structure, as we saw for hemoglobin and myoglobin.

124. It seems that there are a finite number of protein folds.

125. New proteins tend to reuse existing folds as opposed

126. to using a completely new one.


127. This is why a classification of protein folds is useful.

128. Shown here are just a few examples of roughly 1,500 folds

129. that have been identified.

130. This graph here shows the progress of structural biology over the last 40

131. years.

132. The number of folds that have been determined in using crystallography

133. or NMR or other structure determination techniques

134. has risen, but has actually plateaued in the last few years,

135. where almost no new folds have been discovered.

136. And this is in contrast to the all-time high number of protein structures

137. that have been solved.

138. So there's a lot we still don't know about protein structures,

139. even if we know the general fold of a protein.

140. A lot of its function is determined by the details in its loops

141. and specific side chains into how the protein functions.

142. So the arrangement of proteins into multiple independently folding domains

143. can increase their diversity.

144. And that's because domains are modular.

145. Gene recombination events can assemble new genes

146. that code for new combinations of domains

147. and therefore increase the diversity of the genome.

148. OFF CAMERA SPEAKER: But what holds these folds together?

149. Maybe that is conserved across proteins that

150. have the same fold but with different sequences.

151.
1. Start of transcript. Skip to the end.

2. RACHELLE GAUDET: In this video, we're going to talk about the forces that drive
protein folding.

3. And we're going to see that the primary structure, or the sequence of a protein, will
dictate

4. its three dimensional structure.

5. We have to remember that proteins don't fold in a vacuum.

6. They fold in an aqueous environment.

7. And that aqueous environment is critical to them

8. achieving their native conformation.

9. So here I'm showing you a molecule of RNase A.

10. And RNase A is a small protein that is a classic model

11. system that's used by protein chemists to study the process of protein folding.

12. Now I've added thousands of water molecules.

13. So the protein is now in its aqueous environment.

14. It's surrounded by water molecules and also a lot of other ions,

15. like sodium and chloride for example, that are illustrated here.

16. And these water molecules interact with the protein.

17. They do so transiently because they are tumbling in solution.

18. But these thousands of transient interactions are critical to the process of protein
folding.

19. Let's consider the forces that dictate how proteins fold.

20. And so now I've gone to a schematic diagram that illustrates an unfolded protein with

21. all the little colored spheres as the side chains of the amino acids.

22. And then when the protein is unfolded like this,

23. it will have many, many different conformations that it can take.

24. And these conformations can interchange.


25. And that means all that conformational flexibility provides the unfolded protein a lot of
conformational entropy.

26. Now once it folds, it will achieve one single native conformation.

27. And that greatly reduces the conformational entropy of the protein.

28. And it turns out that this is the major force against protein folding.

29. So that opposes protein folding because you have a loss in entropy.

30. So this is where the water comes into play.

31. Free water molecules will tumble in solution.

32. And in all the different orientations, they can see other molecules

33. and form hydrogen bonds.

34. So they have a lot of conformational entropy.

35. Now we can contrast this to a water molecule that's on the hydrophobic surface.

36. For example, the hydrophobic surface of a hydrophobic residue, a side

37. chain that's exposed to solvent.

38. That water molecule will have fewer conformations, or rotations,

39. that will enable it to have the same number of hydrogen bonds.

40. And so it will have reduced conformational entropy in comparison

41. to free water molecules.

42. So now if we return to our schematic of protein folding,

43. we can see that upon folding, a lot of the hydrophobic side chains-- which

44. are the little yellow spheres in our diagram-- are buried in the hydrophobic core of
proteins

45. upon protein folding.

46. And that means that now a lot of the water molecules

47. that were interacting with those side chains, are now free to interact with other water

48. molecules.

49. So there's a great increase in the conformational entropy of water

50. upon protein folding, and actually of the whole system.

51. So there's a favorable gain in entropy in the system by releasing


52. water molecules from hydrophobic surface when you bury them in the hydrophobic
core.

53. And this is called the hydrophobic effect.

54. AUDIENCE: So are hydrophobic residues always buried

55. in the hydrophobic core of a protein?

56. RACHELLE GAUDET: Actually not all of them.

57. Because about half of the residues on the surface of proteins

58. are also hydrophobic residues.

59. And these residues are often important for interactions with other proteins

60. with ligands or with substrates.

61. Another situation where we find a lot of hydrophobic residues also on the surface

62. of the protein is in membrane proteins, where these surface

63. residues will interact with the hydrophobic lipids in membranes.

64. So let's get back to the hydrophobic effect.

65. Another reason that the hydrophobic effect is really important to protein folding is

66. that it's a highly cooperative process.

67. So if you imagine two hydrophobic side chains that

68. will start contacting each other, they constrain the backbone

69. that surrounds them and will lead to other hydrophobic side chains now

70. being in close proximity and also interacting with each other.

71. So you get sort of a snowball effect.

72. And the hydrophobic core will zip up rapidly to hide all these hydrophobic side chains

73. on the inside of the protein.

74. However, recall that the peptide backbone contains

75. polar amino and carbonyl groups.

76. How are these groups buried inside the hydrophobic core of proteins?

77. As we learned in earlier videos, hydrogen bonding within the main chain

78. stabilizes secondary structures like the alpha helices and beta sheets.
79. And these hydrogen bonding patterns also serve to balance those partial charges on
the polar

80. groups of the peptide bonds.

81. And that is what allows the polar groups to be

82. buried within the hydrophobic core, because they're already

83. hydrogen bonded.

84. And similarly to the hydrophobic effect, the zippering up

85. of the secondary structure elements is also highly cooperative.

86. These non covalent interactions that form within a protein-- including

87. also the salt bridges and the van der Waals interactions--

88. are the final major contributor to the energetics of protein folding.

89. They contribute to a favorable and net decrease in enthalpy of the protein

90. as it folds.

91. So now that we know about the three major forces that drive protein

92. folding, let's put them together.

93. So I'm going to show you some estimates of the energy contributions from each

94. of these forces to folding a protein of about 100 amino acids.

95. It turns out that the average size of a domain--the independently folding unit

96. of proteins-- is about 100 amino acids.

97. So the hydrophobic effect due to the burial of the hydrophobic amino acids

98. into the core of proteins, will yield minus 140 kilojoules

99. per mole of protein upon folding.

100. So that's a favorable energetic component.

101. The second favorable energetic component are the non covalent interactions,

102. which provide again about minus 150 kilojoules per mole of stabilization energy.

103. And then the forces that opposes these two is the huge reduction in entropy

104. going from many, many conformations accessible to the unfolded protein

105. to the single conformation of the native protein.


106. So when we add up those three components, we still come out with a net
favorable free

107. energy, with about minus 70 kilojoules per mole of stabilization energy upon
folding of 100 amino acid protein.

108. And that means that the process should be spontaneous.

109. These numbers are estimates.

110. And they are done with a few simplifying assumptions.

111. But it turns out that they agree quite well with experimental results.

112. So it turns out to be a simple matter of thermodynamics.

113. Protein folding is thermodynamically favorable.

114. Still, you can see that the net effect is rather small because we have a few large
forces

115. that almost cancel out.

116. So now we know that thermodynamics favor the folding

117. of proteins in an aqueous solution.

118. But I mentioned earlier that there's one native conformation to proteins.

119. How do the proteins reach that one single native conformation?

120.

1. ranscript. Skip to the end.

2. RACHELLE GAUDET: So now we have a basic idea that protein folding is

3. energetically favorable.

4. But we also know that the unfolded state can take on many, many conformations.

5. So it could be reasonable to assume that the folded state could also take

6. on multiple different conformations.

7. But we know that a native protein will have typically just one native, folded
confirmation.
8. So this raises another fundamental question--how do proteins achieve that one single
native fold?

9. So to answer this, we're going to turn to a classic experiment

10. by Christian Anfinsen, one of the famous early protein biochemists.

11. To do his experiments, Christian Anfinsen used RNase A.

12. And there's a story behind why he used RNase A. It turns out

13. that the Armour Packing Company of hot dog fame

14. had purified a whole kilogram of this protein, and was distributing small samples to
interested scientists.

15. So because of that, RNase has become a classic model

16. for studying protein folding.

17. So another reason that RNase A is very useful is that it's a nuclease, which means that
it's actually relatively easy to assay its activity.

18. So RNase A is a 124 residue protein that contains a mixture of alpha helices and beta
sheets,

19. and four disulfide bonds.

20. The disulfide bonds are pictured here in yellow.

21. So Anfinsen, before he got started, got a baseline information,

22. which he just measured the activity of his protein sample,

23. and set that as 100% protein activity.

24. So after measuring the baseline activity, Anfinsen added 8 molar urea to his sample.

25. And that's illustrated here as sort of the gray shading.

26. And when he measured the activity in the presence of urea, he found 0% activity.

27. And that's because urea unfolds the protein into sort

28. of random coils, random conformations.

29. And then he removed urea, using a technique called dialysis.

30. And once the urea was completely removed, he measured the activity again, and found

31. about 100% percent activity.

32. So the activity had returned, suggesting that the protein had once again
33. refolded into its native conformation.

34. But he didn't stop there.

35. So thinking of those disulfide bonds that I told you about,

36. he repeated the experiment with one major difference.

37. So when he added the 8 molar urea, he also added a reducing agent, called beta
mercaptoethanol.

38. So what beta mercaptoethanol does is that it reduces that disulfide bonds to release
the two free cysteines.

39. So now there's going to be eight free cysteines on each RNase A chain.

40. And then he performed the dialysis again, which removed both the urea and the
mercaptoethanol.

41. And in the oxidizing environment of the test tube that he had that was just open to the

42. atmosphere, the disulfide bonds could now reform.

43. So after all the urea was removed, he measured the activity again

44. and found about 90% activity, which meant that the protein had again

45. taken on its native structure.

46. So even though he could reduce the disulfide bonds,

47. they could reform, again, into the native conformation

48. without destroying the protein.

49. Anfinsen then performed a great control experiment.

50. Again, he denatured the protein, and reduced all of its disulfide bonds.

51. But then, before dialyzing the protein, he added an oxidizing agent.

52. So this neutralized the mercaptoethanol and led to reforming of the disulfide bonds in
the denatured protein.

53. Then, when Anfinsen removed the urea by dialysis, he found that the protein was only
1% active.

54. Why is this?

55. He hypothesized that disulfide bonds formed at random within the protein

56. when it was denatured.


57. Most of these bonds formed between the wrong pairs of cysteines.

58. When the urea was removed, then the protein could no longer fold back into the native

59. state, because it was constrained by these errant

60. disulfide bonds.

61. Using some back of the envelope math, we can calculate that the correct combination
of

62. four disulfide bonds needed to form active RNase A would form, by chance, about 1%
of the time.

63. And that matches quite well the results that Anfinsen

64. obtained when he obtained about 1% activity in has control experiment.

65. And then there's a part two to this control.

66. And that's Anfinsen's final insight.

67. So he took this RNase A that now had the mismatched disulfides-- or he

68. hypothesized that had-- and it only had 1% activity.

69. And then he added to this solution just a small amount of beta mercaptoethanol.

70. And what this small amount of beta mercaptoethanol will do

71. is it will allow disulfide bonds to both break and reform.

72. And so they can exchange to different cysteines, and there's a chance of the disulfide
bonds

73. forming the correct cysteine pairings.

74. After he waited long enough-- and by long enough,

75. I mean about 10 hours-- he found that he could regain a lot of the RNase

76. activity, so that the sample regained as much as 90% of the original activity.

77. And that suggests that the protein, with time, can again take on its native conformation,

78. and that it clearly wasn't destroyed.

79. So to summarize the result of Anfinsen's critical control experiment,

80. the RNase A protein could not reach its native fold after the disulfide bonds were
allowed

81. to form at random in the unfolded peptide chain in the presence of urea.
82. However, when the urea and the reducing agent were removed at the same time,

83. almost all the protein was active.

84. And this suggested that the native fold of the protein

85. placed the correct cysteine residues in close proximity.

86. And then the correct disulfide bonds then formed, and stabilized the structure and
activity of the

87. RNase A. AUDIENCE: Can you explain how urea denatures proteins?

88. RACHELLE GAUDET: So the answer actually goes back to the previous video.

89. Urea is a very polar molecule.

90. So when it's at very high concentrations-- such as 8 molar--

91. it will alter the hydrogen bonding patterns of water,

92. and that will decrease the impact of the hydrophobic effect we talked about,

93. that will drive a lot of the protein folding.

94. And so reducing the hydrophobic effect actually shifts

95. the equilibrium of protein folding back towards the unfolded state.

96. So that's why 8 molar urea will lead to unfolding of proteins.

97. From the results of his series of experiments, Anfinsen formulated an important
concept called

98. the thermodynamic hypothesis.

99. And the thermodynamic hypothesis states that the most stable

100. thermodynamic conformation of a protein is its native fold.

101. And that basically implies that folding is a spontaneous process in cells.

102. Another important lesson from his experiments is that the information contained
in the primary

103. structure, or the sequence, of a protein is sufficient to dictate its

104. three dimensional fold.

105. So the protein doesn't need any templates or any other instructions,

106. other than the presence of a favorable, physiological-like buffer.

107. So the thermodynamic hypothesis suggests that, if our understanding


108. of the energetic forces that drive protein folding

109. are sufficiently sophisticated and precise, we could actually perhaps predict the
structure of proteins, just from knowing their sequence, or primary structure.

110. And actually, for some simple small proteins, that is the case.

111. So here in yellow, I'm showing you the predicted structure

112. of a small bacterial protein.

113. And the structure was predicted using a powerful computational algorithm that

114. includes a lot of information about the energetics of proteins.

115. And after those predictions were finished, the structure was determined using x-
ray crystallography.

116. So here in gray is the experimental structure, superimposed onto the predicted
structure, in yellow.

117. And you can see that there's actually a pretty impressive agreement

118. between the predicted and experimental structures, because they look very
similar, and they superimpose nicely.

119. And that actually says that there's strong, independent support

120. from these predictions about the thermodynamic hypothesis,

121. because the fact that we can predict it means that we actually

122. understand protein folding quite well.

123. So when we think about the process of folding of proteins,

124. we often illustrate it using a protein folding funnel, like the one shown here.

125. And the protein folding funnel shows the thermodynamic landscape-- so

126. the energetic landscape-- of the folding process.

127. On the x-axis is conformational space.

128. And that sort of illustrates all the many different conformations

129. that the protein can take.

130. In some sense, you could illustrate that with a large, multi-dimensional space--

131. one dimension for each of the dihedral angles that

132. can freely rotate in the backbone.


133. But that would just be impossible to show, so we just simplify it by saying
conformational

134. space is the x dimension.

135. And then, in the y dimension, we have the free energy.

136. So then if you imagine that the protein starts at a high energy

137. level of that unfolded state, and then it will slowly travel down the funnel,

138. towards its folded confirmation, its native structure.

139. And the native structure will be at the minimum of that energy landscape,

140. at the minimum of the funnel.

141. And that's the key of the thermodynamic hypothesis.

142. Now in some cases, on some of the paths, the protein

143. might encounter a thermodynamic trap, like the one that's

144. shown here, where there's a semi-stable folded intermediate

145. on the way towards the native state, the most stable state.

146. The protein folding funnels can actually take on many different shapes

147. for different proteins.

148. So each primary structure or sequence will have its own folding funnel.

149. So in this smooth funnel case, the protein will simply start in an unfolded state,
and

150. rapidly reach its native conformation at the minimum.

151. So in this second example, there's a large flat region

152. at the top of the graph, where there's high energy.

153. So that means that there are many conformations in the unfolded state that
have about the

154. same amount of energy-- of high energy.

155. And so the protein will sample many of these conformations

156. before it reaches the point where the funnel suddenly drops.

157. And so the protein will then fold rapidly in its native state,

158. after having sampled many unfolded conformations.


159. In this third scenario, a protein might need to pass through a high energy
intermediate

160. before folding.

161. And that's indicated here by the shaded regions on this blue plot.

162. But many proteins probably have a rough folding funnel,

163. like the first funnel we discussed, which is shown here in yellow again.

164. And it has these thermodynamic traps that we discussed,

165. with semi-stable folding intermediates.

166. The thermodynamic landscape, or folding funnel, of each protein

167. is likely to be unique.

168. As you might imagine, the shape of the funnel influences the kinetics of the
process, or

169. how long a protein will take to fold.

170. So now we know that protein structure is determined by the primary sequence.

171. And that's a good thing for cells, because life is complicated enough.

172. So because the protein sequence dictates the three dimensional shape,

173. it allows the protein to fold into the right shape

174. almost every time, without any other templates or information coming

175. from the cellular components.

176. But those protein folding funnels that we just looked at also hint at another
important concept in protein folding,

177. and that's the kinetics of protein folding.

178.
1. Start of transcript. Skip to the end.

2. RACHELLE GAUDET: Protein folding kinetics is a quite complicated process,


and it's actually an

3. active field of research. So in this video, we'll show you some of the basic

4. concepts that dictate some of the kinetics of protein folding just to get

5. you sense of how protein folding happens. Remember that we already have a tool to

6. visualize the process of protein folding, and that's the folding funnel. When we

7. look at the protein funnel, we think of the protein starting at a high energy

8. level when it's unfolded and then moving down in trajectory towards the minimum,

9. and the minimum is the native fold of the protein. But how does a protein

10. actually change its shape as it's undergoing this folding process? Remember

11. the hydrophobic effect? We talked about how the hydrophobic side chains will

12. bury themselves into the hydrophobic core of protein. Well this hydrophobic

13. collapse is actually often one of the first steps in the folding process. When

14. looking at these folding funnels, remember that the fraction of residues

15. in the native conformation increases as the proteins move towards its native

16. state, and at the same time the number of possible conformations decreases and

17. that's the funnel effect here, as the side chain becomes more and more

18. constrained. So let's dig deeper into these conformations as the protein is

19. undergoing the process of folding to better understand the kinetics. Remember

20. that these phi and psi bonds can rotate, but not quite freely all the way around

21. the 360 degrees of rotation. Some of the angles are disallowed because there are

22. some steric clashes, and we saw that the Ramachandran plot is a great way to

23. visualize the regions that are allowed, the regions that prevent a lot of the

24. steric clashes in proteins. And so we already know now that there's some

25. constraints on the conformations of proteins, but the protein still needs to
26. take on very precise dihedral angles to maintain its native fold, because just

27. one or two errant dihedrals can actually disrupt the fold

28. of the protein, and we can illustrate this with a simple example of an alpha

29. helix. Remember that a series of residues within the same region of the phi-psi

30. angles in the Ramachandran plot, the green region here, can generate an alpha

31. helix. So say if just two residues within this alpha helix change their dihedral

32. angle values to other parts of the Ramachandran plot.

33. Well that would actually disrupt the whole helix, so therefore that tells us

34. that protein folding relies on very precise conformations of the backbone

35. through their phi-psi angles. So now let's get an idea of the magnitude of the

36. problem, the magnitude of the task that the protein has to go through to get to

37. its native fold. So we're going to do a thought experiment making some estimate

38. about the number of conformations that the protein might be able to undertake.

39. So we're going to take an example of a protein of 100 residues, so that's about

40. the average size of a protein domain. So now we know that it's phi-psi angles are

41. restricted, so that we can approximate that there's roughly three different

42. conformations for each residue corresponding to the three different

43. regions of the Ramachandran plot. So that would yield three to the 100

44. conformations available to the whole protein, assuming that each of these

45. dihedral angles can behave independently. Three to the 100 is actually an enormous

46. number, and to get an idea of the magnitude of this number we're going to

47. estimate that if the protein say samples each of these conformations for a very

48. short amount of time, 0.1 picoseconds. Well that would

49. yield 10 to the 27 years to sample every single conformation, and that's orders of

50. magnitudes longer than the lifetime of the universe. So that brings us to

51. a paradox, right, that the protein has no time to explore all the possible

52. conformations on the way to its native fold. This is called Levinthal's paradox

53. after Cyrus Levinthal, who's the first one to state this
54. concept in this paradox, and what he hypothesized is that this paradox means

55. that there must be some conformations that are just never explored. There must

56. be some paths that are more common as proteins start to fold. So how do proteins

57. then figure out which conformations might be the best ones to lead them

58. towards their native fold? That's where cooperativity comes into

59. play, because this cooperativity brings some residues in proximity, and so it

60. makes choices for the protein. So let's see a more specific example. Remember

61. when the hydrophobic core forms you might have at first a contact between

62. two residues, that's labeled c1 here, but that first contact will then bring other

63. residues in close proximity. So then the second contact that's labeled c2 here

64. now has a higher probability of forming. So that's the cooperativity being

65. illustrated for the hydrophobic effect that leads to the collapse of the

66. hydrophobic core, but this cooperativity is also important in forming the

67. secondary structure elements. So let's see this example now for a secondary

68. structure element, looking at the folding of a helix. So the largest kinetic

69. barrier for the folding of a alpha helix is the formation of the first hydrogen

70. bond, and that's called the nucleation event. It's analogous to when you

71. crystallize something, the first seeds of a crystal might take a long time, but

72. then the growth of the crystal is a lot faster. So here in an alpha helix, the

73. first hydrogen bond might take a long time to form, but once it's formed it

74. brings into proximity the other residues that want to form hydrogen bonds, and

75. then they will extend the helix by forming these hydrogen bonds on either

76. side of that first hydrogen bond to form the complete helix.

77. So this cooperativity is key to the assembly of secondary structure elements

78. as well. So protein scientists who study folding have generated sort of two

79. extreme models of protein folding to help explain it.

80. In the first one, the secondary structure forms first, and then the secondary

81. structure elements will then form together the tertiary structure by
82. moving until they achieve the correct fold. And this is illustrated here as a

83. time course with the energy on the y axis, and you can see that first the

84. secondary structure elements are formed and then the full fold of the protein.

85. And this process or this model is called the diffusion collision model. In the

86. other model, the hydrophobic core of the protein that's indicated here in yellow

87. is what first collapses in a relatively random way, and once the core has

88. collapsed, this allows the secondary structure elements to form, because now

89. the residues are in close proximity, and the protein will adopt its native

90. conformation. And this second model is called the nucleation condensation

91. process. The currently available experimental results on protein folding

92. indicate that both models can operate in protein folding, and which model is

93. favored depends on the protein, depends on the primary sequence. So as with the

94. thermodynamics of protein folding, the kinetics of protein folding is a process

95. that our understanding of it is advancing more and more rapidly, and a

96. lot of that is due to computer simulations. Now one type of computer

97. simulations are molecular dynamic simulations which simulate the motion of

98. atoms, and it's powerful enough to simulate the folding of some of the

99. small proteins in the aqueous solution, because remember that the aqueous

100. solution is key to the folding process. So now we're going to look at a

101. simulation of the folding of a small protein, a 35 residue protein. It's the

102. headpiece domain of villin, which is a cytoskeletal protein, and this protein is

103. actually experimentally one of the classic models of protein folding and

104. it's one of the fastest folding proteins that's been determined experimentally. So

105. in the simulations, there are actually thousands of water molecules that are in

106. included in the aqueous solution, but they're not going to be shown here just

107. for clarity. And the simulation lasts in the real time just six microseconds,

108. that's how long it takes for this villin headpiece to fold. You can see

109. that the villin headpiece is a small fold that contains three helices, and
110. they're colored here from a blue N-terminus to a red C-terminus. So this

111. simulation starts with an extended peptide, and the peptide chain rapidly

112. collapses onto itself, and that's similar to the nucleation condensation model of

113. protein folding that we just saw. Then two helices form soon after, while that

114. middle helix takes a lot longer to form to yield the final fold. Remarkably the

115. final fold is very similar to the experimentally determined structure of

116. the headpiece. But it's important to note that this is just one sample trajectory

117. that shows us one example of how the villin protein might fold, and actually

118. every simulation of this villin headpiece yields slightly different

119. results, although there are themes that emerge. So protein folding kinetics are

120. quite complicated, but we've already reached a few take-home messages,
some

121. key ideas. And one of them is that there's going to be a limited number of

122. paths that the protein is actually going to take to reach its fold. That we've

123. come about through looking at the Levinthal paradox and how a lot of these

124. confirmations are probably not energetically accessible. So the idea is

125. that the landscape of the energetics of protein folding is what's going to

126. dictate the path towards folding. And the other important idea is that the

127. cooperativity of the process is also really important, so the cooperativity

128. will sort of dictate which steps to take along the way, one after the other, and so

129. that will ensure that the protein folds in a timely manner.

130.
1.  transcript. Skip to the end.

2. RACHELLE GAUDET: So we've learned quite a bit about how proteins fold,

3. but we've done so under assumptions that are based on a homogeneous solution of
dilute

4. protein that's folding in an aqueous solution.

5. For example, in an aqueous buffer in our test tube in the lab.

6. So now we're going to consider folding in the environment of the cell--

7. so how the protein deals with the cellular environment.

8. We often think that the cytosol is this big wide open space for the molecules

9. to float around.

10. But, in fact, it's actually jam-packed with molecules-- particularly proteins and RNA.

11. So this image represents a simulation of the crowded cytosolic environment

12. in a bacteria.

13. So this crowded environment poses some unique challenges to proteins

14. as they fold.

15. And that's the subject of this video.

16. So the challenge here is that the protein has to fold onto itself

17. to achieve its own tertiary structure.

18. And that's in contrast to it might very well interact with other proteins or other molecules

19. in the process of trying to fold.

20. And that would inhibit its folding ability.

21. Here's our model of the protein folding funnel, where the protein can achieve its native
fold--

22. the energy minimum.

23. And we've seen that this is true in dilute solutions-- for example,

24. in the Anfinsen experiment.

25. But in the cytosol, there are many other proteins that are in close proximity to the
nascently
26. folding proteins-- the proteins that are just synthesized by the ribosomes.

27. So if the protein chain actually interacts with other proteins that are also nascent

28. folding proteins, they could adopt other conformations that

29. have lower energy-- in particular, some multi-protein aggregates.

30. And this potential problem is illustrated here by extending the conformational space
available to the protein in the

31. process of searching its native fold.

32. So we've added a whole other side to this graph showing the potential for aggregates.

33. The same energetic forces that dictate native protein folding

34. dominate, also, in aggregates.

35. Namely, the benefit of burying hydrophobic residues through the hydrophobic effect.

36. So this suggests that in the cytosol there's a risk that proteins aggregate rather than

37. adopt their native fold.

38. And then they couldn't perform their biological functions.

39. But aggregates are even more nefarious than that.

40. For example, here's an electron micrograph of aggregates of a peptide

41. called amyloid beta.

42. And these aggregates are one of the pathogenic features

43. of Alzheimer's disease.

44. So you can see that aggregates can be really deleterious to cellular functions.

45. So how do cells actually prevent these aggregates from forming?

46. So cells can prevent the aggregation using proteins called chaperones.

47. And these chaperone proteins will actually aid protein folding.

48. But let's not assume that each protein has its own chaperone

49. to bend it into its native state.

50. We know that the native state is thermodynamically favorable.

51. That's the thermodynamic hypothesis.

52. But what chaperones can do is prevent the newly forming proteins
53. from aggregating with other proteins.

54. So we'll start with one of those chaperones, HSP70.

55. It plays an important role in folding of the proteins

56. as they are being translated-- right as they're coming off the ribosomes.

57. And mRNA here is translated by a ribosome.

58. And so you can see the polypeptide coming out.

59. And the polypeptide that is synthesized is at risk of forming some aggregates.

60. And that's especially true with large proteins because they have complicated folds
often.

61. So for example, if the fold of a protein would require interactions

62. between distant elements in the sequence in the primary structure-- for example,

63. between the N-terminus and the C-terminus-- the N-terminus is translated first.

64. And it may form some aggregates with other proteins

65. before the C-terminus can be translated.

66. And that's where HSP70 comes in.

67. So when HSP70 is there, it can bind to the nascent protein

68. before it's fully translated and before it aggregates.

69. So the protein will then be more likely to reach its proper fold.

70. But how would it do that?

71. So the HSP70 interacts preferentially with hydrophobic residues.

72. It will transiently mask these hydrophobic residues

73. and prevent aggregation.

74. But the key word here is transient because HSP70 also

75. needs to release the substrate to allow it to try to fold.

76. If it was stably bound, it would effectively be just another kind of aggregate instead

77. of the aggregates that it should prevent.

78. So how does HSP70 interact only transiently with its substrates?

79. It uses cycles of ATP hydrolysis-- an exchange of ADP for ATP.


80. ATP hydrolysis causes a massive conformational change

81. that allows substrate binding to HSP70.

82. The substrate binding, in turn, will promote the exchange of ADP

83. for a new molecule of ATP.

84. And now HSP70 changes its conformation back to a state that is not permissive for
substrate binding.

85. That will cause the release of the substrate.

86. The released protein then has another chance to fold on its own

87. and bury its hydrophobic surfaces.

88. This schematic here illustrates that HSP70 can exist in either a substrate-bound or a

89. substrate-free state.

90. And the structures of these two states have actually been determined.

91. So let's take a closer look at the structures to see how ATP hydrolysis and exchange
affect substrate affinity.

92. Here is HSP70 in its ATP-bound substrate-free state.

93. It has an ATPase domain, that's shown in blue, and a substrate-binding domain, that's
shown

94. in red with a yellow lid domain.

95. Now with a transparent ATPase domain surface, you can see that the ATP is actually
quite

96. deeply buried at the center of the molecule-- at the center

97. of this ATPase domain.

98. And upon ATP hydrolysis, the yellow lid actually moves into close contact with the rest
of

99. the substrate-binding domain-- the orange part of the protein here.

100. And that forms a substrate-binding pocket that sequesters some of the
substrate's hydrophobic surfaces.

101. So this is the part where HSP70 is sequestering the nascent protein.

102. Now if we return to the ATP-bound structure, notice how the exchange of ADP
for ATP moves
103. the lid domain away and abolishes this pocket.

104. It is remarkable how the state of the nucleotide affects very distal part of the
structure.

105. With the ATP- and ADP-bound structures superimposed here,

106. you can see just how massive that conformational change

107. is between the ADP-bound state, which is the solid surface here,

108. and the ATP-bound state, which is the transparent surface.

109. The movement of the lid domain is what creates a binding pocket that is crucial
for that

110. transient affinity of HSP70 for its substrates.

111. So let's unpack the name.

112. HSP70 means Heat Shock Protein 70.

113. There's actually a whole class of heat shock proteins, and most of them

114. are chaperones.

115. But where does the name heat shock protein come from?

116. It's because the expression of these heat shock proteins, or HSP proteins,

117. increase with exposure to high temperature.

118. It's usually a transient exposure to high temperatures-- so a heat shock.

119. So here you can see it in a Western blot experiment.

120. So the cells were heat shocked at time 0, and then the expression of HSP70

121. was monitored over eight hours.

122. And you can see that, especially at two and four hours,

123. there's more HSP70-- there's a darker band--than there was at time 0.

124. That's in contrast to Actin, which is the control here,

125. whose expression is about constant throughout the experiment.

126. But it's important to see that HSP70 is also present at pretty high levels in
unstressed

127. cells at time 0 here.


128. And that shows that it's important in folding-- in helping proteins fold-- even
without a

129. heat stress.

130. So why is HSP70 expression increased so much with heat stress?

131. So heat can cause proteins to denature.

132. This plot shows the fraction of protein that's folded versus temperature for a
representative

133. protein within the cell.

134. So as the temperature increases, there's more denatured proteins,

135. or fewer folded proteins.

136. And that's dangerous because these denatured proteins

137. could form aggregates through intermolecular hydrophobic contractions

138. when the cells cool.

139. So the cells will upregulate chaperone expression to prevent this aggregation

140. and allow proteins to refold intramolecularly so that they

141. regain their native tertiary structure.

142. SPEAKER 1: Are chaperone proteins important in other situations?

143. RACHELLE GAUDET: Yes, actually, they are.

144. But let's just recap first.

145. So we've already seen that under basal conditions, they're important to
maintain the homeostasis

146. of the cellular protein content.

147. And that under heat stress that's a condition that

148. actually led to their discovery.

149. They're very important under heat stress.

150. But they're also important in response to other stresses,

151. and you'll see increased chaperone expression in these situations too.

152. One example is in secretory cells.

153. When they're very active, when they're generating a lot of secreted proteins,
154. that can cause some protein folding stress.

155. Translating so much of the secreted protein could lead to aggregates.

156. So part of the stress response to this increase in secretory activity

157. is to increase chaperone expression to assist with protein folding.

158. So remember that HSP70 is not the only chaperone.

159. Actually, mammalian genomes have on the order of 200 chaperone genes.

160. One really interesting chaperone is called HSP60, or also

161. called GroEl in bacteria.

162. And it's shown here.

163. It forms a huge, multimeric cage structure.

164. And this view here is a cutaway view that shows the inside of that cage
structure.

165. A whole protein can go inside this cage to fold into its native shape

166. in a safe environment that's free from hydrophobic interactions

167. with other proteins.

168. Let's recall the thermodynamic hypothesis.

169. And that's the hypothesis that the most thermodynamically stable state

170. of a protein is this protein's native fold.

171. And for most proteins that are folding in dilute solutions, this is true.

172. In the crowded environment of the cell, chaperones assist folding in several
ways.

173. One of them is that they help proteins-- as they fold-- overcoming the kinetic
traps that are

174. along the pathway.

175. But even more important is that chaperones prevent the thermodynamically
favorable aggregates

176. from forming.

177. And that's why many of the chaperone genes are actually essential genes.

178. Here's a fun analogy to close out this video.


179. So you can think of the crowded environment of the cell as the mosh pit

180. at a rock concert filled with teenagers having a lot of fun.

181. So the chaperones are just preventing the teenagers

182. from getting too tangled up before they've grown into full, mature adults.

183. End of transcript. Skip to the start

You might also like