You are on page 1of 13

Molecular Weight Distributions and Averages

Victor A. Bloomeld Department of Biochemistry, Molecular Biology, and Biophysics University of Minnesota December 12, 1999

Introduction
Knowledge of molecular weights of biological macromolecules is important for many reasons. Molecular weight is a fundamental characteristic of molecules, an identifying mark of their chemical nature. It serves as a distinguishing label. Because of the linear nature of the genetic code and the colinearity of protein and nucleic acid sequences, it indicates the genetic information content. Dierences in molecular weight between that expected from the DNA sequence and that observed for the resulting RNA or protein can indicate cleavage, or post-transcriptional or post-translational processing. Finally, molecular weight measurements can indicate subunit association, a key mechanism of biological regulation. There are many ways to measure molecular weights [1, 2]. Some, such as mass spectrometry, and chemical analysis of amino acid or nucleotide sequence, are of very high precision for covalently bonded molecules, but destroy the bonds that stabilize noncovalently associated polymers. Other techniques of enormous importance in molecular biology, such as gel electrophoresis and size exclusion chromatography, are often used to determine molecular weight by comparison with standards of known molecular weight M and relative mobility , typically according an equation like = KM a, where K and a are empirically determined constants. Although these techniques are very convenient andwhen run under nondenaturing conditionscan preserve subunit association, they are of limited precision and depend on the suitability of the standards. More traditional physical chemical techniques may be best for determining subunit association of biological polymers. Osmotic pressure is useful for thermodynamic determination of the number average molecular weight and degree of association of moderate-size polymers, up to about Mr 100, 000. Sedimentation equilibrium can cover a very wide range of molecular weights, and is arguably the best way to determine macromolecular association since the weight average molecular weight is measured as a continuous function of concentration in a single experimental run. Sedimentation-diusion. Light scattering is also an important technique for determining polymer molecular weight and size. Osmotic pressure, sedimentation equilibrium, and light scattering are absolute methods, rmly based on equilibrium thermodynamics and not requiring calibration with standards. However, they require specialized equipment and careful attention of sample purity. Transport methods, such as sedimentation velocity and dynamic laser light scattering which measures diusion, are relative methods by themselves, requiring calibration with appropriate standards; but the sedimentation and diusion coecients can be combined to eliminate the frictional coecient to give molecular weight with relatively high precision.

Finally, electron microscopy, STEM (scanning transmission electron microscopy), and various scanning probe microscopies are often able to give direct visualization of macromolecular complexes, though generally not with very high precision of molecular weight.

1
1.1

Molecular weight averages


Notation

There are a variety of ways of symbolizing the concentrations of species in a complex mixtures. We shall use the following notation. N or [ ] denotes molar concentration c denotes weight concentration. Remember that c = N M where M is molecular weight. X denotes mole fraction w denotes weight fraction i is a running index that denotes the degree of polymerization of the ith species in a mixture. Mi Ni, we include just the species that When we write a summation over species, such as
i

are present in signicant amounts, such as i = 1, 2, 4 in a hemoglobin solution that might contain monomers, dimers, and tetramers. If the molecular weight of the monomer is M1 , Mi = iM1 .

1.2

Types of molecular weight averages

The number average molecular weight M n is measured in experiments that determine the number of particles. This includes colligative properties, such as osmotic pressure, and electron microscope counting of particles. Mi Ni M
n

ci =
i

Ni
i

Mi Xi =
i

ci /Mi

(1.2.1)

The weight average molecular weight M w is measured in experiments that determine the mass of particles. This includes sedimentation equilibrium and light scattering, scanning transmission electron microscopy (STEM), and gel electrophoresis in which the staining of a given band is proportional to the number of subunits (amino acids, nucleotides) in the band. Mi ci M
w

ci
i

=
i

Mi wi =

i i

Mi2 Ni Mi Ni (1.2.2)

The z -average molecular weight M z is measured in some sedimentation equilibrium experiments. It emphasizes large particles even more strongly than does the weight average molecular weight. M
z

i i

Mi2 ci Mi ci =

i i

Mi3 Ni Mi2 Ni (1.2.3)

Average degrees of polymerization i or dp are obtained simply by replacing Mi by i in these equations. 2

1.3

Averages of other quantities


z

Averages of other quantities are dened similarly. The mean-square radius of gyration R2 G determined from the angular dependence of light scattering intensity is a z -average [1] R2 z G =
i

Mi ci (R2 )i G Mi ci
i z

(1.3.1) measured in dynamic laser light scat-

as is the average translational diusion coecient D tering [3] D =


i

Di Mi ci Pi (q)
z

Mi ci Pi (q)
i

(1.3.2)

where Pi (q) is the scattering structure factor of species i at scattering vector q.

Polymerization mechanisms

We survey here some of the simple mechanisms that are most pertinent to biological systems.

2.1

Monomern-mer equilibrium

This is a good model for association of some oligomeric proteins, and for micelle formation. The reaction is nP1 Pn , (2.1.1)

the equilibrium association constant (with units of molar(n1 ) is K= [Pn ] , [P1 ]n (2.1.2)

and the equation for the conservation of total monomer is [P]tot = [P1 ] + n[Pn ]. (2.1.3)

The number average and weight average molecular weights and degrees of polymerization are, from the combination of eqs. 1.2.1 and 1.2.2 with eq. 2.1.2, M
n

M1 [P1 ] + Mn [Pn ] 1 + nK[P1 ]n1 = M1 = M1 dp n, [P1 ] + [Pn ] 1 + K[P1 ]n1


2 2 M1 [P1 ] + Mn [Pn ] 1 + n2 K[P1 ]n1 = M1 = M1 dp w , M1 [P1 ] + Mn [Pn ] 1 + nK[P1 ]n1 w/

(2.1.4)

(2.1.5)

while the polydispersity ratio M M M


w n

is 1 + K[P1 ]n1 (1 + nK[P1 ]n1 )2 . (2.1.6)

1 + n2 K[P1 ]n1

As K[P1 ]n1 approaches either 0 or , the polydispersity ratio approaches 1. A transition between monomer and n-mer occurs when K[P1 ]n1 is approximately unity. This behavior is plotted in Figure 1, which also shows the very slow increase in monomer concentration once the critical concentration is reached. 3

Figure 1: Left: Number average and weight average degrees of polymerization as a function of [P]tot for the monomern-mer model with n = 12 and K = 2n1 /n. Right: Dependence of free monomer concentration on [P]tot. The inection point corresponds to the critical micelle concentration.

2.2

Unlimited association with equal probability of bond formation

In contrast to the previous model, in which only monomer and n-mer were allowed, this model allows monomer, dimer, trimer, . . . to form in linear array, with equal equilibrium constant at each step. This mechanism is used in the synthetic polymer literature to model condensation polymerization (e.g. of nylon). In biochemistry, it can be used to model the stacking of monomeric nucleic acid bases in aqueous solution, or the noncovalent association of protein monomers into rodlike polymers (although essentially all such polymers, such as F-actin, tobacco mosaic virus, and microtubules are multistart helices that obey a more complicated mechanism). It also predicts the polymer size distribution for random chain degradation by endolytic cleavage, in which all bonds have the same probability of being broken. The mechanism can be written P 1 + P1 P1 + P2 . . . P1 + Pi1 . . . The equilibrium constant is the same for all steps, a condition that is sometimes called isodesmic polymerization: K= [[Pi+1 ] , [P1 ][Pi ] i = 1, 2, . . . , . (2.2.2) Pi (2.2.1) P2 P3

Thus [P2 ] = K[P1 ]2 , [P3 ] = K[P1 ][P2 ] = K 2 [P1 ]3 , . . . , or in general [Pi] = K i1 [P1 ]i = K 1 (K[P1 ])i = K 1 pi , where p = K[P1 ]. 4 i = 1, 2, . . . , (2.2.3)

The conservation of total monomer and the number and weight average degrees of polymerization are expressed by the equations

[P]tot =
i=1

i[Pi] = K 1

ipi ,
i=1

(2.2.4)

i[Pi ] dp
n

ipi , pi (2.2.5)

i=1 i=1

= [Pi ]

i=1 i=1

i2 [Pi ] = i[Pi]

i 2 pi . ipi (2.2.6)

dp

i=1 i=1

i=1 i=1

The summations are geometric series, which can be solved by a combination of a review of high-school algebra and a trick. The review is of the basic geometric series itself. If Sn is the seried carried out to n terms, rather than to innity,
n

Sn = 1 + p + p + p + . . . + p =
i=0

pi .

(2.2.7)

Multiplying both sides by p, pSn = p + p2 + p3 + . . . + pn + pn+1 . Subtracting eq. 2.2.7 from eq. 2.2.8, we see that all terms cancel except the rst and last: (1 p)Sn = 1 pn+1 , so Sn = If p < 1, pn+1 0 as n , so S = S = 1 , 1p p < 1. (2.2.11) 1 pn+1 . 1p (2.2.10) (2.2.9) (2.2.8)

The sums in the equations for dp and [P]tot start at 1 rather than 0, so
n

pi =
i=1

p pn+1 p 1 pn+1 1= as n . 1p 1p 1p

(2.2.12)

The trick is the following way to introduce powers of i into the sums. Since dierentiation and summation are commutative, d dp

pi =
i=1 i=1

ipi1 .

(2.2.13)

Then multiplying by p, using eq. 2.2.12, and rearranging,

ipi = p
i=1

d dp

pi = p
i=1

d dp

p 1p

p . (1 p)2

(2.2.14)

Repeating the trick once more,


i=1

d i p =p dp
2 i

ipi =
i=1

p(1 + p) . (1 p)3

(2.2.15)

Thus we nd [P]tot = K 1 [P1 ] p = , 2 (1 p) (1 K[P1 ])2 (2.2.16)

which rearranges after solving the quadratic to K[P1 ] = p = 1 + 1 1 1 + 2 [P]2 2K[P]tot K[P]tot 4K tot
1/2

(2.2.17)

Substituting into eqs. 2.2.5 and 2.2.6, we nd dp and dp


w n

1 1p 1+p . 1p

(2.2.18)

(2.2.19)

Note that large degrees of polymerization occur only when p is very close to 1. This is similar to a multi-step synthesis having a very high yield only when each step has a yield close to 100%. The polydispersity index dp w / dp n = 1 + p 2 when p 1. The behavior of K[P1 ] and dp n are plotted as functions of [P]tot in Figure 2. One can also calculate the distribution of mole fractions and weight fractions: Xi =

[Pi ] [Pj ]

pi
j=1

= pi1 (1 p),

(2.2.20)

pj

j=1

wi =

i[Pi] j[Pj ]

ipi
j=1

= ipi1 (1 p)2 .

(2.2.21)

jpj

j=1

Another derivation of these results, due to Flory [4], is simple and instructive. Let p be the probability that one monomer will be bonded to the next, so (1 p) is the probability of no bond. Then the probability of nding a molecule with exactly i 1 bonds (which equals the mole fraction of i-mers) is Xi = pi1 (1 p). (2.2.22)

This can be compared with eq. 2.2.20, showing that p = K[P1 ] is the probability of bond formation. Flory has called eq. 2.2.20 the most probable distribution. 6

Figure 2: Left: Free monomer concentration K[P1 ] (solid line, left ordinate) and number average degree of polymerization (dotted line, right ordinate) as functions of the total monomer concentration K[P]tot for the isodesmic polymerization model. Right: Mole fraction Xi (solid line) and weight fraction wi (dotted line) as functions of degree of polymerization i for p = 0.9. Note that, according to eq. 2.2.16, this corresponds to K[P]tot = 90.

2.3

Phosphorylase-catalyzed polymerization

A biologically relevant variant of the unlimited, equal-probability mechanism is the linear polymerization of polysaccharides or polyribonucleotides by phosphorylase enzymes. It diers from the previous mechanism by starting with a primer, so that the minimum length is not 1, and by consuming phosphorylated monomer and liberating inorganic phosphate, so the bond formation probability depends on the concentration of free monomer and phosphate in solution. It is instructive to consider these additional details, as they have been worked out by Peller [5] whose work we follow closely. The reaction for the ith step can be written AXi1 + XP and the equilibrium constant Ki = [AXi ][Pi] , [AXi1 ][XP] (2.3.2) AXi + P (2.3.1)

where AXi is the polymer containing i reactive X units, XP is phosphorylated monomer (e.g., glucose-1-phosphate or nucleoside diphosphate), and Pi is inorganic phosphate. Proceeding as before, we nd [AXi ] = [A](K)i = [A]pi, (2.3.3)

where A = AX0 is the primer of minimum size and = [XP]/[P]. The probability of bond formation p = K now depends on the equilibrium ratio of monomer feed to phosphate product. Three conservation equations must be obeyed: Conservation of chain ends, if polymerization takes place on pre-existing primer chains:

[AXi ] = [A]
i=0 i=0

pi =

[A] = 1p

[AXi ]0 ;
i=0

(2.3.4)

conservation of monomer:

i[AXi ] = [A]
i=0 i=0

ipi =

[A]p = (1 p)2

i[AXi ]0 + [XP]0 [XP];


i=0

(2.3.5)

and conservation of phosphate: [XP] + [Pi] = [XP]0 + [P]0 . In eqs. 2.3.4 and 2.3.5 we have used the geometrical series summations we derived earlier. We can eliminate [A] from these equations and derive an expression for p: p = K = where

(2.3.6)

0 1 + 0 1 +

1 1/0 0 1+

1 1/0 0 1+

(2.3.7)

+1

i[AXi ]0 , 0 = [AXi ]0

i[AXi ]0 [XP]0 , 0 =

0 =

i=0 i=0

i=0

[XP]0 . [P]0

(2.3.8)

That is, 0 is the initial number average degree of polymerization of the primer molecules (beyond the minimal primer size dpmin ), which may be polydisperse, so the initial dp n = 0 = dpmin + 0 . The parameter 0 is the initial ratio of monomer building blocks X incorporated in primer to the monomer in solution. Proceeding as before, we obtain the number average dp including primer,

(i + dpmin )[AXi ]
i=0

i=0

= + dpmin = 0 1 + [AXi ]

1 0

1 /0 1+

+ dpmin ,

(2.3.9)

so that the number average extension beyond the initial primer is 0 = 0 = 0 0 1 /0 1+ . (2.3.10)

This formula describes both polymerization ( > 0 and < 0 ) and phosphorolysis ( < 0 and > 0 ) From these equations we nd that the analog of eq. 2.2.22 is Xi =

[AXi ] [AXi ]

= pi (1 p),

(2.3.11)

i=0

where the power i rather than i 1 arises because the i-mer contains i bonds rather than i 1 without a primer. It can be shown [5] that with the initial conditions of no inorganic phosphate and only the limit primer present (0 = 0), p 1 , 1 + [A]0 /[XP]0 (2.3.12)

1 and K 1. This low molar ratio of under the common experimental condition of [A]0 /[XP]0 chain ends compared to monomer feed leads to p 1 and relatively long polymer chains. 8

2.4

Polymerization on preexisting nuclei

This may occur with irreversible polymerization on primers (if there is reversibility then the most probable distribution of the previous section is found), or when there is a critical nucleus that must be formed and the rate of nucleation is much slower than subsequent polymerization. In polymer chemistry, this type of mechanism is known as monomer addition without termination. In biology, examples may be extension of DNA or RNA primers, and polymerization of bacteriophage tails on baseplate structures. Let there be n nuclei, hence n polymer chains, with m monomers to be distributed among them. The average chain length is r = dp
n

= m/n.

(2.4.1)

The probability that a given polymer chain will be chosen by any given monomer is 1/n. so the probability of exactly i particular monomers associating with a given chain, while the other m i do not, is (1/n)i(1 1/n)mi . According to the standard combinatorial formula, there are m!/[i!(m i)!] ways of choosing these i monomers from the m available. Thus the fraction of chains that contain i monomers is Xi = m! i!(m i)! 1 n
i

1 n

mi

(2.4.2)

which is a Bernoulli distribution. For highly polymerized systems one will have m n and m i, so m!/(m i)! = i 1, m(m 1)(m 2) (m i + 1) m , and since in molecular systems we can also assume n we obtain 1 1 n so using eq. 2.4.1 we nd Xi mi i! 1 n
i mi

1 1 n

em/n ,

(2.4.3)

em/n =

r i er , i!

(2.4.4)

which is a Poisson distribution. The corresponding weight distribution is wi = r (i + 1)er r i1 . r+1 i! (2.4.5)

This is a narrow distribution for large r, as we see by calculating the number and weight average degrees of polymerization and the polydispersity index:

dp

=
i=0

iXi = e

i=0

d ir i = er r i! dr

i=0

ri . i!

(2.4.6)

But by the familiar Taylors series,


i=0

ri = er , i!

(2.4.7)

Figure 3: Mole fraction of species containing i monomers, according to the nucleated polymerization model, for number average degree of polymerization r = 10 (solid) and 20 (dotted). The weight fraction distribution is essentially coincident with the mole fraction. so dp
n

=r

(2.4.8)

which is of course consistent with the denition of r in eq. 2.4.1. By a similar series of manipulations we obtain

i2 Xi iXi

dp

i=0 i=0

1 = r

i=0

i2 r i r e = er i!

i=1

ir i1 = er (i 1)!

j=0

(j + 1)r j , j!

(2.4.9)

or dp Thus the polydispersity index is dp dp


w n w

= r + 1.

(2.4.10)

= 1+

1 1 for r r

1.

(2.4.11)

This shows that nucleation-limited polymerization is an eective mechanism for sharp control of size in biomolecular assembly reactions. The distribution function for mole fractions is shown in Figure 3.

2.5

Nucleated polymerization of helical proteins

We saw in the previous sections that very long polymers are impossible to achieve at equilibrium unless the probability p of bond formation is essentially equal to unity, or unless polymerization is irreversible. This limitation can be overcome if polymerization is controlled by the formation of a nucleus of marginal stability and very low concentration, but upon which extensive polymerization can occur once it is formed. 10

Such a situation can arise in protein polymerization when the rst several subunits are less strongly bound than those added later. This could occur simply because the rst subunits have fewer neighbors to which to bind (for example, each monomer in a dimer has only one neighbor, while each monomer in a closed trimer has two neighbors) or because some strain energy must be expended to distort the initial oligomer into a conformation suitable for subsequent extension. We can construct a generally useful framework for considering nucleated polymerization without specifying the mechanistic details [7]. It suces to specify two free energies, G0 per subunit for nuc nucleating the polymer, and G0 per subunit for propagating it. If the number of monomers in prop the nucleus is i0 , then the total free energy of the polymer containing i monomers is G0 = i0 G0 + (i i0 )G0 , i nuc prop i > i0 . (2.5.1)

Of course, G0 will be an average over the individual free energies of adding a subunit up to the nuc critical nucleus size. However, since we are interested in the size distribution of the helical polymers rather than of the nucleus precursors, this will make no dierence to the conclusions. This equation can be rewritten G0 = i0 (G0 G0 ) + iG0 , i nuc prop prop i > i0 , (2.5.2)

so using the familiar G0 = RT ln Keq and proceeding as in our discussion of unlimited association with equal probability of bond formation, we see that for i > i0 , [Pi] = K i [Pi1] = K 1 (K[P1 ])i where K = eGprop /RT
0

(2.5.3)

(2.5.4)

and = ei0 (Gnuc Gprop )/RT .


0 0

(2.5.5)

Since G0 > G0 nuc prop (both are negative if they lead to stable bonding among the subunits, but 0 Gprop is more negative), is less than one. In fact, reasonable estimates for the parameters are 15 . Other i0 = 4, G0 G0 nuc prop = 5 kcal/mol, and RT = 0.6 kcal/mol, leading to 3 10 estimates of might vary by orders of magnitude, but the inescapable conclusion is that 1. Therefore, the ratio of nucleating to propagating species will also be 1. Even though the concentration of nucleic will be very small, they must be present if high polymers are to be formed. As we shall now see, this can happen only when the concentration of monomer is very close to a critical concentration. At that point, polymerization occurs abruptly, producing very long polymers. If the concentrations of nucleic and pre-nuclei with i i0 can be neglected, the total concentration of monomers [P]tot is

[P]tot = [P1 ] +
i=i0 +1

i[Pi] = [P1 ] + K 1

ipi,
i=i0 +1

(2.5.6)

where p = K[P1 ]. Summing the geometric series beginning at i = 1 and then subtracting the extra terms, we obtain [P]tot = [P1 ] + K = [P1 ] +
1

p K 1 (1 p)2

i0

ipi
i=1

[P1 ] [P1 ] 2K[P1 ]2 . . . . (1 K[P1 ])2 11

(2.5.7)

Figure 4: Schematic behavior of a system undergoing nucleated polymerization, after Oosawa and Asakura [7]. All terms except [P1 ] will be negligible until [P1 ] increases to [P1 ]crit 1/K. At this critical concentration the denominator approaches zero, so the term representing polymer increases very rapidly while the terms representing species with i i0 remain negligible. If we dene the concentration of monomers in helical polymers as [Ph ], we have [P]tot = [P1 ] + [Ph ] where [Ph ] = [P1 ] . (1 K[P1 ])2 (2.5.9) (2.5.8)

Substitution of [P1 ]crit for [P1 ] at or above the critical concentration, and some algebraic rearrangement, then leads to i
n =

1 1 = = 1/2 1p 1 K[P1 ]crit

[P]tot 1 [P1 ]crit

1/2

= 1/2

[Ph ] [P1 ]crit

1/2

(2.5.10)

where we have used eq. 2.2.5. The weight average molecular weight is likewise given by eq. 2.2.6 (although in both cases the summations start at i0 + 1 rather than at 1), so the polydispersity index i w / i n 2, just as it is for unnucleated polymerization. The behavior of this system is plotted in Figure 4. We observe that even if only a small fraction of subunits are in the helical form ([Ph ]/[P1 ]crit < 1), the factor 1/2 drives the degree of polymerization abruptly up.

References
[1] Tanford, C. (1961), Physical Chemistry of Macromolecules, Wiley, New York. [2] van Holde, K.E., Johnson, W.C. and Ho, P.S. (1998), Principles of Physical Biochemistry, Prentice Hall, 1998. [3] Koppel, D.E. (1972) J. Chem. Phys. 57: 48144820. 12

[4] Flory, P.J. (1953), Principles of Polymer Chemistry. Cornell University Press, Ithaca, Ch. 8. [5] Peller, L. (1961) Biochim. Biophys. Acta 47: 6165. [6] Peller, L. (1966) Proc. Natl. Acad. Sci. USA 55: 10251031. [7] Oosawa, F. and Asakura, S. (1975), Thermodynamics of the Polymerization of Protein, Academic Press, New York, pp. 2835.

13

You might also like