You are on page 1of 11

Mission Report, Part H. Scientific Data and 27. J. N. de Wys, ibid.

ibid. 153, 632 (1967); A Preliminary Report, NASA SP-173 (Na-


Results, Technical Report 32-1023 (Jet Propul- J. Geophys. Res. 73, 6915 (1968). tional Aeronautics and Space Administration,
sion Laboratory, Pasadena, California, 1966), 28. , Surveyor Vii: A Preliminary Report, Washington, D.C., 1968), pp. 289-294.
pp. 7-44. NASA SP-173 (National Aeronautics and 35. R. H. Norton, J. B. Gunn, W. C. Livingston,
22. R. H. Norton, J. E. Gunn, W. C. Livingston, S7pace Administration, Washington, D.C., 0. A. Newkirk, H. Zirin, Surveyor VI: A
G. A. Newkirk, H. Zirin, J. Geophys. Res. 72, 1968), pp. 187-205. Preliminary Report, INASA SP-166 (National
815 (1967); , Surveyor V: A Preliminary 29. A. L. Turkevich, E. J. Franzgrote, J. H. Pat- Aeronautics and Space Administration, Wash-
Report, NASA SP-163 (National Aeronautics terson, Surveyor VI: A Preliminary Report, ington, D.C., 1968), p. 107; - -, Surveyor
and Space Administration, Washington, D.C., NASA SP-166 (National Aeronautics and VII: A Preliminary Report, NASA SP-173
1967), pp. 103-105. Space Administration, Washington, D.C., (National Aeronautics and Space Administra-
23. J. A. O'Keefe, J. B. Adams, D. E. Gault, 1968), pp. 109-132. tion, Washington, D.C., 1968), pp. 295-297.
J. Green, G. P. Kuiper, H. Masursky, R. A. 30. G. Vitkus, R. R. Garipay, W. A. Hagemeyer, 36. F. B. Winn, Surveyor Project Final Report,
Phinney, E. M. Shoemaker, Surveyor VI: A J. W. Lucas, B. P. Jones, J. M. Sasri, Sur- Part ii. Science Results, Technical Report 32-
Preliminary Report, NASA SP-166 (National veyor Vii: A Preliminary Report, NASA SP- 1265 (Jet Propulsion Laboratory, Pasadena,
Aeronautics and Space Administration, Wash- 173 (National Aeronautics and Space Ad- California, 1968).
ington, D.C., 1968), pp. 145-149. ministration, Washington, D.C., 1968), pp. 37. E. M. Shoemaker, R. M. Batson, H. E. Holt,
24. D. E. Gault, J. B. Adams, R. J. Collins, G. 163-180. E. C. Morris, E. A. Whitaker, ibid.
P. Kuiper, H. Masursky, J. A. O'Keefe, R. 31. L. D. Jaffe, C. 0. Alley, S. A. Batterson, 38. A. L. Turkevich, E. J. Franzgrote, 3. H. Pat-
A. Phinney, E. M. Shoemaker, Surveyor Vii: E. M. Christensen, S. E. Dwornik, D. E. terson, Surveyor V: A Preliminary Report,
A Preliminary Report, NASA SP-173 (Nation- Gault, J. W. Lucas, D. 0. Muhleman, R. H. NASA SP-163 (National Aeronautics and
al Aeronautics and Space Administration, Norton, R. F. Scott, E. M. Shoemaker, R. H. Space Administration, Washington, D.C.,
Washington, D.C., 1968), pp. 233-276. Steinbacher, G. H. Sutton, A. L. Turkevich, 1967), pp. 107-132.
25. D. B. Gault, J. B. Adams, R. J. Collins, J. ibid., pp. 1-3. 39. Many people participated. in the Surveyor
Green, 0. P. Kuiper, H. Masursky, J. A. 32. D. 0. Muhleman, W. E. Brown, Jr., L. Project and in analyses of the data. Names
O'Keefe, R. A. Phinney, E. M. Shoemaker, Davids, W. H. Peake, Surveyor Project Final of some of them are given in the references.
Science 158, 641 (1967). Report, Part 11. Science Results, Technical The Surveyor project was managed by the
26. A. L. Turkevich, E. J. Franzgrote, J. H. Pat- Report 32-1265 (Jet Propulsion Laboratory, Jet Propulsion Laboratory, California Insti-
terson, ibid., pp. 635-637; A. L. Turkevich, Pasadena, California, 1968). tute of Technology, under contract NAS7-100,
J. H. Patterson, E. J. Franzgrote, ibid. 160, 33. b). 0. Muhleman, ibid. sponsored by the National Aeronautics and
1108 (1968). 34. C. 0. Alley and D. G. Currie, Surveyor Vii: Space Administration.

is a cellular control of molecular ac-


tivities, and Simpson adds that there is
also an organismal control of cellular
activities and a populational control of
organismal activities, and concludes
(1):
Non-Darwinian Evolution The consensus is that completely neutral
genes or alleles must be very rare if they
-exist at all. To an evolutionary biologist,
Most evolutionary change in proteins may be it therefore seems highly improbable that
proteins, supposedly fully determined by
due to neutral mutations and genetic drift. genes, should have nonfunctional parts,
that dormant genes should exist over
periods of generations, or that molecules
Jack Lester King and Thomas H. Jukes should change in a regular but nonadaptive
way . . . [natural selection] is the com-
poser of the genetic message, and DNA,
RNA, enzymes, and other molecules in
the system are successively its messengers.
Darwinism is so well established that ural selection, operating through adap- We cannot agree with Simpson that'
it is difficult to think of evolution ex- tive changes in DNA. It does not nec- DNA is a passive carrier of the evolu-
cept in terms of selection for desirable essarily follow that all, or most, tionary message. Evolutionary change
characteristics and advantageous genes. evolutionary change in DNA is due to is not imposed upon DNA from with-
New technical developments and new the action of Darwinian natural selec- out; it arises from within. Natural
knowledge, such as the sequential anal- tion. There appears to be considerable selection is the editor, rather than the
ysis of proteins and the deciphering of latitude at the molecular level for composer, of the genetic message. One
the genetic code, have made a much random genetic changes that have no thing the editor does not do is to re-
closer examination of evolutionary effect upon the fitness of the organism. move changes which it is unable to
processes possible, and therefore nec- Selectively neutral mutations, if they perceive.
essary. Patterns of evolutionary change occur, become passively fixed as evo- The view that mutations cannot be
that have been observed at the pheno- lutionary changes through the action of selectively neutral is not confined to
typic level do not necessarily apply at random genetic drift. organismal evolutionists. Smith (3)
the genotypic and molecular levels. We The idea of selectively neutral change states:
need new rules in order to understand at the molecular level has not been
One of the objectives of protein chem-
-the patterns and dynamics of molecular readily accepted by many classical evo- istry is to have a full and comprehensive
evolution. lutionists, perhaps because of the understanding of all the possible roles that
Evolutionary change at the morpho- pervasiveness of Darwinian thought. the 20 amino acids can play in function
logical, functional, and behavioral Change in DNA and protein, when it and conformation. Each of these amino
levels results from the process of nat- is thought of at all, is thought to be acids must have a unique survival value in
limited to a response to activities at a phenotype being manifestedorganism-the
the phenotype of the
Dr. King is a biophysicist and geneticist for in the struc-
the Donner Laboratory and Dr. Jukes is asso- higher level. For example, Simpson (1) tures of the proteins. This is as true for a
ciate director of the Space Sciences Laboratory,
University of California, Berkeley 94720. quotes Weiss (2) as stating that there single protein as for the whole organism.
788 SCIENCE, VOL. 164
To hold that selectively neutral iso- Table 1. Rates of amino acid substitutions in mammalian evolution. [From data in Table 5;
alleles cannot occur is equivalent to for sources, see (53)]
maintaining that there is one and only Total Observed Observed Estimated 10-1o sub-
one optimal form for every gene at number of number number number of stitutions
Protein comparisons of amino of differ- substitu- don
any point in evolutionary time. We of amino acid dif- ences per tions per per codon
think that life is not so inflexible. acids ferences codon codon P Y
1. Insulin A and B 510 24 0.047 0.049 3.3
2. Cytochrome c 1040 63 .061 .063 4.2
Fixation of Selectively 3. Hemoglobin a-chain 432 58 .137 .149 9.9
4. Hemoglobin /3-chain 438 63 .144 .155 10.3
Neutral Isoalleles 5. Ribonuclease 124 40 .323 .390 25.3
6. Immunoglobulin light
Drift is slow but effective in the chain (constant half) 102 40 .392 .498 33.2
fixation of neutral mutations. As 7. Fibrinopeptide A 160 76 .475 .644 42.9
pointed out by Kimura (4), the rate 8. Bovine hemoglobin
fetal chain 438 97 .221 .250 22.9t
of random fixation of neutral mutations 9. Guinea pig insulin 255 86 .337 .411 53.1t
in evolution (per species per generation) * The estimate for time elapsed since the divergence of euplacental mammalian orders is 75 million
is equal to the rate of occurrence of years. The average rate of evolution for the seven protein species represented by entries 1 through 7
neutral mutations (per gamete per is 16 X 10-10 substitution per codon per year. t Bovine line of descent only. : Guinea pig line
of descent only. Positive natural selection has probably been a factor in the evollution of bovine fetal
generation). hemoglobin and guinea pig insulin.
Of the 2N copies of a gene in a pop-
ulation of N individuals at one point in
evolutionary time, only one is destined there are 61 amino-acid-specifying were due to adaptive evolution, then
to become the ancestor (through repli- codons. Since each of the three base one should expect that the first two
cation) of all copies of the gene that pairs can mutate in any of three ways, nucleotide positions of each codon
will be in existence in the species in the each codon can mutate in any of nine would change more rapidly than the
distant evolutionary future. The process ways by single substitution. Of the 549 third position, since synonymous mu-
by which one line becomes fixed has possible single-base substitutions, 134 tations are unlikely to be adaptive. But
been called "genetic drift," "random (one-fourth) are substitutions to syn- if DNA divergence in evolution in-
walk," or "branching process." If all onymous codons (6). These are herit- cludes the random fixation of neutral
copies of the gene are selectively equiv- able changes in the genetic material, mutations, then the third-position nu-
alent, all have equal chances of becom- hence true mutations. As far as is cleotides should change more rapidly,
ing the common ancestor. Thus, if a known, synonymous mutations are because synonymous mutations are
newly occurring mutation is selectively truly neutral with respect to natural more likely to be neutral.
neutral, its probability of becoming selection. If the 15 percent of base differences
fixed through random drift is 1 /2N between the DNA's of mice and rats
(5). If mi is the rate of occurrence of were distributed randomly in structural
selectively neutral mutations per func- Comparing Evolution in genes, (0.85)3, or 61 percent, of all co-
tional gamete, the expected number of Protein and in DNA dons would remain identical. About one-
newly occurring neutral isoalleles in the fourth of the remaining codons would
species is 2Nm1 per generation. Only Species divergence can be measured be synonymous in the two species, so
a small proportion of these will become at the protein level through sequence that one would expect about 70 percent
fixed by chance; the rate of occurrence analysis, and independently at the DNA of the amino acid positions in mouse
of neutral isoalleles destined to become level through in vitro hybridization (7, and rat proteins to be identical and 30
so fixed is 1/ 2N X 2Nmi, or mi per 8). The measurement of DNA species percent to be different. Unfortunately,
generation. Thus, the rate of non- divergence is complicated by the exist- there have been no studies of amino
Darwinian evolutionary change is a ence of repetitive DNA sequences of acid sequence reported for homologous
function only of the rate of occurrence unknown function; discovery of these proteins in the two species. But, if the
of neutral mutations and is independent sequences has been the most important time since divergence from the last
of population size. finding of the hybridization experiments mouse-rat common ancestor is 9 mil-
The gene frequency of a neutral (9). Laird et al. (8) find that the slowly lion years (7), as estimated, a differ-
allele fluctuates randomly from genera- reassociating "unique-sequence" DNA ence in 30 percent of the amino acid
tion to generation. Eventually the of the mouse and the rat have diverged positions would represent an evolution-
"random walk" of the gene frequency to the extent that almost half is unable ary rate of 17 x 10-9 substitution per
goes to the ground states of loss or fix- to form interspecific hybrid molecules. codon per year. This is ten times the
ation. The evolutionary time scale al- They estimate that, in the 54 percent of estimated average rate of protein evo-
lows for many such fixations in the mouse unique-sequence DNA that does lution in mammalian species (see Table
divergence of species. form hybrid molecules with rat DNA, 1).
about 15 percent of the nucleotide bases Walker (7), using quite different
are improperly paired, due to divergent procedures and criteria, estimated that
Mutations to Synonymous Codons evolution. If the hybridizable fraction of 13 percent of the nucleotide positions
unique-sequence DNA is typical of are occupied by different bases in the
Because of the degeneracy of the structural DNA in the mouse and rat, DNA's of the mouse and the rat. He
genetic code, some DNA base-pair evidently 15 percent is a minimum esti- concluded that the large discrepancy
changes in structural genes are without mate of nucleotide divergence. between the evolutionary rate for DNA
effect on protein structure. Specifically, If most DNA species divergence and that for protein sequences implied
16 MAY 1969 789
Table 2. Distribution of numbers of amino acid changes compared for 148 sites in globin than that in the original stock. The
chains, 110 sites in cytochrome-c chains (19), and 111 sites in specificity regions of immuno- gene favors the substitution A*T -e C*G
globulin-G light chains (26).
(11). The effect of the gene, which
Specificity regions of may possibly be mediated through an
Globins Cytochromes c light chains of altered DNA polymerase, is apparently
immunoglobulins G
No. of No. of to fill the third positions of synonymous
No. of codons withi C or G and also to pro-
Changes
per site
sites
having
Minus
six
Poisson
distri-
sites Minus
having 29
Poisson
distri- No.
Minus
nine
Poisson
distri- duce neutral substitutions in the amino-
the spe- invari- bution the spe- invari- bution of invari- bution acid content of proteins in which A-
cified able for m cified able for m sites able for m
No. of sites = 3.5 No. of sites = 2.6 sites = 2.4 and T-rich codons are changed to G-
changes changes and C-rich codons. Thousands of such
0 7 1 4 35 6 6 17 9 9 mutations accumulated in laboratory
1 21 21 15 17 17 16 19 19 22 cultures without markedly impairing
2 23 23 27 18 18 20 28 28 27 the viability of the mutated strains.
3 33 33 31 19 19 18 20 20 21 Moreover, as noted by Cox and Yanof-
4 29 29 27 10 10 12 12 12 13 sky, genes in other organisms, such as
5 20 20 19 6 6 6 5 5 6 bacteriophage T4, may exert a bias in
6 7 7 11 3 3 3 5 5 2 the opposite direction, GC -> A-T
7 5 5 6 1 1 1.0 4 4 0.8 (12). It seems that mutator genes can
8 2 2 2 1 1 0.3 1 1 .3 be an evolutionary phenomenon, ex-
9 1 1 1.0 0 0 .1 0.0 0.0 .10
plaining to some extent the amino
acid differences between AT-rich and
G'C-rich bacterial species, first noted
that most evolutionary changes in DNA corresponding human and carp DNA by Sueoka (13) and discussed else-
are concentrated in synonymous third would fail to reassociate in the regions where (14).
positions. Presumably the third-position beyond the 47th codon.
nucleotides are not inherently more A third possibility is that most of
mutable, but mutations occurring in the the DNA for which measurements of Proteins in Evolution
first two positions of a structural gene homology have been made is not in the
codon usually cause amino acid substi- form of structural genes, since, as is Cytochrome c. We now consider
tutions and are frequently eliminated discussed below, it is probable that not those changes in DNA which do result
by natural selection, whereas third- much more than 1 percent of mam- in altered proteins but which nonethe-
position mutations are usually selec- malian DNA codes for proteins. less are fully equivalent to the original
tively neutral and are incorporated into The data indicate that changes in form with respect to natural selection
evolutionary lines through random amino acid sequences occur much more (15). Cytochrome c is a protein that
processes. slowly than changes in total DNA. appears to have identical and well-
This may not be the only source of Presumably, changes in DNA which defined functions in the cells of all
the discrepancy, however. A second cause changes in proteins are held in eukaryotes. The cytochromes c of vari-
circumstance is that the matching prop- check by natural selection to a far ous organisms were fully interchange-
erties of homologous DNA sequences greater degree than are those which able when compared in vitro in studies
ry be thrown abruptly out of phase do not. of intact mitochondria (16). The sub-
by deletions of one or more codons. stitutions in homologous amino acid
Consider the case of human and carp residues of cytochrome c have repeat-
-hemoglobins. These differ in that site The Treffers Mutator Gene edly been scrutinized by Margoliash
47 is occupied by an alanine residue in and Smith (17), who have discussed
the carp a-chain and is a gap in human Cox and Yanofsky (10) have shown the nature of the substitutions with re-
and other mammalian a-chains. The that when Escherichia coli of "mut T" spect to the properties of the relevant
homology of the a-chains is readily strain is repeatedly subcultured, the amino acids. Smith (3) defines con-
adjusted by replacing the alanine resi- presence of the Treffers mutator (mut servative substitutions as "the replace-
due of the carp a-chain with a gap in T) gene produces a trend toward DNA ment of one amino-acid residue -by
the e-chains of other species, but the of a guanine-cytosine content higher another with similar properties, the
locus of such substitutions being such
that no disturbance of function will
Table 3. Suggested neutral interchanges in cytochrome c. The sites are numbered consecutively, occur because of minor differences in
starting with the first residue in the wheat cytochrome-c sequence. structure." But this definition is quite
Cytochrome c
Site number elastic, for he points out that it can fit
17 19 43 65 66 93 103 the case in which nine different residues
Lys Leu Ile Thr Leu Ile
occupy a homologous site in different
Neurospora Leu
Bakers' yeast Leu Lys Ile Val Leu Leu Ile cytochromes c when the locus has its
Yeast (Candida krusei) Leu Lys Ile Val Glu Leu Val side chain on the outside of the mole-
Wheat le Lys Leu Val Glu Leu Ile cule. Cytochrome c is devoid, or almost
Moth (Samia cynthia) nIe Val Phe Ile Thr Leu Te devoid, of helical regions, but the in-
Horse Ile Val Leu Ile Thr Ile Ile terior of the molecule "shows a low
Other species Val Ile Val Ile density area that must consist entirely
Val or almost entirely of hydrophobic side
790 SCIENCE, VOL. 164
chains" (3). This requirement would Table 4. Human hemoglobin variants which These considerations make it appear
correspond to mutations that have become in- that most of the interspecies differences
restrict but not prevent interchanges corporated into the normal hemoglobins of
among the amino acid residues that other species. between the hemoglobins are function-
provide these side chains. Posi- Residue in human Residue in nor- ally neutral.
It is our view that, while there are tion hemoglobin A mal animal This view is supported by studies of
some restrictions on the replacements in -o c human hemoglobin mutants reported
chain Normal Mutant hemoglobin
at variable sites in cytochrome c, the by Perutz and Lehmann (22). Because
possibilities for such replacements are a22 Gly Asp Carp Asp of the screening methods used, only
extensive, and that many of the exist- a57 Gly Asp Orangutan Asp mutations involving electrophoretic
a68 Asn Lys Rabbit Lys,
changes were generally available for
ing replacements are neutral. Further- sheep Lys
more, the possibilities for replacements a68 Asn Asp Carp Asp study. Such changes have proved to be
are by no means exhausted by the list ,B16 Gly Asp Horse Asp harmful when they occur in the interior
that is available as a result of analyses p69 Gly Asp Bovine Asp of the molecule; a number of these
of cytochromes c from about 25 differ- ,387 Thr Lys Pig Lys,
interior changes were considered in
rabbit Lys
ent species. It is quite likely that all of ,B95 Lys Glu Pig Glu detail with respect to their effects on
the remaining possible replacements clinical symptoms, on chemical prop-
are present in the cytochromes c of the erties, and on molecular structure.
many millions of species which -have remaining 74 to 81 residues are vari- Most changes in residues occurring on
not yet been examined. It may also be able, and substitutions in the variable the exterior of the hemoglobin mole-
presumed that the cytochromes c are sites of the cytochromes c seem to cule appear to be harmless, at least in
still evolving, and it is possible that follow the Poisson distribution (Table the heterozygous state. Many of the 59
many neutral evolutionary replacements 2); this would indicate that there is different external replacements which
of their amino acid residues are yet to very little restriction on the type of have been found to occur in human
be made. Thus, two views are expressed amino acid that can be accommodated hemoglobin are the counterparts of
regarding the number and distribution at most of the variable sites. This con- variations at the corresponding sites
of amino acid replacements in the evo- clusion is supported by the observation in normal hemoglobins of other mam-
lution of homologous proteins. The first that many of the variable sites show malian species (Table 4). The infer-
is that of the protein chemist, who sees interchanges between neutral, acidic, ence is drawn that these replacements
the replacements as being related and basic amino acid residues, or be- are functionally equivalent and selec-
solely to function. The external regions tween hydrophobic and hydrophilic tively neutral, despite the fact that they
of the protein molecule are less re- amino acid residues. involve changes in net charge.
stricted with respect to change than are Leucine, isoleucine, and valine are The occurrence of hemoglobin vari-
the internal regions, which must often similar to each other in structure and ants in the human population has been
be occupied by hydrophobic side chains. properties. We suggest that the leucine- discussed by Sick et al. (23). They
Certain residues are invariant because isoleucine-valine substitutions in the found ten hemoglobin variants among
they are essential to enzymatic func- cytochromes c at sites 17, 19, 43, 65, 8000 Europeans examined by a screen-
tion. It is the necessary properties of 66, 93, and 103 (Table 3) are neutral ing procedure in which histidine was
the protein that dictate its primary rather than adaptive, and that many not distinguishable from the neutral
structure. This view tends to push other neutral substitutions exist in the amino acids. Of 2217 theoretically pos-
DNA, as the driving force in evolution, cytochromes c, particularly at sites sible amino acid substitutions in a- and
into the background. where there are many interchanges. B-chains, only 700 would cause a
The second view, to which we sub- Matsubara and Smith (20) reported change in charge, so possibly the ten
scribe, is that the protein molecule is a variant human cytochrome c in which detected variants represented a total of
continually challenged by mutational the leucine residue at site 65 was re- 32 occurrences in 8000 subjects-an
changes resulting from base substitu- placed by a methionine. The source incidence of 0.4 percent. Additional
tions and other mutational events in material was a composite sample ob- surveys (24) brought the total number
DNA. Natural selection screens these tained from approximately 70 individ- of subjects to 20,000. The incidence of
changes. The fact that some variable uals. Matsubara and Smith concluded variants found by electrophoresis was
amino acid sites are more subject to that a single mutant would account for 1 per 1800, corresponding to an actual
change than others in a set of homol- the observation. Information is much occurrence of 1 in 600.
ogous proteins is an expression pri- more extensive on the occurrence of On this basis it is estimated that there
marily of the random nature of point hemoglobin variants, as discussed be- are 5 million hemoglobin A variants in
mutations and only secondarily of pro- low. the total human population of 3 billion.
tein function. As shown in Table 2, Hemoglobins. The structure and The total number of possible amino
the five so-called "hypermutable sites" functional relationships of hemoglobins acid replacements in hemoglobin A is
(18) in cytochrome c, which have six have been studied more extensively about 2217. If half of these replace-
or more changes per site, are predict- than those of any other proteins, and ments could occur without greatly dis-
able in terms of the Poisson distribu- no other proteins are known to have rupting the secondary and tertiary
tion. a comparable variability of molecular structure of the hemoglobin molecule,
About 29 of the amino acid residues structure. This variability occurs despite the number of different variants should
in the cytochrome c are invariant (18, the fact that all of these polypeptide be about 1100. Perutz and Lehmann
19). These residues are needed for chains are of approximately the same (22) listed 82 identified mutations in-
combining with the heme group, for length. The oxygen dissociation con- volving single amino acids in the a- and
interacting with cytochrome c oxidase, stants of various mammalian hemo- Bl-chains; this number would be 7.5
and possibly for other functions. The globing do not vary significantly (21). percent of 1100 mutations, or 3.9 per-
16 MAY 1969 791
cent of the 2217 total theoretically could be stored for use. It can be ten times as fast per codon as cyto-
possible mutations in the 0.0007 per- argued that, in this special case, most chrome c has, one can conclude that
cent of the population so far examined. mutations would be potentiallLy bene- at least 90 percent of all substitutional
Recently Lehmann and Carrell (25) ficial rather than neutral or del eterious. mutations at the cytochrome-c locus are
have increased this listing to 94 identi- Once again, generation of evollutionary harmful and are rejected by natural
fied mutations. These calculations show changes appears to originate p rimarily selection.
a high probability that all the theo- from random point mutations. Fibrinopeptide B, the other fragment
retically possible variants of hemoglo- The distribution of changess in the removed from fibrinogen, is so change-
bin exist. S-regions shows the presence of four able in evolution and so subject to gaps
The distribution of changes shown in hypervariable sites with five changes, and terminal deletions that we have
Table 2 shows that there are no "hy- These are present in the "hinge region" made no attempt to calculate its
permutable sites"; the numbers of sites (27) (see Table 2). evolutionary rate.
with seven, eight, and nine changes fit Fibrinopeptide A. Fibrinope ptide A Histone IV. Histone IV, a nucleo-
the Poisson distribution. is one of two peptide fragments; that are protein, shows remarkable evolutionary
Immunoglobulins. An examination of removed enzymatically from fi brinogen conservatism (28). On the basis of in-
the distribution of changes of amino in the formation of the blood1-clotting complete sequence analysis of this 101
acids in the specificity regions (S- protein fibrin. Its function is to block amino-acid protein, there appear to have
regions) of the immunoglobulin-G light a site of polymerization. The relative been only two substitutions in the evolu-
chains which have been analyzed shows rapidity of evolutionary chanige in fi- tionary lines of peas and cattle since
that the changeg are distributed in a brinopeptide A (Tables 1 and I5) would deviation from their common ancestor,
random manner, similar to the distribu- seem to imply that its primary structure perhaps a billion years ago. This is a
tions in the globins and cytochromes c is not very critical, and that a irelatively rate of change of one substitution per
(Table 2) (26). The S-regions are large proportion of substitutionial muta- line per codon every 1011 years. It must
presumed to combine with antigenic tions are not rejected by natuiral selec- be that virtually all mutations at the
determinants in immunological reac- tion. Even within the short fibrino- histone-IV locus are rejected.
tions. Consequently it is advantageous peptide-A fragment, however, some The concept of neutral mutations
for an animal to have a large number positions are notably less cl tangeable makes it possible to resolve certain
of different S-regions for defense against than others. It is quite likely 1that only dilemmas in the study of evolution.
numerous antigens. This could be the a minority of the changes thLat occur For example, primates and guinea pigs
case if there were thousands of copies in this portion of the fibrinogen gene are are unable to convert 2-keto-L-gulono-
of the S-region cistron in the genome, selectively neutral. But from tihe obser- lactone to ascorbic acid, hence are sub-
so that numerous mutational variants vation that fibrinopeptide A has evolved ject to scurvy when placed on a diet
lacking in vitamin C. All other animals
that have been examined are free from
Table 5. Amino acid substitutions in mammalian evolution. this metabolic defect and are able to
Observed Observed synthesize ascorbic acid. Evidently the
Comparison differences Comparison differences defect in primates and guinea pigs is
the result of an evolutionary change.
Insulin A and B (except for guinea Ribonuclease: 124 amino ac How could such a nonadaptive change
pig insulin): 51 amino acids; 510 124 comparisons of homologo&,ssites
comparisons of homologous sites Bovine: rat 40 pass into the species?
Human: horse 2 The probable answer is that the change
Human: rabbit 1 Immunoglobulin (constant half of light
3 chain): 102 amino acids; 102 connparisons was a neutral one when it occurred and
Human: sei whale
Human: bovine 3 of homologous sites when it entered the genome. Primates
Horse: rabbit 3 Human: mouse 40 and guinea pigs under "natural" condi-
Horse: sei whale 3 acids; tions have diets that contain adequate
Horse: bovine 3 Fibrinopeptide A: 16 amino
Rabbit: sei whale 3 160 comparisons of homologo bus sites amounts of vitamin C. Man does not
3 Human: donkey 7
Rabbit: bovine develop scurvy unless he subsists on a
Bovine: sei whale 1 Human: rabbit
Human: bovine s diet in which dried foods, refined foods,
Cytochrome c: 104 amino acids; Human: dog 5 and grain products predominate; the
1040 comparisons of homologous sites 10
Human: horse 12 Donkey: rabbit guinea pig is known to develop scurvy
Human: rabbit 9 Donkey: bovine 8
only when, as a laboratory animal, it
Human: pig 10 Donkey: dog
Human: gray whale 10 Rabbit: bovine 10 is deprived of its customary supply of
Horse: rabbit 6 Rabbit: dog 10 fresh green leaves. Here, therefore, is an
Horse: pig 3 Bovine: dog 8 instance of a neutral change becoming
Horse: gray whale 5
Rabbit: pig
Rabbit: gray whale
4
2
146 amino acids; 438 comparhains
Bovine fetal-hemoglobin /3-c detrimental as the result of an "arti-
ficial" change in the environment.
2
of homologous sites
Pig: gray whale 33
Bovine fetal: human 13
Bovine fetal: rabbit 3 33
Hemoglobin a: 141 amino acids;
423 comparisons of homologous sites Bovine fetal: horse ,B 31
Human: horse
17
18 Guinea pig insulin (51 amino acids) Apparently Neutral Mutations
Human: mouse compared with other mammalia,d insulins; in Escherichia coli
Horse: mouse 23 255 comparisons of homologc ~us sites
Hemoglobin /3: 146 amino acids; Guinea pig: human 18 Certain revertants of the tryptophan
438 comparisons of homologous sites Guinea pig: horse
Human: horse 25 Guinea pig: rabbit 18 synthetase-A protein in Escherichia coli
Human: rabbit 14 Guinea pig: whale 16 appear to be neutral changes (29).
Horse: rabbit 24 Guinea pig: bovine 17 These were discovered as follows. Muta-
792 SCIENCE, VOL. 164
tions in the glycine residue at site 210 gene inactivation may generally be too guanine was incorporated, as a "mis-
(Gly210) to arginine or glutamic acid low by an order of magnitude. Since take," at a frequency of one per 2,000
produced nonfunctional tryptophan the assay distinguished only between to 25,000 adenine and thymine nucleo-
synthetase. Revertants of the arginine functional and nonfunctional alleles, it tides polymerized. Subsequently Hall
residue to serine, or of the glutamic is not possible to say what proportion and Lehman (33) found that, during
acid residue to alanine, were fully of the unrecovered amino acid substitu- the synthesis of poly-dG on a dC tem-
functional. Therefore, direct mutations tions-if any-were fully functionally plate by T4 bacteriophage DNA poly-
of GlY210 to serine or alanine would be equivalent to the original form. This merase, T was incorporated instead of
undetectable, and the changes were experiment suggests that the total muta- G at a level of 10-5 to 106. The error
found only because of the intervening tion rate is perhaps ten times the muta- rate was increased fourfold when a
stage of arginine or glutamic acid. It is tion rate detectable by standard means. mutant form of DNA polymerase was
to be expected that neutral mutations Although detectable per-locus muta- used.
may occur even more readily at other tion rates vary considerably, geneticists While fidelity of replication is neces-
sites in this protein; Gly210 is evidently are accustomed to think of a convenient sary for the hereditary process, it is
at a site that is part of the active center. "standard" mutation rate as 10-5 muta- probable that this small amount of in-
tion per gamete per locus. This accom- fidelity is the major driving force in
modates Drosophila recessive lethal and evolution.
Rate of Spontaneous visible mutations, and human and other
Amino Acid Substitutions mammalian recessive mutations. If the
work of Whitfield et al. in Salmonella Rates of Amino Acid Substitution
So far, all direct studies of mutation is at all relevant to higher organisms, a in Mammalian Evolution
rate have depended on the detection reasonable approximation for the total
of mutant genes through some grossly mutation rate, including all mutations Table 5 presents the observed amino
observable effect on function, such as with immeasurably small effects, might acid differences for several proteins in
a change in morphology or viability. well be 10-4 per locus per gamete (30). comparisons between representatives of
Neutral and nearly neutral mutations This contention would seem to be different mammalian orders. All the
have not been systematically observed supported by Mukai's work on "viability proteins have been completely se-
in mutation-rate studies, although they polygenes" (31). Mukai has shown by quenced. In Table 1, evolutionary rates
are potentially observable through careful experiments, involving the are given in terms of substitutions per
modern biochemical techniques. counting of 2.5 million flies, that the codon per year per evolutionary line.
Some indirect estimates of the total rate of spontaneous mutation for the Not all evolutionary changes that
rate of amino acid substitutions are whole genome for slightly deleterious have occurred in the divergence of two
available. Whitfield et al. (30) developed mutations (with an average relative lines can be observed in a direct com-
techniques by which they were able to fitness of homozygotes greater than 98 parison of living representatives. A
analyze the molecular bases of condi- percent of normal) is at least 20 to 30 given site may be changed more than
tional lethal mutations recovered at the times as high as the total rate for once in one evolutionary line, with the
histidine-C locus in Salmonella. Of 65 recessive lethal genes. At least 35 per- result that there is only one observed
such mutations recovered, 22 were base- cent of all Drosophila gametes carry a amino acid difference where there were
substitution mutations resulting in chain- new, slightly harmful mutation. Some two evolutionary events. If the second
terminating codons, and 21 were base- of these slightly deleterious mutations change should happen to have been a
substitutions resulting in nonfunctional may represent the complete loss of func- return to the original amino acid, which
proteins. But, according to the genetic tion of genes which have, at most, only is likely if the two events are function-
code table, there are 549 possible marginal effects on fitness in the ally equivalent at a particular site, no
single-base-substitution mutations; of laboratory. Other slightly deleterious evidence of evolutionary change would
these, 392 result in amino acid changes; mutations are probably changes to remain. Similarly, both diverging lines
there are only 23 kinds of single-base- slightly less effective alleles of vital may have incorporated evolutionary
substitution mutation which result in genes which are also capable of mutat- changes at a homologous site, resulting
the replacement of an amino-acid- ing to fully lethal alleles. Still unde- in only one observed difference or none;
specifying codon with one of the three tected, even in Mukai's work, are the for example
chain-terminating codons. Whitfield et selectively neutral biochemical muta-
al. (30) reasoned that base changes
were probably random, and that only
tions.
The replication of DNA takes place Ala
IGly A Va 1
23/549 of all such substitutions would with astonishing fidelity, so that the
or Ala
be expected to have resulted in chain- daughter strands are complementary to \Va Va1
termination mutants. Thus the recovery the parent strands. This accuracy of It is difficult, in comparisons of
of 22 chain-termination mutants implies replication is essential to heredity and,
that an estimated 525 base-substitution homologous sites, to correct for back
indeed, to the continuation of terrestrial mutations, which show spurious identi-
mutations actually occurred, of which life. Trautner et al. (32) found that the ties. Some corrections can be made for
375 resulted in amino acid changes. frequency of incorporation of G during other sequential changes, however; if
Most of these mutants were not re- enzymatic replication of d(AT) copoly-
covered, presumably because the altered evolutionary substitutions are assumed
mer was less than one residue per to be randomly distributed throughout
enzyme remained functional. Since only 28,000 to 580,000 adenine and thymine the gene, single and multiple "hits" are
about 10 percent of mutants of all kinds nucleotides polymerized. In the replica- distributed according to the Poisson
were recoverable, mutation-rate esti- tion of an analogous polymer contain- distribution. The frequency of un-
mates based on the usual criterion of ing bromouracil instead of thymine, changed sites would be e-P, where p is
16 MAY 1969
793
the true frequency of evolutionary sub- lutionary change; (ii) on studies of 99 percent of mammalian DNA is not
stitutions per site (34). This correction cytochrome c, which is a relatively true genetic material, in the sense that
has been used in Table 1. slowly evolving protein; and (iii) on a it is not capable of transmitting muta-
The assumption of a random dis- minimum estimate based on unse- tional changes which affect the pheno-
tribution of evolutionary amino acid quenced analysis of triosephosphate type, or 40,000 genes is a gross under-
substitutions must be modified, of dehydrogenase, this probably being a estimate of the total gene number.
course, by recognition that some sites gross underestimate of the true evolu- Rates of spontaneous mutation to
are invariant and others are restricted tionary rate for that enzyme. The aver- recessive lethal and visible mutants in
to, for example, hydrophobic side age rate of evolution per codon in the mammals are of the order of 10-6 to
chains. The capacity of the highly completely sequenced proteins listed in 10-5 per locus per generation (38). If
changeable sites to reflect evolutionary Table 1 is five times Kimura's conser- there are 40,000 genes, the total rate
divergence may eventually be ex- vative underestimate. If the rate per of mutation to lethal or nonfunctional
hausted, so that the amount of evolu- codon is extrapolated to the entire alleles would be between 4 and 40 per-
tionary change will be underestimated. haploid DNA genome of 4 X 109 cent per gamete. From this considera-
For example, the rate of change in fi- nucleotide pairs, as has been done pre- tion alone, it is clear that there cannot
brinopeptide A in closely related artio- viously (4, 37), it would appear that be many more than 40,000 genes.
dactyls (35) appears to be greater than mammalian evolution is proceeding at In extensive studies of the spontane-
the rate calculated from comparisons the rate of about two allele substitu- ous mutation rate of Drosophila mela-
between more distantly related mam- tions per year. In relatively long-lived nogaster, the average lethal mutation
mals. mammals this may be 20 substitutions rate was 3 x 10-6 per locus and 10-2
All major euplacental orders diverged per species per generation; in the hu- per genome (39). Thus, the fruit fly
from a common ancestor in a relatively man species, this is an evolutionary has about 3000 loci that are capable of
dtt period, approximately 70 to 80 rate of nearly 60 amino acid substitu- mutating to lethal alleles. If only a third
minion years ago [G. G. Simpson, cited tions per generation, implying a ge- of all loci are capable of mutating to
in (36)]. In Table 1, the evolutionary rate nome mutation rate including 60 neu- lethal alleles under laboratory condi-
is calculated as the adjusted frequency of tral amino acid substitutions per gamete. tions, there may be perhaps 10,000
evolutionary differences per codon, in For several reasons this seems much Drosophila cistrons. If the average
comparisons between representatives of too high. cistron size is 1000 nucleotides, this
pairs of mammalian orders, divided by For one thing, about 4 percent of accounts for about 10 percent of Dro-
150 million (75 million years for each base substitutions result in chain-ter- sophila DNA (8), since drosophilas
line of descent). minating codons; 60 amino acid substi- have much less DNA per cell than
Different proteins evolve at different tutions imply about three chain-termi- mammals have.
rates, and different sites within specific nating mutations per gamete. Most There is more direct evidence for
proteins evolve at different rates. It is chain-terminating mutations, if they the existence of nongenetic DNA.
possible that these differences reflect occur in structural genes, are lethal, or Heterochromatin is known to be nearly
differential mutability of the DNA it- at least produce nonfunctional alleles devoid of specific genetic information,
self, but to us this seems unlikely. It is which have to be eliminated through yet it accounts for about a third of the
more likely that proteins, and sites with- natural selection. No organism having DNA of those species in which it is
in proteins, differ with regard to the three lethal or severely deleterious mu- cytologically detectable. About 30 per-
stringency of their requirements. The tations per gamete can survive. In addi- cent of mammalian DNA consists of
average rate of evolutionary change as tion, frame-shift mutations, also lethal highly repetitive sequences of unknown
shown in Table 1 is 16 X 10-10 sub- in structural genes, appear to occur function (9). In some species there are
stitution per codon per species per about as frequently as chain-terminat- varying numbers of supernumerary
year. ing mutations (30), and certainly some chromosomes that appear to be of no
Kimura (4) has estimated, in agree- \ of the amino acid substitutions are survival value to the organism.
ment with Jukes (37), that total molec- lethal or biologically harmful. Indeed, Perhaps the most compelling argu-
ular evolution in vertebrate species as we attempt to demonstrate below, ment for the existence of superfluous
proceeds at the rate of about one amino it is unlikely that more than about 10 DNA is the wide range in the DNA
acid substitution every 2 years. Argu- percent of all mutations are selectively content of vertebrate cells (40, 41).
ing that Darwinian evolution at that neutral. The average mammalian cell contains
rate would require greater selection A second error is the assumption more than twice the DNA of the
pressure than any species can afford, that all or most mammalian DNA con- chicken cell and almost four times that
Kimura concluded that most amino sists of structural genes. Older estimates of the cell of the gar pike. The cell of
acid changes must be due to the pas- (see 38) of maximum gene number in the bullfrog contains twice as much
sive fixation of selectively neutral mammals rarely exceed 40,000 genes DNA as that of the toad, and two and
mutations. per haploid genome. If the average a half times as much as that of a man,
While we tend to agree with this gene consists of 1000 nucleotide pairs, while the cell of a lungfish has a DNA
conclusion, there are several reasons extrapolation from the estimated evo- content 17 times that of the human cell
for questioning the arguments on which lutionary rate of 16 X 10-10 substitu- and almost 60 times that of the pike
it was based. Kimura's estimate was tion per codon per year gives one cell. Can it be that these wide diver-
deliberately conservative in some re- amino acid substitution per species per gences in DNA content reflect wide
spects. The estimate was based (i) on 50 years. This is a far more believable divergences in the number of functional
comparisons of the beta chains of horse figure. But only 4 X 107 nucleotide genes? This hardly seems likely.
and human hemoglobins, which appear pairs, or 1 percent of the mammalian On the other hand, a substantial
to have about an average rate of evo- genome, is thus accounted for. Either proportion of mammalian DNA is
SCIENCE, VOL. 164
794
capable of forming hybrids with spe- tural DNA itself, and imply that most older allele or alleles. Such uncondi-
cific messenger RNA in vitro (42). base substitutions occurring in the tionally adaptive new mutations, which
Possibly, as Callan suggests (40), nu- structural genes of more slowly evolv- must be very rare, have relatively high
merous nonheritable copies of the es- ing proteins are deleterious. probabilities of eventual fixation. Spe-
sential genetic material are created Natural selection is indirectly oper- cifically, the probability of fixation is
anew each generation. These multiple ative in the patterns of neutral evolu- 2s(NJ1N), where 1 + s is the relative
copies would transmit specific informa- tionary change in that only functionally fitness of the new heterozygote and Ne
tion by way of messenger RNA, but equivalent isoalleles are allowed the and N are, respectively, the effective
would not be true genetic material in small possibility of fixation through and the actual number of the popula-
that they would not transmit informa- random genetic drift. Those alleles tion (46). If u is the rate of occurrence
tion to future generations and would which do become fixed through drift of favorable mutations, per gamete, the
not be directly involved in evolutionary are not a random selection of all sub- rate of Darwinian evolutionary fixation
processes. Another important possibility stitutional mutations, but alleles which is 4usN,. Gene duplications and partial
is that much of mammalian DNA is have been "selected" for innocuousness. duplications that have become fixed in
involved in the complexities of the im- evolution are quite good candidates for
mune response (26). this class of mutations. The rate of
Allele Selection through occurrence of such evolutionary fixa-
Darwinian Evolution tion is a direct function of the total
What Proportion of AJI Mutations occurrence of such beneficial mutations
Is Selectively Neutral? One amino acid substitution every in the population, and is thus a function
50 years is still too rapid a rate to be of the population size of the species.
Since the rate of fixation of selec- accounted for by classical genetic In this situation evolution waits on
tively neutral mutations per species is theory unless most substitutions are mutation.
equal to the mutation rate for neutral selectively neutral. This is the argument In other cases, allele changes de-
mutations per gamete, the observed from which Kimura (4) derived the pend upon environmental or other ex-
rate of evolutionary change represents conclusion that molecular evolution trinsic changes, including other changes
the upper limit of the neutral-mutation was primarily through drift. Haldane in the genetic background. Specific mu-
rate. Thus the neutral-mutation rate in (43) calculated that Darwinian evolu- tations which may have occurred repeat-
mammalian structural genes cannot be tion cannot proceed at a rate greater edly have been nonadaptive or deleteri-
higher than about 16 X 10-10 mutation than about one allele substitution every ous in previous environments; in a new
per codon per year, the observed rate 300 generations; a higher rate of adap- environment the same mutations be-
of protein evolution. If the average tive evolution would produce an un- come advantageous, and increase to
locus consists of about 1000 nucleotide bearable "genetic load" associated with fixation. The rate of this kind of evo-
pairs, the upper limit to the neutral- the elimination of the older, less-favored lutionary change is a function of en-
mutation rate is about 5 X 10-7 per alleles. This tends to support our prin- vironmental change, and is nearly
year, or 3 X 10-8 per locus in such cipal hypothesis, but the idea of an independent of either population size
mammals as have an average genera- unbearable genetic load has been or rate of mutation of any kind (47).
tion span of 6 years. This is approx- strongly challenged recently (44, 45) Rather small selective advantages
imately the mutation rate per locus of since it depends on the erroneous as- for relatively rare favorable mutations
recessive lethals. From the work of sumption of independent action of are required to account for rates of
Mukai (31) and Whitfield et dl. (30) genetic and environmental factors Darwinian selection consistent with the
it appears that very slightly deleterious affecting fitness. Sved and Maynard observed and calculated evolutionary
mutations are some ten times as fre- Smith have shown independently (45) rates. As a numerical example, suppose
quent as recessive lethals; thus it would that even the high rate of evolution that the probability of a favorable mu-
appear that something of the order of calculated by Kimura (4) is not in- tation (or of the combination of a mu-
80 or 90 percent of spontaneous mu- compatible with Darwinian adaptive tation and an appropriate change in
tations are mildly deleterious, 5 to 10 evolution. environment) were only 10-10 per
percent are lethal, and 5 to 10 percent Adaptive change, wherein the new gamete for a certain locus. That is,
are selectively neutral. allele increases to evolutionary fixation about one mutation in 100,000 muta-
The apparent discrepancy between because the carrier of the new form is tions would be favorable. Suppose that
calculated evolutionary rates for DNA more fit than the homozygote of the the average selective advantage of the
and protein (7, 8) is consistent with old form, can be inferred to have oc- new isoallele over the old were
this interpretation. If base substitutions curred at the molecular level, from the 0.0005-a very small advantage. If the
in a significant proportion of mam- indisputable fact of adaptive evolution effective total number of the species
malian DNA are not subject to natural at the morphological and physiological were 500,000, the expected rate of
selection, while base substitutions in levels. Direct evidence of such change Darwinian evolutionary fixation at this
structural DNA (that is, DNA that at the molecular level, however, has locus would be 10-7 per generation.
codes for proteins) are usually elim- been rather scanty, perhaps because This is not in the range of observed
inated by natural selection, structural fitness is so difficult to measure.
DNA will diverge at a rate slower than Allele replacement through positive
evolutionary rates, but the expected
rate becomes an acceptable 10-6 per
the rate of divergence for total DNA. selection can be the result of any of
Again the difference is of one order of several rather different situations. One
generation with an effective species
number of 5 million, or a favorable
magnitude. Finally, the rapidly evolv- is the occurrence of a new, unprece- mutation rate of 10-0 per generation,
ing fibrinopeptides indicate something dented mutation which is immediately or an average selective advantage of
about the mutability potential of struc- and unconditionally superior to the 0.005. It would appear that the ob-
16 MAY 1969
795
l Expectations for Models of Darwinian any other mammalian insulin studied;
and Non-Darwinian Evolution Darwinian change is therefore indi-
8 Ala Gly Oe cated in this evolutionary development
y8Ve *LeX The rate of non-Darwinian change (Tables 1 and 5).
X6 Asp Glu* * hr- equals the rate of selectively neutral It is fortunate for the biochemical
-o* Pro mutation and is independent of en- taxonomist that most proteins studied
As4 Ph 7'n,l Arg - vironmental fluctuations and of popu- exhibit relatively uniform rates of
Cys //Gln lation size. For a given protein, the rate change, as this is a required feature
0
2 - Met/ H of such change should be nearly con- of most models of biochemical tax-
Trp stant. Darwinian change, in contrast, onomy. Uniform rates of evolutionary
I2 is under the influence of changing en-
change also lend credence to the propo-
0 2 4 6f 10 12 vironment, adaptive radiation, fluctua- sition that a substantial proportion of
tions in population size, and such fac- evolutionary change at the molecular
Fig. 1. Graph showing the similarity be- tors as adjustment to major changes in level is due to the random incorpora-
tween the observed frequencies of amino the genetic background. Thus it might tion of functionally insignificant change.
acids in 53 completely sequenced mam- g .
malian proteins and the frequencies pre- well be subject to bursts of rapid change
dicted by the genetic code and random in some species and relative stability in
permutations of DNA nucleotides. The fre- others. Amino Acid Composition
quencies are in percentages of total amino Sarich and Wilson (48) have re-
acid content. The straight line represents Another difference in the expecta-
an idealized equality of expectation and ported that the rate of evolutionary tions based on the Darwinian and non-
observation. change in the immunological properties
of primate albumin seems to be re- Darwinian models pertains to amino
markably constant in numerous species. acid composition. In the non-Darwin-
served rates of evolutionary change at The rates of evolutionary change in the ian model the amino acid composition
the molecular level are consonant either primary structures of hemoglobin and should be strongly influenced by the
with predominantly non-Darwinian fix- of cytochrome c also appear to be rela- genetic code, since, by hypothesis, a
ation of random neutral change, or tively constant (Table 5). Insulin ap- significant proportion of the amino
with predominantly Darwinian positive pears to be stable in most lines of acids present have arisen by random
selection for favorable mutations, or descent. 'Guinea pig insulin, however, mutation and drift. In the Darwinian
with any mixture of the two. has markedly more substitutions than model, one particular amino acid will
be optimum at a given site in a given
organism, and it matters little whether
Table 6. Amino acid frequencies among 5492 residues in 53 vertebrate polypeptides, compared there are six possible codons (as there
with the frequencies expected with random permutations of nucleic acid bases. are for serine) or only one (as there
Observed Expected is for methionine). However, if one
Number of, allows for numerous sites, within pro-
Aoino acid Codons ccurrences frequency frequency
teins, at which amino acid composition
UCU, UCA 443 8.1 8.6 is not critical, then a given site at a
Serine
UCC, UCG given point in evolutionary time is six
AGU, AGC times more likely to be serine than
Leucine CUU, CUA 417 7.6 7.9 methionine. Other amino acids will be
CUC, CUG present in rough accordance with their
UUA, UUG
Arginine CGU, CGA 229 4.2 10.7 numbers of synonymous codons,
CGC, CGG weighted by the frequencies of the
AGA, AGG nucleic acid bases involved. And this
Glycine GGU, GGA 408 7.4 7.2 is what is found when total amino acid
GGC, GGG compositions of large numbers of pro-
Alanine GCU, GCA 406 7.4 6.0
GCC, GCG teins are analyzed (6, 49).
Valine GUU, GUA 375 6.8 6.1 The amino acid compositions of 53
GUC, GUG vertebrate (mostly mammalian) poly-
Threonine ACU, ACA 339 6.2 6.9
peptides were taken from data of Day-
ACC, ACG hoff and Eck (50). Several pairs of
Proline CCU, CCA 275 5.0 5.0
CCC, CCG related polypeptides were included, but
Isoleucine AUU, AUA 209 3.8 5.2 none with greater than 80 percent
AUC
5.5
homology. The total number of amino
Lysine AAA, AAG 394 7.2
GAA, GAG 317 5.8 4.7 acid residues involved was 5492, dis-
Glutamic acid
Aspartic acid GAU, GAC 322 5.9 3.6 tributed as shown in Table 6. For the
Phenylalanine UUU, UUC 222 4.0 2.2 first two positions of the codons making
Asparagine AAU, AAC 243 4.4 4.2 up the relevant messenger RNA, the
Glutamine CAA, CAG 203 3.7 3.9 base composition is as follows: uracil,
Tyrosine UAU, UAC 183 3.3 3.1
Cysteine UGU, UGC 181 3.3 2.6 22.0 percent; adenine, 30.3 percent;
Histidine CAU, CAC 158 2.9 3.0 cytosine, 21.7 percent; guanine, 26.1
Methionine AUG 96 1.8 1.8 percent.
Tryptophan UGG 72 1.3 1.6 Note that in this sample, which
796 SCIENCE, VOL. 164
presumably reflects one of the two from the base composition. Subak- in protein function. The principal evi-
DNA strands, G + A is not equal to Sharpeet al. (51) have suggested that dence for this is the astounding vari-
C + U. The implied asymmetry of mammalian cells rarely use the arginine ability in primary structure of homol-
the composition of the transcribed and codons CGU, CGC, CGA, and CGG, ogous proteins from various species,
nontranscribed strands of structural and they have also suggested (52) that and the rapid rate at which molecular
DNA is of considerable interest in it- "the CpG shortage observed in mam- changes accumulate in evolution.
self. The G + C content is 47.8 per- malian DNA has a magnitude which
cent. We will make the assumption References and Notes
virtually precludes the use of CpG for
that the distribution of third-position general coding for amino acids." 1. G. G. Simpson, Scienice 146, 1535 (1964).
2. P. Weiss, in The Molecular Control of Cellu-
bases in this sample is the same as that Various possibilities suggest them- lar Activity, J. M. Allen, Ed. (McGraw-Hill,
of the first- and second-position bases. selves in explanation of the comparative New York, 1961), p. 1.
3. E. L. Smith, Harvey Lectures Ser. 62 (1965-
A hypothesis can then be tested: are rarity of CpG doublets. One is that (1960), 231 (1967).
the amino acid residues distributed ac- mutation to CpG-containing codons is 4. M. Kimura, Nature 217, 624 (1968).
5. R. A. Fisher, Proc. Roy. Soc. Edinburgh Sect.
cording to random permutations of the relatively rare, because of some un- B 50 (1928-29), 205 (1930).
nucleic acid bases? 6. M. Kimura, Genet. Res. 11, 247 (1968).
known aspect of mutation-producing 7. P. M. B. Walker, Nature 219, 228 (1968).
For example, the codons for tyro- mechanisms. A second possibility is 8. C. Laird, B. L. McConaughy, B. J. McCarthy,
in preparation.
sine are UAU and UAC. With the that such mutations do occur, but that 9. R. J. Britten and D. E. Kohne, Science 161,
messenger RNA base composition CpG doublets are regularly back- 529 (1968).
10. E. C. Cox and C. Yanofsky, Proc. Nat. Acad.
calculated, the random expectation for mutated to other forms during DNA Sc. U.S. 58, 1895 (1967).
the frequency of tyrosine is (0.220) replication. A third possibility is that 11. The following abbreviations are used in this
article: A, adenine; C, cytosine; G, guanine;
(0.303) (0.220) + (0.220) (0.303) CpG-containing codons, although syn- T, thymine; U, uracil; A * T, base pair in
(0.217)-that is, 0.0292. Since not all DNA-adenine in one strand paired with
onymous with other normal codons, thymine in the complementary strand; CpG,
codons specify amino acids, this value are in some way disadvantageous and a nucleotide doublet-cytidylic acid and guany-
should be multiplied by a correction lic acid in a 3'-5' linkage; d(AT) copolymer,
are eliminated by natural selection. A synthetic DNA consisting of alternating A
factor of 1.057. The expected fre- fourth possibility is that the amount and T bases in each complementary strand;
quency of tyrosine is thus 3.09 percent; Hb, hemoglobin; Ala, alanine; Arg, arginine;
of arginine that can be tolerated in Asn, asparagine; Asp, aspartic acid; Cys,
the observed frequency is 3.33 percent. animal proteins is less than the amount cysteine; Gln, glutamine; Glu, glutamic acid;
[For a similar approach with other Gly, glycine; His, histidine; Ile, isoleucine;
which would result from the occurrence Leu, leucine; Lys, lysine; Met, methionine;
data, see (6).] of all six arginine codons at a random Phe, phenylalanine; Pro, proline; Ser, serine;
Thr, threonine; Trp, tryptophan; Tyr, tyro-
Expected and observed frequencies rate, so that the CpG content of animal sine; Val, valine; N, any nucleotide.
of all the amino acids are presented 12. J. F. Speyer, Biochem. Biophys. Res. Com-
DNA has been lowered by natural mun. 21, 6 (1965).
in Table 6. Although the distribution selection. There is some evidence that 13. N. Sueoka, Proc. Nat. Acad. Scl. U.S. 47,
1141 (1961).
of amino acids is not completely ran- CGN arginine codons are present in 14. For further discussion, see T. H. Jukes, Mole-
dom-notably in the case of arginine, mammalian DNA-for example, the cules and Evolution (Columbia Univ. Press,
New York, 1966).
which occurs at a frequency less than occurrence in hemoglobin of mutations 15. E. Freese and A. Yoshina, in Evolving Genes
half that expected-for the most part between arginine and histidine, leucine, and Proteins, V. Bryson and H. J. Vogel, Eds.
(Academic Press, New York, 1965).
the fit is remarkably good, which indi- proline, and glutamine (22), all of 16. E. E. Jacobs and D. R. Sanadi, J. Biol. Chem.
cates a very strong influence of the ge- which mutations require CGN codons 235, 53 (1960).
17. E. Margoliash and E. L. Smith, in Evolving
netic code on protein composition. When for single-base changes. Genes and Proteins, V. Bryson and H. J.
arginine is disregarded, the coefficient It has been argued (49) that the Vogel, Eds. (Academic Press, New York,
1965).
of correlation (r) between the expected genetic code evolved to its definitive 18. W. M. Fitch and E. Margoliash, Biochem.
Genet. 1, 65 (1967).
and the observed frequencies is 0.89 form because this form best matches 19. T. H. Jukes and C. R. Cantor, in Mammalian
(see Fig. 1). The opposing hypothesis, the amino acid composition of living Protein Metabolism, vol. 3, H. N. Munro,
Ed. (Academic Press, New York, in press).
that all evolutionary change depends material; we suggest that the relation- 20. H. Matsubara and E. L. Smith, J. Biol. Chem.
upon natural selection, predicts that ship is the other way around, and that 238, 2732 (1963).
21. A. Riggs, Nature 183, 1037 (1959).
there should be no relationship be- the average amino acid composition of 22. M. F. Perutz and H. Lehmann, ibid. 219,
tween amino acid frequencies and the 902 (1968).
proteins reflects, more or less passively, 23. K. Sick, D. Beale, D. Irvine, H. Lehmann,
genetic code. the genetic code. P. T. Goodall, S. MacDougal, Biochim. Bio-
phys. Acta 140, 231 (1967).
From these considerations it is not 24. H. Lehmann, personal communication.
difficult to conclude that the stream of 25. and R. W. Carrell, Brit. Med. Bull.
Comparative Rarity of Arginine 25, 14 (1969).
spontaneous alterations in DNA, con- 26. T. H. Jukes, Biochem. Genet. 3, 109 (1969).
tinuously fed into the genetic pool, 27. C. Milstein, Nature 216, 330 (1967).
The conspicuous disparity of the 28. R. J. DeLange and D. M. Fambrough, Fed.
should include far more acceptable Proc. 27, 392 (1968).
observed and expected frequencies of changes that are neutral than changes 29. C. Yanofsky, Cold Spring Harbor Symp.
occurrence for arginine (Table 6) is Quant. Biol. 28, 581 (1963).
that are adaptive. Protein molecules 30. H. J. Whitfield, Jr., R. G. Martin, B. Ames,
actually to be expected from predic- are subjected to incessant probing as a J. Mol. Biol. 21, 335 (1966).
31. T. Mukai, Genetics 50, 1 (1964).
tions made by Subak-Sharpe et al. (51, result of point mutations and other 32. T. A. Trautner, M. N. Swartz, A. Kornberg,
52). Their investigations focused at- DNA alterations. The genome becomes Proc. Nat. Acad. Sci. U.S. 48, 449 (1962).
33. Z. W. Hall and I. R. Lehman, J. Mol. Biol.
tention on the anomalous rarity of the virtually saturated with such changes 36, 321 (1968)
doublet CpG in vertebrate DNA, first as are not thrown off through natural 34. E. Zuckerkandl and L. Pauling, in Evolving
Genes and Proteins, V. Bryson and H. J.
noted by Josse et al. (53) and Swartz selection. We conclude that most pro- Vogel, Eds. (Academic Press, New York,
et al. (54). The sequence CpG occurs 1965).
teins contain regions where substitu- 35. R. F. Doolittle, D. Schubert, S. A. Schwartz,
in human DNA at a frequency less tions of many amino acids can be made Arch. Biochem. Biophys. 118, 456 (1967).
36. E. L. Smith and E. Margoliash, Fed. Proc.
than 10 percent of that anticipated without producing appreciable changes 23, 1243 (1964).
16 MAY 1969 797
37. T. H. Jukes, Amer. Scientist 53, 477 (1965). J. Maynard Smith, Nature 219, 1114 (1968). Cold Spring Harbor Symp. Quant. Biol. 31,
38. C. Stern, Principls of Human Genetics (Free- 46. J. B. S. Haldane, Proc. Cambridge Phil. Soc. 583 (1966).
man, San Francisco, 1960). 23, 838 (1927); M. Kimura, J. Appl. Proba- 52. H. Subak-Sharpe, R. R. Burk, L. V. Craw-
39. H. J. Muller, Studies In Genetics (Indiana bility 1, 177 (1964). ford, J. M. Morrison, J. Hay, H. M. Keir.
Univ. Press, Bloomington, 1962). 47. 0. L. Stebbins, Processes of Organic Evolu- ibid., p. 737.
40. H. Callan, J. Cell Sd. 2, 1 (1967). tion (Prentice-Hall, Englewood Cliffs, N.J., 53. J. Josse, A. D. Kaiser, A. Kornberg, J. Biol.
41. C. Bresch, Klassische und Afolekulare Genetik 1966). Chem. 236, 861 (1961).
(Springer, Berlin, 1964); D. B. Comings and 54. M. N. Swartz, T. A. Trautner, A. Kornberg,
R. 0. Berger, Biochem. Genet. 2, 319 (1969). 48. V. M. Sarich and A. C. Wilson, Proc. Nat.
Acad. Sci. U.S. 58, 142 (1967). ibid. 237, 1961 (1962).
42. J. Paul and R. S. Gilmour, J. Mol. Biol. 34, 55. We thank Dr. Motoo Kimura for suggestions
305 (1968). 49. A. L. MacKay, Nature 216, 159 (1967). and comments. The work discussed here was
43. J. B. S. Haldane, Genetics 55, 511 (1957). 50. M. 0. Dayhoff and R. V. Eck, Atlas of done with support from the U.S. Atomic En-
44. J. L. King, ibid., p. 403; R. D. Milkman, Protein Sequence and Structure 1967-1968 ergy Commission and from the National Aero-
ibid., p. 493; J. A. Sved, T. E. Reed and W. (National Biomedical Research Foundation, nautics and Space Administration (grant NGR
P. Bodmer, ibid., p. 469. Silver Spring, Md., 1968). 05-003-020 to the University of California,
45. J. A. Sved, Amer. Naturalist 102, 283 (1968); 51. H. Subak-Sharpe, W. M. Shepherd, J. Hay, Berkeley).

be made of changing the denominator


as well as the numerator.
In a very simple arithmetic calcula-
tion, an imaginary less-developed
Birth Control for country may be expected, in 1980, to
have a national output (V) of $2500
Economic Development million and a population (P) of 12.5
million for a yearly output per head
(VIP) of $200. The government may
Reducing human fertility can raise decide to spend an extra $2.5 million
a year for 10 years starting in 1970 to
per capita income in less-developed countries. raise VIP. It can use these funds to
increase output (AlV) or to decrease
population (AP) from what they would
Stephen Enke otherwise be (3). If the significant rate
of return on traditional investments
is 10 percent annually, an investment of
$25 million from 1970 to 1980 will
T;here is a growing interest in the encouraging voluntary use of contra- yield a AV in 1980 of $2.5 million, so
possibilities of lowering birth rates in ceptives. The objective is economic that AVIV is 0.1 percent, or 1 in
order to raise per capita incomes in development. 1000.
many of the less-developed countries. Many questions remain. How effec- Alternatively, the $2.5 million per
Described below is one economic- tive in raising incomes per head is re- year might have been spent on birth
demographic method of assessing what ducing fertility as compared with other control. If the annual cost of an adult
reduced human fertility might contrib- investments of resources? Could and practicing contraception is $5 (4) and
ute to increased economic develop- should governments of less-developed the annual fertility of contraceptive
ment. Justifications of government pro- countries encourage voluntary contra- users is otherwise typically 0.25 live
grams to increase voluntary contracep- ception? births, then in 1980 the population
tion are also considered (1). (12.5 million) would be 1.25 million
In less-developed oountries, one-half smaller than expected. Thus APIP is
or more of annual increases in national Income per Head 10 percent or 1 in 10.
output is being "swallowed" by annual Apparently the amount of money
increases in population, with income One measure of successful economic spent each year on birth control can
per head rising very slowly. Most of development is a rising income (out- be 100 times more effective in raising
these countries have natural increases put) per head of population (2). It is output per head than the amount of
of from 2 percent to 3 percent a year. ordinarily associated with other indi- money spent each year on traditional
Hence they are doubling their popula- cators of increasing welfare such as productive investments-for VAPI
tions every 35 to 23 years. This results greater annual investment. Another PAV here equals 100. Had the rate of
not from rising birthrates but from measure is fewer people living in pov- return on investments been 20 percent
falling death rates during the past 25 erty. annually instead of 10 percent, had
to 40 years-mostly attributable to im- Income (output) per head is a ratio. the annual cost of birth control been
proved health measures. Governments have sought to raise this $10 instead of $5, or had the other-
Some of their governments have de- ratio by increasing its numerator-in- wise fertility of "contraceptors" (5)
cided that they cannot afford to wait vesting in factories, dams, and high- been 0.125 instead of 0.25, this supe-
for a spontaneous decline in fertility, ways, and the like-in order to increase rior effectiveness ratio would have been
resulting perhaps from more education, the annual national output of goods 50 to 1 instead of 100 to 1. Had all
greater urbanization, and improved and services. However, where politi- three parameters been altered by a
living. Instead, a few governments are cally feasible, governments can also factor of two to weaken the argument,
raise the ratio of output per head by the expenditures on birth control would
The author is manager of economic develop-
ment programs at TEMPO, General Electric's decreasing the denominator. A com- still appear 12.5 tim-es more effective.
Center for Advanced Studies, Santa Barbara, The explanation is that it costs fewer
California. parison of economic effectiveness can
SCIENCE, VOL. 164
798

You might also like