Mission Report, Part H. Scientific Data and 27. J. N. de Wys, ibid.
ibid. 153, 632 (1967); A Preliminary Report, NASA SP-173 (Na-
Results, Technical Report 32-1023 (Jet Propul- J. Geophys. Res. 73, 6915 (1968). tional Aeronautics and Space Administration, sion Laboratory, Pasadena, California, 1966), 28. , Surveyor Vii: A Preliminary Report, Washington, D.C., 1968), pp. 289-294. pp. 7-44. NASA SP-173 (National Aeronautics and 35. R. H. Norton, J. B. Gunn, W. C. Livingston, 22. R. H. Norton, J. E. Gunn, W. C. Livingston, S7pace Administration, Washington, D.C., 0. A. Newkirk, H. Zirin, Surveyor VI: A G. A. Newkirk, H. Zirin, J. Geophys. Res. 72, 1968), pp. 187-205. Preliminary Report, INASA SP-166 (National 815 (1967); , Surveyor V: A Preliminary 29. A. L. Turkevich, E. J. Franzgrote, J. H. Pat- Aeronautics and Space Administration, Wash- Report, NASA SP-163 (National Aeronautics terson, Surveyor VI: A Preliminary Report, ington, D.C., 1968), p. 107; - -, Surveyor and Space Administration, Washington, D.C., NASA SP-166 (National Aeronautics and VII: A Preliminary Report, NASA SP-173 1967), pp. 103-105. Space Administration, Washington, D.C., (National Aeronautics and Space Administra- 23. J. A. O'Keefe, J. B. Adams, D. E. Gault, 1968), pp. 109-132. tion, Washington, D.C., 1968), pp. 295-297. J. Green, G. P. Kuiper, H. Masursky, R. A. 30. G. Vitkus, R. R. Garipay, W. A. Hagemeyer, 36. F. B. Winn, Surveyor Project Final Report, Phinney, E. M. Shoemaker, Surveyor VI: A J. W. Lucas, B. P. Jones, J. M. Sasri, Sur- Part ii. Science Results, Technical Report 32- Preliminary Report, NASA SP-166 (National veyor Vii: A Preliminary Report, NASA SP- 1265 (Jet Propulsion Laboratory, Pasadena, Aeronautics and Space Administration, Wash- 173 (National Aeronautics and Space Ad- California, 1968). ington, D.C., 1968), pp. 145-149. ministration, Washington, D.C., 1968), pp. 37. E. M. Shoemaker, R. M. Batson, H. E. Holt, 24. D. E. Gault, J. B. Adams, R. J. Collins, G. 163-180. E. C. Morris, E. A. Whitaker, ibid. P. Kuiper, H. Masursky, J. A. O'Keefe, R. 31. L. D. Jaffe, C. 0. Alley, S. A. Batterson, 38. A. L. Turkevich, E. J. Franzgrote, 3. H. Pat- A. Phinney, E. M. Shoemaker, Surveyor Vii: E. M. Christensen, S. E. Dwornik, D. E. terson, Surveyor V: A Preliminary Report, A Preliminary Report, NASA SP-173 (Nation- Gault, J. W. Lucas, D. 0. Muhleman, R. H. NASA SP-163 (National Aeronautics and al Aeronautics and Space Administration, Norton, R. F. Scott, E. M. Shoemaker, R. H. Space Administration, Washington, D.C., Washington, D.C., 1968), pp. 233-276. Steinbacher, G. H. Sutton, A. L. Turkevich, 1967), pp. 107-132. 25. D. B. Gault, J. B. Adams, R. J. Collins, J. ibid., pp. 1-3. 39. Many people participated. in the Surveyor Green, 0. P. Kuiper, H. Masursky, J. A. 32. D. 0. Muhleman, W. E. Brown, Jr., L. Project and in analyses of the data. Names O'Keefe, R. A. Phinney, E. M. Shoemaker, Davids, W. H. Peake, Surveyor Project Final of some of them are given in the references. Science 158, 641 (1967). Report, Part 11. Science Results, Technical The Surveyor project was managed by the 26. A. L. Turkevich, E. J. Franzgrote, J. H. Pat- Report 32-1265 (Jet Propulsion Laboratory, Jet Propulsion Laboratory, California Insti- terson, ibid., pp. 635-637; A. L. Turkevich, Pasadena, California, 1968). tute of Technology, under contract NAS7-100, J. H. Patterson, E. J. Franzgrote, ibid. 160, 33. b). 0. Muhleman, ibid. sponsored by the National Aeronautics and 1108 (1968). 34. C. 0. Alley and D. G. Currie, Surveyor Vii: Space Administration.
is a cellular control of molecular ac-
tivities, and Simpson adds that there is also an organismal control of cellular activities and a populational control of organismal activities, and concludes (1): Non-Darwinian Evolution The consensus is that completely neutral genes or alleles must be very rare if they -exist at all. To an evolutionary biologist, Most evolutionary change in proteins may be it therefore seems highly improbable that proteins, supposedly fully determined by due to neutral mutations and genetic drift. genes, should have nonfunctional parts, that dormant genes should exist over periods of generations, or that molecules Jack Lester King and Thomas H. Jukes should change in a regular but nonadaptive way . . . [natural selection] is the com- poser of the genetic message, and DNA, RNA, enzymes, and other molecules in the system are successively its messengers. Darwinism is so well established that ural selection, operating through adap- We cannot agree with Simpson that' it is difficult to think of evolution ex- tive changes in DNA. It does not nec- DNA is a passive carrier of the evolu- cept in terms of selection for desirable essarily follow that all, or most, tionary message. Evolutionary change characteristics and advantageous genes. evolutionary change in DNA is due to is not imposed upon DNA from with- New technical developments and new the action of Darwinian natural selec- out; it arises from within. Natural knowledge, such as the sequential anal- tion. There appears to be considerable selection is the editor, rather than the ysis of proteins and the deciphering of latitude at the molecular level for composer, of the genetic message. One the genetic code, have made a much random genetic changes that have no thing the editor does not do is to re- closer examination of evolutionary effect upon the fitness of the organism. move changes which it is unable to processes possible, and therefore nec- Selectively neutral mutations, if they perceive. essary. Patterns of evolutionary change occur, become passively fixed as evo- The view that mutations cannot be that have been observed at the pheno- lutionary changes through the action of selectively neutral is not confined to typic level do not necessarily apply at random genetic drift. organismal evolutionists. Smith (3) the genotypic and molecular levels. We The idea of selectively neutral change states: need new rules in order to understand at the molecular level has not been One of the objectives of protein chem- -the patterns and dynamics of molecular readily accepted by many classical evo- istry is to have a full and comprehensive evolution. lutionists, perhaps because of the understanding of all the possible roles that Evolutionary change at the morpho- pervasiveness of Darwinian thought. the 20 amino acids can play in function logical, functional, and behavioral Change in DNA and protein, when it and conformation. Each of these amino levels results from the process of nat- is thought of at all, is thought to be acids must have a unique survival value in limited to a response to activities at a phenotype being manifestedorganism-the the phenotype of the Dr. King is a biophysicist and geneticist for in the struc- the Donner Laboratory and Dr. Jukes is asso- higher level. For example, Simpson (1) tures of the proteins. This is as true for a ciate director of the Space Sciences Laboratory, University of California, Berkeley 94720. quotes Weiss (2) as stating that there single protein as for the whole organism. 788 SCIENCE, VOL. 164 To hold that selectively neutral iso- Table 1. Rates of amino acid substitutions in mammalian evolution. [From data in Table 5; alleles cannot occur is equivalent to for sources, see (53)] maintaining that there is one and only Total Observed Observed Estimated 10-1o sub- one optimal form for every gene at number of number number number of stitutions Protein comparisons of amino of differ- substitu- don any point in evolutionary time. We of amino acid dif- ences per tions per per codon think that life is not so inflexible. acids ferences codon codon P Y 1. Insulin A and B 510 24 0.047 0.049 3.3 2. Cytochrome c 1040 63 .061 .063 4.2 Fixation of Selectively 3. Hemoglobin a-chain 432 58 .137 .149 9.9 4. Hemoglobin /3-chain 438 63 .144 .155 10.3 Neutral Isoalleles 5. Ribonuclease 124 40 .323 .390 25.3 6. Immunoglobulin light Drift is slow but effective in the chain (constant half) 102 40 .392 .498 33.2 fixation of neutral mutations. As 7. Fibrinopeptide A 160 76 .475 .644 42.9 pointed out by Kimura (4), the rate 8. Bovine hemoglobin fetal chain 438 97 .221 .250 22.9t of random fixation of neutral mutations 9. Guinea pig insulin 255 86 .337 .411 53.1t in evolution (per species per generation) * The estimate for time elapsed since the divergence of euplacental mammalian orders is 75 million is equal to the rate of occurrence of years. The average rate of evolution for the seven protein species represented by entries 1 through 7 neutral mutations (per gamete per is 16 X 10-10 substitution per codon per year. t Bovine line of descent only. : Guinea pig line of descent only. Positive natural selection has probably been a factor in the evollution of bovine fetal generation). hemoglobin and guinea pig insulin. Of the 2N copies of a gene in a pop- ulation of N individuals at one point in evolutionary time, only one is destined there are 61 amino-acid-specifying were due to adaptive evolution, then to become the ancestor (through repli- codons. Since each of the three base one should expect that the first two cation) of all copies of the gene that pairs can mutate in any of three ways, nucleotide positions of each codon will be in existence in the species in the each codon can mutate in any of nine would change more rapidly than the distant evolutionary future. The process ways by single substitution. Of the 549 third position, since synonymous mu- by which one line becomes fixed has possible single-base substitutions, 134 tations are unlikely to be adaptive. But been called "genetic drift," "random (one-fourth) are substitutions to syn- if DNA divergence in evolution in- walk," or "branching process." If all onymous codons (6). These are herit- cludes the random fixation of neutral copies of the gene are selectively equiv- able changes in the genetic material, mutations, then the third-position nu- alent, all have equal chances of becom- hence true mutations. As far as is cleotides should change more rapidly, ing the common ancestor. Thus, if a known, synonymous mutations are because synonymous mutations are newly occurring mutation is selectively truly neutral with respect to natural more likely to be neutral. neutral, its probability of becoming selection. If the 15 percent of base differences fixed through random drift is 1 /2N between the DNA's of mice and rats (5). If mi is the rate of occurrence of were distributed randomly in structural selectively neutral mutations per func- Comparing Evolution in genes, (0.85)3, or 61 percent, of all co- tional gamete, the expected number of Protein and in DNA dons would remain identical. About one- newly occurring neutral isoalleles in the fourth of the remaining codons would species is 2Nm1 per generation. Only Species divergence can be measured be synonymous in the two species, so a small proportion of these will become at the protein level through sequence that one would expect about 70 percent fixed by chance; the rate of occurrence analysis, and independently at the DNA of the amino acid positions in mouse of neutral isoalleles destined to become level through in vitro hybridization (7, and rat proteins to be identical and 30 so fixed is 1/ 2N X 2Nmi, or mi per 8). The measurement of DNA species percent to be different. Unfortunately, generation. Thus, the rate of non- divergence is complicated by the exist- there have been no studies of amino Darwinian evolutionary change is a ence of repetitive DNA sequences of acid sequence reported for homologous function only of the rate of occurrence unknown function; discovery of these proteins in the two species. But, if the of neutral mutations and is independent sequences has been the most important time since divergence from the last of population size. finding of the hybridization experiments mouse-rat common ancestor is 9 mil- The gene frequency of a neutral (9). Laird et al. (8) find that the slowly lion years (7), as estimated, a differ- allele fluctuates randomly from genera- reassociating "unique-sequence" DNA ence in 30 percent of the amino acid tion to generation. Eventually the of the mouse and the rat have diverged positions would represent an evolution- "random walk" of the gene frequency to the extent that almost half is unable ary rate of 17 x 10-9 substitution per goes to the ground states of loss or fix- to form interspecific hybrid molecules. codon per year. This is ten times the ation. The evolutionary time scale al- They estimate that, in the 54 percent of estimated average rate of protein evo- lows for many such fixations in the mouse unique-sequence DNA that does lution in mammalian species (see Table divergence of species. form hybrid molecules with rat DNA, 1). about 15 percent of the nucleotide bases Walker (7), using quite different are improperly paired, due to divergent procedures and criteria, estimated that Mutations to Synonymous Codons evolution. If the hybridizable fraction of 13 percent of the nucleotide positions unique-sequence DNA is typical of are occupied by different bases in the Because of the degeneracy of the structural DNA in the mouse and rat, DNA's of the mouse and the rat. He genetic code, some DNA base-pair evidently 15 percent is a minimum esti- concluded that the large discrepancy changes in structural genes are without mate of nucleotide divergence. between the evolutionary rate for DNA effect on protein structure. Specifically, If most DNA species divergence and that for protein sequences implied 16 MAY 1969 789 Table 2. Distribution of numbers of amino acid changes compared for 148 sites in globin than that in the original stock. The chains, 110 sites in cytochrome-c chains (19), and 111 sites in specificity regions of immuno- gene favors the substitution A*T -e C*G globulin-G light chains (26). (11). The effect of the gene, which Specificity regions of may possibly be mediated through an Globins Cytochromes c light chains of altered DNA polymerase, is apparently immunoglobulins G No. of No. of to fill the third positions of synonymous No. of codons withi C or G and also to pro- Changes per site sites having Minus six Poisson distri- sites Minus having 29 Poisson distri- No. Minus nine Poisson distri- duce neutral substitutions in the amino- the spe- invari- bution the spe- invari- bution of invari- bution acid content of proteins in which A- cified able for m cified able for m sites able for m No. of sites = 3.5 No. of sites = 2.6 sites = 2.4 and T-rich codons are changed to G- changes changes and C-rich codons. Thousands of such 0 7 1 4 35 6 6 17 9 9 mutations accumulated in laboratory 1 21 21 15 17 17 16 19 19 22 cultures without markedly impairing 2 23 23 27 18 18 20 28 28 27 the viability of the mutated strains. 3 33 33 31 19 19 18 20 20 21 Moreover, as noted by Cox and Yanof- 4 29 29 27 10 10 12 12 12 13 sky, genes in other organisms, such as 5 20 20 19 6 6 6 5 5 6 bacteriophage T4, may exert a bias in 6 7 7 11 3 3 3 5 5 2 the opposite direction, GC -> A-T 7 5 5 6 1 1 1.0 4 4 0.8 (12). It seems that mutator genes can 8 2 2 2 1 1 0.3 1 1 .3 be an evolutionary phenomenon, ex- 9 1 1 1.0 0 0 .1 0.0 0.0 .10 plaining to some extent the amino acid differences between AT-rich and G'C-rich bacterial species, first noted that most evolutionary changes in DNA corresponding human and carp DNA by Sueoka (13) and discussed else- are concentrated in synonymous third would fail to reassociate in the regions where (14). positions. Presumably the third-position beyond the 47th codon. nucleotides are not inherently more A third possibility is that most of mutable, but mutations occurring in the the DNA for which measurements of Proteins in Evolution first two positions of a structural gene homology have been made is not in the codon usually cause amino acid substi- form of structural genes, since, as is Cytochrome c. We now consider tutions and are frequently eliminated discussed below, it is probable that not those changes in DNA which do result by natural selection, whereas third- much more than 1 percent of mam- in altered proteins but which nonethe- position mutations are usually selec- malian DNA codes for proteins. less are fully equivalent to the original tively neutral and are incorporated into The data indicate that changes in form with respect to natural selection evolutionary lines through random amino acid sequences occur much more (15). Cytochrome c is a protein that processes. slowly than changes in total DNA. appears to have identical and well- This may not be the only source of Presumably, changes in DNA which defined functions in the cells of all the discrepancy, however. A second cause changes in proteins are held in eukaryotes. The cytochromes c of vari- circumstance is that the matching prop- check by natural selection to a far ous organisms were fully interchange- erties of homologous DNA sequences greater degree than are those which able when compared in vitro in studies ry be thrown abruptly out of phase do not. of intact mitochondria (16). The sub- by deletions of one or more codons. stitutions in homologous amino acid Consider the case of human and carp residues of cytochrome c have repeat- -hemoglobins. These differ in that site The Treffers Mutator Gene edly been scrutinized by Margoliash 47 is occupied by an alanine residue in and Smith (17), who have discussed the carp a-chain and is a gap in human Cox and Yanofsky (10) have shown the nature of the substitutions with re- and other mammalian a-chains. The that when Escherichia coli of "mut T" spect to the properties of the relevant homology of the a-chains is readily strain is repeatedly subcultured, the amino acids. Smith (3) defines con- adjusted by replacing the alanine resi- presence of the Treffers mutator (mut servative substitutions as "the replace- due of the carp a-chain with a gap in T) gene produces a trend toward DNA ment of one amino-acid residue -by the e-chains of other species, but the of a guanine-cytosine content higher another with similar properties, the locus of such substitutions being such that no disturbance of function will Table 3. Suggested neutral interchanges in cytochrome c. The sites are numbered consecutively, occur because of minor differences in starting with the first residue in the wheat cytochrome-c sequence. structure." But this definition is quite Cytochrome c Site number elastic, for he points out that it can fit 17 19 43 65 66 93 103 the case in which nine different residues Lys Leu Ile Thr Leu Ile occupy a homologous site in different Neurospora Leu Bakers' yeast Leu Lys Ile Val Leu Leu Ile cytochromes c when the locus has its Yeast (Candida krusei) Leu Lys Ile Val Glu Leu Val side chain on the outside of the mole- Wheat le Lys Leu Val Glu Leu Ile cule. Cytochrome c is devoid, or almost Moth (Samia cynthia) nIe Val Phe Ile Thr Leu Te devoid, of helical regions, but the in- Horse Ile Val Leu Ile Thr Ile Ile terior of the molecule "shows a low Other species Val Ile Val Ile density area that must consist entirely Val or almost entirely of hydrophobic side 790 SCIENCE, VOL. 164 chains" (3). This requirement would Table 4. Human hemoglobin variants which These considerations make it appear correspond to mutations that have become in- that most of the interspecies differences restrict but not prevent interchanges corporated into the normal hemoglobins of among the amino acid residues that other species. between the hemoglobins are function- provide these side chains. Posi- Residue in human Residue in nor- ally neutral. It is our view that, while there are tion hemoglobin A mal animal This view is supported by studies of some restrictions on the replacements in -o c human hemoglobin mutants reported chain Normal Mutant hemoglobin at variable sites in cytochrome c, the by Perutz and Lehmann (22). Because possibilities for such replacements are a22 Gly Asp Carp Asp of the screening methods used, only extensive, and that many of the exist- a57 Gly Asp Orangutan Asp mutations involving electrophoretic a68 Asn Lys Rabbit Lys, changes were generally available for ing replacements are neutral. Further- sheep Lys more, the possibilities for replacements a68 Asn Asp Carp Asp study. Such changes have proved to be are by no means exhausted by the list ,B16 Gly Asp Horse Asp harmful when they occur in the interior that is available as a result of analyses p69 Gly Asp Bovine Asp of the molecule; a number of these of cytochromes c from about 25 differ- ,387 Thr Lys Pig Lys, interior changes were considered in rabbit Lys ent species. It is quite likely that all of ,B95 Lys Glu Pig Glu detail with respect to their effects on the remaining possible replacements clinical symptoms, on chemical prop- are present in the cytochromes c of the erties, and on molecular structure. many millions of species which -have remaining 74 to 81 residues are vari- Most changes in residues occurring on not yet been examined. It may also be able, and substitutions in the variable the exterior of the hemoglobin mole- presumed that the cytochromes c are sites of the cytochromes c seem to cule appear to be harmless, at least in still evolving, and it is possible that follow the Poisson distribution (Table the heterozygous state. Many of the 59 many neutral evolutionary replacements 2); this would indicate that there is different external replacements which of their amino acid residues are yet to very little restriction on the type of have been found to occur in human be made. Thus, two views are expressed amino acid that can be accommodated hemoglobin are the counterparts of regarding the number and distribution at most of the variable sites. This con- variations at the corresponding sites of amino acid replacements in the evo- clusion is supported by the observation in normal hemoglobins of other mam- lution of homologous proteins. The first that many of the variable sites show malian species (Table 4). The infer- is that of the protein chemist, who sees interchanges between neutral, acidic, ence is drawn that these replacements the replacements as being related and basic amino acid residues, or be- are functionally equivalent and selec- solely to function. The external regions tween hydrophobic and hydrophilic tively neutral, despite the fact that they of the protein molecule are less re- amino acid residues. involve changes in net charge. stricted with respect to change than are Leucine, isoleucine, and valine are The occurrence of hemoglobin vari- the internal regions, which must often similar to each other in structure and ants in the human population has been be occupied by hydrophobic side chains. properties. We suggest that the leucine- discussed by Sick et al. (23). They Certain residues are invariant because isoleucine-valine substitutions in the found ten hemoglobin variants among they are essential to enzymatic func- cytochromes c at sites 17, 19, 43, 65, 8000 Europeans examined by a screen- tion. It is the necessary properties of 66, 93, and 103 (Table 3) are neutral ing procedure in which histidine was the protein that dictate its primary rather than adaptive, and that many not distinguishable from the neutral structure. This view tends to push other neutral substitutions exist in the amino acids. Of 2217 theoretically pos- DNA, as the driving force in evolution, cytochromes c, particularly at sites sible amino acid substitutions in a- and into the background. where there are many interchanges. B-chains, only 700 would cause a The second view, to which we sub- Matsubara and Smith (20) reported change in charge, so possibly the ten scribe, is that the protein molecule is a variant human cytochrome c in which detected variants represented a total of continually challenged by mutational the leucine residue at site 65 was re- 32 occurrences in 8000 subjects-an changes resulting from base substitu- placed by a methionine. The source incidence of 0.4 percent. Additional tions and other mutational events in material was a composite sample ob- surveys (24) brought the total number DNA. Natural selection screens these tained from approximately 70 individ- of subjects to 20,000. The incidence of changes. The fact that some variable uals. Matsubara and Smith concluded variants found by electrophoresis was amino acid sites are more subject to that a single mutant would account for 1 per 1800, corresponding to an actual change than others in a set of homol- the observation. Information is much occurrence of 1 in 600. ogous proteins is an expression pri- more extensive on the occurrence of On this basis it is estimated that there marily of the random nature of point hemoglobin variants, as discussed be- are 5 million hemoglobin A variants in mutations and only secondarily of pro- low. the total human population of 3 billion. tein function. As shown in Table 2, Hemoglobins. The structure and The total number of possible amino the five so-called "hypermutable sites" functional relationships of hemoglobins acid replacements in hemoglobin A is (18) in cytochrome c, which have six have been studied more extensively about 2217. If half of these replace- or more changes per site, are predict- than those of any other proteins, and ments could occur without greatly dis- able in terms of the Poisson distribu- no other proteins are known to have rupting the secondary and tertiary tion. a comparable variability of molecular structure of the hemoglobin molecule, About 29 of the amino acid residues structure. This variability occurs despite the number of different variants should in the cytochrome c are invariant (18, the fact that all of these polypeptide be about 1100. Perutz and Lehmann 19). These residues are needed for chains are of approximately the same (22) listed 82 identified mutations in- combining with the heme group, for length. The oxygen dissociation con- volving single amino acids in the a- and interacting with cytochrome c oxidase, stants of various mammalian hemo- Bl-chains; this number would be 7.5 and possibly for other functions. The globing do not vary significantly (21). percent of 1100 mutations, or 3.9 per- 16 MAY 1969 791 cent of the 2217 total theoretically could be stored for use. It can be ten times as fast per codon as cyto- possible mutations in the 0.0007 per- argued that, in this special case, most chrome c has, one can conclude that cent of the population so far examined. mutations would be potentiallLy bene- at least 90 percent of all substitutional Recently Lehmann and Carrell (25) ficial rather than neutral or del eterious. mutations at the cytochrome-c locus are have increased this listing to 94 identi- Once again, generation of evollutionary harmful and are rejected by natural fied mutations. These calculations show changes appears to originate p rimarily selection. a high probability that all the theo- from random point mutations. Fibrinopeptide B, the other fragment retically possible variants of hemoglo- The distribution of changess in the removed from fibrinogen, is so change- bin exist. S-regions shows the presence of four able in evolution and so subject to gaps The distribution of changes shown in hypervariable sites with five changes, and terminal deletions that we have Table 2 shows that there are no "hy- These are present in the "hinge region" made no attempt to calculate its permutable sites"; the numbers of sites (27) (see Table 2). evolutionary rate. with seven, eight, and nine changes fit Fibrinopeptide A. Fibrinope ptide A Histone IV. Histone IV, a nucleo- the Poisson distribution. is one of two peptide fragments; that are protein, shows remarkable evolutionary Immunoglobulins. An examination of removed enzymatically from fi brinogen conservatism (28). On the basis of in- the distribution of changes of amino in the formation of the blood1-clotting complete sequence analysis of this 101 acids in the specificity regions (S- protein fibrin. Its function is to block amino-acid protein, there appear to have regions) of the immunoglobulin-G light a site of polymerization. The relative been only two substitutions in the evolu- chains which have been analyzed shows rapidity of evolutionary chanige in fi- tionary lines of peas and cattle since that the changeg are distributed in a brinopeptide A (Tables 1 and I5) would deviation from their common ancestor, random manner, similar to the distribu- seem to imply that its primary structure perhaps a billion years ago. This is a tions in the globins and cytochromes c is not very critical, and that a irelatively rate of change of one substitution per (Table 2) (26). The S-regions are large proportion of substitutionial muta- line per codon every 1011 years. It must presumed to combine with antigenic tions are not rejected by natuiral selec- be that virtually all mutations at the determinants in immunological reac- tion. Even within the short fibrino- histone-IV locus are rejected. tions. Consequently it is advantageous peptide-A fragment, however, some The concept of neutral mutations for an animal to have a large number positions are notably less cl tangeable makes it possible to resolve certain of different S-regions for defense against than others. It is quite likely 1that only dilemmas in the study of evolution. numerous antigens. This could be the a minority of the changes thLat occur For example, primates and guinea pigs case if there were thousands of copies in this portion of the fibrinogen gene are are unable to convert 2-keto-L-gulono- of the S-region cistron in the genome, selectively neutral. But from tihe obser- lactone to ascorbic acid, hence are sub- so that numerous mutational variants vation that fibrinopeptide A has evolved ject to scurvy when placed on a diet lacking in vitamin C. All other animals that have been examined are free from Table 5. Amino acid substitutions in mammalian evolution. this metabolic defect and are able to Observed Observed synthesize ascorbic acid. Evidently the Comparison differences Comparison differences defect in primates and guinea pigs is the result of an evolutionary change. Insulin A and B (except for guinea Ribonuclease: 124 amino ac How could such a nonadaptive change pig insulin): 51 amino acids; 510 124 comparisons of homologo&,ssites comparisons of homologous sites Bovine: rat 40 pass into the species? Human: horse 2 The probable answer is that the change Human: rabbit 1 Immunoglobulin (constant half of light 3 chain): 102 amino acids; 102 connparisons was a neutral one when it occurred and Human: sei whale Human: bovine 3 of homologous sites when it entered the genome. Primates Horse: rabbit 3 Human: mouse 40 and guinea pigs under "natural" condi- Horse: sei whale 3 acids; tions have diets that contain adequate Horse: bovine 3 Fibrinopeptide A: 16 amino Rabbit: sei whale 3 160 comparisons of homologo bus sites amounts of vitamin C. Man does not 3 Human: donkey 7 Rabbit: bovine develop scurvy unless he subsists on a Bovine: sei whale 1 Human: rabbit Human: bovine s diet in which dried foods, refined foods, Cytochrome c: 104 amino acids; Human: dog 5 and grain products predominate; the 1040 comparisons of homologous sites 10 Human: horse 12 Donkey: rabbit guinea pig is known to develop scurvy Human: rabbit 9 Donkey: bovine 8 only when, as a laboratory animal, it Human: pig 10 Donkey: dog Human: gray whale 10 Rabbit: bovine 10 is deprived of its customary supply of Horse: rabbit 6 Rabbit: dog 10 fresh green leaves. Here, therefore, is an Horse: pig 3 Bovine: dog 8 instance of a neutral change becoming Horse: gray whale 5 Rabbit: pig Rabbit: gray whale 4 2 146 amino acids; 438 comparhains Bovine fetal-hemoglobin /3-c detrimental as the result of an "arti- ficial" change in the environment. 2 of homologous sites Pig: gray whale 33 Bovine fetal: human 13 Bovine fetal: rabbit 3 33 Hemoglobin a: 141 amino acids; 423 comparisons of homologous sites Bovine fetal: horse ,B 31 Human: horse 17 18 Guinea pig insulin (51 amino acids) Apparently Neutral Mutations Human: mouse compared with other mammalia,d insulins; in Escherichia coli Horse: mouse 23 255 comparisons of homologc ~us sites Hemoglobin /3: 146 amino acids; Guinea pig: human 18 Certain revertants of the tryptophan 438 comparisons of homologous sites Guinea pig: horse Human: horse 25 Guinea pig: rabbit 18 synthetase-A protein in Escherichia coli Human: rabbit 14 Guinea pig: whale 16 appear to be neutral changes (29). Horse: rabbit 24 Guinea pig: bovine 17 These were discovered as follows. Muta- 792 SCIENCE, VOL. 164 tions in the glycine residue at site 210 gene inactivation may generally be too guanine was incorporated, as a "mis- (Gly210) to arginine or glutamic acid low by an order of magnitude. Since take," at a frequency of one per 2,000 produced nonfunctional tryptophan the assay distinguished only between to 25,000 adenine and thymine nucleo- synthetase. Revertants of the arginine functional and nonfunctional alleles, it tides polymerized. Subsequently Hall residue to serine, or of the glutamic is not possible to say what proportion and Lehman (33) found that, during acid residue to alanine, were fully of the unrecovered amino acid substitu- the synthesis of poly-dG on a dC tem- functional. Therefore, direct mutations tions-if any-were fully functionally plate by T4 bacteriophage DNA poly- of GlY210 to serine or alanine would be equivalent to the original form. This merase, T was incorporated instead of undetectable, and the changes were experiment suggests that the total muta- G at a level of 10-5 to 106. The error found only because of the intervening tion rate is perhaps ten times the muta- rate was increased fourfold when a stage of arginine or glutamic acid. It is tion rate detectable by standard means. mutant form of DNA polymerase was to be expected that neutral mutations Although detectable per-locus muta- used. may occur even more readily at other tion rates vary considerably, geneticists While fidelity of replication is neces- sites in this protein; Gly210 is evidently are accustomed to think of a convenient sary for the hereditary process, it is at a site that is part of the active center. "standard" mutation rate as 10-5 muta- probable that this small amount of in- tion per gamete per locus. This accom- fidelity is the major driving force in modates Drosophila recessive lethal and evolution. Rate of Spontaneous visible mutations, and human and other Amino Acid Substitutions mammalian recessive mutations. If the work of Whitfield et al. in Salmonella Rates of Amino Acid Substitution So far, all direct studies of mutation is at all relevant to higher organisms, a in Mammalian Evolution rate have depended on the detection reasonable approximation for the total of mutant genes through some grossly mutation rate, including all mutations Table 5 presents the observed amino observable effect on function, such as with immeasurably small effects, might acid differences for several proteins in a change in morphology or viability. well be 10-4 per locus per gamete (30). comparisons between representatives of Neutral and nearly neutral mutations This contention would seem to be different mammalian orders. All the have not been systematically observed supported by Mukai's work on "viability proteins have been completely se- in mutation-rate studies, although they polygenes" (31). Mukai has shown by quenced. In Table 1, evolutionary rates are potentially observable through careful experiments, involving the are given in terms of substitutions per modern biochemical techniques. counting of 2.5 million flies, that the codon per year per evolutionary line. Some indirect estimates of the total rate of spontaneous mutation for the Not all evolutionary changes that rate of amino acid substitutions are whole genome for slightly deleterious have occurred in the divergence of two available. Whitfield et al. (30) developed mutations (with an average relative lines can be observed in a direct com- techniques by which they were able to fitness of homozygotes greater than 98 parison of living representatives. A analyze the molecular bases of condi- percent of normal) is at least 20 to 30 given site may be changed more than tional lethal mutations recovered at the times as high as the total rate for once in one evolutionary line, with the histidine-C locus in Salmonella. Of 65 recessive lethal genes. At least 35 per- result that there is only one observed such mutations recovered, 22 were base- cent of all Drosophila gametes carry a amino acid difference where there were substitution mutations resulting in chain- new, slightly harmful mutation. Some two evolutionary events. If the second terminating codons, and 21 were base- of these slightly deleterious mutations change should happen to have been a substitutions resulting in nonfunctional may represent the complete loss of func- return to the original amino acid, which proteins. But, according to the genetic tion of genes which have, at most, only is likely if the two events are function- code table, there are 549 possible marginal effects on fitness in the ally equivalent at a particular site, no single-base-substitution mutations; of laboratory. Other slightly deleterious evidence of evolutionary change would these, 392 result in amino acid changes; mutations are probably changes to remain. Similarly, both diverging lines there are only 23 kinds of single-base- slightly less effective alleles of vital may have incorporated evolutionary substitution mutation which result in genes which are also capable of mutat- changes at a homologous site, resulting the replacement of an amino-acid- ing to fully lethal alleles. Still unde- in only one observed difference or none; specifying codon with one of the three tected, even in Mukai's work, are the for example chain-terminating codons. Whitfield et selectively neutral biochemical muta- al. (30) reasoned that base changes were probably random, and that only tions. The replication of DNA takes place Ala IGly A Va 1 23/549 of all such substitutions would with astonishing fidelity, so that the or Ala be expected to have resulted in chain- daughter strands are complementary to \Va Va1 termination mutants. Thus the recovery the parent strands. This accuracy of It is difficult, in comparisons of of 22 chain-termination mutants implies replication is essential to heredity and, that an estimated 525 base-substitution homologous sites, to correct for back indeed, to the continuation of terrestrial mutations, which show spurious identi- mutations actually occurred, of which life. Trautner et al. (32) found that the ties. Some corrections can be made for 375 resulted in amino acid changes. frequency of incorporation of G during other sequential changes, however; if Most of these mutants were not re- enzymatic replication of d(AT) copoly- covered, presumably because the altered evolutionary substitutions are assumed mer was less than one residue per to be randomly distributed throughout enzyme remained functional. Since only 28,000 to 580,000 adenine and thymine the gene, single and multiple "hits" are about 10 percent of mutants of all kinds nucleotides polymerized. In the replica- distributed according to the Poisson were recoverable, mutation-rate esti- tion of an analogous polymer contain- distribution. The frequency of un- mates based on the usual criterion of ing bromouracil instead of thymine, changed sites would be e-P, where p is 16 MAY 1969 793 the true frequency of evolutionary sub- lutionary change; (ii) on studies of 99 percent of mammalian DNA is not stitutions per site (34). This correction cytochrome c, which is a relatively true genetic material, in the sense that has been used in Table 1. slowly evolving protein; and (iii) on a it is not capable of transmitting muta- The assumption of a random dis- minimum estimate based on unse- tional changes which affect the pheno- tribution of evolutionary amino acid quenced analysis of triosephosphate type, or 40,000 genes is a gross under- substitutions must be modified, of dehydrogenase, this probably being a estimate of the total gene number. course, by recognition that some sites gross underestimate of the true evolu- Rates of spontaneous mutation to are invariant and others are restricted tionary rate for that enzyme. The aver- recessive lethal and visible mutants in to, for example, hydrophobic side age rate of evolution per codon in the mammals are of the order of 10-6 to chains. The capacity of the highly completely sequenced proteins listed in 10-5 per locus per generation (38). If changeable sites to reflect evolutionary Table 1 is five times Kimura's conser- there are 40,000 genes, the total rate divergence may eventually be ex- vative underestimate. If the rate per of mutation to lethal or nonfunctional hausted, so that the amount of evolu- codon is extrapolated to the entire alleles would be between 4 and 40 per- tionary change will be underestimated. haploid DNA genome of 4 X 109 cent per gamete. From this considera- For example, the rate of change in fi- nucleotide pairs, as has been done pre- tion alone, it is clear that there cannot brinopeptide A in closely related artio- viously (4, 37), it would appear that be many more than 40,000 genes. dactyls (35) appears to be greater than mammalian evolution is proceeding at In extensive studies of the spontane- the rate calculated from comparisons the rate of about two allele substitu- ous mutation rate of Drosophila mela- between more distantly related mam- tions per year. In relatively long-lived nogaster, the average lethal mutation mals. mammals this may be 20 substitutions rate was 3 x 10-6 per locus and 10-2 All major euplacental orders diverged per species per generation; in the hu- per genome (39). Thus, the fruit fly from a common ancestor in a relatively man species, this is an evolutionary has about 3000 loci that are capable of dtt period, approximately 70 to 80 rate of nearly 60 amino acid substitu- mutating to lethal alleles. If only a third minion years ago [G. G. Simpson, cited tions per generation, implying a ge- of all loci are capable of mutating to in (36)]. In Table 1, the evolutionary rate nome mutation rate including 60 neu- lethal alleles under laboratory condi- is calculated as the adjusted frequency of tral amino acid substitutions per gamete. tions, there may be perhaps 10,000 evolutionary differences per codon, in For several reasons this seems much Drosophila cistrons. If the average comparisons between representatives of too high. cistron size is 1000 nucleotides, this pairs of mammalian orders, divided by For one thing, about 4 percent of accounts for about 10 percent of Dro- 150 million (75 million years for each base substitutions result in chain-ter- sophila DNA (8), since drosophilas line of descent). minating codons; 60 amino acid substi- have much less DNA per cell than Different proteins evolve at different tutions imply about three chain-termi- mammals have. rates, and different sites within specific nating mutations per gamete. Most There is more direct evidence for proteins evolve at different rates. It is chain-terminating mutations, if they the existence of nongenetic DNA. possible that these differences reflect occur in structural genes, are lethal, or Heterochromatin is known to be nearly differential mutability of the DNA it- at least produce nonfunctional alleles devoid of specific genetic information, self, but to us this seems unlikely. It is which have to be eliminated through yet it accounts for about a third of the more likely that proteins, and sites with- natural selection. No organism having DNA of those species in which it is in proteins, differ with regard to the three lethal or severely deleterious mu- cytologically detectable. About 30 per- stringency of their requirements. The tations per gamete can survive. In addi- cent of mammalian DNA consists of average rate of evolutionary change as tion, frame-shift mutations, also lethal highly repetitive sequences of unknown shown in Table 1 is 16 X 10-10 sub- in structural genes, appear to occur function (9). In some species there are stitution per codon per species per about as frequently as chain-terminat- varying numbers of supernumerary year. ing mutations (30), and certainly some chromosomes that appear to be of no Kimura (4) has estimated, in agree- \ of the amino acid substitutions are survival value to the organism. ment with Jukes (37), that total molec- lethal or biologically harmful. Indeed, Perhaps the most compelling argu- ular evolution in vertebrate species as we attempt to demonstrate below, ment for the existence of superfluous proceeds at the rate of about one amino it is unlikely that more than about 10 DNA is the wide range in the DNA acid substitution every 2 years. Argu- percent of all mutations are selectively content of vertebrate cells (40, 41). ing that Darwinian evolution at that neutral. The average mammalian cell contains rate would require greater selection A second error is the assumption more than twice the DNA of the pressure than any species can afford, that all or most mammalian DNA con- chicken cell and almost four times that Kimura concluded that most amino sists of structural genes. Older estimates of the cell of the gar pike. The cell of acid changes must be due to the pas- (see 38) of maximum gene number in the bullfrog contains twice as much sive fixation of selectively neutral mammals rarely exceed 40,000 genes DNA as that of the toad, and two and mutations. per haploid genome. If the average a half times as much as that of a man, While we tend to agree with this gene consists of 1000 nucleotide pairs, while the cell of a lungfish has a DNA conclusion, there are several reasons extrapolation from the estimated evo- content 17 times that of the human cell for questioning the arguments on which lutionary rate of 16 X 10-10 substitu- and almost 60 times that of the pike it was based. Kimura's estimate was tion per codon per year gives one cell. Can it be that these wide diver- deliberately conservative in some re- amino acid substitution per species per gences in DNA content reflect wide spects. The estimate was based (i) on 50 years. This is a far more believable divergences in the number of functional comparisons of the beta chains of horse figure. But only 4 X 107 nucleotide genes? This hardly seems likely. and human hemoglobins, which appear pairs, or 1 percent of the mammalian On the other hand, a substantial to have about an average rate of evo- genome, is thus accounted for. Either proportion of mammalian DNA is SCIENCE, VOL. 164 794 capable of forming hybrids with spe- tural DNA itself, and imply that most older allele or alleles. Such uncondi- cific messenger RNA in vitro (42). base substitutions occurring in the tionally adaptive new mutations, which Possibly, as Callan suggests (40), nu- structural genes of more slowly evolv- must be very rare, have relatively high merous nonheritable copies of the es- ing proteins are deleterious. probabilities of eventual fixation. Spe- sential genetic material are created Natural selection is indirectly oper- cifically, the probability of fixation is anew each generation. These multiple ative in the patterns of neutral evolu- 2s(NJ1N), where 1 + s is the relative copies would transmit specific informa- tionary change in that only functionally fitness of the new heterozygote and Ne tion by way of messenger RNA, but equivalent isoalleles are allowed the and N are, respectively, the effective would not be true genetic material in small possibility of fixation through and the actual number of the popula- that they would not transmit informa- random genetic drift. Those alleles tion (46). If u is the rate of occurrence tion to future generations and would which do become fixed through drift of favorable mutations, per gamete, the not be directly involved in evolutionary are not a random selection of all sub- rate of Darwinian evolutionary fixation processes. Another important possibility stitutional mutations, but alleles which is 4usN,. Gene duplications and partial is that much of mammalian DNA is have been "selected" for innocuousness. duplications that have become fixed in involved in the complexities of the im- evolution are quite good candidates for mune response (26). this class of mutations. The rate of Allele Selection through occurrence of such evolutionary fixa- Darwinian Evolution tion is a direct function of the total What Proportion of AJI Mutations occurrence of such beneficial mutations Is Selectively Neutral? One amino acid substitution every in the population, and is thus a function 50 years is still too rapid a rate to be of the population size of the species. Since the rate of fixation of selec- accounted for by classical genetic In this situation evolution waits on tively neutral mutations per species is theory unless most substitutions are mutation. equal to the mutation rate for neutral selectively neutral. This is the argument In other cases, allele changes de- mutations per gamete, the observed from which Kimura (4) derived the pend upon environmental or other ex- rate of evolutionary change represents conclusion that molecular evolution trinsic changes, including other changes the upper limit of the neutral-mutation was primarily through drift. Haldane in the genetic background. Specific mu- rate. Thus the neutral-mutation rate in (43) calculated that Darwinian evolu- tations which may have occurred repeat- mammalian structural genes cannot be tion cannot proceed at a rate greater edly have been nonadaptive or deleteri- higher than about 16 X 10-10 mutation than about one allele substitution every ous in previous environments; in a new per codon per year, the observed rate 300 generations; a higher rate of adap- environment the same mutations be- of protein evolution. If the average tive evolution would produce an un- come advantageous, and increase to locus consists of about 1000 nucleotide bearable "genetic load" associated with fixation. The rate of this kind of evo- pairs, the upper limit to the neutral- the elimination of the older, less-favored lutionary change is a function of en- mutation rate is about 5 X 10-7 per alleles. This tends to support our prin- vironmental change, and is nearly year, or 3 X 10-8 per locus in such cipal hypothesis, but the idea of an independent of either population size mammals as have an average genera- unbearable genetic load has been or rate of mutation of any kind (47). tion span of 6 years. This is approx- strongly challenged recently (44, 45) Rather small selective advantages imately the mutation rate per locus of since it depends on the erroneous as- for relatively rare favorable mutations recessive lethals. From the work of sumption of independent action of are required to account for rates of Mukai (31) and Whitfield et dl. (30) genetic and environmental factors Darwinian selection consistent with the it appears that very slightly deleterious affecting fitness. Sved and Maynard observed and calculated evolutionary mutations are some ten times as fre- Smith have shown independently (45) rates. As a numerical example, suppose quent as recessive lethals; thus it would that even the high rate of evolution that the probability of a favorable mu- appear that something of the order of calculated by Kimura (4) is not in- tation (or of the combination of a mu- 80 or 90 percent of spontaneous mu- compatible with Darwinian adaptive tation and an appropriate change in tations are mildly deleterious, 5 to 10 evolution. environment) were only 10-10 per percent are lethal, and 5 to 10 percent Adaptive change, wherein the new gamete for a certain locus. That is, are selectively neutral. allele increases to evolutionary fixation about one mutation in 100,000 muta- The apparent discrepancy between because the carrier of the new form is tions would be favorable. Suppose that calculated evolutionary rates for DNA more fit than the homozygote of the the average selective advantage of the and protein (7, 8) is consistent with old form, can be inferred to have oc- new isoallele over the old were this interpretation. If base substitutions curred at the molecular level, from the 0.0005-a very small advantage. If the in a significant proportion of mam- indisputable fact of adaptive evolution effective total number of the species malian DNA are not subject to natural at the morphological and physiological were 500,000, the expected rate of selection, while base substitutions in levels. Direct evidence of such change Darwinian evolutionary fixation at this structural DNA (that is, DNA that at the molecular level, however, has locus would be 10-7 per generation. codes for proteins) are usually elim- been rather scanty, perhaps because This is not in the range of observed inated by natural selection, structural fitness is so difficult to measure. DNA will diverge at a rate slower than Allele replacement through positive evolutionary rates, but the expected rate becomes an acceptable 10-6 per the rate of divergence for total DNA. selection can be the result of any of Again the difference is of one order of several rather different situations. One generation with an effective species number of 5 million, or a favorable magnitude. Finally, the rapidly evolv- is the occurrence of a new, unprece- mutation rate of 10-0 per generation, ing fibrinopeptides indicate something dented mutation which is immediately or an average selective advantage of about the mutability potential of struc- and unconditionally superior to the 0.005. It would appear that the ob- 16 MAY 1969 795 l Expectations for Models of Darwinian any other mammalian insulin studied; and Non-Darwinian Evolution Darwinian change is therefore indi- 8 Ala Gly Oe cated in this evolutionary development y8Ve *LeX The rate of non-Darwinian change (Tables 1 and 5). X6 Asp Glu* * hr- equals the rate of selectively neutral It is fortunate for the biochemical -o* Pro mutation and is independent of en- taxonomist that most proteins studied As4 Ph 7'n,l Arg - vironmental fluctuations and of popu- exhibit relatively uniform rates of Cys //Gln lation size. For a given protein, the rate change, as this is a required feature 0 2 - Met/ H of such change should be nearly con- of most models of biochemical tax- Trp stant. Darwinian change, in contrast, onomy. Uniform rates of evolutionary I2 is under the influence of changing en- change also lend credence to the propo- 0 2 4 6f 10 12 vironment, adaptive radiation, fluctua- sition that a substantial proportion of tions in population size, and such fac- evolutionary change at the molecular Fig. 1. Graph showing the similarity be- tors as adjustment to major changes in level is due to the random incorpora- tween the observed frequencies of amino the genetic background. Thus it might tion of functionally insignificant change. acids in 53 completely sequenced mam- g . malian proteins and the frequencies pre- well be subject to bursts of rapid change dicted by the genetic code and random in some species and relative stability in permutations of DNA nucleotides. The fre- others. Amino Acid Composition quencies are in percentages of total amino Sarich and Wilson (48) have re- acid content. The straight line represents Another difference in the expecta- an idealized equality of expectation and ported that the rate of evolutionary tions based on the Darwinian and non- observation. change in the immunological properties of primate albumin seems to be re- Darwinian models pertains to amino markably constant in numerous species. acid composition. In the non-Darwin- served rates of evolutionary change at The rates of evolutionary change in the ian model the amino acid composition the molecular level are consonant either primary structures of hemoglobin and should be strongly influenced by the with predominantly non-Darwinian fix- of cytochrome c also appear to be rela- genetic code, since, by hypothesis, a ation of random neutral change, or tively constant (Table 5). Insulin ap- significant proportion of the amino with predominantly Darwinian positive pears to be stable in most lines of acids present have arisen by random selection for favorable mutations, or descent. 'Guinea pig insulin, however, mutation and drift. In the Darwinian with any mixture of the two. has markedly more substitutions than model, one particular amino acid will be optimum at a given site in a given organism, and it matters little whether Table 6. Amino acid frequencies among 5492 residues in 53 vertebrate polypeptides, compared there are six possible codons (as there with the frequencies expected with random permutations of nucleic acid bases. are for serine) or only one (as there Observed Expected is for methionine). However, if one Number of, allows for numerous sites, within pro- Aoino acid Codons ccurrences frequency frequency teins, at which amino acid composition UCU, UCA 443 8.1 8.6 is not critical, then a given site at a Serine UCC, UCG given point in evolutionary time is six AGU, AGC times more likely to be serine than Leucine CUU, CUA 417 7.6 7.9 methionine. Other amino acids will be CUC, CUG present in rough accordance with their UUA, UUG Arginine CGU, CGA 229 4.2 10.7 numbers of synonymous codons, CGC, CGG weighted by the frequencies of the AGA, AGG nucleic acid bases involved. And this Glycine GGU, GGA 408 7.4 7.2 is what is found when total amino acid GGC, GGG compositions of large numbers of pro- Alanine GCU, GCA 406 7.4 6.0 GCC, GCG teins are analyzed (6, 49). Valine GUU, GUA 375 6.8 6.1 The amino acid compositions of 53 GUC, GUG vertebrate (mostly mammalian) poly- Threonine ACU, ACA 339 6.2 6.9 peptides were taken from data of Day- ACC, ACG hoff and Eck (50). Several pairs of Proline CCU, CCA 275 5.0 5.0 CCC, CCG related polypeptides were included, but Isoleucine AUU, AUA 209 3.8 5.2 none with greater than 80 percent AUC 5.5 homology. The total number of amino Lysine AAA, AAG 394 7.2 GAA, GAG 317 5.8 4.7 acid residues involved was 5492, dis- Glutamic acid Aspartic acid GAU, GAC 322 5.9 3.6 tributed as shown in Table 6. For the Phenylalanine UUU, UUC 222 4.0 2.2 first two positions of the codons making Asparagine AAU, AAC 243 4.4 4.2 up the relevant messenger RNA, the Glutamine CAA, CAG 203 3.7 3.9 base composition is as follows: uracil, Tyrosine UAU, UAC 183 3.3 3.1 Cysteine UGU, UGC 181 3.3 2.6 22.0 percent; adenine, 30.3 percent; Histidine CAU, CAC 158 2.9 3.0 cytosine, 21.7 percent; guanine, 26.1 Methionine AUG 96 1.8 1.8 percent. Tryptophan UGG 72 1.3 1.6 Note that in this sample, which 796 SCIENCE, VOL. 164 presumably reflects one of the two from the base composition. Subak- in protein function. The principal evi- DNA strands, G + A is not equal to Sharpeet al. (51) have suggested that dence for this is the astounding vari- C + U. The implied asymmetry of mammalian cells rarely use the arginine ability in primary structure of homol- the composition of the transcribed and codons CGU, CGC, CGA, and CGG, ogous proteins from various species, nontranscribed strands of structural and they have also suggested (52) that and the rapid rate at which molecular DNA is of considerable interest in it- "the CpG shortage observed in mam- changes accumulate in evolution. self. The G + C content is 47.8 per- malian DNA has a magnitude which cent. We will make the assumption References and Notes virtually precludes the use of CpG for that the distribution of third-position general coding for amino acids." 1. G. G. Simpson, Scienice 146, 1535 (1964). 2. P. Weiss, in The Molecular Control of Cellu- bases in this sample is the same as that Various possibilities suggest them- lar Activity, J. M. Allen, Ed. (McGraw-Hill, of the first- and second-position bases. selves in explanation of the comparative New York, 1961), p. 1. 3. E. L. Smith, Harvey Lectures Ser. 62 (1965- A hypothesis can then be tested: are rarity of CpG doublets. One is that (1960), 231 (1967). the amino acid residues distributed ac- mutation to CpG-containing codons is 4. M. Kimura, Nature 217, 624 (1968). 5. R. A. Fisher, Proc. Roy. Soc. Edinburgh Sect. cording to random permutations of the relatively rare, because of some un- B 50 (1928-29), 205 (1930). nucleic acid bases? 6. M. Kimura, Genet. Res. 11, 247 (1968). known aspect of mutation-producing 7. P. M. B. Walker, Nature 219, 228 (1968). For example, the codons for tyro- mechanisms. A second possibility is 8. C. Laird, B. L. McConaughy, B. J. McCarthy, in preparation. sine are UAU and UAC. With the that such mutations do occur, but that 9. R. J. Britten and D. E. Kohne, Science 161, messenger RNA base composition CpG doublets are regularly back- 529 (1968). 10. E. C. Cox and C. Yanofsky, Proc. Nat. Acad. calculated, the random expectation for mutated to other forms during DNA Sc. U.S. 58, 1895 (1967). the frequency of tyrosine is (0.220) replication. A third possibility is that 11. The following abbreviations are used in this article: A, adenine; C, cytosine; G, guanine; (0.303) (0.220) + (0.220) (0.303) CpG-containing codons, although syn- T, thymine; U, uracil; A * T, base pair in (0.217)-that is, 0.0292. Since not all DNA-adenine in one strand paired with onymous with other normal codons, thymine in the complementary strand; CpG, codons specify amino acids, this value are in some way disadvantageous and a nucleotide doublet-cytidylic acid and guany- should be multiplied by a correction lic acid in a 3'-5' linkage; d(AT) copolymer, are eliminated by natural selection. A synthetic DNA consisting of alternating A factor of 1.057. The expected fre- fourth possibility is that the amount and T bases in each complementary strand; quency of tyrosine is thus 3.09 percent; Hb, hemoglobin; Ala, alanine; Arg, arginine; of arginine that can be tolerated in Asn, asparagine; Asp, aspartic acid; Cys, the observed frequency is 3.33 percent. animal proteins is less than the amount cysteine; Gln, glutamine; Glu, glutamic acid; [For a similar approach with other Gly, glycine; His, histidine; Ile, isoleucine; which would result from the occurrence Leu, leucine; Lys, lysine; Met, methionine; data, see (6).] of all six arginine codons at a random Phe, phenylalanine; Pro, proline; Ser, serine; Thr, threonine; Trp, tryptophan; Tyr, tyro- Expected and observed frequencies rate, so that the CpG content of animal sine; Val, valine; N, any nucleotide. of all the amino acids are presented 12. J. F. Speyer, Biochem. Biophys. Res. Com- DNA has been lowered by natural mun. 21, 6 (1965). in Table 6. Although the distribution selection. There is some evidence that 13. N. Sueoka, Proc. Nat. Acad. Scl. U.S. 47, 1141 (1961). of amino acids is not completely ran- CGN arginine codons are present in 14. For further discussion, see T. H. Jukes, Mole- dom-notably in the case of arginine, mammalian DNA-for example, the cules and Evolution (Columbia Univ. Press, New York, 1966). which occurs at a frequency less than occurrence in hemoglobin of mutations 15. E. Freese and A. Yoshina, in Evolving Genes half that expected-for the most part between arginine and histidine, leucine, and Proteins, V. Bryson and H. J. Vogel, Eds. (Academic Press, New York, 1965). the fit is remarkably good, which indi- proline, and glutamine (22), all of 16. E. E. Jacobs and D. R. Sanadi, J. Biol. Chem. cates a very strong influence of the ge- which mutations require CGN codons 235, 53 (1960). 17. E. Margoliash and E. L. Smith, in Evolving netic code on protein composition. When for single-base changes. Genes and Proteins, V. Bryson and H. J. arginine is disregarded, the coefficient It has been argued (49) that the Vogel, Eds. (Academic Press, New York, 1965). of correlation (r) between the expected genetic code evolved to its definitive 18. W. M. Fitch and E. Margoliash, Biochem. Genet. 1, 65 (1967). and the observed frequencies is 0.89 form because this form best matches 19. T. H. Jukes and C. R. Cantor, in Mammalian (see Fig. 1). The opposing hypothesis, the amino acid composition of living Protein Metabolism, vol. 3, H. N. Munro, Ed. (Academic Press, New York, in press). that all evolutionary change depends material; we suggest that the relation- 20. H. Matsubara and E. L. Smith, J. Biol. Chem. upon natural selection, predicts that ship is the other way around, and that 238, 2732 (1963). 21. A. Riggs, Nature 183, 1037 (1959). there should be no relationship be- the average amino acid composition of 22. M. F. Perutz and H. Lehmann, ibid. 219, tween amino acid frequencies and the 902 (1968). proteins reflects, more or less passively, 23. K. Sick, D. Beale, D. Irvine, H. Lehmann, genetic code. the genetic code. P. T. Goodall, S. MacDougal, Biochim. Bio- phys. Acta 140, 231 (1967). From these considerations it is not 24. H. Lehmann, personal communication. difficult to conclude that the stream of 25. and R. W. Carrell, Brit. Med. Bull. Comparative Rarity of Arginine 25, 14 (1969). spontaneous alterations in DNA, con- 26. T. H. Jukes, Biochem. Genet. 3, 109 (1969). tinuously fed into the genetic pool, 27. C. Milstein, Nature 216, 330 (1967). The conspicuous disparity of the 28. R. J. DeLange and D. M. Fambrough, Fed. should include far more acceptable Proc. 27, 392 (1968). observed and expected frequencies of changes that are neutral than changes 29. C. Yanofsky, Cold Spring Harbor Symp. occurrence for arginine (Table 6) is Quant. Biol. 28, 581 (1963). that are adaptive. Protein molecules 30. H. J. Whitfield, Jr., R. G. Martin, B. Ames, actually to be expected from predic- are subjected to incessant probing as a J. Mol. Biol. 21, 335 (1966). 31. T. Mukai, Genetics 50, 1 (1964). tions made by Subak-Sharpe et al. (51, result of point mutations and other 32. T. A. Trautner, M. N. Swartz, A. Kornberg, 52). Their investigations focused at- DNA alterations. The genome becomes Proc. Nat. Acad. Sci. U.S. 48, 449 (1962). 33. Z. W. Hall and I. R. Lehman, J. Mol. Biol. tention on the anomalous rarity of the virtually saturated with such changes 36, 321 (1968) doublet CpG in vertebrate DNA, first as are not thrown off through natural 34. E. Zuckerkandl and L. Pauling, in Evolving Genes and Proteins, V. Bryson and H. J. noted by Josse et al. (53) and Swartz selection. We conclude that most pro- Vogel, Eds. (Academic Press, New York, et al. (54). The sequence CpG occurs 1965). teins contain regions where substitu- 35. R. F. Doolittle, D. Schubert, S. A. Schwartz, in human DNA at a frequency less tions of many amino acids can be made Arch. Biochem. Biophys. 118, 456 (1967). 36. E. L. Smith and E. Margoliash, Fed. Proc. than 10 percent of that anticipated without producing appreciable changes 23, 1243 (1964). 16 MAY 1969 797 37. T. H. Jukes, Amer. Scientist 53, 477 (1965). J. Maynard Smith, Nature 219, 1114 (1968). Cold Spring Harbor Symp. Quant. Biol. 31, 38. C. Stern, Principls of Human Genetics (Free- 46. J. B. S. Haldane, Proc. Cambridge Phil. Soc. 583 (1966). man, San Francisco, 1960). 23, 838 (1927); M. Kimura, J. Appl. Proba- 52. H. Subak-Sharpe, R. R. Burk, L. V. Craw- 39. H. J. Muller, Studies In Genetics (Indiana bility 1, 177 (1964). ford, J. M. Morrison, J. Hay, H. M. Keir. Univ. Press, Bloomington, 1962). 47. 0. L. Stebbins, Processes of Organic Evolu- ibid., p. 737. 40. H. Callan, J. Cell Sd. 2, 1 (1967). tion (Prentice-Hall, Englewood Cliffs, N.J., 53. J. Josse, A. D. Kaiser, A. Kornberg, J. Biol. 41. C. Bresch, Klassische und Afolekulare Genetik 1966). Chem. 236, 861 (1961). (Springer, Berlin, 1964); D. B. Comings and 54. M. N. Swartz, T. A. Trautner, A. Kornberg, R. 0. Berger, Biochem. Genet. 2, 319 (1969). 48. V. M. Sarich and A. C. Wilson, Proc. Nat. Acad. Sci. U.S. 58, 142 (1967). ibid. 237, 1961 (1962). 42. J. Paul and R. S. Gilmour, J. Mol. Biol. 34, 55. We thank Dr. Motoo Kimura for suggestions 305 (1968). 49. A. L. MacKay, Nature 216, 159 (1967). and comments. The work discussed here was 43. J. B. S. Haldane, Genetics 55, 511 (1957). 50. M. 0. Dayhoff and R. V. Eck, Atlas of done with support from the U.S. Atomic En- 44. J. L. King, ibid., p. 403; R. D. Milkman, Protein Sequence and Structure 1967-1968 ergy Commission and from the National Aero- ibid., p. 493; J. A. Sved, T. E. Reed and W. (National Biomedical Research Foundation, nautics and Space Administration (grant NGR P. Bodmer, ibid., p. 469. Silver Spring, Md., 1968). 05-003-020 to the University of California, 45. J. A. Sved, Amer. Naturalist 102, 283 (1968); 51. H. Subak-Sharpe, W. M. Shepherd, J. Hay, Berkeley).
be made of changing the denominator
as well as the numerator. In a very simple arithmetic calcula- tion, an imaginary less-developed Birth Control for country may be expected, in 1980, to have a national output (V) of $2500 Economic Development million and a population (P) of 12.5 million for a yearly output per head (VIP) of $200. The government may Reducing human fertility can raise decide to spend an extra $2.5 million a year for 10 years starting in 1970 to per capita income in less-developed countries. raise VIP. It can use these funds to increase output (AlV) or to decrease population (AP) from what they would Stephen Enke otherwise be (3). If the significant rate of return on traditional investments is 10 percent annually, an investment of $25 million from 1970 to 1980 will T;here is a growing interest in the encouraging voluntary use of contra- yield a AV in 1980 of $2.5 million, so possibilities of lowering birth rates in ceptives. The objective is economic that AVIV is 0.1 percent, or 1 in order to raise per capita incomes in development. 1000. many of the less-developed countries. Many questions remain. How effec- Alternatively, the $2.5 million per Described below is one economic- tive in raising incomes per head is re- year might have been spent on birth demographic method of assessing what ducing fertility as compared with other control. If the annual cost of an adult reduced human fertility might contrib- investments of resources? Could and practicing contraception is $5 (4) and ute to increased economic develop- should governments of less-developed the annual fertility of contraceptive ment. Justifications of government pro- countries encourage voluntary contra- users is otherwise typically 0.25 live grams to increase voluntary contracep- ception? births, then in 1980 the population tion are also considered (1). (12.5 million) would be 1.25 million In less-developed oountries, one-half smaller than expected. Thus APIP is or more of annual increases in national Income per Head 10 percent or 1 in 10. output is being "swallowed" by annual Apparently the amount of money increases in population, with income One measure of successful economic spent each year on birth control can per head rising very slowly. Most of development is a rising income (out- be 100 times more effective in raising these countries have natural increases put) per head of population (2). It is output per head than the amount of of from 2 percent to 3 percent a year. ordinarily associated with other indi- money spent each year on traditional Hence they are doubling their popula- cators of increasing welfare such as productive investments-for VAPI tions every 35 to 23 years. This results greater annual investment. Another PAV here equals 100. Had the rate of not from rising birthrates but from measure is fewer people living in pov- return on investments been 20 percent falling death rates during the past 25 erty. annually instead of 10 percent, had to 40 years-mostly attributable to im- Income (output) per head is a ratio. the annual cost of birth control been proved health measures. Governments have sought to raise this $10 instead of $5, or had the other- Some of their governments have de- ratio by increasing its numerator-in- wise fertility of "contraceptors" (5) cided that they cannot afford to wait vesting in factories, dams, and high- been 0.125 instead of 0.25, this supe- for a spontaneous decline in fertility, ways, and the like-in order to increase rior effectiveness ratio would have been resulting perhaps from more education, the annual national output of goods 50 to 1 instead of 100 to 1. Had all greater urbanization, and improved and services. However, where politi- three parameters been altered by a living. Instead, a few governments are cally feasible, governments can also factor of two to weaken the argument, raise the ratio of output per head by the expenditures on birth control would The author is manager of economic develop- ment programs at TEMPO, General Electric's decreasing the denominator. A com- still appear 12.5 tim-es more effective. Center for Advanced Studies, Santa Barbara, The explanation is that it costs fewer California. parison of economic effectiveness can SCIENCE, VOL. 164 798
Life Sciences and Space Research: Proceedings of The Open Meetings of The Working Group on Space Biology of The Twentieth Plenary Meeting of COSPAR, Tel Aviv, Israel, 7-18 June 1977
(Applied Logic Series 15) Didier Dubois, Henri Prade, Erich Peter Klement (Auth.), Didier Dubois, Henri Prade, Erich Peter Klement (Eds.) - Fuzzy Sets, Logics and Reasoning About Knowledge-Springer Ne