Professional Documents
Culture Documents
36 De la Rúa, P. et al. (1998) Mitochondrial DNA variability in the Canary 44 Francisco-Ortega, J. et al. (1996) Chloroplast DNA evidence of
Islands honeybees (Apis mellifera L.). Mol. Ecol. 7, 1543–1547 colonization, adaptive radiation, and hybridization in the evolution
37 Widmer, A. et al. (1998) Population genetic structure and colonization of the Macaronesian flora. Proc. Natl. Acad. Sci. U. S. A. 93,
history of Bombus terrestris s.l. (Hymenoptera: Apidae) from the 4085–4090
Canary Islands and Madeira. Heredity 81, 563–572 45 Francisco-Ortega, J. et al. (1996) Isozyme differentiation in the
38 Arnedo, M.A. et al. (1996) Radiation of the genus Dysdera (Araneae, endemic genus Argyranthemum (Asteraceae: Anthemideae)
Haplogynae, Dysderidae) in the Canary Islands: the western islands. in the Macaronesian Islands. Plant Syst. Evol. 202, 137–152
Zool. Scripta 25, 241–274 46 Francisco-Ortega, J. et al. (1997) Molecular evidence for a
39 Avanzati, A.M. et al. (1994) Molecular and morphological Mediterranean origin of the Macaronesian endemic genus
differentiation between steganacarid mites (Acari: Oribatida) from the Argyranthemum (Asteraceae). Am. J. Bot. 84, 1595–1613
Canary Islands. Biol. J. Linn. Soc. 52, 325–340 47 Francisco-Ortega, J. et al. (1997) Origin and evolution of
40 Kim, S.C. et al. (1996) A common origin for woody Sonchus and five Argyranthemum (Asteraceae: Anthemideae) in Macaronesia. In
related genera in the Macaronesian islands: molecular evidence for Molecular Evolution and Adaptive Radiation (Givnish, T.J. and
extensive radiation. Proc. Natl. Acad. Sci. U. S. A. 93, 7743–7748 Sytsma, K.J., eds), pp. 407–431, Cambridge University Press
41 Böhle, U.R. et al. (1996) Island colonization and evolution of the insular 48 Roderick, G.K. and Gillespie, R.G. (1998) Speciation and
woody habit in Echium L. (Boraginaceae). Proc. Natl. Acad. Sci. phylogeography of Hawaiian terrestrial arthropods. Mol. Ecol. 7,
U. S. A. 93, 11740–11745 519–531
42 Francisco-Ortega, J. et al. (1995) Chloroplast DNA evidence for 49 Oromí, P. et al. (1991) The evolution of the hypogean fauna in the
intergeneric relationships of the Macaronesian endemic genus Canary Islands. In The Unity of Evolutionary Biology (Dudley, E.C., ed.)
Argyranthemum (Asteraceae). Syst. Not. 20, 413–422 pp. 380–395, Dioscorides Press
43 Francisco-Ortega, J. et al. (1995) Genetic divergence among 50 Carracedo, J.C. et al. (1998) Hotspot volcanism close to a
Mediterranean and Macaronesian genera of the subtribe passive continental margin: the Canary Islands. Geol. Mag. 135,
Crysanthemidae (Asteraceae). Am. J. Bot. 82, 1321–1328 591–604
M
olecular evolution leaves lysozyme should be evident in a
Tracing the history of molecular changes using
behind a trail of amino acid phylogenetic methods can provide powerful phylogeny of primate lysozyme
substitutions potentially insights into how and why molecules work thesequences, on the lineage leading
rich in information about mol- to colobine monkeys. By inferring
way they do. It is now possible to recreate
ecular function. Tracing changes and comparing ancestral lysozyme
inferred ancestral proteins in the laboratory
in protein structure along the sequences, Messier and Stewart1
and study the function of these molecules.
branches of a phylogenetic tree were able to demonstrate adap-
This provides a unique opportunity to study
can provide important insights the paths and the mechanisms of functionaltive change in this lineage. Specifi-
into molecular function, and the change during molecular evolution. cally, they estimated the numbers
role of selection in shaping the What insights have already emerged from of nonsynonymous (dN) and syn-
relationship between molecular such phylogenetic studies of protein onymous (dS) substitutions along
structure and function. Recreation evolution and function, what are the each branch using Li’s method2,
of inferred ancestral proteins impediments to progress and what are the and found a dN/dS ratio much
using gene synthesis and protein prospects for the future? greater than one in the lineage
expression methods, whose bio- leading to the colobine monkeys
chemical functions can then be (Fig. 1). This analysis also re-
directly measured in vitro, pro- Belinda S.W. Chang and Michael J. Donoghue are vealed a previously unsuspected
vides a powerful approach to this at the Dept of Organismic and Evolutionary Biology, episode of positive selection in
Harvard University Herbaria, 22 Divinity Avenue,
problem. Cambridge, MA 02138, USA
the ancestral hominoid lineage.
(chang24@oeb.harvard.edu; Subsequently, Yang3 devel-
Phylogenies, molecular mdonoghue@oeb.harvard.edu). oped a more rigorous statistical
function and natural selection approach to this problem, using
Phylogenies can be used in several a codon-based maximum likeli-
ways to infer the effects of natural hood model of evolution4 to de-
selection on molecular function. tect elevated dN/dS ratios along
Because directional selection is known to elevate the ratio lineages in a phylogeny. Yang’s method3 uses likelihood
of nonsynonymous to synonymous nucleotide substitu- ratio tests to compare the performance of different likeli-
tions, this ratio can be used as a tool to detect lineages in a hood models, and to determine if the dN/dS ratio for the lin-
molecular phylogeny along which selection has occurred. eage of interest is elevated compared with other lineages
Cows and langur monkeys convergently evolved foregut in the phylogeny. This approach avoids using reconstructed
fermentation as a mechanism to digest the large amounts ancestral states as if these were actual observations in the
of plant material in their diets; a key component of this calculations of nonsynonymous and synonymous substi-
involved the recruitment of the lysozyme enzyme to digest tution rates. When applied to the primate lysozyme data,
foregut bacteria. Selection for this specialized function of this method showed strong evidence for positive selection
TREE vol. 15, no. 3 March 2000 0169-5347/00/$ – see front matter © 2000 Elsevier Science Ltd. All rights reserved. PII: S0169-5347(99)01778-4 109
REVIEWS
monkeys
Colobine
lineage Dusky langur
sifying selection during the course of HIV infection. Using
4.7 Francois' langur their likelihood approach, they found strong support for
Co Proboscis monkey variable selection intensity among amino acid sites, and
Guereza colobus
Angolan colobus
identified positively selected sites not only in the V3 region
but also in the flanking regions of the external glycoprotein
OW Patas monkey (gp120). Their analysis indicated that selection acted on a
Cercopithecine
Vervet broader region of the envelope gene than previously imag-
monkeys
Talapoin
Allen's monkey ined, and that most of the variability in the hypervariable
Ce Olive baboon V3 region appeared to be neutral.
Ca Sooty mangabey The likelihood approach developed by Nielsen and
Rhesus macaque
Yang5 has recently been used in identifying specific sites in
plant chitinases that have been subject to selection. In this
5.1 Lar gibbon
case, diversifying selection has been implicated in connec-
Hominoids
Ho Gorilla
Human tion with defense against fungal pathogens. Using this ap-
Chimpanzee proach5 on a phylogeny of crucifers related to Arabidopsis,
Ancestral Bonobo
Orangutan
Bishop and colleagues6 were able to identify residues
hominoid
lineage within the active site cleft with elevated dN/dS ratios, thus
Trends in Ecology & Evolution
implying positive selection. They suggest that this selec-
Fig. 1. The phylogeny of primate lysozyme sequences, showing episodes tion at the active site results from changes in pathogen
of inferred adaptive evolution along the lineages leading to colobine mon- defenses against plant chitinolytic activity.
keys and hominoids. d N /d S ratios of lineages significantly higher than one Finally, in addition to determining when in the course of
are shown above the thicker lines. Nodes representing the ancestors of
extant primate groups of interest are labeled as follows: Co, colobines; Ce, molecular evolution selection might have acted, and where
cercopithecines; Ho, hominoids; OW, Old World monkeys; Ca, catarrhines. in the protein this might have occurred, one would like to
Ancestral nucleotide sequences were reconstructed using maximum likeli- establish with what functional changes specific amino acid
hood, and pairwise d N /d S ratios were calculated using Li’s method2. Modi- substitutions were associated? These structure–function
fied, with permission, from Ref. 1.
questions can be addressed using phylogeny-based com-
parative methods, but such approaches have received sur-
prisingly little attention in this context. An analysis of the
in the hominoid lineage; however, although higher than the evolution of opsin pigments provides an example of the
ratio calculated over all other lineages, the elevated dN/dS insights that can be gained in this way7.
ratio in the colobine lineage was not significantly greater Visual pigments, which mediate vision in the eye, oc-
than one. cur with a diversity of wavelength sensitivities that re-
These studies1–4 attempt to detect selection along par- flect the visual needs of an animal in its particular environ-
ticular lineages in a phylogeny. Such statistical analyses ment. Using a comparative phylogenetic approach, Chang
can establish when in the evolution of a protein selection and colleagues7 were able to correlate specific amino
might have acted, but not where in the protein it might acid substitutions with changes in visual pigment spectral
have caused amino acid changes. Nielsen and Yang5 devel- sensitivities. This involved reconstructing amino acid
oped a codon-based likelihood method to address the changes at particular sites in the protein and looking for
problem of detecting positively selected amino acid sites those amino acid changes that were repeatedly associ-
in a protein. Because positive selection should increase ated with shifts in sensitivity towards shorter wavelengths
the number of nonsynonymous substitutions relative to (Fig. 2). Four sites were identified as showing evidence
synonymous substitutions at a site, they developed likeli- of convergently evolved replacements. At these sites,
hood models that incorporated variable dN/dS ratios across changes from nonpolar to polar amino acid residues were
sites (Box 1). They applied this method to HIV-1 envelope repeatedly associated with blue shifts in opsin wavelength
sensitivity. Moreover, the location of these amino acid
residues within the chromophore-binding pocket of the
opsin protein suggested a mechanism for these blue shifts,
Box 1. Likelihood models for detecting positively selected
amino acid sites in a protein involving stabilization of the ground state chromophore
through interactions of polar side chains with its proto-
Positively selected amino acid sites are usually identified by an elevated nated Schiff base moiety. This hypothesis has subse-
ratio of nonsynonymous-to-synonymous substitutions (d N / d S or v), with a
ratio significantly greater than one implying strong evidence of selection at quently been confirmed in laboratory experiments using in
that site44. This is in contrast to sites that are highly conserved owing to vitro expression techniques and resonance Raman spec-
functional constraints and thus have few nonsynonymous substitutions, troscopy8. This kind of approach, combining comparative
and to sites evolving neutrally, which would tend towards more equal ratios sequencing and mutagenesis techniques, is increasingly
of nonsynonymous and synonymous substitutions. In their codon-based
maximum likelihood model, Nielsen and Yang5 allow for three categories popular for identifying sites in the protein that affect opsin
of amino acid sites: sites that are conserved (v 5 0), sites evolving neutrally wavelength sensitivity9,10.
(v 5 1) and sites that are the targets of positive selection (v .1). The
numerical value of v for the category of positively selected sites is opti- Recreating ancestral proteins
mized in the likelihood analysis, which indicates the strength of selection Another approach that has become feasible in recent years
for this category of sites. The accuracy of the inferred assignment of amino
acid sites can be assessed using an empirical Bayesian approach to calcu- provides an opportunity to examine the function of in-
late the posterior probabilities of a site belonging to a particular category. ferred ancestral proteins directly in the laboratory. In
This provides a measure of confidence that a particular amino acid site was the studies reviewed here, ancestral gene sequences have
the target of selection. first been inferred, then synthesized, incorporated into an
expression vector and expressed in cell culture. The resulting
9 Fasick, J.I. et al. (1998) The visual pigments of the bottlenose dolphin 30 Kimura, M. (1980) A simple method for estimating evolutionary rate of
(Tursiops truncatus). Vis. Neurosci. 15, 643–651 base substitutions through comparative studies of nucleotide
10 Yokoyama, S. et al. (1999) Adaptive evolution of color vision of the sequences. J. Mol. Evol. 16, 111–120
Comoran coelacanth (Latimera chalumnae). Proc. Natl. Acad. Sci. 31 Yang, Z. (1994) Maximum likelihood phylogenetic estimation from
U. S. A. 96, 6279–6284 DNA sequences with variable rates over sites: approximate methods.
11 Adey, N.B. et al. (1994) Molecular resurrection of an extinct ancestral J. Mol. Evol. 39, 306–314
promoter for mouse L1. Proc. Natl. Acad. Sci. U. S. A. 91, 1569–1573 32 Yang, Z. (1994) Estimating the pattern of nucleotide substitution.
12 Jermann, T.M. et al. (1995) Reconstructing the evolutionary history of J. Mol. Evol. 39, 105–111
the artiodactyl ribonuclease superfamily. Nature 374, 57–59 33 Galtier, N. and Gouy, M. (1998) Inferring pattern and process: maximum-
13 Schluter, D. (1995) Uncertainty in ancient phylogenies. Nature 377, likelihood implementation of a nonhomogeneous model of DNA
108–110 sequence evolution for phylogenetic analysis. Mol. Biol. Evol. 15, 871–879
14 Chandrasekharan, U.M. et al. (1996) Angiotensin II – forming activity in 34 Bishop, M.J. and Friday, A.E. (1985) Evolutionary trees from nucleic
a reconstructed ancestral chymase. Science 271, 502–505 acid and protein sequences. Proc. R. Soc. London B Biol. Sci. 226,
15 Dean, A.M. and Golding, G.B. (1997) Protein engineering reveals 271–302
ancient adaptive replacements in isocitrate dehydrogenase. Proc. Natl. 35 Hasegawa, M. and Fujiwara, M. (1993) Relative efficiencies of the
Acad. Sci. U. S. A. 94, 3104–3109 maximum likelihood, maximum parsimony, and neighbor-joining
16 Swofford, D.L. et al. (1996) Phylogenetic inference. In Molecular methods for estimating protein phylogeny. Mol. Phylog. Evol. 2, 1–5
Systematics (2nd edn) (Hillis, D.M. et al., eds), pp. 407–514, Sinauer 36 Yang, Z. (1997) PAML: a program package for phylogenetic analysis by
17 Felsenstein, J. (1988) Phylogenies from molecular sequences: maximum likelihood. Comput. Appl. Biosci. 13, 555–556
inference and reliability. Annu. Rev. Genet. 22, 521–565 37 Dayhoff, M.O. (1978) A model of evolutionary change in proteins.
18 Schluter, D. et al. (1997) Likelihood of ancestor states in adaptive Matrices for detecting distant relationships. In Atlas of Protein
radiation. Evolution 51, 1699–1712 Sequence and Structure (Dayhoff, M.O., ed.), pp. 345–358, National
19 Cunningham, C.W. et al. (1998) Reconstructing ancestral character Biomedical Research Foundation
states: a critical reappraisal. Trends Ecol. Evol. 13, 361–366 38 Kishino, H. et al. (1990) Maximum likelihood inference of protein
20 Zhang, J. and Nei, M. (1997) Accuracies of ancestral amino acid phylogeny and the origin of chloroplasts. J. Mol. Evol. 31, 151–160
sequences inferred by the parsimony, likelihood, and distance 39 Jones, D.T. et al. (1992) The rapid generation of mutation data
methods. J. Mol. Evol. 44, S139–S146 matrices from protein sequences. Comput. Appl. Biosci. 8, 275–282
21 Donoghue, M.J. and Ackerly, D.D. (1996) Phylogenetic uncertainties 40 Cao, Y. et al. (1994) Phylogenetic place of guinea pigs: no support of
and sensitivity analyses in comparative biology. Philos. Trans. R. Soc. the rodent-polyphyly hypothesis from maximum-likelihood analyses
London Ser. B 351, 1241–1249 of multiple protein sequences. Mol. Biol. Evol. 11, 593–604
22 Maddison, W.P. (1995) Calculating the probability distributions of 41 Adachi, J. and Hasegawa, M. (1996) Model of amino acid substitution
ancestral states reconstructed by parsimony on phylogenetic trees. in proteins encoded by mitochrondrial DNA. J. Mol. Evol. 42, 459–468
Syst. Biol. 44, 474–481 42 Muse, S.V. and Gaut, B.S. (1994) A likelihood approach for comparing
23 Lewis, P.O. (1998) Maximum likelihood as an alternative to parsimony synonymous and nonsynonymous nucleotide substitution rates, with
for inferring phylogeny using nucleotide sequence data. In Molecular application to the chloroplast genome. Mol. Biol. Evol. 11, 715–724
Systematics of Plants II: DNA Sequencing (Soltis, P.S. et al., eds), 43 Yang, Z. et al. (1998) Models of amino acid substitution and
pp. 132–163, Kluwer applications to mitochondrial protein evolution. Mol. Biol. Evol. 15,
24 Yang, Z. (1996) Phylogenetic analysis using parsimony and likelihood 1600–1611
methods. J. Mol. Evol. 42, 294–307 44 Nei, M. (1987) Molecular Evolutionary Genetics, Columbia University
25 Huelsenbeck, J.P. (1997) Is the Felsenstein zone a fly trap? Syst. Biol. Press
46, 69–74 45 Huelsenbeck, J.P. (1998) Systematic bias in phylogenetic analysis: is
26 Felsenstein, J. (1978) Cases in which parsimony or compatibility the Strepsiptera problem solved? Syst. Biol. 47, 519–537
methods will be positively misleading. Syst. Zool. 27, 401–410 46 Huelsenbeck, J.P. and Rannala, B. (1997) Phylogenetic methods come
27 Yang, Z. et al. (1995) A new method of inference of ancestral of age: testing hypotheses in an evolutionary context. Science 276,
nucleotide and amino acid sequences. Genetics 141, 1641–1650 227–232
28 Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a 47 Goldman, N. (1993) Statistical tests of models of DNA substitution.
maximum likelihood approach. J. Mol. Evol. 17, 368–376 J. Mol. Evol. 36, 345–361
29 Jukes, T.H. and Cantor, C.R. (1969) Evolution of protein molecules. In 48 Navidi, W.C. et al. (1991) Methods for inferring phylogenies from
Mammalian Protein Metabolism (Munro, H.N., ed.), pp. 21–132, nucleic acid sequence data by using maximum likelihood and linear
Academic Press invariants. Mol. Biol. Evol. 8, 128–143