You are on page 1of 6

4272±4277 Nucleic Acids Research, 2002, Vol. 30 No.

19 ã 2002 Oxford University Press

Synonymous codon usage is subject to selection in


thermophilic bacteria
David J. Lynn, Gregory A. C. Singer and Donal A. Hickey*
Department of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON K1N 6N5, Canada

Received May 14, 2002; Revised July 24, 2002; Accepted August 1, 2002

ABSTRACT been shown that codon usage mirrors the distribution of tRNA
abundances (2±4), indicating that the `preferred' codons are
The patterns of synonymous codon usage, both those that tend to match the more abundant anticodons. This
within and among genomes, have been extensively correlation between the abundance of codons and their
studied over the past two decades. Despite the matching anticodons suggests that relative tRNA abundance
accumulating evidence that natural selection can is the selective force that determines synonymous codon usage
shape codon usage, it has not been possible to link (2±4). Although the relative tRNA abundances may well be

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


a particular pattern of codon usage to a speci®c the short-term determinants of codon usage, it has been
external selective force. Here, we have analyzed suggested that over the course of long-term evolutionary
the patterns of synonymous codon usage in 40 change the tRNA abundances themselves may also evolve to
completely sequenced prokaryotic genomes. By match the genomic patterns of codon and nucleotide frequen-
combining the genes from several genomes (more cies (9). In other words, it is not clear if, in the long term, the
codon usage pattern is selected to match the relative
than 80 000 genes in all) into a single dataset for
abundances of the isoaccepting tRNAs or vice versa. In any
this analysis, we were able to investigate variations case, there is strong evidence for a co-adaptation of the
in codon usage, both within and between genomes. relative frequencies of codons and their respective anticodons
The results show that synonymous codon usage is within a genome. Despite this evidence, however, we cannot
affected by two major factors: (i) the overall G+C explain why a particular codon±anticodon combination might
content of the genome and (ii) growth at high tem- have a selective advantage over alternative synonymous
perature. This study focused on the relationship codon±anticodon pairs that are also perfectly matched. Thus,
between synonymous codon usage and the ability although we have ample indirect evidence that a particular
to grow at high temperature. We have been able to pattern of synonymous codon usage has biological signi®-
eliminate both phylogenetic history and lateral gene cance, it is not as clear why that particular pattern is favored by
transfer as possible explanations for the character- selection in a given genome. Despite the accumulation of data
on non-random patterns of synonymous codon usage, both
istic pattern of codon usage among the thermo-
between and within genomes, it has been dif®cult to identify
philes. Thus, these results demonstrate a clear link an external selective force acting on synonymous codon
between a particular pattern of codon usage and an usage.
external selective force. Here, we present evidence that the pattern of synonymous
codon usage within thermophilic prokaryotes is different from
that within the mesophilic prokaryotes and that this difference
INTRODUCTION is the result of natural selection linked to thermophily.
The 20 amino acids that commonly occur in proteins are Moreover, we show that this phenomenon affects all of the
encoded by 61 different codons. This redundancy in the genes within the genome, that the pattern cannot be explained
genetic code means that several `synonymous' codons may by a simple accident of phylogenetic history and that it is not
encode the same amino acid. Consequently, one might argue due to horizontal gene transfer between mesophiles and
that mutational changes affecting these codons would not be thermophiles. This result indicates that natural selection acting
subject to natural selection, since the encoded protein through external environmental factors can indeed shape the
sequence would be unaffected by such changes. A large genomic pattern of synonymous codon usage.
body of indirect molecular evidence has accumulated, how-
ever, against such a simple assumption. First, it has been
MATERIALS AND METHODS
shown that different genomes each have their own character-
istic patterns of synonymous codon usage (1). Secondly, and We analyzed the patterns of synonymous codon usage in a
more convincingly, it has been shown that within genomes, total of 40 completely sequenced bacterial genomes (listed in
highly expressed genes have shifted their codon usage toward Table 1). This set of genomes includes 32 eubacteria and eight
a more restricted set of `preferred' synonymous codons than archaea. Although the majority of the eubacterial species are
other, less highly expressed genes (2±8). In many cases, it has mesophiles and the majority of the archaea are thermophiles,

*To whom correspondence should be addressed. Tel: +1 613 562 5800; Fax: +1 613 562 5744; Email: dhickey@uottawa.ca
Nucleic Acids Research, 2002, Vol. 30 No. 19 4273

Table 1. Total genomic G+C contents and optimal growth temperatures for the 40 genomes analyzed in our study

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


Note that the variation in G+C contents is only weakly correlated with phylogenetic relatedness. Although the
majority of thermophilic organisms are Archaea, there are two exceptions (A.aeolicus and T.maritima). In addition,
there is one mesophilic archaeal species in our dataset (Halobacterium sp.).

the list does include two eubacterial thermophiles (Aquifex correspondence analysis (11) to characterize the patterns of
aeolicus and Thermotoga maritima) and one mesophilic codon usage among this large set of genes and to map this
archaeal species (Halobacterium sp.). These three genomes pattern onto the distribution of codons on which the pattern is
have enabled us to distinguish between the effects of based (Fig. 1A and B). Correspondence analysis was carried
environmental selection and phylogenetic history. out using the program CodonW1.4.2 (J. Peden, 2000; http://
In our analysis, we combined the genes from all 40 genomes www.molbiol.ox.ac.uk/cu/). This `transgenomic' analysis
(a total of 83 985 coding sequences) and calculated the relative allowed us to gain information on both the intra-genomic
synonymous codon usage (10) for each gene. We used and inter-genomic patterns of codon usage simultaneously.
4274 Nucleic Acids Research, 2002, Vol. 30 No. 19

Figure 2. Variation in codon usage within and between genomes. Genes


shown are identi®ed by genome and the means (699.99% con®dence
intervals) for each genome are shown (the abbreviations for each organism
are shown in Table 1). From this plot it is clear that the among-genome

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


variation far exceeds the within-genome variation along both axes. Amongst
the thermophiles we have circled the two eubacterial genomes (T.maritima
and A.aeolicus). We have also circled the single mesophilic archaeal species
(Halobacterium sp.).

shown in red, whereas those from mesophiles are shown in


blue. Both mesophilic and thermophilic genes show a broad
distribution along the horizontal axis (the ®rst axis of inertia).
It is clear from Figure 1A, however, that these groups of
genes are signi®cantly different with respect to their position
along the vertical axis (the second axis of inertia). By looking
at the corresponding distribution of codons (Fig. 1B), we see
that the ®rst axis of inertia is due to the separation of codons
ending in A or T (shown in green) from codons ending in G or
Figure 1. Correspondence analysis of the relative synonymous codon usage C (shown in red). Thus, the separation of genes along the
in 83 985 genes from 40 bacterial genomes (see Table 1). (A) Genes from horizontal axis is highly correlated with the overall G+C
thermophilic bacteria are shown in red while those from mesophilic bacteria content of the genome to which they belong (Table 1). This
are colored blue. For each of these two sub-groups, means and standard can be seen more clearly in Figure 2, where we have grouped
deviations are shown. Note that there is extensive overlap between thermo-
philic and mesophilic genes along the horizontal axis, whereas there is the genes by genome. Genes from GC-rich genomes, such as
relatively little overlap between the two groups along the vertical axis. The Mycobacterium tuberculosis and Pseudomonas aeruginosa
difference between mesophiles and thermophiles along this latter axis is cluster to the right of Figure 2, whereas genes from the
highly signi®cant (P < 0.0001). (B) The distribution of synonymous codons AT-rich species, such as Methanococcus janaschii and
along the ®rst and second axes of the correspondence analysis. Here, we see
that the GC-ending codons (shown in red) cluster to the right and the Borellia burgdorferi, appear on the far left. Species with an
AT-ending codons (shown in green) cluster to the left. intermediate G+C content, such as Escherichia coli and
T.maritima, appear near the middle of the distribution. While
variations in genomic G+C content explain most of the
Moreover, it allowed us to directly compare the magnitude of variation along the ®rst axis of inertia, simple changes in
the within-genome and between-genome variations in codon nucleotide content do not explain the separation of the
usage. thermophilic and mesophilic genomes on the vertical axis. In
Figure 2, it can be seen that all of the thermophilic genomes,
including the two eubacterial thermophiles, are clearly
separated along the second axis of inertia (vertical axis
RESULTS Fig. 2). Moreover, we can quantify this effect by comparing
The genes from all 40 genomes were combined for the the position of each species on the second axis of inertia with
correspondence analysis of relative synonymous codon usage. its optimal growth temperature (see Table 1). The results of
Although all of the genes were combined, each gene could be our regression analysis showed that this relationship is highly
identi®ed in the output. Thus, we could, a posteriori, identify statistically signi®cant (P << 0.00001). By examining the
genes by genome or by type. For instance, in Figure 1, genes distribution of codons in Figure 1B, we can see that the major
are identi®ed based on whether they came from thermophilic contributors to this pattern are the arginine (AGR and CGN)
or mesophilic species. Figure 1A shows the distribution of all and isoleucine (ATH) codons, although many other codon
of the genes on the ®rst two axes of inertia of the groups also contribute to the separation between the thermo-
correspondence analysis. Genes from thermophiles are philes and the mesophiles (see Discussion below).
Nucleic Acids Research, 2002, Vol. 30 No. 19 4275

Figure 4. Summary of phylogenetic analyses based on the concatenated


sequences of 10 ribosomal protein genes from each of the 40 genomes used
in this study. Two possible phylogenetic groupings were compared. The
grouping shown on the left side of the ®gure represents the accepted
organismal tree. The tips of the branches are color coded (red for
Figure 3. Evidence for selection on synonymous codon usage. Synonymous thermophiles and blue for mesophiles); it shows that both thermophiles and
codon bias in highly expressed genes. Each arrow represents one genome, mesophiles are polyphyletic. The alternative hypothesis (shown on the right)
with the base of the arrow at the mean position for the whole genome and describes the horizontal gene transfer hypothesis, whereby the thermophilic
the arrowhead ending at the mean for the ribosomal genes within that Eubacteria (A.aeolicus and T.maritima) would have acquired portions of

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


genome. Note that the arrows tend to point away from the division between their genomes, including the ribosomal protein genes, from thermophilic
the thermophiles and the mesophiles; in other words, the ribosomal protein Archaea. In this case, the tree based on the ribosomal protein sequences
genes have a greater degree of synonymous codon bias than the genomes as would indicate that all thermophilic species would be monophyletic (again,
a whole. as indicated by the color coding on the tips of the branches). Our results,
using both distance-based and maximum likelihood methods of phylogenetic
reconstruction (J. Felsenstein, PHYLIP v.3.6a2.1; http://evolution.genetics.
washington.edu/phylip.html), greatly favor the ®rst grouping (left) over the
Since the difference in synonymous codon usage between second (right), with 100% bootstrapping support.
mesophiles and thermophiles is not due to a simple difference
in the nucleotide content of the genomes, we investigated the
possibility that it might be due to natural selection. To date, The fact that the difference in codon usage between
the best evidence for selection acting on codon usage among thermophiles and mesophiles is more pronounced in the
prokaryotes comes from the work of Ikemura and his highly expressed genes provides strong evidence for selection.
colleagues, who demonstrated that highly expressed genes Nevertheless, we wanted to eliminate the possibility that such
tend to have signi®cantly different codon frequencies than a pattern was simply due to the fact that most of the
other genes in the same genomes (2,3). Selection for optimal thermophiles studied are Archaea rather than Eubacteria. We
codon usage is not, however, the only evolutionary force can eliminate phylogenetic history as an explanation because
acting on these genes. As stated by Gouy and Gautier (12), for the two eubacterial thermophiles (T.maritima and A.aeolicus)
each gene within the genome there is a balance between show a typically thermophilic pattern of codon usage in both
selection for optimal codons and other evolutionary forces their genomes as a whole (Fig. 2) and in their highly expressed
such as mutation and genetic drift. These other forces are genes (Fig. 3). Likewise, the mesophilic archaeal species,
expected to affect all genes equally, whereas there is a Halobacterium, shows a typically mesophilic pattern of codon
predicted correlation between the strength of selection and the usage. It should be noted that three `exceptional' genomes
level of expression of each gene. Thus, although all genes are represent more than 1000 individual genes in this analysis.
subject to some degree of selection, it is only among the most Therefore, we can conclude that the separation in codon usage
highly expressed genes that selection is strong enough to is between thermophiles and mesophiles, and not between
constitute the dominant evolutionary force (12). This, in turn, eubacteria and archaea.
leads to a testable hypothesis: it has been proposed that if One might still argue that the common pattern of codon
selection is the underlying cause of synonymous codon usage usage in eubacterial and archaeal thermophiles could be
bias, then the bias should be more pronounced in the highly explained by horizontal gene transfer between the archaea and
expressed genes than in the rest of the genome (13). To test the thermophilic eubacteria (14±16). Indeed, codon usage has
this prediction, we compared the average codon usage of all been used as an indicator of gene transfer between bacterial
genes within a genome with the average for the ribosomal lineages (16,17) and there is evidence for such transfers
protein genes from the same genome (Fig. 3). Among the between thermophiles (14,18). We addressed this question in
thermophiles, the highly expressed ribosomal protein genes two ways. First, we used the concatenated amino acid
had a more extreme value on the second axis of inertia (the sequences of 10 ribosomal genes in both distance-based and
vertical axis) for all nine species. The same was true for a maximum likelihood phylogenetic analyses (Fig. 4). We
majority of the mesophilic genomes as well. These trends were deliberately chose ribosomal proteins since they are included
statistically highly signi®cant (P = 1 3 10±7 for the mesophiles among the class of highly expressed genes that show the most
and P = 2.6 3 10±3 for the thermophiles in paired t-tests). pronounced differences between thermophiles and mesophiles
Essentially, the data show that the force responsible for the in the patterns of synonymous codon usage (Fig. 3). The
difference in codon usage between thermophiles and meso- results of the phylogenetic test show very clearly that the
philes acts more strongly upon highly transcribed genes than Halobacterium ribosomal protein sequences, despite their
other genes within the genome. mesophilic pattern of synonymous codon usage, group with
4276 Nucleic Acids Research, 2002, Vol. 30 No. 19

DISCUSSION
By combining the genes from all 40 genomes into a single data
set, we were able to make a direct comparison between the
intra-genomic and inter-genomic variations in codon usage.
These results show very clearly that the inter-genomic
differences can be very large relative to the variations between
genes within a particular genome. This is illustrated in
Figure 2, where we can see that the distribution of values for
all genes within a genome is relatively tightly clustered around
the mean of that genome. Examination of these results can also
give us an insight into how rapidly codon usage patterns may
change over the course of evolution. For instance, by
exploiting the fact that these 40 species represent a wide
range in divergence times, we can ask if codon usage is an
Figure 5. Frequency distributions of synonymous codon usage among evolutionarily conserved character. From Figure 2, it is clear
thermophilic (red) and the mesophilic genes (blue) along the second axis of that very closely related species, e.g. different species of
inertia. We have also plotted the A.aeolicus gene frequencies as a proportion Chlamydia, have similar patterns of codon usage. When we
of all thermophilic genes (shown in green). This genome, despite being consider broader phylogenetic groupings such as the

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


eubacterial, shows a unimodal distribution that ®ts the distribution pattern
of the other thermophilic genomes. The `shoulder' in each curve is due to Proteobacteria, however, we see that this clustering of related
the AT-rich genomes in each category. taxa no longer holds. In fact, P.aeruginosa (Paer) and
Buchnera sp. (Buch), both Proteobacteria, are found on
opposite extremes of the scale for the ®rst axis of inertia in
Figure 2. We also see some dramatic cases of evolutionary
the other archaea based on the amino acid sequences of the convergence along the horizontal axis in Figure 2. For
encoded proteins. Likewise, the two eubacterial thermophiles example, the codon usage pattern of the GC-rich archaeal
group with the Eubacteria. This is consistent with other recent species Halobacterium is very similar to that of the GC-rich
gram-positive eubacterium M.tuberculosis and the gram-
reports based on whole genome analyses of sequence data
negative P.aeruginosa. This indicates that codon usage,
(19,20). In other words, the synonymous codon usage patterns
while stable in the short term, is a labile character over the
of these ribosomal protein genes identify them unambiguously
longer evolutionary term. It is particularly obvious that
as mesophiles or thermophiles, while the amino acid
the codon usage of genes within a genome can `track'
sequences encoded by these same genes group them into a
the evolutionary changes in nucleotide content of the
conventional taxonomic arrangement of Eubacteria and entire genome (compare Table 1 and Fig. 2). Given that
Archaea. This provides convincing evidence that the nucleo- codon usage is responsive to evolutionary changes in
tides at the synonymous sites have undergone convergent nucleotide composition, it is not surprising that it should
evolutionary change and it shows that the nature of this change also be responsive to other evolutionary pressures, such as the
is directly related to thermophily or mesophily. action of temperature-dependent selection.
In addition to studying the ribosomal protein genes, we Essentially, our results show that codon usage among these
wished to do an independent and more general test of the 40 genomes is determined by two major factors: nucleotide
possibility that horizontal gene transfer might be a signi®cant content and optimal growth temperature. Of these two factors,
factor in determining the codon usage of the thermophilic the G+C content of the genome explains more than 25% of the
Eubacteria. For instance, we wondered if the genes within variation between genomes, whereas optimal growth tem-
these genomes might show a bimodal distribution, re¯ecting perature explains a further 10% of the variation. In this
the fact that a fraction of their genome had been derived from analysis, there are more than 50 axes in all, and the remaining
archaeal thermophiles by horizontal gene transfer. First, we variation is spread over a large number of the remaining axes.
plotted the frequency distributions for the values of the second No other single axis explains even 5% of the variation in
axis of inertia for all genes from thermophiles and mesophiles codon usage.
(Fig. 5). This ®gure shows quite dramatically how separate While it is clear that the second major factor affecting
both sets of genes are in terms of their codon usage. Our codon usage on a genome-wide scale is optimal growth
primary interest, however, was in looking for evidence of temperature, it is not obvious what the nature of the selective
bimodality within a single genome, particularly within the force might be. At ®rst glance, it seemed that the difference
genomes of the eubacterial thermophiles. The results show no between thermophiles and mesophiles lay solely in their usage
trace of such bimodality. To illustrate this, we have plotted the of arginine and isoleucine codons (Fig. 1B). However, when
distribution of genes for the A.aeolicus genome in Figure 5. As we recalculated the codon frequencies in the absence of these
can be seen in Figure 5, the entire gene set for this eubacterial two codon groups, the difference between thermophiles and
thermophile is unimodal and matches almost exactly with that mesophiles remained. We also re-analyzed the data using
of the entire set of non-AT-rich thermophile genes. This 2-fold, 4-fold and 6-fold degenerate codons groups separately.
means that all of the genes within this genome have converged In all cases, there was a difference between the genomes of the
to a single pattern of codon usage regardless of their long-term thermophiles and the mesophiles. This means that the effect is
evolutionary history. very pervasive. This pervasiveness, in turn, leads one to
Nucleic Acids Research, 2002, Vol. 30 No. 19 4277

wonder if the selection is for some general property of the protein genes: a proposal for synonymous codon choice that is optimal
mRNAs that is particularly important under conditions of high for the E. coli translational system. J. Mol. Biol., 151, 389±409.
4. Ikemura,T. (1982) Differences in synonymous codon choice patterns of
temperature, rather than for speci®c codon±anticodon yeast and correlation between the abundance of yeast transfer RNAs and
pairings. One possibility is that the process is driven by the occurrence of the respective codons in protein genes. Differences in
selection for increased mRNA stability at high temperature, synonymous codon choice patterns of yeast. J. Mol. Biol., 158, 573±597.
rather than selection for translational ef®ciency. Increased 5. Shields,D.C. and Sharp,P.M. (1987) Synonymous codon usage in
mRNA stability would result in increased levels of translated Bacillus subtilis re¯ects both translational selection and mutational
protein per mRNA molecule. Thus mRNA stability could be biases. Nucleic Acids Res., 15, 8023±8040.
6. Shields,D.C., Sharp,P.M., Higgins,D.G. and Wright,F. (1988) "Silent"
subject to similar selection pressures as translational ef®- sites in Drosophila genes are not neutral: evidence of selection among
ciency. Interestingly, both forms of selection would be more synonymous codons. Mol. Biol. Evol., 5, 704±716.
pronounced for highly expressed genes. It has been suggested 7. Stenico,M., Lloyd,A.T. and Sharp,P.M. (1994) Codon usage in
that thermophilic genomes are purine rich (21) and such a Caenorhabditis elegans: delineation of translational selection and
purine preference could affect both mRNA stability and the mutational biases. Nucleic Acids Res., 22, 2437±2446.
frequency of synonymous codons within these genomes. 8. McInerney,J.O. (1998) Replicational and transcriptional selection on
codon usage in Borrelia burgdorferi. Proc. Natl Acad. Sci. USA, 95,
In summary, we have shown that the patterns of synonym- 10698±10703.
ous codon usage within a genome can change dramatically 9. Bulmer,M. (1991) The selection-mutation-drift theory of synonymous
during the course of evolution. Our results show that the two codon usage. Genetics, 129, 897±907.
major forces affecting the broad patterns of codon usage 10. Sharp,P.M. and Li,W.-H. (1987) The selection-mutation-drift theory of
among prokaryote genomes are (i) the nucleotide composition synonymous codon usage. Nucleic Acids Res., 15, 1281±1295.

Downloaded from http://nar.oxfordjournals.org/ by guest on June 11, 2014


of the genome and (ii) some form of natural selection linked to 11. Greenacre,M.J. (1984) Theory and Applications of Correspondence
Analysis. Academic Press, London, UK.
optimal growth temperature. It will be of interest to ask if 12. Gouy,M. and Gautier,C. (1982) Codon usage in bacteria: correlation with
those genomes that have changed their synonymous codon gene expressivity. Nucleic Acids Res., 10, 7055±7074.
usage in response to these evolutionary forces have undergone 13. Xia,X. (1998) How optimized is the translational machinery in
a corresponding change in the relative abundances of Escherichia coli, Salmonella typhimurium and Saccharomyces
isoaccepting tRNAs. A second question that merits further cerevisiae? Genetics, 149, 37±44.
study is the biochemical basis of the selective advantage of 14. Aravind,L., Tatusov,R.L., Wolf,Y.I., Walker,D.R. and Koonin,E.V.
(1998) Evidence for massive gene exchange between archaeal and
certain codons under high temperature conditions and, in bacterial hyperthermophiles. Trends Genet., 14, 442±444.
particular, if such selective forces are related to the selection 15. Nelson,K.E., Clayton,R.A., Gill,S.R., Gwinn,M.L., Dodson,R.J.,
on non-synonymous sites among thermophiles (22). The main Haft,D.H., Hickey,E.K., Peterson,J.D., Nelson,W.C., Ketchum,K.A. et al.
conclusion that can be drawn from the results presented here is (1999) Evidence for lateral gene transfer between Archaea and bacteria
that synonymous codon usage patterns can be subject to from genome sequence of Thermotoga maritima. Nature, 399, 323±329.
natural selection and, speci®cally, that a particular environ- 16. Kanaya,S., Kinouchi,M., Abe,T., Kudo,Y., Yamada,Y., Nishi,T., Mori,H.
and Ikemura,T. (2001) Analysis of codon usage diversity of bacterial
mental factor such as high temperature can underlie selection genes with a self-organizing map (SOM): characterization of horizontally
for a speci®c subset of codons in both eubacterial and archaeal transferred genes with emphasis on the E. coli O157 genome. Gene, 276,
lineages. 89±99.
17. Wang,H.C., Badger,J., Kearney,P. and Li,M. (2001) Analysis of codon
usage patterns of bacterial genomes using the self-organizing map. Mol.
ACKNOWLEDGEMENTS Biol. Evol., 18, 792±800.
18. Ochman,H., Lawrence,J.G. and Groisman,E.A. (2000) Lateral gene
This work was supported by a Research Grant from NSERC transfer and the nature of bacterial innovation. Nature, 405, 299±304.
Canada (D.A.H.) and graduate scholarships from the 19. Clarke,G.D., Beiko,R.G., Ragan,M.A. and Charlebois,R.L. (2002)
University of Ottawa (D.J.L.) and NSERC (G.A.C.S.). Inferring genome trees by using a ®lter to eliminate phylogenetically
discordant sequences and a distance matrix based on mean normalized
BLASTP scores. J. Bacteriol., 184, 2072±2080.
REFERENCES 20. House,C.H. and Fitz-Gibbon,S.T. (2002) Using homolog groups to create
1. Grantham,R., Gautier,C. and Gouy,C. (1980) Codon frequencies in 119 a whole-genomic tree of free-living organisms: an update. J. Mol. Evol.,
individual genes con®rm consistent choices of degenerate bases 54, 539±547.
according to genome type. Nucleic Acids Res., 8, 1893±1912. 21. Lao,P.J. and Forsdyke,D.R. (2000) Thermophilic bacteria strictly obey
2. Ikemura,T. (1981) Correlation between the abundance of Escherichia Szybalski's transcription direction rule and politely purine-load RNAs
coli transfer RNAs and the occurrence of the respective codons in its with both adenine and guanine. Genome Res., 10, 228±236.
protein genes. J. Mol. Biol., 146, 1. 22. Kreil,D.P. and Ouzounis,C.A. (2001) Identi®cation of thermophilic
3. Ikemura,T. (1981) Correlation between the abundance of Escherichia species by the amino acid compositions deduced from their genomes.
coli transfer RNAs and the occurrence of the respective codons in its Nucleic Acids Res., 29, 1608±1615.

You might also like