Professional Documents
Culture Documents
Protein-Coding Genes
J. J. Wernegreen and N. A. Moran
Department of Ecology and Evolutionary Biology, University of Arizona
Introduction
The rate of fixation of mutations with fitness consequences depends not only on the strength of selection
for or against them but also on the effectiveness of such
selection as influenced by effective population size. In
populations with low rates of recombination and small
effective sizes, slightly deleterious mutations may experience increased rates of fixation through drift (Ohta
1973). This predicted relationship between population
structure and rate of fixation of slightly deleterious mutations can be tested among prokaryotes. Free-living
bacteria are thought to have large effective population
sizes (Selander, Caugant, and Whittam 1987), and even
clonal groups experience recombination that is important in their evolutionary dynamics (Maynard Smith,
Dowson, and Spratt 1991; Dykhuizen and Green 1993;
Maynard Smith et al. 1993).
In contrast, endosymbiotic bacteria associated with
several insect groups have relatively small effective population sizes and have restricted opportunities for interstrain recombination because of their mode of transmission. Bacteria associated with specialized insect cells
(i.e., mycetocytes) are maternally transmitted by the infection of ovaries or of internally developing embryos
(reviewed in Buchner 1965; Moran and Baumann 1994;
Baumann et al. 1995). The effective population size of
the bacteria is reduced by the bottleneck at each inoculation of progeny, where relatively few bacteria are
Abbreviations: CAI5 Codon Adaptation Index; Nc5 effective number of codons; GC35 percent G1C content at third-codon positions.
Key words: Buchnera, endosymbionts, codon bias, drift, population
size.
Address for correspondence and reprints: Jennifer Wernegreen, Department of Ecology and Evolutionary Biology, University of Arizona,
Biological Sciences West, Room 310, Tucson, Arizona 85721. E-mail:
werjen@u.arizona.edu.
Mol. Biol. Evol. 16(1):8397. 1999
q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
transmitted (Hinde 1971; A. Mira, personal communication). In addition, modeling indicates that insect host
population sizes may be the primary determinant of the
effective population size of intracellular genomes (C.
Rispe, personal communication). Insect population sizes, while relatively large among animals, are much
smaller than those of free-living bacteria (reviewed in
Lambert and Moran 1998). Finally, any lateral gene
transfer among endosymbionts would be confined to the
bacterial genotypes present in the same host individual,
and the tight bottleneck at transmission implies that
these would be similar or identical. Buchnera, the endosymbionts of aphids, are particularly well characterized, and the perfect congruence between symbiont and
host phylogenies supports anatomical evidence for their
stable, vertical inheritance (Munson et al. 1991; Moran
and Baumann 1994). The goals of this study are to explore the effects of this strict asexuality and small population size on sequence evolution in Buchnera and to
test the hypothesis that Buchnera lineages experience
increased rates of substitution of slightly deleterious mutations.
Codon Bias
The use of alternative codons may be shaped by
biases in mutation rates among the four bases (Suoeka
1961; Muto and Osawa 1987), by selection for the use
of optimal codons to maximize rates and efficiency of
translation (Ikemura 1981, 1985), or by a combination
of these processes. Studies of codon usage often attempt
to distinguish the relative importance of genome nucleotide composition and selection for translational efficiency by testing alternative predictions of these models.
In cases where patterns of codon usage largely reflect
mutational pressure and drift rather than translational selection, codon bias is expected to correspond with local
base compositional biases. This pattern characterizes genomes of vertebrates and some bacterial genomes with
83
Buchnera, the bacterial endosymbionts of aphids, undergo severe population bottlenecks during maternal transmission through their hosts. Previous studies suggest an increased effect of drift within these strictly asexual, small
populations, resulting in an increased fixation of slightly deleterious mutations. This study further explores sequence
evolution in Buchnera using three approaches. First, patterns of codon usage were compared across several homologous Escherichia coli and Buchnera loci, in order to test the prediction that selection for the use of optimal
codons is less effective in small populations. A x2-based measure of codon bias was developed to adjust for the
overall A1T richness of silent positions in the endosymbionts. In contrast to E. coli homologues, adaptive codon
bias across Buchnera loci is markedly low, and patterns of codon usage lack a strong relationship with gene
expression level. These data suggest that codon usage in Buchnera has been shaped largely by mutational pressure
and drift rather than by selection for translational efficiency. One exception to the overall lack of bias is groEL,
which is known to be constitutively overexpressed in Buchnera and other endosymbionts. Second, relative-rate tests
show elevated rates of sequence evolution of numerous protein-coding loci across Buchnera, compared to E. coli.
Finally, consistently higher ratios of nonsynonymous to synonymous substitutions in Buchnera loci relative to the
enteric bacteria strongly suggest the accumulation of nonsynonymous substitutions in endosymbiont lineages. Combined, these results suggest a decreased effectiveness of purifying selection in purging endosymbiont populations
of slightly deleterious mutations, particularly those affecting codon usage and amino acid identity.
84
sample of Buchnera and E. coli homologues are included to represent a wide range of gene expression levels,
and the A1T richness of the Buchnera genome (Ishikawa 1987) is considered in testing for codon bias.
Previous studies of codon usage in A1T and
G1Crich genomes highlight methods for assessing codon usage in genomes with strong mutational biases
(Shields and Sharp 1987; Ohtaka, Nakamura, and Ishikawa 1992; Wright and Bibb 1992; Ohtaka and Ishikawa
1993; Andersson and Sharp 1996). For example, the effective number of codons, Nc, is reduced by preferences
for particular codons or biased base composition. In order to test the null hypothesis that codons are used randomly except for the influence of local mutational bias,
expected values of Nc may be adjusted to account for
local base composition. In the A1Trich Rickettsia genome, Nc-plots show an agreement between observed
Nc values and those expected, given the GC3, indicating
that codon usage reflects local base composition and
may therefore be attributed largely to mutational bias
(Andersson and Sharp 1996). Likewise, similar levels of
codon bias across Rickettsia genes with very different
expression levels indicate that mutational bias has a
stronger effect than translational selection. In other taxa,
the combined effects of mutational bias and translational
selection are apparent. Across several Streptomyces loci,
a strong effect of mutational bias is suggested by the
correspondence of GC3 and Nc and by a correlation between the GC3 of a locus and the locus position along
the major axis in correspondence analysis of codon usage (Wright and Bibb 1992). A slight effect of translational selection on the highly expressed Streptomyces tuf
gene is supported by the relatively low Nc for this locus,
its clear distinction from other loci in correspondence
analysis, and the fact that, apparently, preferred codons
in tuf are also preferred by another G1Crich bacterium, Micrococcus luteus (Wright and Bibb 1992). This
combination of mutational bias and translational selection is also apparent for other genomes with mutational
biases, such as Micrococcus luteus (Ohtaka, Nakamura,
and Ishikawa 1992; Ohtaka and Ishikawa 1993), Dictyostelium discoideum (Sharp and Devine 1989), and
Bacillus subtilis (Shields and Sharp 1987). Organelle genomes may also show strong nucleotide biases. The relative importance of selection and genome composition
in shaping codon usage of several A1Tbiased chloroplast genomes was recently tested by comparing an
observed CAI (Sharp and Li 1987), or bias toward a
pool of preferred codons (here, on the basis of a highly
expressed chloroplast gene), to an expected distribution
of CAIs based on genome-wide nucleotide composition
(Morton 1998).
The analyses above test the null hypothesis that
codon usage may be explained solely by local base composition. However, Nc plots, correspondence analysis,
and CAI estimates may fail to detect slight preferences
among synonyms, since these methods derive a single
estimate across all amino acids in a locus. In addition,
CAI estimates are possible only when the optimal codons for a particular genome are known. In this study,
estimates of codon bias across Buchnera loci are also
adjusted for local base composition. However, in contrast to previous estimates, the x2-based method developed here tests for nonrandom-use codons for single
amino acids and does not require prior knowledge of
preferred codons. This approach may be generally applicable to other organisms in which codon preferences
may be absent or subtle, such as in taxa with small effective population sizes and/or strong mutational biases.
Similar to the scaled x2 (Shields et al. 1988), as modified
by Akashi and Shaeffer (1997) to adjust for A1T content at silent positions, we compared observed codon
frequencies to those expected if codon usage reflects local base composition at synonymous sites. By applying
this method to several homologous loci in Buchnera and
in their free-living relative, E. coli, we test the hypothesis that translational selection is relatively ineffective
in the endosymbionts, so that codon usage in Buchnera
is shaped by A1T mutational bias and by the fixation
of nonoptimal codons through drift.
85
86
Table 1
Genetic Loci of Buchnera Strains Included in Study
Gene Name
Acyrthosiphon pisum. . . . . . . . . . . . .
Schlechtendalia chinensis. . . . . . . . .
Schizaphis graminum . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
Myzus persicae . . . . . . . . . . . . . . . . .
Rhopalosiphum padi . . . . . . . . . . . . .
Salmonella typhimurium. . . . . . . . . .
Sitobion avenae. . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
M. persicae . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
Diuraphis noxia . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
Thelaxes suberi . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
T. suberi. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
T. suberi. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
T. suberi. . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
argS
argS
aroA
aroE
aroH
atpA
atpB
atpC
atpD
atpE
atpF
atpG
atpH
cysE
cysS
ddlB
dnaA
dnaG(pt)c
dnaJ
dnaK
dnaN
dnaQ
fdx
ftsA
ftsZ
gapA
gidA
groEL
groEL
groEL
groEL
groEL
groEL
groES
groES
gyrB
himD
hscA
hscB
ilvC
ilvD
infC
leuA
leuA
leuA
leuA
leuA
leuB
leuB
leuB
leuB
leuB
leuC
leuC
leuC
leuC
leuC
leuD
leuD
leuD
leuD
leuD
murC
nifS
pfs
rep
rep
L18933b
L18932b
L43549b
U09230b
U11066b
2827020b
2827024b
2827018b
2827017b
2827023b
2827022b
2827019b
2827021b
M90644b
U09230b
2738587b
M80817b
M90644b
D88673b
D88673b
M80817b
L18927b
2827028b
2738589b
2738588b
U11045b
2827025b
X61150b,d
2754808b,d
U77380b,d
U01039c
U77379b,d
D85628b,d
2754807b,d
D85628b,d
M80817b
L43549b
2827029b
2827030b
2827034b
2827033b
U11066b
AF041837b,d
X71612b,d
47968d
AF041836b,d
Y11966b,d
AF041837b,d
X71612b,d
AF041836b,d
X53376d
Y11966b,d
AF041837b,d
X71612b,d
AF041836b,d
M31047d
Y11966b,d
AF041837b,d
X71612b,d
47764d
AF041836b,d
Y11966b,d
AF012886b
2827032b
AF01288b
X71612b
2827035b
Taxona
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
Tetraneura caerulescens. . . . . . . . . .
T. suberi. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
T. suberi. . . . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
Macrosiphoniella ludovicianae . . . .
Melaphis rhois . . . . . . . . . . . . . . . . .
Rhopalosiphum maidis . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
ECOR 17 . . . . . . . . . . . . . . . . . . . . . .
ECOR 29 . . . . . . . . . . . . . . . . . . . . . .
ECOR 31 . . . . . . . . . . . . . . . . . . . . . .
ECOR 37 . . . . . . . . . . . . . . . . . . . . . .
ECOR 46 . . . . . . . . . . . . . . . . . . . . . .
ECOR 50 . . . . . . . . . . . . . . . . . . . . . .
ECOR 51 . . . . . . . . . . . . . . . . . . . . . .
ECOR 60 . . . . . . . . . . . . . . . . . . . . . .
ECOR 71 . . . . . . . . . . . . . . . . . . . . . .
ECOR 72 . . . . . . . . . . . . . . . . . . . . . .
Uroleucon aeneum . . . . . . . . . . . . . .
Uroleucon ambrosiae . . . . . . . . . . . .
Uroleucon astronomus . . . . . . . . . . .
Uroleucon caligatum . . . . . . . . . . . .
Uroleucon erigeronense . . . . . . . . . .
Uroleucon helianthicola . . . . . . . . . .
Uroleucon jaceae . . . . . . . . . . . . . . .
Uroleucon jaceicola . . . . . . . . . . . . .
Uroleucon obscurum . . . . . . . . . . . .
Uroleucon rudbeckiae . . . . . . . . . . .
Uroleucon rapunculoidis . . . . . . . . .
Uroleucon rurale . . . . . . . . . . . . . . .
Uroleucon solidaginis. . . . . . . . . . . .
Uroleucon sonchi . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. maidis . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
U. caligatum . . . . . . . . . . . . . . . . . . .
Gene Name
repA1
repA1
repA1
repA1
repA2
repA2
repA2
rho
rmph
rnh
rnpA
rpoB
rpoC
rpoD
rpsA
secB
sohB
thrS
tpiA
trmE
trpA
trpA
trpB
trpB
trpB
trpB
trpB
trpB
trpB
trpB
trpB
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpB(pt)
trpC
trpC
trpD
trpD
trpE
trpE
trpE
trpE
trpE
trpE
trpE
trpE
GenBank
Accession
Number
AF041837b,g
X71612b,g
Y11972b,g
Y11966b,g
AF041837b,g
X71612b,g
Y11966b,g
2827037b
M80817b
L18927b
M80817b
Z11913b
Z11913b
M90644b
L43549b
M90644b
U09185b
U11066b
L43549b
2827009b
U09185b
Z19055b
L46355b,d
AF038565b,d
AF058428e
L46357b,d
L46356b,d
L46358b,d
J01810d
U09185b,d
Z19055b,d
U23489e
U25425e
U23494e
U23496e
U23495e
U23497e
U25884e
U23499e
U23500e
U25429e
AF058431d,e
AF058432d,e
AF058433d,e
L81150d,e
L81151d,e
AF058434d,e
AF058435d,e
AF058436d,e
AF058437d,e
AF058439d,e
AF058438d,e
L81149d,e
AF058440d,e
1137716b,d,e
U09185b
Z19055b
U09185b
Z19055b
L43555b,d
L46769b,d
L43550b,d
L43551b,d
V01378d
U09184b,d
Z21938b,d
L8124d
Taxona
GenBank
Accession
Number
Table 1
Continued
Table 1
Continued
Taxona
Gene Name
trpE
trpE
trpE
trpG
trpG
trpG
trpG
trpG
trpG
trxA
tufA
tufA
tufA
tufA
tufA
tufA
aroA
dnaG
dnaJ
dnaK
dnaN
leuB
leuC
leuD
rpoB
L81123d
L8112d
1137712d
L43555b,g
L46769b,g
L43550b,g
L43551b,g
U09184b,g
Z21938b,g
2827036b
2369691b,d
2369697b,d
2369695b,d
X55116d
L43549b,d
2369693b,d
L05002f
U85774f
U25996f
Y14237f
X14791f
U29655f
Y11280f
Y11280f
X15840f
a Buchnera strains are labeled by the aphid species from which they were
isolated. All sequences from Escherichia coli and Haemophilus influenzae are
accessible from the full genome sequences of these two species (GenBank accession numbers U00096 and L42023, respectively). Individual loci of these two
species are not listed.
b Sequences of Buchnera taxa (listed) and E. coli compared in codon usage
analysis.
c (pt) indicates that only a partial sequence is available in GenBank.
d Sequences of Buchnera and enteric bacteria (including E. coli and the enteric species listed) used in comparison of Ka/Ks.
e Sequences used in mapping of nucleotide changes across phylogenies, in
addition to the trpB sequence of E. coli K12.
f Taxon 3 in relative-rate tests.
g No homologue in E. coli with sufficient similarity; locus excluded from
comparison of CAI values.
U. erigeronense. . . . . . . . . . . . . . . . .
U. rurale . . . . . . . . . . . . . . . . . . . . . .
U. sonchi . . . . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . . . . .
R. maidis . . . . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
A. pisum. . . . . . . . . . . . . . . . . . . . . . .
M. rhois . . . . . . . . . . . . . . . . . . . . . . .
Pemphigus betae . . . . . . . . . . . . . . . .
S. typhimurium . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . . . . .
S. graminum . . . . . . . . . . . . . . . . . . .
Aeromonas salmonecida. . . . . . . . . .
Pseudomonas putida. . . . . . . . . . . . .
Haemophilus ducreyi . . . . . . . . . . . .
Vibrio cholerae . . . . . . . . . . . . . . . . .
P. putida . . . . . . . . . . . . . . . . . . . . . .
P. aeruginosa . . . . . . . . . . . . . . . . . .
Azotobacter vinelandii . . . . . . . . . . .
Azotobacter vinelandii . . . . . . . . . . .
P. putida . . . . . . . . . . . . . . . . . . . . . .
GenBank
Accession
Number
87
88
mann, and Clark 1996). In contrast, x2 values are relatively high for individual amino acids across most E.
coli loci, including those considered low expression in
E. coli, such as trp genes (Sharp, Tuohy, and Mosurski
1986).
89
Table 2
Significant Nonrandom Use of U- or A- Ending Codons Across Several Buchnera Loci
for Each of Eight Fourfold Degenerate Families
Locus
Schizaphis graminum . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Schlechtendalia chinensis . . . . . .
Acyrthosiphon pisum . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Sitobion avenae . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Diuraphis noxia . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Thelaxes suberi . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Rhopalosiphum maidis . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. avenae . . . . . . . . . . . . . . . . . . .
T. suberi . . . . . . . . . . . . . . . . . . . .
T. suberi . . . . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . .
S. avenae . . . . . . . . . . . . . . . . . . .
R. padi . . . . . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
T. suberi . . . . . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
A. pisum . . . . . . . . . . . . . . . . . . . .
Melaphis rhois . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
Pemphigus betae . . . . . . . . . . . . .
S. graminum. . . . . . . . . . . . . . . . .
M. rhois . . . . . . . . . . . . . . . . . . . .
D. noxia . . . . . . . . . . . . . . . . . . . .
S. chinensis . . . . . . . . . . . . . . . . .
tpiA
groES
trpG
leuA
tufA
groEL
trpD
trpA
groEL
rpoD
atpA
trpE
trpA
tuf
rpoB
rep
leuB
trpE
leuD
dnaN
trpA
ftsA
hscA
trpE
trpB
groEL
dnaA
aroH
groEL
leuA
leuC
dnaG
murC
ilvD
groEL
trpB
rpoB
dnaA
infC
dnaK
groEL
dnaJ
tuf
ddlB
trpE
groEL
repAl
ftsZ
rpoB
trpG
trpE
groEL
rhn
leuA
ilvD
argS
tufA
rpoB
ilvC
trpA
tuAf
gidA
tufA
leuD
trpA
A
A
A
A
A
A
A
A
A
A
A
G
G
G
G
G
G
G
G
L
L
L
L
L
P
P
P
P
P
P
P
P
P
R
R
R
R
R
R
R
R
R
S
S
S
S
S
S
S
S
S
S
T
T
T
T
T
T
T
V
V
V
V
V
V
a
b
x2 Value for
Nonrandom
Useb
Codon
Ending
14
8
7
35
22
54
12
11
51
27
35
18
17
30
90
17
19
16
8
9
5
9
15
18
11
15
10
10
15
7
12
8
11
14
17
6
35
17
11
9
18
10
10
16
20
29
11
17
64
7
33
30
10
18
26
5
22
58
18
7
30
26
31
9
11
6.401
4.814
4.45
3.817
3.532
3.487
3.484
3.424
2.803
2.734
5.174
9.044
4.2
3.452
3.227
3.039
2.754
2.858
3.125
3.875
2.995
4
6.317
6.368
8.27
4.158
2.9
2.756
2.708
2.734
2.76
3.265
3.777
3.435
2.7299
2.7963
3.0635
3.184
3.5566
5.0547
6.5715
9.2069
2.915
3.0218
3.1565
3.2248
3.5667
4.5219
5.8242
5.9269
6.7696
7.9462
3.784
3.613
3.2
3
2.8356
3.8263
7.2636
5.293
4.417
3.81
3.624
3.531
2.967
A
A
A
A
A
A
A
A
A
A
U
A
A
A
A
A
A
U
U
A
U
U
U
U
A
A
A
A
A
U
U
U
U
A
U
U
U
U
U
U
U
U
U
U
U
U
U
U
U
U
U
U
A
A
A
A
U
U
U
A
A
A
A
A
U
Buchnera taxa are labeled by the aphid species from which they were isolated.
Critical values for these two-class x2 tests are 2.706 (P , 0.1) and 3.841 (P , 0.05).
Taxona
Amino
Acid
No. of
Amino Acid
Residues
in Protein
90
FIG. 1Relationship between the x2-based estimate of codon bias and the Codon Adaptation Index (CAI) for several E. coli or Buchnera
loci. Points represent a single locus and are positioned on the y-axis by the x2 value averaged across each fourfold degenerate amino acid.
Points are positioned on the x-axis by the CAI of either the same gene (E. coli) or the homologous E. coli gene (Buchnera). Average x2 includes
only x2 values for amino acids with five or more residues in a given locus. In contrast to Buchnera genes, E. coli loci have consistently higher
average x2 values and a strong relationship with CAI.
91
FIG. 2Comparison of x2 values for individual amino acids, across several homologous E. coli [x] and Buchnera [.] loci. Buchnera loci
are labeled by the taxon from which a gene was sampled (Sg 5 Schizaphis graminum, Ap 5 Acyrthosiphon pisum, Mr 5 Melaphis rhois, Pb
5 Pemphigus beta, Sc 5 Schlechtendalia chinensis, Rp 5 Rhopalosiphum padi, Sa 5 Sitobion avenae, Mp 5 Myzus persicae, Dn 5 Diuraphis
noxia, Rm 5 R. maidis, Usn 5 Uroleucon sonchi, Ts 5 Thelaxes suberi). For illustrative purposes only, x2 values of E. coli are graphed as
negative values if the preferred codon is a nonoptimal codon as defined by the E. coli Relative Synonymous Codon Usage table (Sharp et al.
1988). x2 values of Buchnera are graphed as negative values if the A-ending codon is preferred. For each locus, levels of bias in E. coli are much
higher than those in Buchnera. The locus showing the strongest evidence for bias in Buchnera is groEL, particularly that of Acyrthosiphon pisum.
may be attributed to decreased nonsynonymous divergence that is more extreme than the observed depression
at synonymous sites. Compared to other genes in Buchnera, purifying selection is apparently more effective
against replacement substitutions at this highly expressed locus.
92
Table 3
Number of Buchnera Loci Showing a Slight Preference for U- or A-Ending Codons for
Each of Eight Fourfold Degenerate Families
AMINO ACID
xxU
xxA
CONSIDERED
A.........
G.........
L .........
P .........
R.........
S .........
T .........
V.........
37
52
33
37
32
72
41
55
53
47
37
48
23
31
54
51
90
99
70
85
55
103
95
106
39.8
43.8
31.0
37.6
24.3
45.6
42.0
46.9
NO.
OF
BUCHNERA LOCIa
TOTAL
LOCI
x2 VALUEb
7.77
0.42
2.09
5.12
0.13
8.32
6.07
0.64
P VALUE (,)
0.01
0.025
0.01
0.01
a Number of Buchnera loci showing slight preference for the A-ending codon of that amino acid and number showing
a preference for the U-ending codon.
b The x2 value expresses the deviation of the observed from the expected number of loci with preference for A-ending
codons.
Table 4
Frequencies of Buchnera Loci in the Original Sample Compared to Their Representation
in the Pool of Loci for Which There Is Evidence of Significant Codon Bias
Locus
No. of
x2 Tests
Fraction of
Total x2 Tests
Performeda
No. Significant
x2 Tests
Fraction of
Significant
x2 Testsb
groELc . . . . . . . . . .
groES . . . . . . . . . . .
leu genes . . . . . . . .
trpB . . . . . . . . . . . .
trpA . . . . . . . . . . . .
trpD . . . . . . . . . . . .
trpE . . . . . . . . . . . .
trpG . . . . . . . . . . . .
tuf . . . . . . . . . . . . . .
32
8
119
64
15
14
46
36
39
0.041
0.010
0.154
0.083
0.019
0.018
0.060
0.047
0.051
8
1
7
2
5
1
5
2
6
0.123
0.015
0.108
0.031
0.077
0.015
0.077
0.031
0.092
93
Table 5
Relative-Rates Test for Substitutions at Nondegenerate Sites in Loci of Buchnera Versus Escherichia coli
Gene
No.
Codons
Taxon 3a
K12
K13
K23 K13K23
Haemophilus influenzae
H. influenzae
H. influenzae
H. influenzae
Pseudomonas aeruginosa
Azotobacter vinelandii
A. vinelandii
Aeromonas salmonicida
H. influenzae
Haemophilus ducreyi
Vibrio cholerae
Pseudomonas putida
P. putida
H. influenzae
P. putida
H. influenzae
H. influenzae
H. influenzae
H. influenzae
H. influenzae
0.27
0.25
0.22
0.24
0.30
0.31
0.42
0.25
0.43
0.21
0.12
0.59
0.45
0.44
0.12
0.16
0.11
0.53
0.18
0.31
0.29
0.28
0.25
0.29
0.45
0.58
0.55
0.42
0.41
0.37
0.16
0.71
1.06
0.58
0.25
0.22
0.14
0.56
0.23
0.32
0.12
0.17
0.22
0.24
0.40
0.37
0.34
0.36
0.21
0.30
0.10
0.38
0.77
0.36
0.21
0.14
0.07
0.20
0.09
0.18
0.17
0.10
0.02
0.06
0.05
0.20
0.21
0.05
0.20
0.07
0.06
0.33
0.29
0.21
0.04
0.09
0.07
0.36
0.14
0.14
zb
K01/K02
7.65***
5.24***
1.20
2.36**
1.33
5.18***
3.66***
1.21
4.83***
2.23*
3.84***
6.26***
3.34**
3.13**
3.37**
5.01***
4.48***
7.21***
6.00***
6.38***
4.3
2.4
1.3
1.6
1.4
4.7
2.9
1.5
2.8
2.0
2.6
3.6
4.7
2.9
2.0
3.4
3.7
5.3
6.9
2.6
a In each test, taxon 1 is Buchnera of Schizaphis graminum, except for dnaJ and dnaK, which are Buchnera of Acyrthosiphon pisum, and taxon 2 is always
Escherichia coli. Taxon 3 is a more distantly related reference taxon.
b z scores were calculated as described by Muse and Weir (1992). Probabilities for one-tailed t-test (H :K
0
01 # K02) are * P , 0.05, ** P , 0.01, *** P ,
0.0001.
c (pt) indicates that only a partial sequence is available in GenBank.
More problematic is the possibility that Ks in Buchnera is underestimated because of the strong A1T bias
across loci and more rapid saturation at silent sites.
However, the calculation of Ka/Ks ratios across shallow
taxonomic levels avoided the problem of saturation at
synonymous sites and the large standard errors that accompany high divergence estimates. In addition, comparisons of changes at first- and second- vs. third-codon
positions across very shallow taxonomic levels (Buchnera isolates of Uroleucon and members of the ECOR
collection of E. coli) also suggest higher rates of fixation
at replacement sites, relative to synonymous sites, in
Buchnera.
Conclusions
Because of their small population sizes and limited
opportunities for recombination, vertically inherited endosymbionts provide a good model system to test the
effects of increased drift on sequence evolution in bacteria. In this study, the lack of adaptive codon bias
across several Buchnera loci suggests that codon usage
is shaped primarily by A1T mutational bias rather than
by translational selection. In addition, relative-rate tests
and comparisons of Ka/Ks ratios support previous conclusions that Buchnera lineages experience rapid sequence evolution at nonsynonymous sites, compared to
their free-living relative, E. coli. These results suggest
that selection is ineffective in eliminating two types of
weakly deleterious mutations from Buchnera populations: those resulting in nonoptimal codons and those
resulting in amino acid replacements.
A set of loci that might be suspected of experiencing unusual selection in Buchnera are those encoding
ilvC . . . . . . . .
ilvD . . . . . . . .
ilvI . . . . . . . . .
leuA . . . . . . . .
leuB . . . . . . . .
leuC. . . . . . . .
leuD. . . . . . . .
aroA . . . . . . .
cysE. . . . . . . .
dnaJ. . . . . . . .
dnaK . . . . . . .
dnaN . . . . . . .
dnaG(pt)c . . .
secB. . . . . . . .
rpoB . . . . . . .
rpsA. . . . . . . .
atpD . . . . . . .
tpiA . . . . . . . .
gapA . . . . . . .
gidA. . . . . . . .
Function
94
in endosymbionts is less certain, it may function to stabilize proteins that have accumulated amino acid substitutions, as suggested by Moran (1996). In this study,
sequence evolution of these functionally important
Buchnera genes shows the same patterns as housekeeping genes that are not overexpressed. In particular, genes
in the tryptophan (trpABC(F)DE) and leucine (leuABCD) biosynthetic pathways and groEL each show
low levels of codon bias and rapid rates of nonsynonymous substitution, relative to E. coli homologues.
The fact that the observed trends of depressed codon bias and accelerated evolutionary rates occur across
all Buchnera loci included, even functionally important
genes, argues for an increased effect of drift within endosymbiont populations. One alternative explanation for
elevated evolutionary rates at nonsynonymous sites is
positive selection for amino acid changes; however, such
selection typically acts at specific loci rather than across
the genome. In addition, there is no evidence that the
acceleration of nonsynonymous substitution rates is any
higher across Buchnera biosynthetic genes than across
other Buchnera loci. Combining loci from table 5 with
FIG. 3Comparison of levels of nonsynonymous substitutions (Ka) and synonymous substitutions (Ks) across several loci of the enteric
bacteria and Buchnera. (A) Pairwise divergence at nonsynonymous and synonymous sites, calculated across Buchnera [.] taxa and across E.
coli versus Salmonella typhimurium [x]. Ks values higher than 1.0 were excluded. (B) Ratios of nonsynonymous to synonymous substitutions,
on the basis of pairwise divergence across Buchnera taxa [.] and across E. coli versus S. typhimurium [x]. With the exception of groEL, all Ka/
Ks estimates for Buchnera exceed the ratio calculated for the homologous gene of the enteric bacteria. All Ka/Ks ratios are based on Ks values
less than one.
95
groEL. While the addition of more loci would be desirable, the consistency of the results across a variety of
loci best supports the view that the effects of mutational
bias and drift in Buchnera are sufficiently strong to
override the effect of purifying selection against the
nonoptimal codons and amino acid replacements. In
contrast to trp and leu genes, groEL does not appear to
be duplicated in Buchnera (Ohtaka and Ishikawa 1993),
and its constitutive overexpression may depend on rapid,
efficient translation of a limited number of mRNA molecules. It is not surprising, therefore, that slight codon
bias was detected for Buchnera groEL, albeit at much
lower levels than for the E. coli homologue.
Acknowledgments
We thank Joana Silva for her comments on an earlier version of this paper, and we thank two anonymous
reviewers for their helpful suggestions. This work was
FIG. 4Mapping of nucleotide changes across genealogies of trpB for (A) strains in the ECOR collection of E. coli isolates and (B) several
Buchnera isolates from the aphid genus Uroleucon (Uae 5 Uroleucon aeneum, Uja 5 U. jaceae, Usl 5 U. solidaginis, Uam 5 U. ambrosiae,
Uas 5 U. astronomus, Urd 5 U. rudbeckiae, Usn 5 U. sonchi, Uo 5 U. obscurum, Urp 5 U. rapunculoidis, Ujl 5 U. jaceicola, Uc 5 U.
caligatum, Uh 5 U. helianthicola, Urr 5 U. rurale, Ue 5 U. erigeronense). TrpB phylogenies were estimated using parsimony analysis of all
sites (679 nucleotides). For both data sets, trees presented are one of two most-parsimonius trees. Confidence in nodes was assessed using
bootstrapping (1,000 replicates). The number of unambiguous nucleotide changes along branches is given in parentheses. The number of changes
at first- and second-codon positions, is followed by the number of changes at third-codon positions (in b). Nucleotide changes were summed
across the entire ECOR tree, and across subsets of taxa in the Buchnera tree (see figure insert). The percent change at first and second positions,
divided by the percent change at third positions, roughly approximates Ka/Ks.
96
97