You are on page 1of 10

Copyright  2004 by the Genetics Society of America

Population Genetics of the Wild Yeast Saccharomyces paradoxus
Louise J. Johnson,*,1 Vassiliki Koufopanou,* Matthew R. Goddard,†
Richard Hetherington,* Stefanie M. Scha¨fer*,2 and Austin Burt*
*Department of Biological Sciences and †NERC Centre for Population Biology, Imperial College at Silwood Park,
Ascot SL5 7PY, United Kingdom
Manuscript received November 4, 2002
Accepted for publication September 22, 2003
ABSTRACT
Saccharomyces paradoxus is the closest known relative of the well-known S. cerevisiae and an attractive model
organism for population genetic and genomic studies. Here we characterize a set of 28 wild isolates from
a 10-km2 sampling area in southern England. All 28 isolates are homothallic (capable of mating-type
switching) and wild type with respect to nutrient requirements. Nine wild isolates and two lab strains of
S. paradoxus were surveyed for sequence variation at six loci totaling 7 kb, and all 28 wild isolates were
then genotyped at seven polymorphic loci. These data were used to calculate nucleotide diversity and
number of segregating sites in S. paradoxus and to investigate geographic differentiation, population
structure, and linkage disequilibrium. Synonymous site diversity is ⵑ0.3%. Extensive incompatibilities
between gene genealogies indicate frequent recombination between unlinked loci, but there is no evidence
of recombination within genes. Some localized clonal growth is apparent. The frequency of outcrossing
relative to inbreeding is estimated at 1.1% on the basis of heterozygosity. Thus, all three modes of
reproduction known in the lab (clonal replication, inbreeding, and outcrossing) have been important in
molding genetic variation in this species.

M

ANY fields in biology have progressed by the concentrated study of a select group of model systems. In population and evolutionary genetics, only a
few species such as Drosophila and humans have been
widely adopted, and it might make sense to consider
what other taxa might best complement these. The yeast
Saccharomyces cerevisiae has a number of characteristics
that would seem to make it ideal (Zeyl 2000): (i) It is
already a well-studied model system in biochemistry, cell
biology, classical genetics, and molecular biology; (ii)
genomes can be precisely altered by homologous recombination; and (iii) long-term experiments with large
population sizes and sensitive fitness assays are readily
possible in the laboratory. These features suggest that
one may be more likely to be able to investigate and
interpret the functional significance of natural DNA
sequence variation in this species than in any other
eukaryote. Moreover, it has a relatively small and generich genome, reducing the size of the problem to be
solved. However, there is a problem: S. cerevisiae has
long been associated with humans, and in collecting
strains it is difficult to determine to what extent they

Sequence data from this article have been deposited with the
EMBL/GenBank Data Libraries under accession nos. AJ515177–
AJ515216, AJ515322–AJ515352, and AJ515430–AJ515449.
1
Corresponding author: Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham NG7 2UH, United Kingdom. E-mail: l.j.johnson@nottingham.ac.uk
2
Present address: Department of Infectious Disease Epidemiology,
Imperial College, London W2 1PG, United Kingdom.
Genetics 166: 43–52 ( January 2004)

are escaped domestics or otherwise greatly affected by
human activity (Vaughan-Martini and Martini 1995;
Naumov et al. 1992a). This could greatly affect their
population genetics, severely complicating interpretations and reducing the extent to which lessons learned
with this species are likely to be widely applicable. For
example, one survey of S. cerevisiae in wineries revealed
some surprising findings, including 31% of strains heterozygous for a lethal mutation and 23% heterozygous
or homozygous for heterothallism, i.e., an inability to
undergo mating-type switching (Mortimer 2000). The
association between Drosophila and humans has posed
similar problems (Andolfatto and Przeworski 2000;
Wall et al. 2002).
One way to circumvent this problem would be to study
a close relative that has the same advantages, but not
the disadvantage. S. paradoxus is (along with S. cariocanus) the closest known relative of S. cerevisiae (Goddard and Burt 1999). The two species appear to be
biochemically indistinguishable (Barnett et al. 1990),
have the same chromosome number, and appear to be
largely syntenic (Naumov et al. 1992b). Growth preferences in the lab are the same as for S. cerevisiae, and
genetic engineering by the same homologous gene replacement methods used in S. cerevisiae is possible (E.
Louis, personal communication). Thus, many of the
advantages still apply. Moreover, it has been isolated
from many natural locations worldwide (e.g., Sniegowski et al. 2002) and apparently has not been widely
domesticated. Gene flow between S. cerevisiae and S.
paradoxus is also unlikely; hybrids can be formed, but

44

L. J. Johnson et al.

are almost completely sterile (Naumov et al. 1997a).
Overall DNA sequence divergence between the two species is thought to be ⵑ20% (Herbert et al. 1988), and
synonymous site divergence at the loci studied here is
ⵑ30%.
In the laboratory, the life cycle of S. paradoxus is the
same as that of S. cerevisiae (Herskowitz 1988). It normally reproduces mitotically as a diploid, but when
starved of nitrogen undergoes meiosis and produces
four haploid spores encapsulated in an ascus. There are
two mating types, and the spores usually mate within
the ascus upon germination, but if this does not happen,
they are able to reproduce mitotically as haploids. Haploid cells are constitutively ready to mate and can outcross. However, haploid mitoses are associated with a
sophisticated mechanism of mating-type switching, with
the result that cells can also mate with their clonemates,
producing an entirely homozygous diploid (“autodiploidization”). Thus, S. paradoxus may undergo two types
of self-fertilization: intra-ascus mating and autodiploidization. For a review of ascomycete mating systems, see
Nelson (1996).
In this article we describe a preliminary investigation
into the genetics of a single population of S. paradoxus,
focusing on quantifying levels of nucleotide variation
and analyzing the pattern of variation to infer mating
system (and, to a lesser extent, dispersal).

MATERIALS AND METHODS
Collections: S. paradoxus was isolated from the bark of oak
trees (Quercus, mainly Quercus robur ; Naumov et al. 1998) in
Silwood Park and Windsor Great Park. Bark scrapings (ⵑ1 g)
were collected from 86 oak trees on each of two dates, with
two scrapings on opposite sides of the tree on each date.
Scrapings were aseptically transferred to acidified malt medium [5% malt extract (Sigma, Dorset, UK), 0.4% lactic acid
(Sigma) w/v] in loosely capped vials and shaken for 2 days
at 30⬚. Many types of microbe were present in the medium so
a selection procedure was incorporated to isolate S. paradoxus.
Dilutions of the 48-hr culture were plated on acidified malt
and incubated for 24 hr at 30⬚. The resulting colony-forming
units were visually inspected and colonies looking like S. paradoxus were picked, placed on YPD [1% yeast extract (Merck,
Dorset, UK), 2% peptone (Merck), 2% glucose (BDH, Leicestershire, UK], and then subsamples were tested for their ability
to form tetrads when placed upon nitrogen-starving medium
(2% potassium acetate; BDH). Heterozygosity was maintained
in the original samples because they were not stimulated to
sporulate. For those that formed tetrads, the internal transcribed spacer region (ITS1-5.8rRNA-ITS2) was amplified using primers ITS1 and ITS4 (White et al. 1990) and then
visualized via electrophoresis through 1% agarose. ITS amplicons of roughly the correct size were sequenced (with an ABI
373) and compared to the ITS sequences from the S. paradoxus
(CBS 432) and S. cerevisiae type strains. Three types of sequence
were recovered. Two of these were largely unalignable to the
Saccharomyces sequences and were identified as Hanseniaspora osmophila (CBS 313) and Torulaspora delbrueckii (CBS
404), using BLAST (Altschul et al. 1990). All sequences in
the third category were very similar to the S. paradoxus sequence and were included in our sample. Our procedure
therefore allowed the isolation of both S. paradoxus and S.

cerevisiae strains with substantial variability within each species.
The initial collection of 344 bark scrapings yielded 28 isolates.
Other strains: The Centraalbureau voor Schimmelcultures
(CBS) supplied CBS 432, the type strain of S. paradoxus, and
the Danish lab strain CBS 5829, here referred to as “Type”
and “Danish,” respectively.
Two S. paradoxus isolates from the Russian Far East (FE),
CBS 8436 and CBS 8444, were included for comparison. These
isolates differ from European S. paradoxus at allozyme loci
(Naumov et al. 1997b) and show ⵑ5% synonymous site divergence from the type strain of S. paradoxus at the six sequenced
loci. These strains, referred to herein as FE1 and FE2, respectively, were kindly provided by Edward Louis. All S. cerevisiae
sequence data were from the Yeast Genome Project (Goffeau
et al. 1996).
Phenotypic assays: To isolate individual spores for phenotypic assays, all wild isolates were grown on sporulation medium for 4 days, and resultant asci were enzymatically digested
(10 min in a 50-␮l solution of 10 mg/ml sulfanotase, 10 mg/
ml lyticase at 25⬚). Individual spores were removed with a Zeiss
micromanipulator and incubated at 25⬚ for 4 days on YPD
agar to allow colony growth. Colonies were replica plated to
minimal and sporulation media and after 3 days examined
for growth or surveyed by microscopy for the presence of
tetrads. The presence of tetrads was considered indicative
of mating-type switching. All media were made according to
Sherman (1991).
Molecular methods: Nine wild isolates were chosen randomly for an initial survey of sequence variation. Total DNA
was extracted (Sherman 1991) and diluted 100-fold for use
as a PCR template. Six genes involved in mate recognition
were amplified from the nine wild isolates and from the Type
strain, Danish, FE1, and FE2 isolates. Details of genes and
primers are given in Table 1. All 28 wild isolates were then
genotyped at polymorphic sites by restriction at the MFA1 and
AGA2 loci, using enzymes Tsp451 and AseI, respectively, and
by sequencing fragments of MF␣1, SAG1, STE2, and STE3.
Microsatellite locus: Twenty S. cerevisiae microsatellite
primer pairs (Field and Wills 1998) were tested on S. paradoxus. Of these only 3 gave a PCR product with S. paradoxus,
and 1 was found to be polymorphic, a variable-length repeat
in the TFA1 gene (chromosome XI in S. cerevisiae). The wild
isolates were genotyped at this locus by polyacrylamide gel
electrophoresis of radioactively end-labeled PCR products
(Sambrook et al. 1989). A representative of each mobility
group was sequenced to determine the length of each allele.
Statistical analysis and software used: Nucleotide diversity
␲ at synonymous and nonsynonymous sites, and synonymous
site divergence, were calculated using DnaSP (Rozas and Rozas
1999; available at http://www.ub.es/dnasp/). Parsimony analysis of gene trees and comparisons among them by the partition
homogeneity test (Farris et al. 1994) were performed using
PAUP (Swofford 2002). To test for deviations from neutrality, we compared the variance of branch lengths on the genealogy to that from 1000 random genealogies with the same total
branch length, constructed using N. Barton’s genealogies package (available at http://helios.bto.ed.ac.uk/evolgen/barton/
index.html) for Mathematica (Wolfram Research 1999). Tests
for overrepresentation of genotypes and linkage disequilibrium were performed using MultiLocus (Agapow and Burt
2001; available at http://www.bio.ic.ac.uk/evolve/software/
multilocus/index.html). The correlation between genetic and
geographical distance across all pairs of isolates was tested by
randomization, in Mathematica.
RESULTS

Isolations: S. paradoxus was isolated from 28 of 344
bark scrapings, a success rate of 8%. There was no obvi-

Population Genetics of S. paradoxus

45

TABLE 1
Primer sequences
Gene

Primers, 5⬘–3⬘, forward first

MFA1 (YDR461w): ␣-pheromone, chromosome 4

MFL-5px: CTG TTG CTC GGA TAA AAT CAA G
MFL-6px: GGA TAA CAG TAA CAG CGC TAA G

MF␣1 (YPL187w): ␣-pheromone, chromosome 16

sMFG1-U: AAA GCA ACA ACA GGT TTT GG
sMFG1-L: CAA ATT GAA ATA TGG CAG GC
MFAL-SEQF*: TTT TAA TAC ACA CAA ATA AAT TAT CC
MFAL-SEQR*: TGA GAA AGT TGA TTT TGT TAC GC

STE2 (YFL026w): ␣-pheromone receptor, chromosome 6

STE2-142F: ACT GTT ACT CAG GCT ATT ATG TTC G
STE2-1539R: TAA TCC AAT GAA AAA AAA TCA CTG C
STE2-497F*: TGA CAT CAA TAT CTT TCA CTT TCA CTT TAG G
STE2-1078F*: TCA GAA AGA ACT TTT GTT GCT GAG G
STE2-1148R*: CCT TGT ATT TTT TGA ACT CGT GG
STE2-235R*: AAA CTT GGT TGA TAA TGA AAA TTG G

STE3 (YKL178c): ␣-pheromone receptor, chromosome 11 STE3-F3: TGG ACA CAT TCA TTA CCT ACC ACG
STE3-ENDR: TTT CTG AAC TAA GCT CAT TTG AAC
STE3-530R*: GAA AAC GAA CAG CAC CAA GG
STE3-989F*: AGG ATT TAC AGC AGG TGG ATG G
STE3-997R*: TTT CAG AAT CGG TAG AGA ATG G
AGA2 (YGL032c): ␣-agglutinin subunit, chromosome 7

AGA2-7PX: CTT TTG TTG TTC GGG CAT TTC C
AGA2-8PX: GTT GGC TAT TAT GAT AGT CCA TCC

SAG1 (YJR004c): ␣-agglutinin, chromosome 10

SAG1-78F: GCT ATG TGA ACC AAA AAA AGA TAC C
SAG1-2005R: GCC TGA TGT TGA AGA ATA ATA TGC
SAG1-411R*: GTT TTT TGC GAT GAA TCT GAC AGC
SAG1-711F*: AAT GTC TGA TGT GGT GAA TTT CG
SAG1-1317F*: GTC GGA AGT AAT CAG TCA TGT GG
SAG1-1503R*: GAT GTT GAA GTC ACA ATA GGT ACG

* Internal sequencing primers.

ous difference in success rate between large and small
trees or samples with different aspect. From 4 bark
scrapings on each of two dates, 63 trees produced no
isolates, 18 produced one isolate, and 5 produced two
isolates. No S. cerevisiae strains were recovered although
they were not excluded by our procedure.
Phenotypic variation: All 28 wild isolates were induced
to undergo meiosis, and the four haploid spores were
dissected from the asci. The resultant colonies were all
capable of growth on minimal medium, demonstrating
that none of the 28 strains carried an auxotrophic mutation. The frequency of auxotrophic mutants is thus 0,
with a 2-unit upper support limit of 0.069. In S. cerevisiae,
ⵑ60 genes can mutate to auxotrophy, as estimated by
counting gene names denoting amino acid auxotrophy
in the yeast genome (Goffeau et al. 1996). The spontaneous mutation rate in the lab is ⵑ10⫺8/locus/mitotic
generation (Drake 1991; Zeyl and Devisser 2001). If
the same values apply to S. paradoxus in nature, and the
population is at mutation selection balance (i.e., the
frequency of deleterious mutants is equal to q ⫽ u/s,
where u is the mutation rate and s is the selection coefficient), the minimum harmonic mean selection coefficient against auxotrophic mutants necessary to keep

them at the observed frequency is ⵑ60 ⫻ 10⫺8/0.069 ⵑ
10⫺5. Thus, even very small selection coefficients would
be sufficient to keep the mutants at the observed low
frequency.
All colonies grown from haploid spores were also
capable of forming tetrads on sporulation medium, indicating that they had autodiploidized following matingtype switching (i.e., were homothallic). In S. cerevisiae it
appears that there is only one locus that can mutate to
give a heterothallic phenotype (HO); making the same
calculations as above indicates that the minimum selection coefficient against such mutants in the wild is
ⵑ10⫺7.
Molecular data set 1: DNA sequences from nine isolates: The initial survey of molecular variation involved
sequencing six loci from nine wild isolates plus the Type,
Danish, FE1, and FE2 isolates. Sequence variation was
discovered at each of the six loci, and there were a total
of 24 polymorphic sites and one polymorphic repeat in
ⵑ7000 bp of sequence from nine isolates (see Table 2).
None of the isolates was heterozygous at any of these
polymorphic sites. Three isolates (T8.1, T21.4, and
T32.1) had identical genotypes; subsequent analysis (described below for data set 2) suggests that they are part

STE3

STE2

SAG1

AGA2

A
?
T
T
T
T
T
T
T
T
T
?
?
T

C
.
.
.
.
.

.
.
.
.
?
?
.

G
.
.
.
.
.
.
a
.
.
a
.
.
.

3
4
4
4
4
4
4
4
4
4
4
.
4
4

C
T
T
T
T
T
.
.
.
.
T
T
T
T
A





.
.
.
.



A
G
.
.
.
.
.
.
.
.
.
.
.
G

T
.
.
.
.
C
.
.
.
.
.
.
.
.

G
.
.
.
.
.
T
.
.
.
.
.
.
.

G
A
.
.
.
A
.
.
.
.
.
.
.
.
C
T
T
T
T
T
T
T
T
T
T
T
T
.

T
.
.
.
.
A
.
.
.
.
.
.
.
C

A
.
.
.
.
C
.
.
.
.
.
.
.
.

G
.
.
.
.
C
.
.
.
.
.
.
.
T

A
.
.
.
.
T
.
.
.
.
.
.
.
.

T
.
.
.
.
A
.
.
.
.
.
.
.
A

A
G
G
G
G
G
G
G
G
G
G
G
G
G

C
.
.
.
.
.
T
T
T
.
T
T
T
T

C
T
.
.
.
.
.
.
.
.
.
.
.
.

A
T
.
.
.
.
.
.
.
T
.
.
.
.

C
T
.
.
.
.
.
.
.
T
.
.
.
.

A
.
G
?
G
.
.
.
G
.
.
.
.
.

A
.
G
?
G
.
.
.
G
.
.
.
.
.

G
.
A
A
A
.
.
.
A
.
.
.
.
.

G
T
.
.
.
.
T
T
.
.
.
.
.
.


T
.
.
.
.
T
T
.
.
.
.
.
.

⫺130 ⫺108 354 R 17 245 ⫺10 792 805 1365 1544 1593 1679 1707 1718 1775 224 837 1106 1382 1387 ⫺90 915 1578 346 347

MFA1

Nucleotide polymorphisms found in the initial survey of nine wild isolates and the Type and Danish strains are shown. Also shown, for comparison, are the nucleotides
found in the Far Eastern strains and in S. cerevisiae (S. c). Dots indicate identity to Type. Bases are numbered from the start of the coding sequence; negative numbers
indicate upstream positions. Column R shows number of repeats of the MF␣1 pheromone unit. Noncoding regions are shown in italic type.

Type
Danish
T8.1
T21.4
T32.1
T62.1
T76.6
Q4.1
Q32.3
Q59.1
Q70.8
FE1
FE2
S. c

Base

MF␣1

Gene

Polymorphisms

TABLE 2

46
L. J. Johnson et al.

Population Genetics of S. paradoxus

47

TABLE 3
Estimates of nucleotide diversity in S. paradoxus wild isolates
Coding sequence

Noncoding sequence

Gene

Strains

bp

␲a ⫻ 103

␲s ⫻ 103

bp

␲ ⫻ 103

MFA1
MF␣1
STE2
STE3
AGA2
SAG1
Total

9
9
9
9
9
8

111
534
1296
1413
264
1956
5534

0
0
0
0.21
0
0
0.07

23.31
6.94
2.16
1.42
0
3.96
3.53

463
150
214
450
221
158
1656

1.2
2.6
1.8
2.5
1.8
3.4
1.7

Average pairwise diversity per nucleotide site at synonymous (␲s) and nonsynonymous (␲a) sites of coding
regions and of adjacent noncoding sequence is shown. Noncoding regions considered are upstream of MF␣1
and SAG1; downstream of MFA1, STE2, and STE3; and 91 bp upstream ⫹ 130 bp downstream of AGA2. The
STE2 sequence does not include the first 200 bp.

of a single clone. No other pair of isolates had identical
genotypes. Table 3 shows the average pairwise diversity
per nucleotide site of these six genes in wild isolates.
Only one amino acid polymorphism is seen among the
nine wild isolates; the nonsynonymous nucleotide diversity at these loci is low (ⵑ0.01%), comparable to that
found in humans (Li and Sadler 1991). By contrast,
the synonymous and noncoding nucleotide diversity is
relatively high (ⵑ0.3%), comparable to that found in
Drosophila melanogaster (Begun and Aquadro 1992)—
although this is still far lower than the diversity of ⵑ5%
seen between sympatric isolates of Escherichia coli (Hall

and Sharp 1992). These results indicate that the six
genes are under purifying selection in S. paradoxus.
Gene trees for each locus, rooted using the Far Eastern isolates and S. cerevisiae, are shown in Figure 1.
The data fit these trees perfectly—i.e., their consistency
index is 1 (Farris 1989): There is no homoplasy within
the European data. Far Eastern and European isolates,
however, share a polymorphism in MF␣1 pheromone
repeat number. There are fixed differences between
Far Eastern and European MF␣1 sequences at other
sites, so this homoplasy must have been created either
by recombination between alleles from the Far East and

Figure 1.—Gene trees for
11 European S. paradoxus isolates at six loci. Identical sequences at each locus are
grouped together, and branch
lengths are labeled with number of base changes. T21.4 and
T32.1 are in all cases identical
to T8.1 and have been omitted.
Arrows indicate ancestral state
as indicated by the Far Eastern
isolates.

48

L. J. Johnson et al.

Europe or by parallel mutations. Parallel mutation is a
plausible cause, as repeat number is highly variable in
Saccharomyces (Kitada and Hishinuma 1988) and varies from two to four repeats in our set of 28 wild isolates
(see below). Overall, then, there is no compelling evidence of recombination within any of these genes.
To test for recombination between genes, the data
from all six loci were combined for parsimony analysis.
The European isolates give a shortest tree of 30 steps,
7 steps longer than the minimum possible (consistency
index ⫽ 0.77), showing extensive homoplasy. Eight of
the 15 possible pairs of gene trees conflict, and no
branch is common to all 6 trees. Moreover, nucleotide
sites in the same gene are significantly more likely to
agree than sites in different genes (partition homogeneity test, P ⫽ 0.002). Recombination does therefore appear to have occurred between the six genes, each of
which is on a different chromosome.
Interestingly, for none of the genes do our wild isolates form a monophyletic clade with respect to the
Type and Danish strains (with the possible exception
of SAG1). This indicates either gene flow on the scale
of thousands of kilometers or large populations since
divergence such that variation present at the time of
divergence has not sorted out.
To compare the gene trees to the expectation under
the null hypothesis of a neutral coalescent, we calculated
the variance of branch lengths in the genealogies and
compared them to those found on randomized genealogies with the same total number of mutations. For this
analysis the sample size was taken as seven (i.e., clonemates
were excluded). For STE3, seven of the eight differences
segregating within our wild isolates are on the same
branch and the variance of branch lengths is 4.1, significantly higher than that in random genealogies (P ⵑ
0.005). For SAG1, all three segregating differences are
on the same branch, and the variance is 0.75, also significant (P ⵑ 0.05). This clumping of nucleotide
changes on the genealogies could have resulted from
nonindependent mutation (perhaps unlikely since the
changes occurred ⬎600 bp apart), introgression from
other more divergent populations, or balancing selection at a linked locus.
Molecular data set 2: genotypes of 28 isolates at seven
loci: The second data set consists of all 28 isolates genotyped for at least one polymorphism per locus sequenced, plus a microsatellite locus (Table 4). Six isolates, including the three found to be identical in data
set 1, had identical genotypes. This is unlikely in a randomized data set (P ⬍ 0.001), and all 6 isolates were
collected within 600 m of one another over a 3-month
period (Figure 2). We interpret these 6 isolates as part
of a clone. If five of these six clones are removed from
the data set, there remain 5 pairs of identical isolates and
only 18 different genotypes. This is fewer than would be
expected in a randomized data set (P ⫽ 0.05), suggesting that one or more of these are also clonemates.

One such pair (Q15.1 and Q16.1) was collected from
the same tree at the same time and is the most likely
candidate; each other pair is separated by ⬎500 m and
the data do not allow one to distinguish whether these
are clonemates or are identical just by chance.
Apart from this localized clonal growth, there is no
obvious correlation between genotype and geographic
location. With all isolates included, there is a significant
positive regression across all pairs of isolates of genotypic distance (proportion of loci at which the isolates
differ) and geographical distance (slope ⫽ 0.01 km⫺1,
P ⵑ 0.02). However, if only a single (randomly chosen)
isolate of each distinct genotype is included in the analysis, the regression is not significant (slope ⫽ 0.005 km⫺1,
P ⵑ 0.25). It appears that this population experiences
frequent gene flow on a kilometer scale.
Homozygosity and inbreeding: In the entire data set,
only a single isolate was heterozygous, at a single locus
(Table 4). Wright’s inbreeding coefficient, F, estimated
from the fixation index (Brown 1979) is 0.99. This
suggests a high level of inbreeding. In the appendix
we model a mixed-mating population in which diploid
individuals are derived either from intra-ascus mating
or from random outcrossing. Using this model, the maximum-likelihood estimate of the outcrossing rate is
1.1%, with 2-unit support limits of 0.06 and 5%. If autodiploidization occurs in the wild, this method will underestimate the true outcrossing rate, as autodiploidization
removes heterozygosity far more quickly than intra-ascus
mating does (appendix).
Recombination: In both data sets, there is abundant
evidence of recombination between loci. Of the 21 possible pairs of loci, 18 of them are phylogenetically incompatible (i.e., show evidence of past recombination). Parsimony analysis of the entire data set gives a shortest
tree of 22 steps, compared to a minimum possible of
12 (consistency index ⫽ 0.54). Taken as a whole there
is significant multilocus linkage disequilibrium (IA ⫽
0.21, rD ⫽ 0.035, P ⵑ 0.02), but not if each distinct
genotype is reduced to a single observation (IA ⫽ ⫺0.05,
rD ⫽ ⫺0.008, P ⵑ 0.6).
DISCUSSION

Like S. cerevisiae, S. paradoxus is capable of three types
of reproduction in the laboratory: clonal replication,
inbreeding, and outcrossing. All three appear to be
important in molding the pattern of genetic variation in
our natural population. Evidence for clonal replication
comes from the repeated isolation of the same genotype,
more than would be expected by chance: Among our
28 wild isolates, 6 appear to be members of a single
clone, and at least one of the other five pairs of identical
genotypes is also likely to be clonemates. There may
have been inbreeding in the ancestry of these clonemates,
or even mating between clonemates, but inbreeding
alone without clonal replication would not lead to such

Population Genetics of S. paradoxus

49

TABLE 4
Genotypes of 28 wild S. paradoxus isolates
Month
collected

MF␣1
333, 354, R

MFA1 17

W7
S36.7
T4ba
T8.1a
T18.2
T21.4a
T22.1a
T26.3
T27.3a
T32.1a
T62.1
T68.2a
T76.6a

10/96
12/97
5/98
5/98
5/98
5/98
5/98
7/98
7/98
7/98
7/98
7/98
7/98

TA4
TG4
TG4
TG4
TG4
TG4
TG4
?
TG4
TG4
TA4
TA4
TG4

C
C
T
T
C
T
T
C
T
T
T
T
C

Silkwood Park
TG
?
CG
TT
TG
AC
TG
AC
TG
?
TG
AC
TG
AC
TG
AC
TG
AC
TG
AC
CG
AC
TG
AC
TT
AC

Q4.1
Q6.1
Q14.4a
Q15.1a
Q16.1a
Q31.4a
Q32.3
Q43.5a
Q59.1
Q62.5
Q69.8a
Q70.8a
Q74.4a
Q89.8
Q95.3

9/98
9/98
9/98
9/98
9/98
9/98
9/98
9/98
10/98
10/98
10/98
10/98
10/98
10/98
10/98

TA4
TA4
TA2
AA3
AA3
TG4
TG4
TA4
TG4
TG4
TA2
TA4
TA4
TG4
TG4

C
T
C
C
C
C
C
C
C
C
C
T
C
C
C

Windsor Great Park
TG
AC
TG
AC
TG
TT
TG
TT
TG
TT
TT
AC
TG
AC
TG
?
TG
TT
TG
TT
TG
?
TG
AC
TG
?
CG
TT
TG
AC

ID

STE3
792, 805

STE2
1382, 1387

SAG1
1578

AGA2
346, 347

TFA1

G
G
A
A
A/G
A
A
A
A
A
G
G
G

GGGGGGGTT
GGGGTT

2
3
1
1
3
1
1
1
1
1
2
2
2

G
A
G
A
A
G
A
G
G
G
G
G
G
A
G

TT
TT
GTT
TT
TT
GGGTT
GGGGG-

1
1
2
2
2
2
1
1
1
2
2
2
1
1
1

Bases or repeat numbers are shown for polymorphic sites at seven loci. Numbers under the gene names
indicate polymorphic positions scored (see Table 2). Two MF␣1 alleles were absent from the nine-isolate set:
both differ from Type sequence by the G → A change at base 354; allele 3 has a further T → A change at
base 333 and three pheromone repeats. Allele 4 has two pheromone repeats.
a
Isolate IDs with indistinguishable genotypes.

an overrepresentation of genotypes. Evidence for inbreeding comes from the high homozygosity. An assumption in making this inference is that S. paradoxus
in the field behaves as it does in the lab, and in particular
that the diplophase predominates, and so the cells we
isolated are diploid. In principle, an alternative explanation for the lack of heterozygosity is that cells are haploids in nature, but autodiploidize in the early stages of
the isolation procedure. However, we do not consider
it likely that S. paradoxus should change its life cycle so
drastically in response to laboratory conditions. Finally,
evidence for outcrossing comes from the single heterozygote we found plus the genealogical incompatibility
between loci and absence of linkage disequilibrium.
This contrast between the great excess of homozygosity and the absence of linkage disequilibrium between
genes reflects the fact that even small amounts of outcrossing and recombination will randomize alleles at

different loci (Maynard Smith 1994). Nevertheless,
inbreeding reduces the effective rate of recombination
(re) in the population below the actual rate (ra), according to the relation re ⫽ (1 ⫺ F)ra (Dye and Williams 1997; Nordborg 2000). This is because recombination is effective only in heterozygous individuals, and
inbreeding reduces the frequency of heterozygotes. In
our population, F ⫽ 0.99, and so the effective recombination rate is 1% of what it would be in a randommating population. This means that linkage disequilibrium should extend for greater distances along the
genome than would otherwise be the case and may have
contributed to the absence of evidence for recombination within any of the genes studied. This extension of
linkage disequilibrium along the genome means that
DNA sequences will be more informative for at least
some types of analyses than would otherwise be the case
(Nordborg 2000), which makes S. paradoxus yet more

50

L. J. Johnson et al.

Figure 2.—Locations of oak trees from which wild isolates were collected. Superimposed circles indicate isolates from the
same tree. Suspected clones are shown as open circles.

attractive as a model system for population genetics and
genomics. Also relevant, of course, is the actual rate of
recombination, and it is interesting that S. cerevisiae has
one of the highest known recombination rates per
megabase of DNA. One explanation is that this has
evolved to compensate for a low rate of outcrossing, as
is suggested to explain the high chiasmata frequency
seen in selfing plants (e.g., Zarchi et al. 1972). Alternatively, it is possible that the high rate of recombination
has evolved as a consequence of intense selection pressures imposed by domestication (Burt and Bell 1987).
It will be interesting to see whether S. paradoxus also
has a high rate of recombination in lab crosses and to
determine just how far linkage disequilibrium extends
along the genome.
The low effective rate of recombination over distances
of ⵑ1 kb allowed us to reconstruct genealogies for each
gene. We compared the variance of branch lengths to
those found on random genealogies and detected significant deviations from neutrality in two genes, both
in the direction of changes being clumped on the genealogy. Nonindependent mutation, introgression, or balancing selection could give rise to such a pattern, al-

though formal theoretical work would be useful in
clarifying this. If balancing selection operates, it is probably not heterozygote advantage (given the low levels
of heterozygosity), but frequency-dependent selection.
Inbreeding in S. paradoxus can occur both by intraascus mating and by autodiploidization (as well as by
mating between other types of relatives) and it is not
possible with our data to determine the relative frequency of these alternatives. One possible approach
would be to compare heterozygosity at loci tightly linked
to the mating-type locus to that at unlinked loci; if there
has not been switching, heterozygosity near the matingtype locus will be maintained, even with selfing. Presumably switching does occur at least occasionally, as
otherwise selection would not maintain the underlying
mechanism.
Inbreeding species present some difficulties for interpreting sequence variability, due to genotypes being
nonindependent. Although inbreeding predominates
over outcrossing in S. paradoxus, it is not as extreme in
this regard as some other yeasts, at least in the laboratory—in many species, mating typically occurs between
a haploid mother cell and a daughter bud ( Johannsen

Population Genetics of S. paradoxus

and van der Walt 1980; Kurtzman and Fell 1998).
Other species probably outcross more than S. paradoxus—in particular, species that are vegetatively haploid and heterothallic (Kurtzman and Fell 1998). It
would be interesting to compare patterns of genetic
variation for such species with those found here.
Finally, the results reported here differ markedly from
those reported for S. cerevisiae from wineries, in which
there was a high frequency of heterozygous strains, recessive lethals, and heterothallism (Mortimer 2000).
These differences are presumably the effect of domestication, although the precise details remain obscure.
With the development of wild strain collections, such
as are available for Drosophila, and the identification
of more molecular markers in this species, S. paradoxus
may prove to be a valuable addition to the current suite
of model organisms available to the population geneticist.
Thanks go to Alexandra Eggington and Celine Vass for technical
help. This work was funded by the Natural Environment Research
Council in studentships to Louise Johnson, Matthew Goddard, and
Richard Hetherington; and a grant to Austin Burt.

LITERATURE CITED
Agapow, P.-M., and A. Burt, 2001 Indices of multilocus linkage
disequilibrium. Mol. Ecol. Notes 1: 101–102.
Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman,
1990 Basic local alignment search tool. J. Mol. Biol. 215: 403–
410.
Andolfatto, P., and M. Przeworski, 2000 A genome-wide departure from the standard neutral model in natural populations of
Drosophila. Genetics 156: 257–268.
Barnett, J. A., R. W. Payne and D. Yarrow, 1990 Yeasts: Characteristics and Identification. Cambridge University Press, Cambridge,
UK/London/New York.
Begun, D. J., and C. F. Aquadro, 1992 Levels of naturally occurring
DNA polymorphism correlate with recombination rates in D.
melanogaster. Nature 356: 519–520.
Brown, A. H. D., 1979 Enzyme polymorphism in plant populations.
Theor. Popul. Biol. 15: 1–42.
Burt, A., and G. Bell, 1987 Mammalian chiasma frequencies as a
test of two theories of recombination. Nature 326: 803–805.
Drake, J. W., 1991 A constant rate of spontaneous mutation in DNAbased microbes. Proc. Natl. Acad. Sci. USA 88: 7160–7164.
Dye, C., and B. G. Williams, 1997 Multigenic drug resistance among
inbred malaria parasites. Proc. R. Soc. Lond. Ser. B Biol. Sci.
264: 61–67.
Farris, J. S., 1989 The retention index and rescaled consistency
index. Cladistics 5: 417–419.
Farris, J. S., M. Kallersjo, A. C. Kluge and C. Bult, 1994 Testing
significance of incongruence. Cladistics 10: 315–319.
Field, D., and C. Wills, 1998 Abundant microsatellite polymorphism in S. cerevisiae, and the different distributions of microsatellites in eight prokaryotes and S. cerevisiae, result from strong
mutation pressures and a variety of selective forces. Proc. Natl.
Acad. Sci. USA 95: 1647–1652.
Goddard, M. R., and A. Burt, 1999 Recurrent invasion and extinction of a selfish gene. Proc. Natl. Acad. Sci. USA 96: 13880–13885.
Goffeau, A., B. G. Barrell, H. Bussey, R. W. Davis, B. Dujon et
al., 1996 Life with 6000 genes. Science 274: 563–567.
Hall, B. G., and P. M. Sharp, 1992 Molecular population genetics
of Escherichia coli : DNA sequence diversity at the celC, crr and gutB
loci of natural isolates. Mol. Biol. Evol. 9: 654–665.
Herbert, C. J., G. Dujardin, M. Labouesse and P. P. Slonimski,
1988 Divergence of the mitochondrial leucyl transfer-RNA synthetase genes in 2 closely related yeasts, Saccharomyces cerevisiae

51

and Saccharomyces douglasii—a paradigm of incipient evolution.
Mol. Gen. Genet. 213: 297–309.
Herskowitz, I., 1988 Life cycle of the budding yeast Saccharomyces
cerevisiae. Microbiol. Rev. 52: 536–553.
Johannsen, E., and J. P. van der Walt, 1980 Hybridization studies
within the genus Schwanniomyces Klo¨cker. Can. J. Microbiol. 26:
1199–1203.
Kitada, K., and F. Hishinuma, 1988 Evidence for preferential multiplication of the internal unit in tandem repeats of MFalpha genes
in Saccharomyces yeasts. Curr. Genet. 13: 1–5.
Kurtzman, C. P., and J. W. Fell, 1998 The Yeasts: A Taxonomic Survey,
Ed. 4. Elsevier, Amsterdam.
Li, W.-H., and L. A. Sadler, 1991 Low nucleotide diversity in man.
Genetics 129: 513–523.
Maynard Smith, J., 1994 Estimating the minimum rate of genetic
transformation in bacteria. J. Evol. Biol. 7: 525–534.
Mortimer, R. K., 2000 Evolution and variation of the yeast (Saccharomyces) genome. Genome Res. 10: 403–409.
Naumov, G. I., E. Naumova and M. Korhola, 1992a Genetic identification of natural Saccharomyces sensu stricto yeasts from Finland,
Holland and Slovakia. Antonie van Leeuwenhoek 61: 237–243.
Naumov, G. I., E. S. Naumova, R. A. Lantto, E. J. Louis and M.
Korhola, 1992b Genetic homology between Saccharomyces cerevisiae and its sibling species S. paradoxus and S. bayanus : electrophoretic karyotypes. Yeast 8: 599–612.
Naumov, G. I., E. S. Naumova and A. Querol, 1997a Genetic study
of natural introgression supports delimitation of biological species in the Saccharomyces sensu stricto complex. Syst. Appl. Microbiol. 20: 595–601.
Naumov, G. I., E. S. Naumova and P. D. Sniegowski, 1997b Differentiation of European and Far East Asian populations of Saccharomyces paradoxus by allozyme analysis. Int. J. Syst. Bacteriol.
47: 341–344.
Naumov, G. I., E. S. Naumova and P. D. Sniegowski, 1998 Saccharomyces paradoxus and Saccharomyces cerevisiae are associated with
exudates of North American oaks. Can. J. Microbiol. 44: 1045–
1050.
Nelson, M. A., 1996 Mating systems in ascomycetes: a romp in the
sac. Trends Genet. 12: 69.
Nordborg, M., 2000 Linkage disequilibrium, gene trees and selfing:
an ancestral recombination graph with partial self-fertilization.
Genetics 154: 923–929.
Rozas, J., and R. Rozas, 1999 DnaSP version 3: an integrated program for molecular population genetics and molecular evolution
analysis. Bioinformatics 15: 174–175.
Sambrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Cloning: A Laboratory Manual, Ed. 2. Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, NY.
Sherman, F., 1991 Getting started with yeast, pp. 3–21 in Guide to
Yeast Genetics and Molecular Biology, edited by C. Guthrie and
G. R. Fink. Academic Press, San Diego.
Sniegowski, P. D., P. G. Dombrowski and E. Fingerman, 2002 Saccharomyces cerevisiae and Saccharomyces paradoxus coexist in
a natural woodland site in North America and display different
levels of reproductive isolation from European conspecifics.
FEMS Yeast Res. 1: 299–306.
Swofford, D. L., 2002 PAUP *. Phylogenetic Analysis Using Parsimony
(*and Other Methods), Version 4. Sinauer Associates, Sunderland,
MA.
Vaughan-Martini, A., and A. Martini, 1995 Facts, myths and legends on the prime industrial microorganism. J. Indust. Microbiol.
14: 514–522.
Wall, J. D., P. Andolfatto and M. Przeworski, 2002 Testing models of selection and demography in Drosophila simulans. Genetics
162: 203–216.
White, T. J., T. Bruns, S. Lee and J. W. Taylor, 1990 Amplification
and direct sequencing of fungal rRNA genes for phylogenetics,
pp. 315–322 in PCR Protocols: A Guide to Methods and Applications,
edited by M. A. Innes, D. H. Gelfand, J. J. Sninsky and T. J.
White. Academic Press, San Diego.
Wolfram Research, 1999 Mathematica, Version 4. Wolfram Research, Champaign, IL.
Zarchi, Y., G. Simchen, J. Hillel and T. Schaap, 1972 Chiasmata
and the breeding system in wild populations of diploid wheats.
Chromosoma 38: 77–94.

52

L. J. Johnson et al.

Zeyl, C., 2000 Budding yeast as a model organism for population
genetics. Yeast 16: 773–784.
Zeyl, C., and J. A. G. M. DeVisser, 2001 Estimates of the rate
and distribution of fitness effects of spontaneous mutation in
Saccharomyces cerevisiae. Genetics 157: 53–61.
Communicating editor: D. Charlesworth

APPENDIX

To estimate the frequency of outcrossing compatible
with the observed level of heterozygosity, we first modeled a mixed-mating population in which haploid cells
either mate within the ascus with probability s or mate
randomly in the population with probability t (⫽1 ⫺
s). Note first that in such a population, the probability
that an individual chosen at random is derived from x
generations of selfing (i.e., there are exactly x generations of selfing in its ancestry before one gets back to
an outcrossing event) is sxt. Second, the probability that
an individual derived from x generations of selfing is
homozygous at locus i is 1 ⫺ HWi(2/3)x, where HWi is
the Hardy-Weinberg proportion of heterozygotes in the
population at that locus. Note that in this system selfing
reduces heterozygosity by one-third every generation,
not by one-half, as in more familiar systems where selfing
gametes come from independent meioses (e.g., plants).
This is because, with intra-ascus mating, each haploid
spore produced from a heterozygous diploid shares an
allele with only one of its three potential mating partners. Finally, the overall probability that a random individual is homozygous at the ith locus is the product
of these two probabilities, summed over all possible
numbers of generations of selfing in its ancestry:
p(homozygous) ⫽

兺 sxt 冢1 ⫺ HWi 冢3冣 冣 .
2

x

x⫽0

In our data set there are 7 loci, and the probability that
an individual will be homozygous at all of them is then
p(all homozygous) ⫽

7

x⫽0

i⫽1

兺 sxt 兿 冢1 ⫺ HWi冢3冣 冣 .
2

x

Note that this assumes the loci are independent. For
the six isolates with missing data, the inside product is
done over only the loci for which there are data. Finally,
isolate T18.2 is homozygous at 5 loci and heterozygous
at SAG1, and the probability of an individual being this
is
p(T18.2) ⫽

5

x⫽0

i⫽1

兺 sxt 兿 冢1 ⫺ HWi冢3冣 冣HWSAG1冢3冣 .
2

x

2

x

When we count only one isolate of each distinct genotype, the data consist of 14 completely homozygous genotypes, two homozygous isolates with unknown STE2
genotype, one homozygous isolate with unknown MF␣1
genotype, and the heterozygote T18.2. The probability
of observing the entire data set is therefore
p(data) ⫽ p(all homozygous)14 ⫻ p(missing STE2)2
⫻ p(missing MF␣1) ⫻ p(T18.2).
The maximum possible value of this occurs at an outcrossing rate of t ⫽ 1.1%, with 2-unit support limits of
0.06 and 5%.
We also modeled a mixed-mating population in which
individuals were derived either from mating between
clonemates (autodiploidization) with probability s or
from random outcrossing with probability t. In this case
individuals are either completely homozygous at all loci
or heterozygous at Hardy-Weinberg proportions, and
the probability an individual is homozygous at the ith
locus is
p(homozygous) ⫽ s ⫹ t(1 ⫺ HWi).
With this model the maximum-likelihood outcrossing
rate is 6%, with 2-unit support limits of 0.3 and 23%,
higher than that in the previous model, as a greater
frequency of outcrossing is needed to counterbalance
the more intense inbreeding caused by autodiploidization.