You are on page 1of 53

Genome Evolution in Yeast

Gilles Fischer

27th January 2009 | European Course on

INTRODUCTION:
Comparative genomics
Yeasts as model organisms
GENOME EVOLUTION:
DNA duplications
Chromosome dynamics
Nucleotide composition

A brief introduction to the field of Comparative Genomics

Comparing genomes is a very old idea

DNA carries the genetic information: Avery (1943) and Hershey-Chase (


Vendrely and Vendrely (1950):
"Il ne fait aucun doute que l'tude systmatique de la
teneur absolue du noyau en acide dsoxyribonuclique,
travers de nombreuses espces animales puisse fournir des
suggestions intressantes en ce qui concerne le problme de
l'volution"
Jacques Monod:

"Tout ce qui est vrai pour le colibacille est vrai pour l'lphant"

A brief introduction to the field of Comparative Genomics

identical

divergent

different

time
or
quantity of evolutionary
changes
Looking for differences

Looking for similarities

A brief introduction to the field of Comparative Genomics

identical

divergent

different

time
or
quantity of evolutionary
changes
Looking for differences

Looking for similarities

NEED FOR ADEQUATELY RELATED ORGANSIMS

A brief introduction to the field of Comparative Genomics


Bio-informatics

Genome sequences

Looking for differences


Looking for similarities

Rules governing
genome evolution

Experimental Biology

Molecular
mechanisms

Genetic screens
functional genomics

Mechanistic
hypotheses

A brief introduction to the field of Comparative Genomics


Bio-informatics

Genome sequences

Looking for differences


Looking for similarities

Rules governing
genome evolution

SMALL GENOMES
AND
EXPERIMENTALLY TRACTABLE

Experimental Biology

Molecular
mechanisms

Genetic screens
functional genomics

Mechanistic
hypotheses

A brief introduction to the field of Yeast Genomics


Organisms with small genomes, phylogenetically related and
experimentally tractable = YEASTS
Eukaryotic micro-organisms classified in the kingdom Fungi
About 1,500 species currently described (only 1% of all yeast)
Yeasts are unicellular, typically measuring 34 m in diameter (up to
over 40 m)

Saccharomyces cerevisiae used in baking and fermenting alcoholic


beverages for thousands of years
Other species of yeast, such as Candida albicans, are opportunistic
human pathogens
Yeasts have recently been used to generate electricity in microbial
fuel cells and produce ethanol for the biofuel industry.

Yeasts are found in both divisions Ascomycota and Basidiomycota

A brief introduction to the field of Yeast Genomics


Organisms with small genomes, phylogenetically related and
experimentally tractable = YEASTS

The Tree of Eukaryotes (Keeling et al., 2005)

Saccharomycotina
A brief introduction to the field of Yeast Genomics
The first eukaryotic genome sequence:
The genome of S. cerevisiae

Andr Goffeau
8 years, 120 labs,
641 people

Life with 6000 genes


Science (1996)

Saccharomyces paradoxus
Saccharomyces mikatae
Saccharomyces cerevisiae
Saccharomyces kudriavzevii
Saccharomyces bayanus
Saccharomyces pastorianus
Saccharomyces exiguus
Saccharomyces servazzii
Saccharomyces castellii
Candida glabrata
Vanderwaltozyma polyspora
Zygosaccharomyces rouxii
Lachancea thermotolerans
Lachancea waltii
Lachancea kluyveri
Kluyveromyces lactis
Kluyveromyces marxianus
Eremothecium gossypii
Saccharomycodes ludwigii
Brettanomyces bruxellensis
Pichia angusta
Candida lusitaniae
Debaryomyces hansenii
Pichia stipitis
Pichia sorbitophila
Candida guilliermondii
Candida tropicalis
Candida parapsilosis
Lodderomyces elongisporus
Candida albicans
Candida dubliniensis
Arxula adeninivorans
Yarrowia lipolytica
Schizosaccharomyces pombe

Saccharomycotina
A brief introduction to the field of Yeast Genomics

Whole Genome Duplication

Gain of Megasatellites

Gain of HO gene

Gain of mating type cassettes


and small centromeres

frequent tandem duplications

Extensive loss of transposable


elements and spliceosomal
introns

Saccharomyces paradoxus
Saccharomyces mikatae
Saccharomyces cerevisiae
Saccharomyces kudriavzevii
Saccharomyces bayanus
Saccharomyces pastorianus
Saccharomyces exiguus
Saccharomyces servazzii
Saccharomyces castellii
Candida glabrata
Vanderwaltozyma polyspora
Zygosaccharomyces rouxii
Lachancea thermotolerans
Lachancea waltii
Lachancea kluyveri
Kluyveromyces lactis
Kluyveromyces marxianus
Eremothecium gossypii
Saccharomycodes ludwigii
Brettanomyces bruxellensis
Pichia angusta
Candida lusitaniae
Debaryomyces hansenii
Pichia stipitis
Pichia sorbitophila
Candida guilliermondii
Candida tropicalis
Candida parapsilosis
Lodderomyces elongisporus
Candida albicans
Candida dubliniensis
Arxula adeninivorans
Yarrowia lipolytica
Schizosaccharomyces pombe

A brief introduction to the field of Yeast


Genomics
Genome
annotation
# chr

size (Mb)

# genes

# tRNA

# introns

Saccharomyces cerevisiae

16

12,1

5769

274

287

Candida glabrata

13

12,3

5204

207

131

Zygosaccharomyces rouxii

9,8

4998

272

167

Lachancea kluyveri

11,3

5308

258

322

Lachancea thermotolerans

10,4

5104

231

286

Kluyveromyces lactis

10,7

5084

162

175

Debaryomyces hansenii

12,1

6273

200

475

Yarrowia lipolytica

20,5

6434

510

1070

(WashU seq center M. Jonhston)

A brief introduction to the field of Yeast Genomics

Evolutionary scale

Saccharomyces cerevisiae

100 *

Candida glabrata

65

100 *
100 MYr

100 MYr

amino acid
identity %

Homo sapiens
100 - 300 MYr

90
Lachancea kluyveri

Lachancea thermotolerans

Mus musculus

550 MYr

300 - 1000 MYr

450 MYr

Zygosaccharomyces rouxii

70
Kluyveromyces lactis

60

Debaryomyces hansenii

51

Yarrowia lipolytica

48

Berbee and Taylor, 2006; James et al., 2006

Takifugu rubripes
Tetraodon negroviridis

50

Ciona intestinalis

*Dujon et al., et * Jaillon et al., Nature, 2004

A brief introduction to the field of Yeast Genomics

Genome redundancy

WGD

Saccharomyces cerevisiae

mean family size

1.40
1.35
1.30

Candida glabrata

1.25
1.20

Zygosaccharomyces rouxii

1.15

Lachancea kluyveri

YA
LI

DE
HA

A
KL
L

LA
TH

LA
KL

ZY
RO

CA
GL

SA
CE

1.10

(WashU seq center M. Jonhston)

Lachancea thermotolerans

Kluyveromyces lactis

- important level of redundancy (in all


eukaryotic phyla)
- Gene order changes (differential loss of
duplicates, translocation breakpoints)

Debaryomyces hansenii

- several mechanisms of duplication


Yarrowia lipolytica
Wolfe and Shields, 1997

Yeast Genomes

- Small, compact and specialized:


- small intergenic sequences
- few transposable elements
- few introns
- limited RNA interference
-Large evolutionary scale
- High level of genome redundancy
- Numerous
all clades

evolutionary

novelties

- High number of sequenced genomes


===> good model organisms to study genome evolution

in

Genome evolution: DNA duplications

Most eukaryotic genomes contain high proportion of


duplicated genes
S. c.
A. t.
H. s. s.
Duplicated Genes
43%
50%

Pseudogenization

Loss of function
(most frequent
fate)

C. e.
65%

49%

Neofunctionalization

Gain of a
new
function

D. m.
40%

duplication

Conservation

Degeneration
Complementation

Gene dosage increase Specializati


on of the 2
Genetic robustness
copies

===> Strong evolutionary potential

Genome evolution: DNA duplications

Adaptative value of DNA duplications:


Adaptation to sulfate-limited conditions in chemostats for 200 generations:
CGH

SDs containing between 1 to 22 genes


No homology at the junctions (microhomologies)

Gresham et al., PLoS Genet 2008

Genome evolution: DNA duplications

A duplication assay:

XV

3days - YPD - 30

RPL20A

==> WT growth rate

???
XV

XIII

RPL20B RPL20B

rpl20A
dltion

==>slow growth

and so on

RPL20B

XIII

==> WT growth rate

Genome evolution: DNA duplications A duplication assay:


Molecular characterization of segmental duplications:
Karyotype

Hybridization

Comparative Genomic Hybridization

RPL20B

IV - XII
XV

RPL20B
143 kb

VII, XV
V, XIII
II
XIV
X
XI
V - VIII
IX

Molecular combing

direct tandem

PCR and sequence

III
VI
I
A A C C T A G A G C T T ( G T T ) 14 G T G G A T T G T T T

Despite the selection of a single gene duplication event, only large segmental duplications were recovered

Genome evolution: DNA duplicationsMolecular mechanisms:

strain

REPLICATION

WT
pol32
time
(min)

rate of SDs
(/cell/division)

10-7

type of SDs
Intra-chromosomal

(1)

Raghuraman
2001
0 et al. Science,
(<0.07)

Inter-chromosomal

42

-T

breakpoint sequences (%)

T T -

LTRs
(300bp)

microhomologies
(2 to 11 bp)
microsatellites
(poly A/T or
rpt trinucleotides)

48

52

62

38

clb5

7x 10-5

(730)

66

CPT

3 x 10-5

(320)

22

rad52

3 x 10-7

(3)

70

10

15

DSB REPAIR

20
25
30
35
40

rad52
rad1
dnl4

8 x 10-8

(0.8)

15

Lately replicated regions


54
56
tRNAs
LTRs
microsatellites
0
100

a connection with
replication?
0

100

Koszul et al. EMBO J., 2004

Replication-based mechanisms
strain

WT
clb5

rate of SDs
(/cell/division)

type of SDs
Intra-chromosomal

breakpoint sequences (%)

Inter-chromosomal

LTRs

microhomologies
microsatellites

10-7

(1)

42

48

52

7x 10-5

(730)

66

62

38

defect in the firing of late replication origins (Schwob et al , 1993)

Clb5

S-phase lasts twice longer (Epstein et al, 1992)


Rad9-dependent activation of the replication checkpoint indicative of
DNA damages (Gibson et al, 2004)
RPL20B lies in Clb5-dependent region (CDR; McCune et al, 2008)

replication perturbations strongly induce SD formation


Bloom and Cross, 2007
pol32
0

Pol32

(<0.07)

Nick McElhinny, Cell 2008

Pol32 is required for initiating BIR reaction (Lydeard et al, 2007)

SDs are generated through replication-based mechanisms

Replication-based mechanisms
strain

rate of SDs
(/cell/division)

type of SDs
Intra-chromosomal

breakpoint sequences (%)

Inter-chromosomal

LTRs

microhomologies
microsatellites

WT

10-7

(1)

42

48

52

CPT

3 x 10-5

(320)

22

54

56

Top1

Top1

CPT

=>broken forks promote


SD formation

Broken forks as precursor lesions leading to SDs

The DSB repair pathways

Dnl4
NHEJ

Resection
HR

Rad52
Rad51

pas dhomologies,
religature simple

Rad1
Pol32

SSA

MMEJ

BIR

Microhomologies (5-12pb
>30pb dhomologies

SDSA

DSBR

Two different replication-based mechanisms


strain

rate of SDs
(/cell/division)

type of SDs
Intra-chromosomal

breakpoint sequences (%)

Inter-chromosomal

LTRs

microhomologies
microsatellites

WT

10-7

(1)

42

48

52

rad52

3 x 10-7

(3)

70

100

====>

=>
HR-dependent

HR-independent

=> HR-mediated SDs result from BIR Rad51-independent

=> Non HR-mediated SDs result from ?

The DSB repair pathways

Dnl4
X
Resection

Rad52
X

Rad1
X

MMIR: microhomology microsatellite-induced replication


strain

rate of SDs
(/cell/division)

type of SDs
Intra-chromosomal

breakpoint sequences (%)

Inter-chromosomal

LTRs

microhomologies
microsatellites

WT

10-7

(1)

42

48

52

rad52

3 x 10-7

(3)

70

100

rad52
rad1
dnl4

8 x 10-8

(0.8)

15

100

HR requires Rad52
MMEJ requires Rad1
NHEJ requires Dnl4

SD are still being formed in the absence of all known DSB repair pathways
existence of a new DSB repair pathway?

Sequences found at breakpoints: microhomologies between 2 and 11 bp


poly (A/T)13-23
trinucleotide repeats (GTT)3-20

Extremely high density of microhomologies and microsatelites in the genome


often intragenic
Formation of chimeric genes at breakpoints (in 13 out of 26 junctions)

The DSB repair pathways

Dnl4
X
Resection

Rad52
X

Rad1
X

The DSB repair pathways

Dnl4
X
Resection

Rad52
X

Microhomology/microsatellites Induced Replication

Rad1
X

A new pathway?
MMIR

- independent from all known DSB repair pathways (HR, NHEJ, MMEJ)
- dependent from Pol32
- Replication template switching between microhomologies and microsatellites

Genome evolution: DNA duplications


Conclusions

SDs are spontaneously generated at high frequency: 10-7 SD/cell/division for the RPL20B locus
SDs arise from two alternative replication-based mechanisms: BIR and MMIR
MMIR represents a new mechanism different from known DSB repair pathways (HR, NHEJ):
between microhomologie (between 2 to 11 nt) and microsatellites (poly A/T, trinucleotide
repeats)
independent from Rad52
requires Pol32
MMIR induces the formation of chimerical genes at the rearrangement junctions

Genome evolution: DNA duplications


In human, FoSTeS/MMBIR:
Hastings et al, Nature Review Genetics, 2009

Complex structural variations: - Lissencephaly (Nagamani et al., J. Med Genet 2009)


- Miller-Dieker syndrome
- Charcot-Marie-Tooth disease (Lupski and Chance, 2005)
- Pelizaeus Merzbacher disease (Lee et al., Cell 2007)
- XLMR syndrome (Bauters et al., Genome Res 2008)
- SDs and CNVs (Kim et al., Genome Res 2008)

Genome evolution: Chromosome Dynamics

-Duplications: high evolutionary potential (creation of new genes, adaptation,


specialization,)
- Translocations, inversions, deletions: very low evolutionary potential? (Loss
of genes, deregulation of gene expression, modification of sub-nuclear
architecture,)

Species 1

translocations
Inversions
duplications

Species 2

deletions

#
#

rates of rearrangements

Genome evolution: Chromosome Dynamics


S. cerevisiae
S. cariocanus
S. paradoxus
S. mikatae
S. kudriavzevii

Sensu stricto
S. serevisiae
S. bayanus
Candida glabrata

Zygosaccharomyces rouxii

S. bayanus
Saccharomyces sensu stricto complex:
- monophyletic group
- very closely related species
- hybrids viable but sterile
- 16 chromosomes

Lachancea kluyveri

Lachancea thermotolerans

Kluyveromyces lactis

Debaryomyces hansenii

Yarrowia lipolytica

Genome evolution: Chromosome Dynamics


S. cerevisiae
S. cariocanus (4)
S. paradoxus

(0)

S. mikatae

(2)

S. kudriavzevii (0)
S. bayanus
S. cerevisiae S. paradoxus

only few translocations:


low reorganization
recombination between repeated sequences
no chromosomal speciation
variable rate of rearrangements?

(4)
S. kudriavzevii

S. mikatae

S. cariocanus

S. bayanus

Fischer et al. , Nature 2000

Genome evolution: Chromosome Dynamics

S.cerevisiae

S.bayanus
8 15

C.glabrata
1

7 9 11 13
6 8 10 12

K.lactis
D.hansenii
Sensu stricto

3 5
2 4 6

S. serevisiae
S. bayanus

G IJ

Y.lipolytica
2 45 6

Candida glabrata

chrVIII

Zygosaccharomyces rouxii

Lachancea kluyveri

Lachancea thermotolerans

Kluyveromyces lactis

Debaryomyces hansenii

Yarrowia lipolytica

98%

88%

77%

11%

5%

Genome evolution: Chromosome Dynamics

S.cerevisiae

S.bayanus
8 15

C.glabrata
1

7 9 11 13
6 8 10 12

K.lactis

3 5
2 4 6

D.hansenii
A

G IJ

Y.lipolytica
2 45 6

chrVIII

98%

88%

77%

11%

5%

Genome evolution: Chromosome Dynamics

S.cerevisiae

S.bayanus
8 15

C.glabrata
1

7 9 11 13
6 8 10 12

K.lactis

3 5
2 4 6

D.hansenii
A

G IJ

Y.lipolytica
2 45 6

chrVIII

Fischer

F. Brunet

98%

88%

77%
Fischer et al. , PLoS Genet 2006

Genome evolution: Chromosome Dynamics

S.cerevisiae

S.bayanus
8 15

C.glabrata
1

7 9 11 13
6 8 10 12

K.lactis

3 5
2 4 6

D.hansenii
A

G IJ

Y.lipolytica
2 45 6

chrVIII

98%

88%

77%

11%

5%

Genome evolution: Chromosome Dynamics

S.cerevisiae

S.bayanus
8 15

C.glabrata
1

7 9 11 13
6 8 10 12

K.lactis

3 5
2 4 6

D.hansenii
A

G IJ

Y.lipolytica
2 45 6

chrVIII

98%

88%

77%

11%

5%

Genome evolution: Chromosome Dynamics


at genome scale:
Saccharomyces cerevisiae

Mean amino acid identity: 65%

Candida glabrata

Zygosaccharomyces rouxii

C. glabrata

- comprehensive reshuffling
- 509 translocations, 104 inversions
- no homologous chromosomes

"UNSTABLE" GENOMES
Lachancea kluyveri

S.cerevisiae

Lachancea thermotolerans

L. thermotolerans

Mean amino acid identity: 58%


-moderate reshuffling
-91 translocations, 22 inversions
- large chromosomal segments
(up to 670 kb)

"STABLE" GENOMES
L. kluyveri

Genome evolution: Chromosome Dynamics

Quantitative estimation of the relative genome stability:


order conservation)
species 1

GOC (gene

=5
# neighboring orthologues

If yes: +1

If no: 0

GOC =
Total # orthologues

species 2

=5

- GOL : Gene Order Loss = 1 - GOC

GOL

- Rate of rearrangements =
Dist phylogntique

mean rate

Rocha, Trends Genet, 2003,

Genome evolution: Chromosome Dynamics


Rearrangement branch rate
WGD

1.5

Species instability scale


Saccharomyces cerevisiae

0.7

2.7
1.3

0.4

Candida glabrata
D. hansenii

0.6

0.6
Zygosaccharomyces rouxii

1.7
0.3

S. cerevisiae
Lachancea kluyveri
(WashU seq center M. Jonhston)

0.4

Lachancea thermotolerans

0.0
0.9

1.7

1.7

Kluyveromyces lactis

C. glabrata

0.5

Z. rouxii
K. lactis
L. kluyveri

0.4

L. thermot

0.3
Debaryomyces hansenii

Yarrowia lipolytica
Fischer et al. , PLoS Genet 2006

Genome evolution: Chromosome Dynamics

moderate

massive

low
Sensu stricto
S. serevisiae
S. bayanus
Candida glabrata

differential gene loss

Unstable genome

Zygosaccharomyces rouxii

Lachancea kluyveri
(WashU seq center M. Jonhston)

Lachancea thermotolerans

Stable genomes

Kluyveromyces lactis

TGA expansion
Debaryomyces hansenii

No synteny
Y. lipolytica

Genome evolution: Chromosome Dynamics


Conclusions

High level of chromosome plasticity


Hundreds of translocations and inversions
Gene order is not very constrained
Highly variable rates of chromosome rearrangements between lineages but also within a given
lineage
Is there a selective advantage associated to these rearrangements? Are they accumulated by
genetic drift?
usually considered as deleterious
few examples of the adaptative role of rearrangements (proliferation of cancer cells (ONeil
and Look, 2007), growth advantage of translocated yeast cells (Colson et al, 2004),
adaptative gene loss (Domergue, 2005).
Creation of genetic novelties requires chromosome plasticity?

Genome evolution: Nucleotide composition


GC%
Saccharomyces cerevisiae

38.3

Candida glabrata

38.8

Zygosaccharomyces rouxii

39.1

Base substitution mutations:


T transitions : cytosine deamination

QuickTime et un
dcompresseur
sont requis pour visionner cette image.

Kreutzer and Essigmann, PNAS, 1998

Lachancea kluyveri

41.5

T transversions : 8-oxo-guanine
Shibutani et al., Nature, 1991

Lachancea thermotolerans

47.3

Kluyveromyces lactis

38.8

Global AT-enrichment

Eremothecium gossypii

52.0

>

Debaryomyces hansenii

36.3

<

Yarrowia lipolytica

49.0

The Gnolevures Consortium, Genome Res., 2009

Biased Gene Conversion (BGC):


AT

Duret and Galtier, Annu Rev


Genomics Human Genet, 2009

not in yeast?

GC mutations

Global GC-enrichment
Marsolier-Kergoat and Yeramian, Genetics, 2009

Lachancea thermotolerans
GC%

80

60

47.3
40

39.1

QuickTime et un
dcomp resseur
so nt req uis pour vision ner cett e image.

20
1

10

Mb

Zygosaccharomyces rouxii
Lachancea kluyveri

GC%
80

1 Mb
C-left

60

52.9
40

41.5

20

10

11

Mb

Genome evolution: Nucleotide composition

DNA

GC% in C-left:46.1
GC% out of C-left:37.4

RNA

1st

54.2
42.0

2nd

3rd

46.8
36.5

1st

2nd

global GC increase

3rd

1st

GC% in C-left:53.3 41.0 68.3


GC% out of C-left:
46.4 37.0 42.7

Protein

2nd

3rd

AAAAAA

strong bias in codon

84

84

84

72

11

16

16

16

1.3

1.2

1.1

1.2

0.7

0.8

0.9

GC% in synonymous codons

0.9 relative use in C-left

bias in protein compo


Payen et al., Genome Res., 2009

Genome evolution: Nucleotide composition


Phylogeny:
S. cerevisiae

100

Alignments of universally conserved proteins :


100

C. glabrata

17 families (6688 residues) outside C-left


19 families (4631 residues) in C-left

Z. rouxii
L. kluyveri

100
100

0.05

96
100

100
100

100

L. waltii
L. thermotolerans
K. lactis

98

E. gossypii

C-left has the same phylogentic origin than the rest of the g

Payen et al., Genome Res., 2009

Genome evolution: Nucleotide composition


Synteny:

LAWA_S33

670 kb

LATH_F

LAWA_S27

LAWA_S56

LAWA_S55

LAKL_C

LATH_G
LATH_C
LATH_E
LATH_A

C-left share a common ancestral origin with the


genomes of L. waltii (LAWA) and L. thermotolerans
(LATH)

Genome evolution: Nucleotide composition


Replication:
- Design of custom microarrays (Agilent 2 x 105k):

200bp fragments

- Time course analysis of copy number variation


during S-phase:

G1

DNACy3

DNACy5

G2

Genome evolution: Nucleotide composition


Replication:

ChrA

ChrB

Genome evolution: Nucleotide composition


Replication:

ChrC

ChrD

Genome evolution: Nucleotide composition


Conclusions

L. kluyveri offers a unique opportunity to understand the mechan


of genome nucleotide composition
Global GC increase (codon usage bias and protein
composition bias)
harbors a normal gene density
Phylogenetic origin consistent with the rest of the genome
presents a very high level of synteny conservation with
sister species genomes
encompasses the MAT locus but has lost the silent cassettes
HMR and HML
is devoid of Transposable Elements (203 insertions in the
rest of the genome)
harbors the same compositional bias in all 11 L. kluyveri
strains tested
The replication program is modified (more origins and

Merci
- Unit de Gntique Molculaire des Levures, Institut Pasteur
Celia Payen
Romain Koszul

- Unit de Gnomique des Microorganismes, quipe Biologie des


Gnomes
Nicolas Agier
Gunola Drillon

- Gnolevures consortium:
Jean-Luc Souciet

Univ. Louis Pasteur, Strasbourg

- Centre National de Squenage, Evry

Jean Weissenbach, Patrick Winker

- Gnopole Pasteur-Ile de France Christiane Bouchier, Lionel Frangeul


- Plateforme Puces ADN, Gnopole Pasteur

Odile Sismeiro, Jean-Yves Copp

You might also like