You are on page 1of 21

Article

Introns Protect Eukaryotic Genomes from


Transcription-Associated Genetic Instability
Graphical Abstract Authors
Amandine Bonnet, Ana R. Grosso,
Abdessamad Elkaoutari, ...,
Vincent Geli, Sergio F. de Almeida,
Benoit Palancade

Correspondence
benoit.palancade@ijm.fr

In Brief
By combining the genetic manipulation of
intron content with genome-wide
analyses in both yeasts and human cells,
Bonnet et al. reveal a function for introns
in counteracting DNA:RNA hybrid
(R-loop) formation and its deleterious
impact on genetic stability.

Highlights
d Introns prevent R-loop and DNA damage accumulation on
highly expressed yeast genes

d Insertion of an intron in an R-loop-prone gene attenuates


R-loop formation

d Spliceosome-dependent mRNP assembly, but not splicing,


prevents R-loop formation

d The role of introns in R-loop prevention has been conserved


from yeasts to human

Bonnet et al., 2017, Molecular Cell 67, 608621


August 17, 2017 2017 Elsevier Inc.
http://dx.doi.org/10.1016/j.molcel.2017.07.002
Molecular Cell

Article

Introns Protect Eukaryotic Genomes


from Transcription-Associated Genetic Instability
Amandine Bonnet,1,5 Ana R. Grosso,2 Abdessamad Elkaoutari,3 Emeline Coleno,1 Adrien Presle,1 Sreerama C. Sridhara,2
Guilhem Janbon,4 Vincent Geli,3 Sergio F. de Almeida,2 and Benoit Palancade1,6,*
1Institut Jacques Monod, CNRS, UMR 7592, Universite Paris Diderot, Sorbonne Paris Cite, 75013 Paris, France
2Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, 1600-276 Lisboa, Portugal
3Cancer Research Center of Marseille (CRCM), Equipe Labellisee Ligue, U1068 INSERM, UMR7258 CNRS, Institut Paoli-Calmettes, Aix

Marseille University, 13284 Marseille, France


4Institut Pasteur, Unite Biologie des ARN des Pathogenes Fongiques, Departement de Mycologie, 75015 Paris, France
5Present address: Sorbonne Paris Cite, Universite Paris Diderot, INSERM U944, CNRS UMR 7212, Institut Universitaire dHematologie,

Hopital St. Louis, 75010 Paris, France


6Lead Contact

*Correspondence: benoit.palancade@ijm.fr
http://dx.doi.org/10.1016/j.molcel.2017.07.002

SUMMARY or high densities of transcription complexes have been shown


to favor the formation of abnormal DNA structures and/or to pro-
Transcription is a source of genetic instability that voke transcription/replication collisions. Elevated transcription
can notably result from the formation of genotoxic would thereby ultimately increase the accessibility of the DNA
DNA:RNA hybrids, or R-loops, between the nascent to extrinsic genotoxic agents, but also to endogenous nucleases
mRNA and its template. Here we report an unex- and DNA-modifying enzymes, subsequently leading to DNA
pected function for introns in counteracting R-loop double-strand breaks (DSBs) and mutations, the hallmarks of
transcription-associated genetic instability (reviewed in Gaillard
accumulation in eukaryotic genomes. Deletion of
et al., 2013).
endogenous introns increases R-loop formation,
Among the unwanted intermediates that accumulate at tran-
while insertion of an intron into an intronless gene scribed loci and trigger genome instability, DNA:RNA hybrids
suppresses R-loop accumulation and its deleterious (or R-loops) have recently drawn considerable attention (Sollier
impact on transcription and recombination in yeast. and Cimprich, 2015). R-loops are stable hybrids formed between
Recruitment of the spliceosome onto the mRNA, the template strand of the transcribed DNA and the nascent
but not splicing per se, is shown to be critical to mRNA species, thereby generating a displaced single-stranded
attenuate R-loop formation and transcription-asso- DNA (ssDNA). Hybrid formation was proposed to be influenced
ciated genetic instability. Genome-wide analyses in by a number of parameters, including DNA sequence, e.g., GC
a number of distant species differing in their intron skew and AT skew, topological constraints, and transcription
content, including human, further revealed that levels (Roy and Lieber, 2009; Ginno et al., 2012; Chan et al.,
2014a; El Hage et al., 2014; Wahba et al., 2016). At certain
intron-containing genes and the intron-richest ge-
genomic locations, natural R-loop formation has been reported
nomes are best protected against R-loop accumula-
to contribute positively to different nuclear processes, including
tion and subsequent genetic instability. Our results transcription initiation and termination (Costantino and Kosh-
thereby provide a possible rationale for the conser- land, 2015). However, the R-loop accumulation observed in
vation of introns throughout the eukaryotic lineage. mutant or pathological situations can be deleterious for genome
expression and stability. Indeed, R-loops decrease Pol II elonga-
tion efficiency, disturb DNA replication fork progression, and
INTRODUCTION expose vulnerable ssDNA stretches, thereby triggering gene
expression defects, DSBs, and unwanted recombination events
Gene expression exposes DNA to potentially harmful agents and in several distant eukaryotic species (Chan et al., 2014b; Santos-
comes into conflict with genome duplication and transmission. Pereira and Aguilera, 2015).
Indeed, pioneering studies in budding yeast revealed that highly In view of the importance of controlling R-loop metabolism for
expressed mRNA-coding loci display elevated rates of genomic maintaining genome homeostasis, hybrid accumulation is natu-
rearrangements and mutations, identifying transcription by RNA rally restricted not only by dedicated factors, including nucleases
polymerase II (Pol II) as an endogenous source of genetic insta- of the RNase H family (Wahba et al., 2011) and helicases (e.g.,
bility (Thomas and Rothstein, 1989; Datta and Jinks-Robertson, Sen1; Mischo et al., 2011), but also through the tight coordina-
1995). Such transcription-associated recombination and muta- tion of transcription with the assembly and the nuclear export
genesis events have been observed in several species and pro- of mRNPs (messenger ribonucleoparticles) (Santos-Pereira
posed to rely on multiple mechanisms. Among them, changes in and Aguilera, 2015). Among other examples, R-loop-dependent
chromatin or nuclear organization, local negative super-coiling, genetic instability has been scored in the absence of several

608 Molecular Cell 67, 608621, August 17, 2017 2017 Elsevier Inc.
factors contributing to proper formation of export-competent occurrence of hybrids was significantly reduced among intron-
mRNPs, notably the conserved THO/TREX and THSC/TREX-2 containing as compared to intronless genes in the highest tran-
complexes (Huertas and Aguilera, 2003; Gonzalez-Aguilera scriptional class (>50 mRNAs/hr; Figure 1A). Furthermore, for
et al., 2008). In addition, previous studies, including systematic R-loop-positive loci of this category, hybrid densities calculated
screens, uncovered elevated levels of R-loops or DNA damage from DRIP sequencing (DRIP-seq) data were strongly reduced
upon inactivation of splicing factors in both budding yeast and for intron-containing genes (Figure 1B; Table S1). Differences
mammals (Li and Manley, 2005; Paulsen et al., 2009; Chan in transcription levels, gene length, G-C content, GC skew, or
et al., 2014a). These observations could reflect an indirect contri- AT skew did not appear to bias the formation of R-loops on highly
bution of splicing to the expression of factors targeting R-loops. expressed intron-containing genes, since these parameters
However, the fact that the artificial insertion of an intron in a re- were not distinguishable between R-loop-positive and R-loop-
porter gene alleviates the transcriptional defects arising in a negative loci (Figures S1CS1G). In addition, similar conclusions
yeast mRNP biogenesis mutant (tho) previously argued in favor could be reached when re-analyzing an alternative dataset of
of a protective effect of introns in cis (Bonnet et al., 2015). R-loop-prone sites (Figure S1H), which could be detected in
Spliceosomal introns are intervening sequences that are wild-type (WT) cells using a less stringent DRIP protocol (Chan
removed from the pre-mRNA molecule by two sequential trans- et al., 2014a), and further associated with hotspots of genetic
esterification reactions requiring the recognition of cis-acting instability (Chen et al., 2016).
elements by the spliceosome complex. Since their identification To strengthen the relevance of these observations, we then
(Berget et al., 1977; Chow et al., 1977), these sequences have sought a natural situation where intron-containing and intronless
emerged as a distinctive feature of eukaryotic genomes, even versions of the same gene could be directly compared for R-loop
though their frequency and length greatly vary between organ- formation. Among the numerous gene pairs arising from the
isms. However, their functions, as well as the evolutionary con- whole-genome duplication event that has occurred in an
straints that drive their maintenance in genomes, have remained ancestor of S. cerevisiae (Byrne and Wolfe, 2005), only one
debated. On the one hand, the presence of introns increases the (RPP1A and RPP1B) corresponded to two highly expressed
regulatability and the coding potential of the genome: introns can loci (>50 mRNAs/hr) with a distinct intron content. Strikingly,
modulate mRNA synthesis rates and stability (Heyn et al., 2015), although being more transcribed, the intron-containing RPP1B/
in particular through developmentally regulated intron retention YDL130w gene displayed a lower R-loop density than its intron-
(Braunschweig et al., 2014), and allow alternative splicing events less paralog RPP1A/YDL081c (Figure 1B; Table S1). Intron-con-
that contribute to proteome diversity (Nilsen and Graveley, taining genes are thereby less prone to accumulate DNA:RNA
2010). On the other hand, recent advances in transcriptome hybrids than their intronless counterparts.
profiling have revealed that a meaningful fraction of exon-intron To further determine whether the presence of introns is the
junctions is constitutively spliced in mammals and thereby un- direct cause of decreased R-loop formation, we performed
likely to contribute to regulatory events (Ryu et al., 2015). DRIP experiments on a yeast strain (Di) in which introns have
Furthermore, a large proportion of introns can be removed been removed from RPL7A and RPL7B, two intron-containing
from the yeast genome without altering gene expression or cell genes of the highest transcriptional class (Parenteau et al.,
fitness (Parenteau et al., 2008, 2011). This paradox prompted 2011; Figure 1C). In WT cells, our DRIP assay readily detected
us to question whether introns could have been selected and/ RNH-sensitive hybrids on YEF3, one of the intronless genes of
or maintained in eukaryotic genes to counteract R-loop forma- this transcriptional class identified in the DRIP-seq analysis,
tion through spliceosome recruitment, further reducing tran- but it failed to score any RNH-sensitive signal on both intron-
scription-associated genetic instability. containing RPL7A and RPL7B loci as well as on an untranscribed
region (Figures 1D1G, top panels). However, intron deletion (Di),
RESULTS while not causing an increased expression of RPL7A/B (Fig-
ure S1I), was sufficient to trigger a specific appearance of
Intron-Containing Genes Display Decreased R-Loop detectable RNH-sensitive R-loops at these two loci (Figures 1D
and DNA Damage Levels in Yeast and 1E, bottom panels).
We first analyzed a dataset of budding yeast R-loop-prone sites, We then wondered whether the decreased R-loop accumula-
mapped by DNA:RNA hybrid immunoprecipitation (DRIP) using tion detected on intron-containing genes was associated with a
the S9.6 hybrid-specific antibody in cells mutated for RNases lowered genetic instability. For this purpose, we intersected the
H (RNH; Wahba et al., 2016). Since R-loops mainly stem dataset of R-loop-positive loci with the previously reported
from high levels of transcription (Wahba et al., 2016) and as genome-wide map of H2A-Ser129 phosphorylation (g-sites; Stir-
intron-containing genes, although representing a minor fraction ling et al., 2012). This histone modification is a landmark of DNA
(4.4%) of protein-coding loci, are heavily transcribed in yeast damage, including, but not restricted to, R-loop- and transcrip-
(Ares et al., 1999), we focused our study on highly transcribed tion-dependent DSBs (Stirling et al., 2012). Although H2A-
genes (>50 mRNAs/hr). Analysis of this category, which ac- Ser129 phosphorylation was not systematically detected on
counts for 50% of the transcriptome, allowed comparison of R-loop-prone loci in WT cells, which are likely to maintain non-gen-
equivalent numbers of intronless and intron-containing genes otoxic levels of hybrids, g-sites were found to form with a lower
of similar expression levels (Figure S1A). While the propensity occurrence (Figure 1H) and at a lower density (Figure 1I) on
to form R-loops increased gradually as a function of transcription intron-containing as compared to intronless genes. This observa-
for both intronless and intron-containing loci (Figure S1B), the tion was not merely caused by a difference in the number of

Molecular Cell 67, 608621, August 17, 2017 609


Figure 1. Introns Prevent R-Loop and DNA Damage Accumulation on Highly Transcribed Yeast Genes
(A) Occurrence of DNA:RNA hybrids in intronless and intron-containing genes of the highest transcriptional category (>50 mRNAs/hr), evaluated according to a
dataset of hybrid-prone sites detected in the rnh1D rnh201D mutant (Wahba et al., 2016).
(B) DNA:RNA hybrid densities on intronless and intron-containing hybrid-positive loci from (A). The positions of the pair of paralogous genes RPP1A and RPP1B
are indicated.
(C) Organization of RPL7A and RPL7B loci in WT and Di strains. The positions of the deleted introns (dark orange) are indicated.
(DG) DNA:RNA hybrid detection by DRIP-qPCR (percentage of IP; means and SD are plotted; n = 4) at the indicated loci (D, RPL7A; E, RPL7B; F, intergenic; G,
YEF3) in either WT or Di yeast cells. Values from RNH-treated immunoprecipitations appear as hatched.
(H) Occurrence of g-sites in intronless and intron-containing hybrid-positive loci, evaluated according to a ChIP-ChIP dataset of phosphorylated H2A (Stirling
et al., 2012).
(I) g-site densities on intronless and intron-containing hybrid-positive loci.
*p % 0.05, **p % 0.01, and ***p % 0.001; Fishers exact test (A and H), Mann-Whitney-Wilcoxon test (B), Welchs t test (D and E), and Bootstrapping Method (I).
See also Figure S1.

intronless and intron-containing loci, as proved by data bootstrap- ChIP dataset (Capra et al., 2010; Figures S1J and S1K). Collec-
ping (Figure 1I), and it was further confirmed upon re-analysis of an tively, our results thereby demonstrate that intron-containing loci
alternative phospho-H2A chromatin immunoprecipitation (ChIP)- are specifically protected against R-loop and DNA damage

610 Molecular Cell 67, 608621, August 17, 2017


Figure 2. R-Loop-Forming Reporters Allow Scoring of R-Loop-Associated Phenotypes
(A) Principle of the YAT1 reporter genes. The YAT1 gene was expressed either in vitro from the T7 promoter (pT7) or in yeast cells under the control of the GAL1-
inducible promoter (pGAL1). The GAL-YAT1 reporter was integrated between two direct leu2 repeats (dark gray) to score transcription-associated recombination
(J). The position of the 50 and 30 qPCR amplicons used in (I) is indicated.
(B) Electrophoretic mobility of the T7-YAT1 plasmid, either untranscribed or transcribed by the T7 RNA polymerase (T7 RNA pol) and further treated or not with
RNase H (RNH). Nucleic acids were stained with ethidium bromide. The positions of the distinct T7-YAT1 plasmid species, as well as molecular weight markers
(kb), are indicated.
(C) R-loop-forming YAT1 plasmids (from B) were extracted following electrophoresis and further used for dot blotting with antibodies against DNA:RNA hybrids
(S9.6) or double-stranded DNA (dsDNA).
(D) Detection of DNA:RNA hybrids on in-vitro-transcribed YAT1 by DRIP-qPCR (percentage of IP; n = 3).
(legend continued on next page)

Molecular Cell 67, 608621, August 17, 2017 611


accumulation and that removing introns can be sufficient to trigger (Figure 2H). This analysis revealed that nascent YAT1 mRNAs
R-loop formation. transcribed under the control of the GAL1 promoter have an
increased propensity to form R-loops in the absence of the
A Set of R-Loop-Forming Reporters to Score THO complex (Figure 2H). Importantly, decreasing the level of
R-Loop-Associated Phenotypes YAT1 transcription in WT cells by using shorter induction times
We next asked whether, conversely, the insertion of an intron on the same reporter gene (WTS; Figure 2F) did not lead to a
could protect against R-loop formation and transcription-associ- similar increase in R-loop formation per mRNA (Figure 2H),
ated genetic instability in pathological situations where R-loops demonstrating that the observed R-loop accumulation is not a
accumulate. For this purpose, we first designed R-loop-forming mere consequence of normalization but a phenotype specific
reporters by placing the naturally intronless YAT1 gene, a locus to THO inactivation.
prone to transcription-associated genetic instability (Chavez To further evaluate the genetic instability triggered by those
et al., 2001), under the control of promoters driving high levels hybrids, we scored recombination occurring between two direct
of transcription in vitro and in vivo (pT7 and pGAL1, respectively; repeats flanking the GAL-YAT1 sequence (Figure 2A). This sys-
Figure 2A), since this gene is not expressed in standard growth tem detected a strong increase in the number of recombination
conditions (Bonnet et al., 2015). In vitro transcription of the events arising specifically when the YAT1 locus is transcribed in
pT7-YAT1 reporter plasmid triggered the formation of RNH-sen- the absence of the THO complex (Figure 2J), in agreement with
sitive species of decreased electrophoretic mobility (Figure 2B), earlier reports (Chavez et al., 2001; Huertas and Aguilera, 2003).
a typical feature of R-loop-forming sequences (Yu et al., 2006). Importantly, this transcription-dependent hyper-recombination
These species were further recognized by the hybrid-specific phenotype was significantly reduced by in vivo overexpression
S9.6 antibody in a dot blot experiment, demonstrating that of RNase H (RNH1; Figure 2J), to a similar extent as observed
they indeed display R-loops (Figure 2C). In addition, our DRIP in previous studies (Huertas and Aguilera, 2003; Santos-Pereira
assay probed substantial levels of RNH-sensitive R-loops not et al., 2013). These results thus demonstrate that hyper-recom-
only on this in-vitro-transcribed reporter gene (Figure 2D) but bination on the YAT1 reporter is a consequence of hybrid accu-
also in vivo, when YAT1 was expressed in WT yeast cells under mulation in tho cells. In sum, the use of these hybrid-forming
the control of the GAL1 promoter (Figure 2E), confirming its reporters allows us to monitor R-loop levels and the deleterious
R-loop-prone character. consequences of their accumulation when mRNP biogenesis is
The GAL-YAT1 reporter was then expressed in yeast cells further impaired (tho mutants).
lacking the Mft1 subunit of the THO complex, a conserved
mRNP biogenesis factor previously shown to counteract Insertion of Introns within R-Loop-Prone Genes
R-loop formation (Huertas and Aguilera, 2003). Previous reports Is Sufficient to Attenuate R-Loop Formation
had indicated that THO complex inactivation (tho cells) triggers a Having established that our assays score R-loop-dependent
pronounced decrease in the transcription of the GAL-YAT1 phenotypes at the GAL-YAT1 locus, we inserted a short intron
construct (Chavez et al., 2001; Bonnet et al., 2015). To evaluate derived from the RPL51A gene (Bonnet et al., 2015) at the
the amounts of mRNAs available to form hybrids in this mutant, YAT1 50 end (Figure 3A), a position typical of intronic locations
we measured the levels of chromatin-associated, nascent within the yeast genome (Lin and Zhang, 2005). We first
mRNAs by implementing a reported procedure (Figure S2A; Car- confirmed that the RPL51A* intron retained its functionality in
rillo Oesterreich et al., 2010). In agreement with this study, the this configuration, by monitoring both spliceosome recruitment
isolated chromatin fractions were depleted for ribosomal pro- and splicing (Figures S3A and S3B).
teins, enriched for Pol II and for unspliced ACT1 transcripts (Fig- We then compared both intron-containing and intronless re-
ures S2B and S2C). mRNA quantification from these fractions porters for R-loop formation, transcription, and transcription-
further confirmed a decreased production of YAT1 mRNAs in associated recombination (Figures 3B3E). While not modifying
tho cells (Figure 2F), together with a decreased Pol II occupancy the basal, non-genotoxic, hybrid levels scored at the YAT1 locus
on the YAT1 gene, as scored by ChIP (Figure 2I). To further quan- in WT cells, the presence of the intron specifically reduced the
tify the fraction of nascent mRNAs that forms hybrids in this propensity of nascent mRNAs to form the R-loops accumulating
mutant, R-loop levels were measured by DRIP (Figure 2G) in the absence of the THO complex (Figure 3B). Consistently, the
and further normalized to the amount of nascent YAT1 mRNAs intron alleviated the deleterious effect of R-loops on transcription

(E) Detection of DNA:RNA hybrids on the GAL-YAT1 gene expressed in WT yeast cells by DRIP-qPCR (percentage of IP; n = 3). Values from RNH-treated im-
munoprecipitations appear as hatched. DRIP signals obtained on two control loci, either highly transcribed (YEF3) or untranscribed (intergenic), are indicated.
(F) Nascent mRNA levels detected in WT or tho (mft1D) yeast cells expressing the GAL-YAT1 gene (qRT-PCR, normalized to ACT1 mRNA values; n = 3). WTS, the
induction of the GAL-YAT1 gene was performed for 30 min instead of 5 hr.
(G) DNA:RNA hybrid detection by DRIP-qPCR in WT or tho yeast cells expressing the GAL-YAT1 gene (percentage of IP; n = 3).
(H) DNA:RNA hybrid formation per nascent mRNA in WT or tho yeast cells expressing the GAL-YAT1 gene. DRIP values (from G) were normalized to the amount of
nascent mRNAs expressed from the corresponding strains (from F).
(I) RNA polymerase II (Pol II) distribution on the GAL-YAT1 gene as determined by chromatin immunoprecipitation (ChIP) in WT or tho yeast cells (percentage of IP;
n = 4).
(J) Recombination frequencies (n = 3) for WT or tho yeast strains expressing the GAL-YAT1 reporter and either an empty vector or an RNH1-overexpressing
construct. The frequency of Leu+ prototrophs arising from recombination upon transcriptional induction (Gal) or repression (Glu, hatched) is indicated.
Means and SD are plotted (*p % 0.05 and **p % 0.01; ns, not significant; Welchs t test). See also Figure S2.

612 Molecular Cell 67, 608621, August 17, 2017


Figure 3. Insertion of an Intron within an R-Loop-Prone Gene Prevents R-Loop Accumulation and Transcription-Associated Genetic
Instability
(A) Principle of the yeast intronless and intron-containing YAT1 reporter genes. Both reporters were integrated between leu2 repeats to score transcription-
associated recombination.
(B) DNA:RNA hybrid formation per nascent mRNA in WT or tho (mft1D) yeast cells expressing the indicated reporter constructs. DRIP/nascent mRNA values were
calculated as in Figure 2H (percentage of IP/mRNA; n = 3), and RNH-treated immunoprecipitations appear as hatched.
(C) Nascent mRNA levels detected in WT or tho yeast cells expressing the indicated reporter constructs (qRT-PCR, normalized to ACT1 mRNA values; n = 3).
(D) Pol II distribution on the reporter genes as determined by ChIP in WT or tho yeast cells (percentage of IP; n = 4). Values for the intronless reporter are the same
as used in Figure 2I.
(E and F) Recombination frequencies (n = 3) for WT, tho (mft1D), sus1D, or sen1-1 yeast strains expressing the indicated constructs. The frequency of Leu+
prototrophs arising from recombination upon transcriptional induction (Gal) or repression (Glu, hatched) of each reporter is indicated. Note that SEN1 inactivation
triggers hyper-recombination in both a transcription-independent and a transcription-dependent manner but that the intron mainly decreases the transcription-
dependent phenotype.
Means and SD are plotted (*p % 0.05, **p % 0.01, and ***p % 0.001; Welchs t test). See also Figures S3 and S4.

in tho mutants, as scored by nascent mRNA quantification (Fig- on the YAT1 intronless gene (Figure 3F), in agreement with earlier
ure 3C) and Pol II ChIP (Figure 3D). Finally, the intron virtually studies using alternative reporter systems (Mischo et al., 2011;
suppressed the unwanted recombination induced on the YAT1 Stirling et al., 2012), and this phenotype was partially alleviated
gene by THO inactivation (Figure 3E). on the intron-containing construct (Figure 3F).
Importantly, the presence of the RPL51A* intron did not affect In addition, the effect of the intron was not restricted to the
R-loop formation in the in vitro transcription assay (Figures S3D case of the YAT1 gene. LYS2 is a long gene similarly reported
and S3E). In contrast, its protective effect was observed in to display transcription defects and transcription-associated
distinct in vivo mutant situations associated with R-loop accu- recombination in tho cells (Chavez et al., 2001), and insertion
mulation. Indeed, loss of function of either Hpr1, another THO of the RPL51A* intron at its 50 end partially rescued these
complex subunit (Figures S4A and S4B), or Sus1, a subunit of R-loop-associated phenotypes (Figures S4ES4G). Finally, other
the THSC/TREX-2 that functions downstream of the THO com- intronic sequences were also tested for their ability to protect
plex in the mRNP biogenesis and export pathway (Figures 3F against the consequences of R-loop accumulation. Strikingly,
and S4C), similarly triggered transcription defects and hyper- insertion of the RPL35A or SEC27 natural introns at the 50 end
recombination phenotypes that were reduced in the presence of the YAT1 reporter (Figure 4A) similarly supported splicing
of the intron. Similarly, inactivation of the R-loop-resolving (Figure S3B), and it further alleviated the deleterious effect of
Sen1 helicase triggered transcription-dependent recombination THO inactivation on transcription (Figure 4B) and recombination

Molecular Cell 67, 608621, August 17, 2017 613


Figure 4. Distinct Intronic Sequences Alleviate R-Loop-Associated Phenotypes
(A) Principle of the reporter constructs used in the different panels. All reporters were further integrated between leu2 repeats to score transcription-associated
recombination.
(B) mRNA levels in tho (mft1D) yeast cells expressing the indicated reporter constructs (qRT-PCR, normalized to ACT1 mRNA values; n = 3).
(C) Recombination frequencies scored upon transcriptional induction for tho yeast strains expressing the indicated constructs (n = 3).
Means and SD are plotted (*p % 0.05 and **p % 0.01; ns, not significant; Welchs t test).
See also Figure S3.

(Figure 4C). In contrast, insertion of an exonic sequence origi- small nuclear (sn)RNPs was indeed defective for both 50 SS
nating from the intronless RPL4A gene and of similar length as mutants (D5 and 5II3I) but virtually unaffected for the 30 SS
the RPL51A* intron failed to rescue the decreased transcription mutant intron (Figure S3A). In addition, RT-PCR analyses
and the hyper-recombination phenotype of tho cells (Figures confirmed that splicing was strongly reduced for the three
4B and 4C). This set of results therefore demonstrates that de mutants (Figure S3B). Strikingly, both 50 SS mutants rescued
novo insertion of introns in R-loop-prone genes can suppress neither the transcriptional defect (Figure 5B) nor the hyper-
the pathological formation of R-loops and their impact on tran- recombination (Figure 5C) arising at the intronless YAT1 gene
scription and genetic stability in vivo. in R-loop-forming tho cells. In contrast, the 30 SS mutant intron
fully rescued both R-loop-associated cellular phenotypes (Fig-
RNP Formation, but Not Splicing Per Se, Attenuates ures 5B and 5C), but it failed to suppress hybrid formation in
R-Loop Formation and Genetic Instability the in vitro transcription system (Figures S3D and S3E), demon-
Based on these findings, we used transcription and hyper- strating that spliceosome recruitment in vivo, but not splicing per
recombination of YAT1 reporters in R-loop-accumulating tho se, is the main cause of R-loop suppression.
mutants as a readout of hybrid formation to further decipher To further confirm this finding, we next inserted within the
the mechanisms by which introns attenuate R-loop accumula- YAT1 reporter a Tetrahymena group I self-splicing intron, which
tion (Figure 5A). On the one hand, introns could prevent R-loop readily excises in the context of the YAT1 sequence (Figure S3C)
formation by recruiting the spliceosome that would directly without recruiting any protein machineries (Chalamcharla et al.,
antagonize invasion of the pre-mRNA into its DNA template or 2010). Self-splicing occurred co-transcriptionally (i.e., on
further contribute to alternative mRNP assembly pathways nascent mRNAs; Figure S3C), but it failed to rescue the impaired
specifically taking over upon impaired mRNP biogenesis (tho transcription and the hyper-recombination phenotype triggered
mutants). On the other hand, intron splicing could itself attenuate by R-loop accumulation (Figures 5B and 5C), confirming that
R-loop formation by decreasing the sequence homology be- splicing per se was not sufficient to reduce R-loop formation.
tween the mRNA and its template. To discriminate between Conversely, artificial tethering of MS2-coat proteins (MS2-CPs)
these two hypotheses, we inserted within the YAT1 reporter onto the intronless YAT1 mRNA through two or six MS2-loops
different splice site (SS) mutants of the RPL51A* intron as fol- (MS2L) inserted in place of the intron partially rescued the tran-
lows: (1) a combined mutation of the 50 SS and of the branchpoint scriptional defects of tho mutants, as probed by measurement
(5II3I), which fully suppresses spliceosome recruitment and im- of mRNA levels (Figure 5D) and Pol II recruitment (Figures S5F
pairs splicing (Lacadie and Rosbash, 2005); (2) a deletion of and S5G). The observation of an increased Pol II occupancy
the 50 SS (D5), which affects early spliceosome assembly and, upon MS2-CP binding to MS2-loops rules out the possibility
subsequently, inhibits the first step of splicing (Lacadie and that the rescue of YAT1 mRNA levels would solely reflect the re-
Rosbash, 2005); and (3) a mutation of the 30 SS (30 ss*), which ported stabilization of MS2-CP-bound RNA species (Garcia and
supports early stages of spliceosome formation but prevents Parker, 2015). Furthermore, MS2-CP recruitment onto the YAT1
the second step of splicing (Alexander et al., 2010). ChIP exper- mRNA suppressed the hyper-recombination phenotype caused
iments revealed that the recruitment of spliceosomal U1 and U2 by THO inactivation (Figure 5E). In agreement with a direct

614 Molecular Cell 67, 608621, August 17, 2017


Figure 5. Spliceosome-Dependent mRNP Assembly, but Not Splicing Per Se, Is Required to Prevent R-Loop Formation
(A) Principle of the reporter constructs used in the different panels. All reporters were further integrated between leu2 repeats to score transcription-associated
recombination. The inserted sequences do not disturb transcription or recombination at the YAT1 locus in WT cells (see Figures S5BS5E).
(B) mRNA levels in tho (mft1D) yeast cells expressing the indicated reporter constructs (qRT-PCR, normalized to ACT1 mRNA values; n = 3).
(C) Recombination frequencies for tho yeast strains expressing the indicated constructs (n = 3). The frequency of recombination events arising upon tran-
scriptional induction (Gal) or repression (Glu, hatched) is scored for each reporter. The ability to properly recruit the spliceosome and to get spliced is indicated for
each intron tested (see also Figures S3AS3C). 5II3I, D5, and 30 ss* are splice site mutations within the RPL51A* intron (Lacadie and Rosbash, 2005; Alexander
et al., 2010).
(D) mRNA levels in WT or tho yeast cells expressing the indicated reporter constructs (as in B, n = 3).
(E) Recombination frequencies for WT or tho yeast strains expressing the indicated constructs (as in C, n = 3).
(F) DNA:RNA hybrid formation per mRNA in WT or tho yeast cells expressing the indicated reporter constructs (percentage of IP/mRNA; n = 3). The expression of
MS2-coat proteins (MS2-CPs) artificially tethered onto the mRNA through two or six MS2-loops is indicated.
Means and SD are plotted (*p % 0.05, **p % 0.01, and ***p % 0.001; ns, not significant; Welchs t test). See also Figure S5.

suppression of R-loop formation by the presence of MS2-CP rich (S. pombe and C. neoformans) species (Figure 6A). As pre-
bound to the mRNA, our DRIP assay detected decreased hybrid viously observed in S. cerevisiae (Garca-Rubio et al., 2008), loss
levels at the MS2L-YAT1 reporter upon MS2-CP expression (Fig- of HPR1 function triggered a noticeable growth defect in
ure 5F). Taken together, these experiments support a model in C. glabrata, S. pombe, or C. neoformans cells (Figures S6A
which introns prevent R-loop formation by recruiting protein fac- S6C). Using genomic DNA from cells of these distinct species,
tors that directly antagonize hybridization of the nascent mRNA we then detected RNH-sensitive hybrids on dot blots (Figures
onto its template (Figure S5H). S6DS6G), as previously performed for templates forming
R-loops in vitro (as shown in Figure 2C). This analysis readily
Intron-Rich Genomes Do Not Accumulate R-Loops upon detected DNA:RNA hybrids on genomic DNA from WT cells of
Impaired mRNP Biogenesis these different species, possibly originating from transcription
We then sought to determine whether this function of introns had by all RNA polymerases. Strikingly, THO inactivation, which is
been maintained throughout evolution. Since the function of likely to specifically increase Pol II-dependent hybrids, triggered
the THO complex in preventing the accumulation of genotoxic a clear accumulation of R-loops in intron-poor genomes, e.g., in
R-loops on Pol II genes has been conserved from yeast to human C. glabrata (3.3% of intron-containing genes) and S. cerevisiae
(Huertas and Aguilera, 2003; Domnguez-Sanchez et al., 2011), (4.4%) (Figures 6B and 6C). However, loss of the THO complex
we systematically triggered hybrid formation by inactivating its had little or no effect in yeasts with an elevated proportion of
Hpr1 subunit in a panel of yeasts differing for their intron content, intron-containing genes, S. pombe (47%) and C. neoformans
including intron-poor (C. glabrata and S. cerevisiae) and intron- (99.5%) (Figures 6D and 6E), revealing an inverse correlation

Molecular Cell 67, 608621, August 17, 2017 615


Figure 6. Intron-Rich Yeast Genomes Do Not Accumulate R-Loops upon Improper mRNP Biogenesis
(A) Occurrence of introns in the genomes of the indicated yeast species. The following metrics are indicated: percentage of intron-containing genes (among
protein-coding genes), mean number of introns (per intron-containing gene), positional bias of the introns (50 , introns preferentially located at the 50 end
of the mRNA). aLinde et al., 2015; bSaccharomyces Genome Database (http://www.yeastgenome.org); cPomBase (http://www.pombase.org); dJanbon et al.,
2014; eLin and Zhang, 2005.
(BE) DNA:RNA hybrid detection by dot blot on genomic DNA from either WT or hpr1 (tho) cells in the indicated species (B, C. glabrata; C, S. cerevisiae; D,
S. pombe; E, C. neoformans). Decreasing amounts of DNA extracts from the indicated cells were probed using antibodies directed against DNA:RNA hybrids (left
panels) or dsDNA (right panels).
(F) DNA:RNA hybrid levels were quantified from (B)(E) using serial dilutions of a reference sample as a standard and normalized to the amount of DNA in each
sample (for each species, n = 3). The accumulation of DNA:RNA hybrids upon THO inactivation (HPR1 knockdown, relative to WT) was plotted as a function of the
fraction of intron-containing genes (among protein-coding genes) in the genomes of the different species.
Means and SD are plotted. The Pearson correlation coefficient is indicated. See also Figure S6.

between hybrid formation and intron content (Figure 6F; Pearson loci (Figures 7A and S7A). This observation was not merely
correlation coefficient = 0.89). Intron-richest genomes are due to the difference in the number of intronless and intron-con-
thereby less prone to accumulate R-loops in a context of defec- taining loci, as proved by data bootstrapping (Figure 7A), and
tive mRNP biogenesis. it was also confirmed by examining independent datasets
obtained from distinct cell types (IMR90, NTERA-2, and human
Intron-Containing Human Genes Display Decreased fibroblasts; Figures S7C, S7E, and S7G). Furthermore, analysis
Levels of R-Loops and DNA Damage of potential confounding effects (transcription levels, gene
We then examined various datasets of genome-wide R-loop dis- length, G-C content, and GC skew) for these distinct datasets
tribution obtained from human cells (Ginno et al., 2013; Lim et al., did not identify any genomic feature susceptible to bias the for-
2015; Nadel et al., 2015), and we calculated R-loop densities for mation of R-loops, as shown for highly expressed intron-con-
intronless and intron-containing genes. The analysis of hybrid- taining genes (Figures S7B, S7D, S7F, and S7H). In particular,
positive regions in HEK293 cells revealed that R-loop densities increased R-loop levels on intronless genes do not seem to be
do not vary much among transcribed genes in these cells, as caused by their reduced length, since R-loop-forming intronless
shown previously (Nadel et al., 2015), but that they are system- genes are frequently longer that intronless genes with no detect-
atically lower on intron-containing as compared to intronless able R-loops (see, for example, Figures S7D, S7F, and S7H,

616 Molecular Cell 67, 608621, August 17, 2017


Figure 7. Intron-Containing Genes Are Protected from R-Loop and DNA Damage Accumulation in Humans
(A) DNA:RNA hybrid densities on intronless and intron-containing hybrid-positive loci, evaluated according to a dataset of hybrid-prone sites detected in HEK293
cells (Nadel et al., 2015) and ranked according to nascent mRNA levels (see also Figure S7A). L, low expression; M, medium expression; H, high expression; VH,
very high expression.
(B) g-H2AX association to intronless and intron-containing hybrid-positive loci from (A), evaluated according to a g-H2AX ChIP-seq dataset obtained from
HEK293 cells (Bunch et al., 2015) and displayed according to expression levels.
(C) Model for the role of introns in R-loop prevention. Intron-mediated recruitment of the spliceosome, and possibly of other factors, would favor RNP formation
and antagonize hybridization of the mRNA onto its DNA template.
***p % 0.001, Bootstrapping Method. See also Figure S7.

length panels). Human intron-containing genes are thereby lowered R-loop accumulation in intron-rich genomes. Finally,
less prone to accumulate DNA:RNA hybrids as compared to their genome-wide analyses revealed that intron-containing genes
intronless counterparts. display decreased R-loop levels and DNA damage as compared
We then compared DNA damage accumulation on intronless to intronless genes of similar expression, in both yeast and
and intron-containing R-loop-forming loci using a genome- human cells.
wide g-H2AX map also obtained in the HEK293 cell line (Bunch It is noteworthy that the presence of introns is not the only
et al., 2015). Although g-H2AX accumulation was not system- determinant of R-loop formation in our genome-wide analyses,
atic on R-loop-prone loci, we scored a significantly decreased as evidenced by the detection of both hybrid-positive intron-
signal on intron-containing genes of the highest transcrip- containing genes and hybrid-negative intronless genes (Figures
tion class (Figure 7B), a result that was still significant after 1A and 7A). However, when R-loops form on intron-containing
correcting for multiple hypothesis testing (p = 0.0004). These loci, e.g., for highly expressed yeast genes or in hybrid-prone
observations support that, among other features, R-loop accu- mutants, their levels are strongly attenuated as compared to
mulation can be a cause of damage accumulation and that those scored on their intronless counterparts (Figures 1B, 3B,
reduced R-loop formation on intron-containing genes also and 7A). The genomic R-loop distribution is thereby likely to be
dampens transcription-associated genetic instability in the modulated by the occurrence of introns, together with other
human genome. cis-acting determinants previously reported to impact on hybrid
formation. Among them, high levels of transcription from strong
DISCUSSION promoters have been shown to drive R-loop accumulation in
yeast (Chan et al., 2014a; Wahba et al., 2016). In addition, the
In this study, we provide multiple lines of evidence supporting influence of sequence properties on the propensity to form
our initial hypothesis that introns protect eukaryotic genomes R-loops has been pinpointed by different studies: while GC rich-
from R-loop accumulation and subsequent transcription-associ- ness was shown to positively influence R-loop formation both
ated genetic instability. First, by modifying the intron content of in vitro on transcribed reporters (Roy and Lieber, 2009) and
budding yeast genes, we have demonstrated that removing in vivo on human promoters (Ginno et al., 2013), A:T tracts
the introns from intron-containing genes is sufficient to trigger were recently shown to facilitate hybrid accumulation (Wahba
hybrid formation; conversely, inserting an intron within an et al., 2016). However, these features are unlikely to account
intronless R-loop-prone gene suppresses R-loop formation for the lowered R-loop formation scored on intron-containing
and hyper-recombination in R-loop-forming mutants. Second, genes in our study. Indeed, (1) intronless genes harbor higher
by comparing distinct yeast species differing for their intron con- R-loop densities than intron-containing genes of similar expres-
tent in situations of impaired mRNP biogenesis, we have scored sion (Figures 1B, 7A, S1C, and S7A), (2) hybrid-positive and

Molecular Cell 67, 608621, August 17, 2017 617


hybrid-negative intron-containing loci do not exhibit differences or intron-containing mRNPs will likely give insights into this func-
in transcription levels or base content (Figures S1CS1G and tion of introns.
S7), and (3) the ability of different intron variants to rescue Our results also demonstrate that being protected from R-loop
R-loop-associated phenotypes when inserted in the YAT1 accumulation, intron-containing loci display lowered levels of
gene does not correlate with particular sequence properties, be- transcription-associated genetic instability (Figures 1H, 3E, 3F,
sides their ability to recruit proteins onto the mRNA (Figure S5H). and 7B). Notably, in our genome-wide analyses, DNA damage
In addition to these cis-acting determinants, trans-acting factors is not systematically detected on R-loop-forming loci, possibly
such as the THO complex can also be predominant players in in relation to their location in the genome, including their distance
R-loop prevention, as shown in the case of the YAT1 gene (Fig- and orientation relative to replication origins (Stirling et al., 2012;
ure 3). Notably, the functional importance of combining both Gaillard et al., 2013), or to the action of endogenous RNases H or
trans-acting factors and cis-acting elements such as introns is other cellular factors that would maintain non-genotoxic levels of
further supported by the synergistic effect on R-loop formation hybrids in WT cells. In situations characterized by stable and/or
of preventing both THO complex function and spliceosome increased R-loop levels, the presence of introns would be more
binding (Figures 3, 4, and 5). critical to dampen transcription-associated genetic instability, as
Regarding the mechanism of R-loop suppression, the only evidenced in tho mutants. It is tempting to speculate that this
sequences that could attenuate hybrid formation on the function of introns could contribute to explain their evolutionary
R-loop-forming YAT1 reporter were either spliceosome-bound selection and/or maintenance at certain genomic locations, in
introns (Figures 3, 4, and 5) or MS2-loops bound by MS2-CPs particular within highly expressed yeast genes in which they
(Figures 5D5F). This last finding, together with the results ob- are particularly enriched (Figure S1A). Their role in R-loop pre-
tained with the 30 ss* intron, which recruits the spliceosome vention could further account for their retention at the 50 ends
but does not complete splicing, demonstrates that the pres- of genes in intron-poor species (Lin and Zhang, 2005), a
ence of proteins bound to the mRNA can be sufficient to restrain positional bias also observed in intronogenesis assays (Lee
its hybridization onto its DNA template, either by steric hin- and Stevens, 2016) and that could prevent hybrid formation at
drance or by facilitating mRNA folding, a process previously early stages of nascent mRNA synthesis. In certain genetic,
demonstrated to modulate R-loop formation (Chen et al., physiological, or stress conditions in which R-loops could reach
2016). However, RNP packaging with MS2-CPs failed to fully genotoxic levels, in particular due to the natural inactivation of
rescue R-loop-associated phenotypes, notably the transcrip- cellular R-loop-antagonizing factors (see, for instance, Jackson
tional defects caused by hybrid accumulation (Figures S5F et al., 2014), the presence of introns could thereby prevent
and S5G). both inappropriate gene expression and acquisition of heritable
As opposed to the artificial tethering of MS2-CPs, the intron- mutations.
mediated binding of splicing factors is likely to facilitate the
recruitment of additional proteins that contribute to the biogen- STAR+METHODS
esis of a fully packaged mRNP and to its release from the tran-
scription site, thereby preventing R-loop formation. Candidate Detailed methods are provided in the online version of this paper
proteins that could be recruited onto mRNPs in a splicing- and include the following:
dependent manner and further preclude hybrid accumulation
include the following: (1) the hnRNP Npl3 and the helicase d KEY RESOURCES TABLE
Sub2, which contact the spliceosome (Gottschalk et al., 1998; d CONTACT FOR REAGENT AND RESSOURCES SHARING
Lardelli et al., 2010) and whose inactivation triggers R-loop- d EXPERIMENTAL MODEL AND SUBJECT DETAILS
dependent genetic instability in yeast (Jimeno et al., 2002; San- B Yeast Strains and Growth
tos-Pereira et al., 2013), and (2) the exon junction complex, B Plasmids
which specifically decorates spliced mRNAs in metazoans (Le d METHOD DETAILS
Hir et al., 2016). In addition, several reports have pointed that B DNA:RNA hybrid detection
the splicing process can interfere with chromatin organization B Gene expression and splicing analyses
and/or transcription elongation (de Almeida and Carmo-Fon- B Bioinformatic analyses of yeast datasets
seca, 2014). In particular, splicing has been shown to trigger a B Bioinformatic analyses of human datasets
pause in Pol II elongation following the 30 SS (Alexander et al., d QUANTIFICATION AND STATISTICAL ANALYSIS
2010; Carrillo Oesterreich et al., 2010). Interestingly, Pol II slow- d DATA AND SOFTWARE AVAILABILITY
down has been shown to suppress certain phenotypes of the
R-loop-forming tho mutants (Jensen et al., 2004; Jimeno et al., SUPPLEMENTAL INFORMATION
2008). However, Pol II pausing does not appear to occur in the
second exon of the intron-containing YAT1 reporter (Figure S5I). Supplemental Information includes seven figures and one table and can be
found with this article online at http://dx.doi.org/10.1016/j.molcel.2017.
In addition, mutation of the intron 30 SS prevents pausing (Alex-
07.002.
ander et al., 2010) yet suppresses R-loop phenotypes (Figures
5B and 5C). We therefore support a model in which the intron
AUTHOR CONTRIBUTIONS
would prevent R-loop formation by favoring RNP assembly
and not by solely modifying Pol II kinetics (Figure 7C). In the Conceptualization, A.B. and B.P.; Methodology, A.B. and B.P.; Investigation,
future, the systematic analysis of the composition of intronless A.B., E.C., A.P., S.C.S., G.J., and B.P.; Formal Analysis Yeast Datasets,

618 Molecular Cell 67, 608621, August 17, 2017


A.E. and V.G.; Formal Analysis Human Datasets, A.R.G. and S.F.d.A.; Chan, Y.A., Aristizabal, M.J., Lu, P.Y., Luo, Z., Hamza, A., Kobor, M.S., Stirling,
Writing, B.P.; Visualization, A.B. and B.P.; Funding Acquisition, S.F.d.A., P.C., and Hieter, P. (2014a). Genome-wide profiling of yeast DNA:RNA hybrid
V.G., and B.P.; Supervision, V.G., S.F.d.A., and B.P. prone sites with DRIP-chip. PLoS Genet. 10, e1004288.
Chan, Y.A., Hieter, P., and Stirling, P.C. (2014b). Mechanisms of genome insta-
bility induced by RNA-processing defects. Trends Genet. 30, 245253.
ACKNOWLEDGMENTS
Chavez, S., Garca-Rubio, M., Prado, F., and Aguilera, A. (2001). Hpr1 is pref-
We thank S. Abou Elela, A. Aguilera, M. Belfort, E. Bertrand, J.-M. Camadro, V. erentially required for transcription of either long or G+C-rich DNA sequences
Doye, M. Garcia, P. Hieter, D. Koshland, P. Lesage, D. Libri, R. Rothstein, M. in Saccharomyces cerevisiae. Mol. Cell. Biol. 21, 70547064.
Rougemaille, G. Rabut, J. Rouviere, P. Stirling, F. Stutz, V. Vanoosthuyse, and Chen, X., Yang, J.R., and Zhang, J. (2016). Nascent RNA folding mitigates tran-
M. Wickens for reagents and/or discussion and B. Quioc and F. Moyrand for scription-associated mutagenesis. Genome Res. 26, 5059.
technical assistance. This work was supported by CNRS (to B.P.), Fondation
Chow, L.T., Gelinas, R.E., Broker, T.R., and Roberts, R.J. (1977). An amazing
ARC pour la Recherche sur le Cancer (to B.P.), Ligue Nationale contre le Can-
sequence arrangement at the 50 ends of adenovirus 2 messenger RNA. Cell
cer (to B.P.; fellowship to A.B. and Equipe Labellisee 2014 to V.G.), EMBO
12, 18.
(short-term fellowship to A.B.), Canceropole PACA (fellowship to A.E.), and
Fundacao para a Ciencia e Tecnologia, Portugal (PTDC/BIM-ONC/0016- Costantino, L., and Koshland, D. (2015). The yin and yang of R-loop biology.
2014 to S.F.d.A. and IF/00510/2014 to A.R.G.). Bioinformatic and computing Curr. Opin. Cell Biol. 34, 3945.
support at CRCM was provided by the CRCM Integrative Bioinformatics and Datta, A., and Jinks-Robertson, S. (1995). Association of increased sponta-
Datacentre IT and Scientific Computing platforms. neous mutation rates with high levels of transcription in yeast. Science 268,
16161619.
Received: September 22, 2016
de Almeida, S.F., and Carmo-Fonseca, M. (2014). Reciprocal regulatory links
Revised: May 19, 2017
between cotranscriptional splicing and chromatin. Semin. Cell Dev. Biol.
Accepted: June 30, 2017
32, 210.
Published: July 27, 2017
Delaveau, T., Davoine, D., Jolly, A., Vallot, A., Rouviere, J.O., Gerber, A.,
Brochet, S., Plessis, M., Roquigny, R., Merhej, J., et al. (2016). Tma108, a
REFERENCES putative M1 aminopeptidase, is a specific nascent chain-associated protein
in Saccharomyces cerevisiae. Nucleic Acids Res. 44, 88268841.
Alexander, R.D., Innocente, S.A., Barrass, J.D., and Beggs, J.D. (2010).
Domnguez-Sanchez, M.S., Barroso, S., Gomez-Gonzalez, B., Luna, R., and
Splicing-dependent RNA polymerase pausing in yeast. Mol. Cell 40, 582593.
Aguilera, A. (2011). Genome instability and transcription elongation impair-
Ares, M., Jr., Grate, L., and Pauling, M.H. (1999). A handful of intron-containing ment in human cells depleted of THO/TREX. PLoS Genet. 7, e1002386.
genes produces the lions share of yeast mRNA. RNA 5, 11381139.
El Hage, A., Webb, S., Kerr, A., and Tollervey, D. (2014). Genome-wide distri-
Bahler, J., Wu, J.Q., Longtine, M.S., Shah, N.G., McKenzie, A., 3rd, Steever, bution of RNA-DNA hybrids identifies RNase H targets in tRNA genes, retro-
A.B., Wach, A., Philippsen, P., and Pringle, J.R. (1998). Heterologous modules transposons and mitochondria. PLoS Genet. 10, e1004716.
for efficient and versatile PCR-based gene targeting in Schizosaccharomyces
ENCODE Project Consortium (2012). An integrated encyclopedia of DNA
pombe. Yeast 14, 943951.
elements in the human genome. Nature 489, 5774.
Berget, S.M., Moore, C., and Sharp, P.A. (1977). Spliced segments at the 50
Gaillard, H., Herrera-Moyano, E., and Aguilera, A. (2013). Transcription-asso-
terminus of adenovirus 2 late mRNA. Proc. Natl. Acad. Sci. USA 74,
ciated genome instability. Chem. Rev. 113, 86388661.
31713175.
Garcia, J.F., and Parker, R. (2015). MS2 coat proteins bound to yeast
Bonnet, A., Bretes, H., and Palancade, B. (2015). Nuclear pore components
mRNAs block 50 to 30 degradation and trap mRNA decay products:
affect distinct stages of intron-containing gene expression. Nucleic Acids
implications for the localization of mRNAs by MS2-MCP system. RNA 21,
Res. 43, 42494261.
13931395.
Braunschweig, U., Barbosa-Morais, N.L., Pan, Q., Nachman, E.N., Alipanahi,
Garca-Rubio, M., Chavez, S., Huertas, P., Tous, C., Jimeno, S., Luna, R., and
B., Gonatopoulos-Pournatzis, T., Frey, B., Irimia, M., and Blencowe, B.J.
Aguilera, A. (2008). Different physiological relevance of yeast THO/TREX sub-
(2014). Widespread intron retention in mammals functionally tunes transcrip-
units in gene expression and genome integrity. Mol. Genet. Genomics 279,
tomes. Genome Res. 24, 17741786.
123132.
Bray, N.L., Pimentel, H., Melsted, P., and Pachter, L. (2016). Near-optimal
Ghaemmaghami, S., Huh, W.K., Bower, K., Howson, R.W., Belle, A.,
probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525527.
Dephoure, N., OShea, E.K., and Weissman, J.S. (2003). Global analysis of
Bunch, H., Lawney, B.P., Lin, Y.F., Asaithamby, A., Murshid, A., Wang, Y.E., protein expression in yeast. Nature 425, 737741.
Chen, B.P., and Calderwood, S.K. (2015). Transcriptional elongation requires
Ginno, P.A., Lott, P.L., Christensen, H.C., Korf, I., and Chedin, F. (2012). R-loop
DNA break-induced signalling. Nat. Commun. 6, 10191.
formation is a distinctive characteristic of unmethylated human CpG island
Byrne, K.P., and Wolfe, K.H. (2005). The Yeast Gene Order Browser: promoters. Mol. Cell 45, 814825.
combining curated homology and syntenic context reveals gene fate in poly- Ginno, P.A., Lim, Y.W., Lott, P.L., Korf, I., and Chedin, F. (2013). GC
ploid species. Genome Res. 15, 14561461. skew at the 50 and 30 ends of human genes links R-loop formation to
Capra, J.A., Paeschke, K., Singh, M., and Zakian, V.A. (2010). G-quadruplex epigenetic regulation and transcription termination. Genome Res. 23,
DNA sequences are evolutionarily conserved and associated with distinct 15901600.
genomic features in Saccharomyces cerevisiae. PLoS Comput. Biol. 6, Goebels, C., Thonn, A., Gonzalez-Hilarion, S., Rolland, O., Moyrand, F.,
e1000861. Beilharz, T.H., and Janbon, G. (2013). Introns regulate gene expression in
Carrillo Oesterreich, F., Preibisch, S., and Neugebauer, K.M. (2010). Global Cryptococcus neoformans in a Pab2p dependent pathway. PLoS Genet. 9,
analysis of nascent RNA reveals transcriptional pausing in terminal exons. e1003686.
Mol. Cell 40, 571581. Gonzalez-Aguilera, C., Tous, C., Gomez-Gonzalez, B., Huertas, P., Luna, R.,
Chalamcharla, V.R., Curcio, M.J., and Belfort, M. (2010). Nuclear expression of and Aguilera, A. (2008). The THP1-SAC3-SUS1-CDC31 complex works in
a group II intron is consistent with spliceosomal intron ancestry. Genes Dev. transcription elongation-mRNA export preventing RNA-mediated genome
24, 827836. instability. Mol. Biol. Cell 19, 43104318.

Molecular Cell 67, 608621, August 17, 2017 619


Gottschalk, A., Tang, J., Puig, O., Salgado, J., Neubauer, G., Colot, H.V., Subgroup (2009). The Sequence Alignment/Map format and SAMtools.
Mann, M., Seraphin, B., Rosbash, M., Luhrmann, R., and Fabrizio, P. (1998). Bioinformatics 25, 20782079.
A comprehensive biochemical and genetic analysis of the yeast U1 snRNP re- Li, Z., Vizeacoumar, F.J., Bahr, S., Li, J., Warringer, J., Vizeacoumar, F.S., Min,
veals five novel proteins. RNA 4, 374393. R., Vandersluis, B., Bellay, J., Devit, M., et al. (2011). Systematic exploration of
Halasz, L., Karanyi, Z., Boros-Olah, B., Kuik-Rozsa, T., Sipos, E., Nagy, E., essential yeast gene function with temperature-sensitive mutants. Nat.
Mosolygo-L, A., Mazlo, A., Rajnavolgyi, E., Halmos, G., and Szekvolgyi, L. Biotechnol. 29, 361367.
(2017). RNA-DNA hybrid (R-loop) immunoprecipitation mapping: an analytical Lim, Y.W., Sanz, L.A., Xu, X., Hartono, S.R., and Chedin, F. (2015). Genome-
workflow to evaluate inherent biases. Genome Res. 27, 10631073. wide DNA hypomethylation and RNA:DNA hybrid accumulation in Aicardi-
Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Goutieres syndrome. eLife 4, e08007.
Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S., et al. (2012). Lin, K., and Zhang, D.Y. (2005). The excess of 50 introns in eukaryotic ge-
GENCODE: the reference human genome annotation for The ENCODE nomes. Nucleic Acids Res. 33, 65226527.
Project. Genome Res. 22, 17601774.
Linde, J., Duggan, S., Weber, M., Horn, F., Sieber, P., Hellwig, D., Riege, K.,
Heyn, P., Kalinka, A.T., Tomancak, P., and Neugebauer, K.M. (2015). Introns Marz, M., Martin, R., Guthke, R., and Kurzai, O. (2015). Defining the transcrip-
and gene expression: cellular constraints, transcriptional regulation, and tomic landscape of Candida glabrata by RNA-seq. Nucleic Acids Res. 43,
evolutionary consequences. BioEssays 37, 148154. 13921406.
Holstege, F.C., Jennings, E.G., Wyrick, J.J., Lee, T.I., Hengartner, C.J., Green, Mischo, H.E., Gomez-Gonzalez, B., Grzechnik, P., Rondon, A.G., Wei, W.,
M.R., Golub, T.R., Lander, E.S., and Young, R.A. (1998). Dissecting the regu- Steinmetz, L., Aguilera, A., and Proudfoot, N.J. (2011). Yeast Sen1 helicase
latory circuitry of a eukaryotic genome. Cell 95, 717728. protects the genome from transcription-associated instability. Mol. Cell
Huertas, P., and Aguilera, A. (2003). Cotranscriptionally formed DNA:RNA 41, 2132.
hybrids mediate transcription elongation impairment and transcription-associ- Nadel, J., Athanasiadou, R., Lemetre, C., Wijetunga, N.A., O Broin, P., Sato, H.,
ated recombination. Mol. Cell 12, 711721. Zhang, Z., Jeddeloh, J., Montagna, C., Golden, A., et al. (2015). RNA:DNA hy-
Jackson, B.R., Noerenberg, M., and Whitehouse, A. (2014). A novel mecha- brids in the human genome have distinctive nucleotide characteristics, chro-
nism inducing genome instability in Kaposis sarcoma-associated herpesvirus matin composition, and transcriptional relationships. Epigenetics Chromatin
infected cells. PLoS Pathog. 10, e1004098. 8, 46.
Janbon, G., Ormerod, K.L., Paulet, D., Byrnes, E.J., 3rd, Yadav, V., Chatterjee, Nilsen, T.W., and Graveley, B.R. (2010). Expansion of the eukaryotic proteome
G., Mullapudi, N., Hon, C.C., Billmyre, R.B., Brunel, F., et al. (2014). Analysis of by alternative splicing. Nature 463, 457463.
the genome and transcriptome of Cryptococcus neoformans var. grubii re-
Parenteau, J., Durand, M., Veronneau, S., Lacombe, A.A., Morin, G., Guerin,
veals complex RNA expression and microevolution leading to virulence atten-
V., Cecez, B., Gervais-Bird, J., Koh, C.S., Brunelle, D., et al. (2008). Deletion
uation. PLoS Genet. 10, e1004261.
of many yeast introns reveals a minority of genes that require splicing for func-
Jensen, T.H., Boulay, J., Olesen, J.R., Colin, J., Weyler, M., and Libri, D. (2004). tion. Mol. Biol. Cell 19, 19321941.
Modulation of transcription affects mRNP quality. Mol. Cell 16, 235244.
Parenteau, J., Durand, M., Morin, G., Gagnon, J., Lucier, J.F., Wellinger,
Jimeno, S., Rondon, A.G., Luna, R., and Aguilera, A. (2002). The yeast THO R.J., Chabot, B., and Elela, S.A. (2011). Introns within ribosomal protein
complex and mRNA export factors link RNA metabolism with transcription genes regulate the production and function of yeast ribosomes. Cell 147,
and genome instability. EMBO J. 21, 35263535. 320331.
Jimeno, S., Garca-Rubio, M., Luna, R., and Aguilera, A. (2008). A reduction in Paulsen, R.D., Soni, D.V., Wollman, R., Hahn, A.T., Yee, M.C., Guan, A.,
RNA polymerase II initiation rate suppresses hyper-recombination and tran- Hesley, J.A., Miller, S.C., Cromwell, E.F., Solow-Cordero, D.E., et al. (2009).
scription-elongation impairment of THO mutants. Mol. Genet. Genomics A genome-wide siRNA screen reveals diverse cellular processes and path-
280, 327336. ways that mediate genome stability. Mol. Cell 35, 228239.
Kim, D.U., Hayles, J., Kim, D., Wood, V., Park, H.O., Won, M., Yoo, H.S., Duhig, Prado, F., and Aguilera, A. (1995). Role of reciprocal exchange, one-ended in-
T., Nam, M., Palmer, G., et al. (2010). Analysis of a genome-wide set of gene vasion crossover and single-strand annealing on inverted and direct repeat
deletions in the fission yeast Schizosaccharomyces pombe. Nat. Biotechnol. recombination in yeast: different requirements for the RAD1, RAD10, and
28, 617623. RAD52 genes. Genetics 139, 109123.
Kitada, K., Yamaguchi, E., and Arisawa, M. (1995). Cloning of the Candida Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for
glabrata TRP1 and HIS3 genes, and construction of their disruptant strains comparing genomic features. Bioinformatics 26, 841842.
by sequential integrative transformation. Gene 165, 203206.
Roy, D., and Lieber, M.R. (2009). G clustering is important for the initiation of
Lacadie, S.A., and Rosbash, M. (2005). Cotranscriptional spliceosome assem- transcription-induced R-loops in vitro, whereas high G density without clus-
bly dynamics and the role of U1 snRNA:5ss base pairing in yeast. Mol. Cell tering is sufficient thereafter. Mol. Cell. Biol. 29, 31243133.
19, 6575.
Ryu, J.Y., Kim, H.U., and Lee, S.Y. (2015). Human genes with a greater number
Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and of transcript variants tend to show biological features of housekeeping and
memory-efficient alignment of short DNA sequences to the human genome. essential genes. Mol. Biosyst. 11, 27982807.
Genome Biol. 10, R25.
Santos-Pereira, J.M., and Aguilera, A. (2015). R loops: new modulators of
Lardelli, R.M., Thompson, J.X., Yates, J.R., 3rd, and Stevens, S.W. (2010). genome dynamics and function. Nat. Rev. Genet. 16, 583597.
Release of SF3 from the intron branchpoint activates the first step of pre-
Santos-Pereira, J.M., Herrero, A.B., Garca-Rubio, M.L., Marn, A., Moreno,
mRNA splicing. RNA 16, 516528.
S., and Aguilera, A. (2013). The Npl3 hnRNP prevents R-loop-mediated tran-
Le Hir, H., Sauliere, J., and Wang, Z. (2016). The exon junction complex as a scription-replication conflicts and genome instability. Genes Dev. 27,
node of post-transcriptional networks. Nat. Rev. Mol. Cell Biol. 17, 4154. 24452458.
Lee, S., and Stevens, S.W. (2016). Spliceosomal intronogenesis. Proc. Natl. Sollier, J., and Cimprich, K.A. (2015). Breaking bad: R-loops and genome
Acad. Sci. USA 113, 65146519. integrity. Trends Cell Biol. 25, 514522.
Li, X., and Manley, J.L. (2005). Inactivation of the SR protein splicing factor Stirling, P.C., Chan, Y.A., Minaker, S.W., Aristizabal, M.J., Barrett, I.,
ASF/SF2 results in genomic instability. Cell 122, 365378. Sipahimalani, P., Kobor, M.S., and Hieter, P. (2012). R-loop-mediated genome
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., instability in mRNA cleavage and polyadenylation mutants. Genes Dev. 26,
Abecasis, G., and Durbin, R.; 1000 Genome Project Data Processing 163175.

620 Molecular Cell 67, 608621, August 17, 2017


Thomas, B.J., and Rothstein, R. (1989). Elevated recombination rates in tran- Xu, Z., Wei, W., Gagneur, J., Perocchi, F., Clauder-Mu nster, S., Camblong, J.,
scriptionally active DNA. Cell 56, 619630. Guffanti, E., Stutz, F., Huber, W., and Steinmetz, L.M. (2009). Bidirectional pro-
Wahba, L., Amon, J.D., Koshland, D., and Vuica-Ross, M. (2011). RNase H and moters generate pervasive transcription in yeast. Nature 457, 10331037.
multiple RNA biogenesis factors cooperate to prevent RNA:DNA hybrids from Yu, K., Roy, D., Huang, F.T., and Lieber, M.R. (2006). Detection and structural
generating genome instability. Mol. Cell 44, 978988. analysis of R-loops. Methods Enzymol. 409, 316329.
Wahba, L., Costantino, L., Tan, F.J., Zimmer, A., and Koshland, D. (2016). S1- Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E.,
DRIP-seq identifies high expression and polyA tracts as major contributors to Nusbaum, C., Myers, R.M., Brown, M., Li, W., and Liu, X.S. (2008). Model-
R-loop formation. Genes Dev. 30, 13271338. based analysis of ChIP-seq (MACS). Genome Biol. 9, R137.

Molecular Cell 67, 608621, August 17, 2017 621


STAR+METHODS

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER


Antibodies
anti DNA:RNA hybrids (S9.6) Kerafast Cat# ENH001
anti-dsDNA (HYB331-01) Santa Cruz Cat# sc-58749; RRID: AB_783088
anti-Rpl5 (Delaveau et al., 2016) N/A
anti-Rpl3 DSHB ScRPL3; RRID: AB_1553774
anti-Rpb1 (8WG16) BioLegend Cat# 920102
Peroxidase AffiniPure Goat Anti-Mouse IgG+IgM (H+L) Jackson ImmunoResearch Cat# 115-035-068;
RRID: AB_2338505
Chemicals, Peptides, and Recombinant Proteins
RNase H Sigma Cat# 10786357001
RQ1 RNase-free Dnase Promega Cat# M6101
FastDigest restriction enzymes Thermo Scientific Cat# FD0274, FD0504, FD0684,
FD0774, FD0933
Protein G Sepharose Fast Flow GE Healthcare Cat# 17061801
IgG Sepharose Fast Flow GE Healthcare Cat# 17096901
Critical Commercial Assays
LightCycler 480 SYBR Green I Master Roche Cat# 04887352001
Riboprobe System T7 Promega Cat# P1440
Deposited Data
DRIP-seq in rnh1 rnh201 yeast cells (Wahba et al., 2016) SRA SRA: SRP071346
DRIP-chip in wt yeast cells (Chan et al., 2014a) Array Express ArrayExpress: E-MTAB-2388
transcriptome in wt yeast cells (Xu et al., 2009) Array Express ArrayExpress: E-TABM-590
g-H2A ChIP-chip in wt yeast cells (Stirling et al., 2012) N/A N/A
g-H2A ChIP-chip in wt yeast cells (Capra et al., 2010) N/A N/A
DRIP-seq in human fibroblasts (Ginno et al., 2013) SRA SRA: SRA048940.1
DRIP-seq in NTERA2 human cells (Lim et al., 2015) GEO GEO: GSE57353
RDIP-seq in HEK293 human cells (Nadel et al., 2015) GEO GEO: GSE68953
RDIP-seq in IMR90 human cells (Nadel et al., 2015) GEO GEO: GSE68953
RNA-seq in human fibroblasts (Lim et al., 2015) GEO GEO: GSE57353
RNA-seq in NTERA2 human cells SRA SRA: SRX359337
GRO-seq in HEK293 human cells GEO GEO: GSM1249869 and
GSM1249874
GRO-seq in IMR90 human cells GEO GEO: GSM1055806
g-H2AX ChIP-seq from HEK293 cells (Bunch et al., 2015) GEO GEO: GSE75170
Raw images This paper http://dx.doi.org/10.17632/
b3f8k56vjs.1
Experimental Models: Organisms/Strains
S. cerevisiae: wt (RPL7 RPL20) (Parenteau et al., 2011) JPY10H3
S. cerevisiae: rpl7ADi rpl7BDi (Parenteau et al., 2011) JPY172G2
S. cerevisiae: wt (Bonnet et al., 2015) BY4742
S. cerevisiae: mft1::KanMX (Bonnet et al., 2015) Y10508
S. cerevisiae: hpr1::KanMX (Bonnet et al., 2015) Y14072
S. cerevisiae: sus1::KanMX (Bonnet et al., 2015) YV1542
S. cerevisiae: sen1-1::KanMX (Li et al., 2011) Y12015
S. cerevisiae: YHC1-TAP::His3MX (Ghaemmaghami et al., 2003) U1-TAP
(Continued on next page)

e1 Molecular Cell 67, 608621.e1e6, August 17, 2017


Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
S. cerevisiae: LEA1-TAP::His3MX (Ghaemmaghami et al., 2003) U2-TAP
C. glabrata: wt (Kitada et al., 1995) DHTU
C. glabrata: hpr1::TRP1 This study DCghpr1
S. pombe: wt A gift from M. Rougemaille PR167
S. pombe: KanMX::P81nmt1-THOC1 (SpHPR1) A gift from M. Rougemaille PR327
C. neoformans: wt (Goebels et al., 2013) JEC32
C. neoformans: hpr1::NEO This study NE1205
Oligonucleotides
GATCTGCCTTGTATTGTCTTCG SIGMA RPL7A-F
TCAGCGGCCATTGTGATCTT SIGMA RPL7A-R
TTGTAGAGTAAGTAGACCCATA SIGMA RPL7B-F
TCAGTGGACATTATGACGTTGA SIGMA RPL7B-R
GATTGCCGGTGGTAAGAAGA SIGMA YEF3-F
CGTAAGCATCACCCAATTCC SIGMA YEF3-R
GAAACCACGAAAAGTTCACCA SIGMA intergenic-F
AGCTTCTGCAAACCTCATTTG SIGMA intergenic-R
CCTTATACATTAGGTCCTTTGTAGCAT SIGMA GAL1p-F
GATCCGGTCATTATTAATTTAG SIGMA GAL1p-R
ACTGCAGGACACGCTCAAC SIGMA YAT1-50 -F
GTTTTCTGCGGAGAGCACAG SIGMA YAT1-50 -R
CATTGAACGTGCAGACAGG SIGMA YAT1-mid-F
CGGTGTGTTCGAAGTTGATG SIGMA YAT1-mid-R
TCTGTGGTGGTGTCCTCAAG SIGMA YAT1-30 -F
CTTGCTGCCGTTTGAAGATG SIGMA YAT1-30 -R
ACCAAAGCTCCGGAACTAGA SIGMA LYS2-30 -F
AGACCAATCAACACCTGTCCA SIGMA LYS2-30 -R
ACGTTACCCAATTGAACACG SIGMA ACT1-F
AGAACAGGGTGTTCTTCTGG SIGMA ACT1-R
CTAAGTCTCATGTACTAACATCGATTGCTTC SIGMA ACT1intron-F
ACCGGCAAAACCGGCTTTACACATAC SIGMA ACT1intron-R
Recombinant DNA
pRS316-L CEN / URA3 / leu2D30 -leu2D50 (Prado and Aguilera, 1995) N/A
p-GAL-YAT1 CEN / URA3 / GAL1promoter-YAT1 (Bonnet et al., 2015) N/A
p-GAL-[RPL51A*intron]-YAT1CEN / URA3 / GAL1promoter- (Bonnet et al., 2015) N/A
[RPL51A*intron]-YAT1
p-GAL-[RPL35Aintron]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[RPL35Aintron]-YAT1
p-GAL-[SEC27intron]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[SEC27intron]-YAT1
p-GAL-[RPL4Aexon]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[RPL4Aexon]-YAT1
p-GAL-LYS2 CEN / URA3 / GAL1promoter-LYS2 this study N/A
p-GAL-[RPL51A*intron]-LYS2 CEN / URA3 / GAL1promoter- this study N/A
[RPL51A*intron]-LYS2
p-GAL-[RPL51A*intron(5II3I)]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[RPL51A*intron carrying mutations within the 50 SS and the
branchpoint]-YAT1
p-GAL-[RPL51A*intron(D5)]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[RPL51A*intron carrying a deletion of the 50 SS]-YAT1
(Continued on next page)

Molecular Cell 67, 608621.e1e6, August 17, 2017 e2


Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
p-GAL-[RPL51A*intron(30 ss*)]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[RPL51A*intron carrying mutations within the 30 SS]-YAT1
p-GAL-[tGpIintron]-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[Tetrahymena GroupI intron]-YAT1
YCplac111-MS2-GFP CEN / LEU2 / GPDpromoter-NLS-HA-(MS2- a gift from E. Bertrand N/A
CP)-GFP
p-GAL-[MS2-loop]x2-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[MS2-loop]x2-YAT1
p-GAL-[MS2-loop]x6-YAT1 CEN / URA3 / GAL1promoter- this study N/A
[MS2-loop]x6-YAT1
p-L-GAL-YAT1 CEN / URA3 / leu2D30 -GAL1promoter-YAT1-leu2D50 this study N/A
p-L-GAL-[RPL51A*intron]-YAT1 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL51A*intron]-YAT1-leu2D50
p-L-GAL-[RPL35Aintron]-YAT1 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL35Aintron]-YAT1-leu2D50
p-L-GAL-[SEC27intron]-YAT1 CEN / URA3 / leu2D30 -GAL1promoter- this study N/A
[SEC27intron]-YAT1-leu2D50
p-L-GAL-[RPL4Aexon]-YAT1 CEN / URA3 / leu2D30 -GAL1promoter- this study N/A
[RPL4Aexon]-YAT1-leu2D50
p-L-GAL-LYS2 CEN / URA3 / leu2D30 -GAL1promoter-LYS2-leu2D50 this study N/A
p-L-GAL-[RPL51A*intron]-LYS2 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL51A*intron]-LYS2-leu2D50
p-L-GAL-[RPL51A*intron(5II3I)]-YAT1 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL51A*intron(5II3I)]-YAT1-leu2D50
p-L-GAL-[RPL51A*intron(D5)]-YAT1 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL51A*intron(D5)]-YAT1-leu2D50
p-L-GAL-[RPL51A*intron(30 ss*)]-YAT1 CEN / URA3 / leu2D30 - this study N/A
GAL1promoter-[RPL51A*intron(30 ss*)]-YAT1-leu2D50
p-L-GAL-[tGpIintron]-YAT1 CEN / URA3 / leu2D30 -GAL1promoter- this study N/A
[Tetrahymena GroupI intron]-YAT1-leu2D50
p-L-GAL-[MS2-loop]x6-YAT1 CEN / URA3 / leu2D30 -GAL1promoter- this study N/A
[MS2-loop]x6-YAT1-leu2D50
pRS423-GPD-hsRNH1 2m / HIS3 / GPDpromoter-hsRNH1 this study N/A
pCR4-T7-YAT1 T7promoter-YAT1 this study N/A
pCR4-T7-[RPL51A*intron]-YAT1 T7promoter-[RPL51A*intron]-YAT1 this study N/A
pCR4-T7-[RPL51A*intron(5II3I)]-YAT1 T7promoter- this study N/A
[RPL51A*intron(5II3I)]-YAT1
pCR4-T7-[RPL51A*intron(D5)]-YAT1 T7promoter- this study N/A
[RPL51A*intron(D5)]-YAT1
pCR4-T7-[RPL51A*intron(30 ss*)]-YAT1 T7promoter- this study N/A
[RPL51A*intron(30 ss*)]-YAT1
pCR4-T7-[tGpIintron]-YAT1 T7promoter-[Tetrahymena GroupI this study N/A
intron]-YAT1
Software and Algorithms
Bowtie (Langmead et al., 2009) N/A
MACS2 (Zhang et al., 2008) N/A
Kallisto (Bray et al., 2016) N/A
BEDtools (Quinlan and Hall, 2010) N/A
SAMtools (Li et al., 2009) N/A
Other
NucleoSpin RNA II Macherey-Nagel Cat# 740955

e3 Molecular Cell 67, 608621.e1e6, August 17, 2017


CONTACT FOR REAGENT AND RESSOURCES SHARING

Further information and requests for reagents may be directed to and will be fulfilled by the Lead Contact, Benoit Palancade (benoit.
palancade@ijm.fr).

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Yeast Strains and Growth


Yeast strains of the distinct species used in this study are listed in the Key Resources Table. Yeast cultures were performed according
to standard procedures. Growth assays were performed by spotting serial dilutions of exponentially-growing cells on solid YPD
medium and incubating the plates at the indicated temperatures.
S. cerevisiae - All the S. cerevisiae strains are isogenic to S288c and were grown at 30 C in standard yeast extract peptone
dextrose (YPD). For GAL1 promoter induction or repression, 2% galactose or 2% glucose, respectively, was added for 5 hr to cells
grown in glycerol-lactate (GGL: 0.17% YNB, 0.5% ammonium sulfate, 0.05% glucose, 2% lactate and 2% glycerol) supplemented
with the required nutrients.
C. glabrata The Cagl0b01617 g ORF was unambiguously identified as the closest ScHPR1 relative in the Candida glabrata
genome and further renamed CgHPR1. A gene replacement cassette was amplified from pFA6a-TRP1 using recombinogenic
primers harboring 80 bases of homology upstream and downstream the CgHPR1 ORF and transformed in the previously described
C. glabrata DHTU strain (Kitada et al., 1995). Correct integration of the cassette was verified by PCR and cells were grown in YPD at
30 C for phenotypic analyses.
S. pombe The SPCP25A2.03/tho1 gene, which encodes the closest Hpr1 ortholog in Schizosaccharomyces pombe, being
essential (Kim et al., 2010), inactivation of SpHPR1 was performed by integrating a P81nmt1 thiamin-repressible promoter (Ba hler
et al., 1998) upstream its coding sequence. The corresponding Pnmt1-SpHPR1 and its parental wild-type strains were kindly provided
by M. Rougemaille (Institut Jacques Monod, Paris, France). For DNA:RNA hybrids analyses, cells were grown for 64h in standard
Edinburgh minimal medium supplemented with the required nutrients in the presence of 20 mg/mL thiamin.
C. neoformans The AFR96461/CNAG_03235/THOC1 gene was annotated as the closest human Hpr1 relative in the Crypto-
coccus neoformans genome and further named CnHPR1 in this study. CnHPR1 inactivation was performed in the parental strain
JEC32 by biolistic transformation integration using a geneticin resistance marker and further verified as previously described
(Goebels et al., 2013). Cells were further grown in YPD at 30 C for phenotypic analyses.

Plasmids
Plasmids used in this study are listed in the Key Resources Table and were constructed using PCR-based molecular cloning tech-
niques. Introns from RPL35A and SEC27, as well as an exonic sequence from RPL4A (1-70), were inserted downstream the ATG
within p-GAL-YAT1. The complete LYS2 CDS was subcloned downstream the GAL1 promoter to generate p-GAL-LYS2. Plasmids
of the p-L-GAL serie (used for hyper-recombination assays) were obtained by subcloning the different GAL1-promoter(+ATG)-YAT1
cassettes from plasmids of the p-GAL serie between the two leu2 repeats of pRS316-L, which share 600pb of homology. Previously
described intron mutations (5II3I and D5; Lacadie and Rosbash, 2005) were integrated in p-GAL-[RPL51A*intron]-YAT1 by PCR-
based mutagenesis. The 30 SS was mutated by introducing a AG- > Ac substitution (Alexander et al., 2010) at the end of the
RPL51A*intron, together with a G- > c substitution at a secondary 30 SS mapped 9 nt downstream. The Tetrahymena GroupI self-
splicing intron sequence was amplified from ptGpI-CUP1 (Chalamcharla et al., 2010) and subcloned between the ATG and the
YAT1 CDS within p-GAL-YAT1. MS2-loop coding sequences were amplified from pIIIA/MS2-1 (a gift from M. Wickens) and inserted
between the ATG and the YAT1 CDS within p-GAL-YAT1. For in vitro transcription, the YAT1 CDS, eventually flanked by wt or mutant
RPL51A* introns, was subcloned downstream the T7 promoter within the pCR4-TOPO vector (Lifetech). For RNH1 overexpression, a
GPDpromoter-hsRNH1 cassette from p425-GPD-hsRNH1 (Wahba et al., 2011) was subcloned in pRS423.

METHOD DETAILS

DNA:RNA hybrid detection


DNA:RNA hybrid immunoprecipitation (DRIP) was performed using the S9.6 DNA:RNA hybrid-specific monoclonal antibody
according to a published procedure (Mischo et al., 2011; Wahba et al., 2016), with the following modifications. Briefly, genomic
DNA was phenol-extracted from exponentially growing yeast cells and isolated by ethanol precipitation. 50 mg of purified DNA
were digested by a cocktail of restriction enzymes (EcoRI, HindIII, XbaI, SspI, BsrGI; FastDigest enzymes; Thermo Scientific) for
30min at 37 C in a total volume of 100 mL, conditions in which complete digestion was observed. Digested DNA samples were further
diluted 4-fold with FA1 buffer (0.1% SDS, 1% Triton, 10 mM HEPES pH 7.5, 0.1% sodium deoxycholate, 275 mM NaCl) and incu-
bated for 16h in the presence of 1.5 mg of S9.6 purified antibody (Kerafast). Immunoprecipitated DNA fragments were further
captured on protein G Sepharose beads (GE Healthcare), washed and purified according to standard ChIP procedures (Bonnet
et al., 2015). Input and immunoprecipitated DNA amounts were quantified by real-time PCR with a LightCycler 480 system (Roche)
using SYBR Green incorporation according to the manufacturers instructions. The amount of DNA in the immunoprecipitated

Molecular Cell 67, 608621.e1e6, August 17, 2017 e4


fraction was divided by the amount detected in the input to evaluate the percentage of immunoprecipitation (% of IP). Specificity of
the DRIP signal was determined by including 10 units of RNase H (Sigma) in the digestion reaction. Free RNA removal did not enhance
the specificity of our DRIP assay, in agreement with previous reports (Wahba et al., 2016; Halasz et al., 2017).
Dot-blot detection of DNA:RNA hybrids was performed as previously described (Wahba et al., 2016). Decreasing amounts of
genomic DNA (prepared as above) were adsorbed onto nylon membranes (Hybond-N+, GE Healthcare) which were incubated
with the following mouse antibodies: anti DNA:RNA hybrids (S9.6; 1:4,000 in TBS, 0.5% Tween-20, 5% skimmed milk); anti
double-stranded DNA (HYB331-01; 1:1,000 in TBS, 0.5% Tween-20, 5% BSA). Following incubation with anti-mouse peroxy-
dase-conjugated antibodies, membranes were imaged on a LAS4000 Imager (Fuji). The DNA:RNA hybrid and total DNA amounts
were determined using serial dilutions of a reference sample as a standard. For each species, specificity of the dot-blot signals
was systematically confirmed by treating the DNA extracts with RNase H (Roche) or RNase-free DNase (Promega) prior to blotting.
For in vitro assays, DNA:RNA hybrids were formed by transcription of the YAT1 sequence placed under the control of the T7 Pro-
moter using the Riboprobe T7 in vitro transcription system (Promega). Free RNA degradation, RNase H treatment and subsequent
plasmid isolation was performed as previously reported (Yu et al., 2006). In vitro transcribed plasmids were either directly used for
DNA:RNA hybrid immunoprecipitation as above, or resolved on 1% stain-free low-melting agarose gels, and further stained with
Ethidium Bromide. For dot-blotting of in vitro formed R-loops, in vitro transcribed plasmids were cut from the agarose gel following
electrophoresis, solubilized and adsorbed onto nylon membranes, as above.

Gene expression and splicing analyses


Total RNAs were directly extracted from yeast cells disrupted by bead beating. For nascent mRNA analysis, chromatin pellets were
isolated using a described procedure (Carrillo Oesterreich et al., 2010) and checked for their protein content by western blot analysis
using the following antibodies: anti-Rpl1 (Delaveau et al., 2016), anti-Rpl3 (DSHB) and anti-Rpb1 (BioLegend). Total or chromatin-
associated mRNAs were then purified using the Nucleospin RNA II kit (Macherey-Nagel) and reverse-transcribed with Super-
script-II reverse transcriptase (Invitrogen). cDNA quantification was achieved by real-time PCR using primers as described in the
Key Resources Table. YAT1-specific primers detected both the unspliced and spliced versions of the reverse-transcribed mRNAs.
ACT1-specific primers were previously described (Alexander et al., 2010). The amounts of the mRNAs of interest were normalized
relative to ACT1 mRNA values (that do not vary between the different strains and growth conditions used in this study), and further
set to 1 for wt cells carrying the intronless GAL-YAT1 reporter. Unless indicated, values obtained for the YAT1-30 amplicon were
displayed.
RNA polymerase II distribution along the genes of interest was determined by ChIP using anti-Pol II largest subunit antibodies (anti-
Rpb1, BioLegend) as previously reported (Bonnet et al., 2015). Input and immunoprecipitated DNA amounts were further quantified
by real-time PCR as above. Spliceosome recruitment was evaluated by ChIP as above except that IgG Sepharose beads (GE Health-
care) were used to capture TAP-tagged U1snRNP or U2snRNP subunits. Splicing efficiencies of the reporter constructs were deter-
mined by semiquantitative PCR using the following intron-flanking primers detecting both unspliced and spliced reverse-transcribed
mRNA variants: CCTTATACATTAGGTCCTTTGTAGCAT (forward) and GTTTTCTGCGGAGAGCACAG (reverse).

Bioinformatic analyses of yeast datasets


A dataset or R-loop-prone sites, obtained by DRIP-seq in the rnh1D rnh201D strain (Wahba et al., 2016) was used to compare hybrid
formation between intronless and intron-containing loci. DNA:RNA hybrid occurrence was determined by intersecting this dataset
with the list of wild-type sacCer3 transcripts (Xu et al., 2009), including intron-containing genes as defined in the Saccharomyces
Genome Database (http//www.yeastgenome.org), using the BEDtools package (Quinlan and Hall, 2010). Only the coding transcripts
corresponding to a single ORF were considered in the analysis. Hybrid-positive and hybrid-negative genes were further divided into
five categories according to their expression levels (transcriptional frequency, mRNAs/hour; Holstege et al., 1998), as previously
described (Chan et al., 2014a; Stirling et al., 2012). For each gene, the number of reads associated with each overlapping DNA:RNA
hybrid-enriched regions was normalized to the total number of reads per run, further weighted by the length of the overlap divided by
the length of the full transcript, and summed to define the R-loop density. The mean R-loop density was averaged from 4 independent
sequencing runs. Alternatively, we used a dataset of R-loop-prone sites obtained by DRIP-chip in wt cells (Chan et al., 2014a), in
which hybrid-enriched regions are characterized by an occupancy score. For each transcript, the scores of each overlapping
hybrid-enriched regions were further weighted by their length divided by the length of the full transcript, and summed to define
the hybrid density. These density values were further averaged from the 2 replicates. The dataset of R-loop positive transcripts
was further intersected to two distinct g-H2A ChIP-chip dataset (Capra et al., 2010; Stirling et al., 2012), in which g-H2A-enriched
regions are characterized by an occupancy score of g-H2A. Transcripts overlapping g-H2A-enriched regions in both ChIP-chip rep-
licates were considered as g-sites-positive. For each transcript, g-H2A densities were calculated using the same rationale as used
above for hybrid densities. Genomic features were assessed across the entire length of each genes according to: percentage of G
and C; (G-C)/(G+C) for GC skew; (A-T)/(A+T) for AT skew.

Bioinformatic analyses of human datasets


Genome-wide R-loop, transcriptome and g-H2AX profiling was assessed using publicly available data, including DRIP-seq data from
human fibroblasts and NTERA2 cells (Ginno et al., 2013; Lim et al., 2015); RDIP-seq data from HEK293 and IMR90 cells (Nadel et al.,

e5 Molecular Cell 67, 608621.e1e6, August 17, 2017


2015); RNA-seq data from human fibroblasts (Lim et al., 2015) and NTERA2 cells (SRA: SRX359337); GRO-seq (genomic run-on) data
from HEK293 (GEO: GSM1249869 and GSM1249874) and IMR90 cells (GEO: GSM1055806); g-H2AX ChIP-seq from HEK293 cells
(Bunch et al., 2015). The quality of high-throughput sequencing data was assessed with FastQC (http://www.bioinformatics.
babraham.ac.uk/projects/fastqc). DRIP-seq and RDIP-seq reads were aligned to the reference human genome (GRCh38/hg38
assembly) with Bowtie (Langmead et al., 2009) and filtered for uniquely aligned reads. Enriched regions were identified using
MACS2 (Zhang et al., 2008), with a false-discovery rate of 0.05 and an absolute change in density higher than 2-fold relative to
the control sample (digested input DNA). R-loops overlapping with signal artifact blacklisted regions were removed (ENCODE Project
Consortium, 2012). Finally, R-loops were assigned to annotated genes from GenCode (GRCh38, release 23) (Harrow et al., 2012)
previously merged into a single model transcript per gene. R-loop density was estimated using only read counts overlapping signif-
icantly enriched regions and normalized as reads per kilobase per million mapped reads (RPKMs). RNA-seq and GRO-seq sequence
reads were pseudo-mapped to the human transcriptome (GRCh38/hg38) using Kallisto (Bray et al., 2016) and gene expression levels
were estimated as transcripts per million (TPMs). R-loop-positive, transcriptionally active genes (expression higher than the 25th
percentile) were split into four equally sized groups: low (L), medium (M), high (H) and very high (VH) expression. To obtain intronless
and intron-containing gene groups with equivalent expression levels, genes with TPMs higher than 32 were removed from the anal-
ysis. For each R-loop-positive gene, the g-H2AX enrichment was further determined as the fold-change in the g-H2AX signal as
compared to the control input. GC features were assessed for each gene co-oriented with transcription using 50bp sliding windows
of 1 bp step size according to: percentage of G and C; (G-C)/(G+C) for GC skew. A set of in-house scripts for data processing and
graphical visualization was written in Bash and in the R environmental language (http://www.R-project.org). SAMtools (Li et al., 2009)
and BEDtools (Quinlan and Hall, 2010) were used for alignment manipulation, filtering steps, file format conversion and comparison of
genomic features.

QUANTIFICATION AND STATISTICAL ANALYSIS

Hyper-recombination rates were defined as the mean of 3 experiments, each one performed with six independent colonies: cells
were plated on SC medium lacking leucine to estimate the number of Leu+ recombinants, while cell survival was estimated following
plating on SC medium. For statistics, n values correspond to the number of biological replicates (e.g., independent yeast cultures).
The following statistical tests were used: Fisher exact test (to compare R-loop/g-sites occurrences between intronless and intron-
containing gene populations; Figures 1 and S1); Bootstrapping Method with 100000 replications of the samples mean to compare
R-loop/g-sites densities of intronless and intron-containing dataset; Figures 1 and 7); Mann-Whitney-Wilcoxon with FDR adjusted
p values (to compare R-loop densities and genomic features of intronless and intron-containing datasets; Figures 1, S1, and S7);
two-sided Welchs t test, allowing unequal variance (other figures). *p % 0.05; **p % 0.01; ***p % 0.001; ns, not significant.

DATA AND SOFTWARE AVAILABILITY

The raw images have been deposited in Mendeley Data and are available at http://dx.doi.org/10.17632/b3f8k56vjs.1.

Molecular Cell 67, 608621.e1e6, August 17, 2017 e6