Professional Documents
Culture Documents
Whole Transcriptome RNAseq
Whole Transcriptome RNAseq
Received December 28, 2010; Revised June 10, 2011; Accepted June 15, 2011
*To whom correspondence should be addressed. Tel: +1 650 8122002; Fax: +1 650 8122745; Email: pourmand@soe.ucsc.edu
starts with total RNA that has been depleted of rRNA by MATERIALS AND METHODS
using a set of oligos that bind to rRNA (6), and the second Samples and tissues
method selects for transcript by isolating poly-A RNA as
the starting material for the construction of Balb/C male mice were purchased from Harlan
whole-transcriptome libraries (7). Laboratories Inc. (Livermore, USA). The mice were
The NuGEN OvationÕ RNA-seq system is an RNA- irradiated with charge particle radiation after proper
based single primer, isothermal amplification (SPIA) tech- resting of 2 days in the Loma Linda University
nology that is a highly sensitive RNA amplification for Radiation Facility (Loma Linda, CA, USA). Group 1
whole-transcriptome sequencing using minute amount of served as control (0 Gy). The mice in groups 2 were
total RNA, as described in detail in published literature exposed to 2.0 Gy from a proton source at a dose rate
(8). In this system, the mRNA is reverse transcribed to of 1Gev/45 s. The controls and irradiated mice were killed
synthesize the first-strand cDNA by using a combination by cervical decapitation and testis tissues were dissected
of random hexamers and poly-T chimeric primer. Then, out and immediately frozen in liquid nitrogen.
the RNA template is partially degraded in a heating step
and the second strand is synthesized along the first-strand
cDNA as template using DNA polymerase. The double- RNA isolation and purification
stranded DNA is purified and then amplified using SPIA. Total RNA was extracted with QIAzol Lysis Reagent
SPIA is a linear cDNA amplification process in which (Qiagen; Valencia, CA, USA) and then purified on
RNase H degrades RNA in DNA/RNA heteroduplex RNeasy spin columns (Qiagen) per manufacturer’s in-
cDNA synthesis (fragment size 100–500 bp). The shearing was done by son-
TM ication (Covaris model S1) with duty cycle 5, intensity 3
The fragmented mRNA samples (from both TruSeq
and cycle/burst 200 for 180 s according to manufacturer’s
poly-A and RiboMinusTM-based enrichment) were sub-
instructions. Following shearing, cDNA was
jected to cDNA synthesis using Illumina TruSeqTM
electrophoresed on 2.5% agarose gel and size-selected in
RNA sample preparation kit (Low-Throughput
the range of 100–200 bp for SOLiD and 250–350 bp for
protocol) according to manufacturer’s protocol. Briefly,
Illumina platforms. The cDNA fragments were then
cDNA was synthesized from enriched and fragmented
blunt-ended through an end-repair reaction and ligated
RNA using reverse transcriptase (Super-Script II) and
to platform-specific double-stranded bar-coded adapters
random primers. The cDNA was further converted into
using library preparation kits from New England
double stranded DNA using the reagents supplied in the
Biolabs (Ipswich, MA, USA). All the libraries were
kit, and the resulting dsDNA was used for library
prepared through a semi-automated procedure using
preparation.
NorDiag Magnatrix 8000 plus liquid-handling Robot
For comparison studies between mRNA enrichment
(NorDiag, Oslo, Norway). The end-repair, dA-tailing
methods and the OvationÕ RNA-Seq system, a single
(for Illumina-based libraries), ligation of platform-specific
100 ng total RNA sample (with technical replicate) was
adaptors and purification reactions required for library
processed for cDNA synthesis using the OvationÕ
preparation steps were done in an automated fashion,
RNA-Seq system (NuGEN Technologies, Inc.; San
while gel purification and library amplification (15
Carlos, CA, USA), and either DNase-treated using
cycles) were performed manually.
DNase mix from RecoverAllTM Total Nucleic Acid
randomly sampled from all samples. All subsequent mRNA enriched either by TruSeqTM poly-A selection
analysis was performed on this randomly sampled data. (128.2 million reads) or by RiboMinusTM (99.6 million
reads) as well as from total RNA using NuGEN-
Transcript abundance OvationÕ RNA-Seq System (17.1 million reads). From
Transcript abundance was determined from the TopHat these reads, 86.6 million (64.05%) TruSeqTM poly-A and
alignment using a custom perl script and annotated tran- 46.6 million (46.74%) RiboMinusTM reads and 10.59
scripts from RefSeq. RefSeq exons were considered to be million (60.59%) OvationÕ reads were mapped to the
detected if at least one read mapped within annotated mouse genome (Supplementary Table S1). In our hands,
exon boundaries. the TruSeqTM poly-A enriched sample generated the
highest mapping (68.94%) to known mouse transcripts
Differential expression and GC content bias (refSeq transcripts, 27582), far exceeding both
RiboMinusTM (36.64%) and OvationÕ RNA-Seq System
Differential expression was assessed using transcript abun- (30.22%), as shown by analysis of the exons category in
dances as inputs to DESeq (14). Differentially expressed Figure 1A. It is worth mentioning that more reads (5%)
transcripts were analyzed between SOLiD and Illumina were mapped to the exome from the total RNA sample,
sequenced data. 0 and 2 Gy mouse samples were compared which was treated with DNase before cDNA synthesis in
separately. The transcripts with an adjusted P > 0.05 were OvationÕ RNA-Seq System.
considered to be differentially expressed. Transcripts
called differentially expressed by DESeq (15) were sep- Basic sequencing data for eight mouse testis tissue samples
coefficient of replicates proved that this method per- due to priming bias during first-strand synthesis
formed equally as well as previously published protocols (Supplementary Figure S3).
(19). However, the correlation for replicates between the
two platforms was lower than the correlation between bio- Assessment of coverage at 50 - and 30 -ends of the
logical replicates using the same platform; this may be the transcripts
result of differences in the two sequencing technologies. We next assessed potential biases in transcript representa-
Spearman’s correlation heat map is shown in tion (coverage at 50 - and 30 -end) for the three methods.
Supplementary Figure S2. Poly-A tail cDNA synthesis methods are known to be
Evenness of transcript coverage prone to 30 bias of transcripts with respect to sequenc-
ing depth (10,23,24). Therefore, we determined the
We compared the evenness of transcript coverage to judge average coverage at each percentile of length from 50 - to
the efficiency of priming and cDNA synthesis between the 30 -end of the known transcripts to assess method bias
Illumina TruSeqTM RNA sample preparation kit and for transcript ends (25). The TruSeqTM poly-A-based
OvationÕ RNA-Seq System. In the Illumina TruSeqTM enrichment method showed significant 30 bias in compari-
RNA sample preparation kit, the first strand of cDNA son to both the RiboMinusTM and OvationÕ RNA-Seq
is synthesized using random primers and reverse tran- systems (Figure 4A). Our results also showed that there
scriptase (Super-Script II), whereas the OvationÕ is low 50 bias in the RiboMinusTM sample, in contrast
RNA-Seq System employs a combination of hexamers and to the data obtained for libraries prepared with the
poly-T chimeric primers with reverse transcriptase. To OvationÕ RNA-Seq system. The transcript coverage
Figure 3. Spearman’s correlation plots comparing mouse mRNA expression data of biological replicates (OvationÕ RNA-Seq system) using SOLiD
sequencer (number of reads in annotated transcripts). (A) Two biological replicates of 0Gy sample (mouse testis mRNA). (B) Two biological
replicates of 2Gy sample (mouse testis mRNA). (C) Combined abundances of two biological replicates (X-axis) and combined abundances of
two biological replicates (Y-axis) for 0 Gy mouse samples. (D) Combined reads two biological replicates (X-axis) and combined reads of two
biological replicates (Y-axis) for 2 Gy mouse samples.
Nucleic Acids Research, 2011 7
Figure 4. (A) Average percentile coverage across all transcripts showing significant 30 bias in the TruSeqTM poly-A selection method, slight 50 bias in
the RiboMinusTM rRNA depletion method, and slight 30 bias in the OvationÕ RNA-Seq system as assessed by single RNA samples of mouse testis
tissues. There was no noticeable difference for transcript coverage between libraries prepared from non-sheared and sheared cDNA for the OvationÕ
RNA-Seq system. (B) Transcripts with coverage at extreme ends (50 and 30 ) confirming 30 bias for TruSeqTM poly-A selection method, slight 50 bias
for RiboMinusTM and slight 30 bias for OvationÕ RNA-Seq system in single RNA sample of mouse testis tissues. Each line represents number of
highly expressed transcripts with coverage at extreme ends over the total number of highly expressed transcripts at a base-level resolution.
Figure 6. GC-based bias in next-generation sequencing: the densities of transcripts that were upregulated in SOLiD and Illumina are compared to
the density of all annotated transcripts. The dashed vertical line indicates the mean GC content for all transcripts. (A) GC content of the transcript is
plotted against the density of transcripts upregulated in both platforms for 0 Gy samples, (B) GC content of the transcript is plotted against the
density of transcripts upregulated in both platforms for 2 Gy samples.
15. Anders,S. and Huber,W. (2010) Differential expression analysis 28. Morse,A.M., Carballo,V., Baldwin,D.A., Taylor,C.G. and
for sequence count data. Genome Biol., 11, R106. McIntyre,L.M. (2010) Comparison between NuGEN’s
16. Cheng,J., Kapranov,P., Drenkow,J., Dike,S., Brubaker,S., WT-Ovation Pico and one-direct amplification systems.
Patel,S., Long,J., Stern,D., Tammana,H., Helt,G. et al. (2005) J. Biomol. Tech., 21, 141–147.
Transcriptional maps of 10 human chromosomes at 5-nucleotide 29. Rehrauer,H., Aquino,C., Gruissem,W., Henz,S.R., Hilson,P.,
resolution. Science, 308, 1149–1154. Laubinger,S., Naouar,N., Patrignani,A., Rombauts,S., Shu,H.
17. Mamanova,L., Andrews,R.M., James,K.D., Sheridan,E.M., et al. (2010) AGRONOMICS1: a new resource for Arabidopsis
Ellis,P.D., Langford,C.F., Ost,T.W.B., Collins,J.E. and transcriptome profiling. Plant Physiol., 152, 487–499.
Turner,D.J. (2010) FRT-seq: Amplification-free, strand-specific, 30. Head,S.R., Komori,H.K., Hart,G.T., Shimashita,J., Schaffer,L.,
transcriptome sequencing. Nat. Methods, 7, 130–132. Salomon,D.R. and Ordoukhanian,P.T. (2011) Method for
18. Croucher,N.J. (2009) A simple method for directional improved Illumina sequencing library preparation using NuGEN
transcriptome sequencing using Illumina technology. Ovation RNA-Seq System. Biotechniques, 50, 177–180.
Nucleic Acids Res., 37, e148. 31. Sengupta,S., Ruotti,V., Bolin,J., Elwell,A., Hernandez,A.,
19. Armour,C.D., Castle,J.C., Chen,R., Babak,T., Loerch,P., Thomson,J. and Stewart,R. (2010) Highly consistent, fully
Jackson,S., Shah,J.K., Dey,J., Rohl,C.A., Johnson,J.M. et al. representative mRNA-Seq libraries from ten nanograms of total
(2009) Digital transcriptome profiling using selective hexamer RNA. Biotechniques, 49, 898–904.
priming for cDNA synthesis. Nat. Methods, 6, 647–649. 32. Tian,G., Yin,X., Luo,H., Xu,X., Bolund,L. and Zhang,X. (2010)
20. Slomovic,S., Laufer,D., Geiger,D. and Schuster,G. (2006) Sequencing bias: comparison of different protocols of microRNA
Polyadenylation of ribosomal RNA in human cells. library construction. BMC Biotechnol., 10, 64.
Nucleic Acids Res., 34, 2966–2975. 33. Fehniger,T.A., Wylie,T., Germino,E., Leong,J.W., Magrini,V.J.,
21. Ozsolak,F., Platt,A.R., Jones,D.R., Reifenberger,J.G., Sass,L.E., Koul,S., Keppel,C.R., Schneider,S.E., Koboldt,D.C., Sullivan,R.P.
McInerney,P., Thompson,J.F., Bowers,J., Jarosz,M. and et al. (2010) Next-generation sequencing identifies the
Milos,P.M. (2009) Direct RNA sequencing. Nature, 461, 814–818. natural killer cell microRNA transcriptome. Genome Res., 20,