Professional Documents
Culture Documents
M E TH O D Open Access
Abstract
New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom
method estimates the mean-variance relationship of the log-counts, generates a precision weight for each
observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts
to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or
better than count-based RNA-seq methods even when the data are generated according to the assumptions of the
earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.
© 2014 Law et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication
waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise
stated.
Law et al. Genome Biology 2014, 15:R29 Page 2 of 17
http://genomebiology.com/2014/15/2/R29
and this tends to limit both the performance and the larger standard deviations than small counts. While a log-
usefulness of the RNA-seq analysis methods. One prob- arithmic transformation counteracts this, it overdoes the
lem relates to error rate control with small sample sizes. adjustment somewhat so that large log-counts now have
Despite the use of probabilistic distributions, all the sta- smaller standard deviations than small log-counts. We
tistical methods developed for RNA-seq counts rely on explore the idea that it is more important to model the
approximations of various kinds. Many rely on the statis- mean-variance relationship correctly than it is to specify
tical tests that are only asymptotically valid or are theoret- the exact probabilistic distribution of the counts. There
ically accurate only when the dispersion is small. All the is a body of theory in the statistical literature showing
differential expression methods currently available based that correct modeling of the mean-variance relationship
on the NB distribution treat the estimated dispersions as inherent in a data generating process is the key to design-
if they were known parameters, without allowing for the ing statistically powerful methods of analysis [30]. Such
uncertainty of estimation, and this leads to statistical tests variance modeling may in fact take precedence over iden-
that are overly liberal in some situations [27,28]. This is tifying the exact probability law that the data values follow
true even of the NB exact test [18], which gives exact [31-33]. We therefore take the view that it is crucial to
type I error rate control when the dispersion is known but understand the way in which the variability of RNA-seq
which becomes liberal when an imprecise dispersion esti- read counts depends on the size of the counts. Our work
mator is inserted for the known value. Quasi-likelihood is in the spirit of pseudo-likelihoods [32] whereby statisti-
methods [23] account for uncertainty in the dispersion by cal methods based on the normal distribution are applied
using an F-test in place of the usual likelihood ratio test, after estimating a mean-variance function for the data at
but this relies on other approximations, in particular that hand.
the residual deviances are analogous to residual sums of Our approach is to estimate the mean-variance rela-
squares from a normal analysis of variance. tionship robustly and non-parametrically from the data.
A related issue is the ability to adapt to different types We work with log-counts normalized for sequence depth,
of data with high or low dispersion heterogeneity. None of specifically with log-counts per million (log-cpm). The
the empirical Bayes methods based on the NB distribution mean-variance is fitted to the gene-wise standard devia-
achieve the same adaptability, robustness or small sample tions of the log-cpm as a function of average log-count.
properties as the corresponding methods for microarrays, We explore two ways to incorporate the mean-variance
due to the mathematical intractability of count distribu- relationship into the differential expression analysis. The
tions compared to the normal distribution. first is to modify limma’s empirical Bayes procedure to
The most serious limitation though is the reduced range incorporate a mean-variance trend. The second method
of statistical tools associated with count distributions incorporates the mean-variance trend into a precision
compared to the normal distribution. This is more fun- weight for each individual normalized observation. The
damental than the other problems because it limits the normalized log-counts and associated precision weights
types of analyses that can be done. Much of the statisti- can then be entered into the limma analysis pipeline, or
cal methodology that has been developed for microarray indeed into any statistical pipeline for microarray data
data relies on use of the normal distribution. For exam- that is precision weight aware. We call the first method
ple, we often find it useful in our own microarray gene limma-trend and the second method voom, an acronym
expression studies to estimate empirical quality weights for ‘variance modeling at the observational level’. limma-
to downweight poor quality RNA samples [29], to use trend applies the mean-variance relationship at the gene
random effects to allow for repeated measures on the level whereas voom applies it at the level of individual
same experimental units [4,5] or to conduct gene set observations.
tests for expression signatures while allowing for inter- This article compares the performance of the limma-
gene correlations [7,8]. These techniques broaden the based pipelines to edgeR [20,34], DESeq [13], baySeq [21],
range of experimental designs that can be analyzed or TSPM [25], PoissonSeq [26] and DSS [22], all of which
offer improved interpretation for differential expression are based on NB or over-dispersed Poisson distributions.
results in terms of higher level molecular processes. None Simulation studies show that the limma pipelines perform
of these techniques are currently available for RNA-seq at least as well in terms of power and error rate control
analysis using count distributions. as the NB or Poisson methods even when the data is gen-
For these reasons, the purpose of this article is to revisit erated according to the probabilistic assumptions of the
the idea of applying normal-based microarray-like statis- earlier methods. A key advantage of the limma pipelines
tical methods to RNA-seq read counts. An obstacle to is that they provide accurate type I error rate control even
applying normal-based statistical methods to read counts when the number of RNA-seq samples is small. The NB
is that the counts have markedly unequal variabilities, and Poisson based methods either fail to control the error
even after log-transformation. Large counts have much correctly or are excessively conservative. limma-trend and
Law et al. Genome Biology 2014, 15:R29 Page 3 of 17
http://genomebiology.com/2014/15/2/R29
voom perform almost equally well when the sequencing Log-cpms have stabilized variances at high counts
depths are the same for each RNA sample. When the Probability distributions for counts are naturally het-
sequencing depths are different, voom is the clear best eroscedastic, with larger variances for larger counts. It
performer. has previously been argued that the mean-variance rela-
Either voom or limma-trend give RNA-seq analysts tionship for RNA-seq counts should be approximately
immediate access to many techniques developed for quadratic [34]. This leads to the conclusion that the coef-
microarrays that are not otherwise available for RNA- ficient of variation (CV) of RNA-seq counts should be
seq, including all the quality weighting, random effects a decreasing function of count size for small to moder-
and gene set testing techniques mentioned above. This ate counts but for larger counts should asymptote to a
article presents two case studies that demonstrate how value that depends on biological variability. Specifically,
voom can handle heterogeneous data and complex exper- the squared CV of the counts should be roughly
iments as well as facilitating pathway analysis and gene set
testing. 1/λ + φ
Figure 1 Mean-variance relationships. Gene-wise means and variances of RNA-seq data are represented by black points with a LOWESS trend.
Plots are ordered by increasing levels of biological variation in datasets. (a) voom trend for HBRR and UHRR genes for Samples A, B, C and D of the
SEQC project; technical variation only. (b) C57BL/6J and DBA mouse experiment; low-level biological variation. (c) Simulation study in the presence
of 100 upregulating genes and 100 downregulating genes; moderate-level biological variation. (d) Nigerian lymphoblastoid cell lines; high-level
biological variation. (e) Drosophila melanogaster embryonic developmental stages; very high biological variation due to systematic differences
between samples. (f) LOWESS voom trends for datasets (a)–(e). HBRR, Ambion’s Human Brain Reference RNA; LOWESS, locally weighted regression;
UHRR, Stratagene’s Universal Human Reference RNA.
This is similar in principle to the procedure proposed by A technical difficulty is that we want to predict the
Sartor et al. [36] for microarray data, except that we model variances of individual observations although there is, by
the trend using a regression spline and our implementa- definition, no replication at the observational level from
tion allows for the possibility of missing values or differing which variances could be estimated. We work around this
residual degrees of freedom between genes. We call this inconvenience by estimating the mean-variance trend at
strategy limma-trend, whereby the log-cpm values are the gene level, then interpolating this trend to predict the
analyzed as for microarray data but with a trended prior variances of individual observations (Figure 2).
variance. For comparison, the more naive approach with- The algorithm proceeds as follows. First, gene-wise lin-
out the mean-variance trend will be called limma-notrend. ear models are fitted to the normalized log-cpm values,
taking into account the experimental design, treatment
voom: variance modeling at the observation-level conditions, replicates and so on. This generates a resid-
The limma-trend pipeline models the variance at the gene ual standard deviation for each gene (Figure 2a). A robust
level. However, in RNA-seq applications, the count sizes trend is then fitted to the residual standard deviations
may vary considerably from sample to sample for the same as a function of the average log-count for each gene
gene. Different samples may be sequenced to different (Figure 2b).
depths, so different count sizes may be quite different even Also available from the linear models is a fitted value
if the cpm values are the same. For this reason, we wish for each log-cpm observation. Taking the library sizes
to model the mean-variance trend of the log-cpm values into account, the fitted log-cpm for each observation is
at the individual observation level, instead of applying a converted into a predicted count. The standard deviation
gene-level variability estimate to all observations from the trend is then interpolated to predict the standard devia-
same gene. tion of each individual observation based on its predicted
Our strategy is to estimate non-parametrically the count size (Figure 2c). Finally, the inverse squared pre-
mean-variance trend of the logged read counts and to dicted standard deviation for each observation becomes
use this mean-variance relationship to predict the vari- the weight for that observation.
ance of each log-cpm value. The predicted variance is then The log-cpm values and associated weights are then
encapsulated as an inverse weight for the log-cpm value. input into limma’s standard differential expression
When the weights are incorporated into a linear modeling pipeline. Most limma functions are designed to accept
procedure, the mean-variance relationship in the log-cpm quantitative weights, providing the ability to perform
values is effectively eliminated. microarray-like analyses while taking account of the
Law et al. Genome Biology 2014, 15:R29 Page 5 of 17
http://genomebiology.com/2014/15/2/R29
a b c
Sqrt( standard deviation )
1.0
0.8
0.6
0.4
0.2
0.0
4 6 8 10 12 14 4 6 8 10 12 14 4 6 8 10 12 14
Average log2(count size + 0.5) Average log2(count size + 0.5) Fitted log2(count size + 0.5)
Figure 2 voom mean-variance modeling. (a) Gene-wise square-root residual standard deviations are plotted against average log-count. (b) A
functional relation between gene-wise means and variances is given by a robust LOWESS fit to the points. (c) The mean-variance trend enables
each observation to map to a square-root standard deviation value using its fitted value for log-count. LOWESS, locally weighted regression.
mean-variance relationship of the log-cpm values at the data (Figure 1b) and the human data (Figure 1d) both in
observation level. terms of the absolute size of the dispersions and in terms
of heterogeneity of the dispersions between genes.
voom and limma-trend control the type I error rate We found that variation in sequencing depth between
correctly libraries had a noticeable impact on some RNA-seq anal-
We have found that voom and limma-trend, especially ysis methods. Hence all the simulations were repeated
voom, perform well and produce P values that control under two library size scenarios, one with the same
error rates correctly over a wide range of simulation sequencing depth for all six libraries and one with sub-
scenarios. For illustration, we present results from sim- stantial variation in sequencing depth. In the equal size
ulations in which read counts were generated under the scenario, all libraries were simulated to contain 11 million
same NB model as assumed by a number of existing RNA- reads. In the unequal size scenario, the odd-numbered
seq analysis methods. These simulations should represent libraries were simulated to have a sequence depth of 20
the ideal for the NB-based methods. If the normal-based million reads while the even-numbered libraries had a
methods can give performance comparable to or better sequence depth of 2 million reads. Hence the same total
than count-based methods in these simulations, then this number of reads was simulated in this scenario but dis-
is strong evidence that they will be competitive across a tributed unevenly between the libraries.
wide range of data types. In the first set of simulations, we examined the ability of
Six RNA-seq count libraries were simulated with counts voom and limma-trend to control the type I error rate cor-
for 10,000 genes. The first three libraries were treated as rectly in the absence of any genuine differential expression
group 1 and the others as group 2. The distribution of between the groups. When there are no truly differentially
cpm values for each library was simulated to match the expressed genes, the gene-wise P values should follow an
distribution that we observed for a real RNA-seq dataset approximate uniform distribution. If the type I error rate
from our own practice. The NB dispersion φ was set to is controlled correctly, then the expected proportion of
decrease on average with expected count size, asymp- P values below any cutoff should be less than or equal
toting to 0.2 squared for large counts. This degree of to the cutoff value. A number of popular RNA-seq anal-
biological variation is representative of what we observe ysis methods based on the NB or Poisson distributions
for real RNA-seq data, being larger than we typically were included for comparison. Figure 3 shows results for
observe between genetically identical laboratory mice but a P value cutoff of 0.01. Results for other cutoffs are
less than we typically see between unrelated human sub- qualitatively similar. None of the NB- or Poisson-based
jects (Figure 1). An individual dispersion φ was generated methods were found to control the type I error rate very
for each gene around the trend according to an inverse accurately. When the library sizes are equal, the NB and
chi-square distribution with 40 degrees of freedom. The Poisson methods were overly liberal, except for DESeq
voom mean-variance trend for one such simulated dataset which is very conservative. When the library sizes are
is shown in Figure 1c. It can be seen from Figure 1 that unequal, DSS and DESeq became extremely conservative.
the simulated dataset is intermediate between the mouse By contrast, all the normal-based methods were slightly
Law et al. Genome Biology 2014, 15:R29 Page 6 of 17
http://genomebiology.com/2014/15/2/R29
Prooportion of genes
with p−value < 0.01
0.04
0.03
0.02
0.01
0.00
voom
limma trend
limma notrend
t−test
edgeR classic
edgeR glm
DESeq
PoissonSeq
DSS
TSPM
voom
limma trend
limma notrend
t−test
edgeR classic
edgeR glm
DESeq
PoissonSeq
DSS
TSPM
Figure 3 Type I error rates in the absence of true differential expression. The bar plots show the proportion of genes with P < 0.01 for each
method (a) when the library sizes are equal and (b) when the library sizes are unequal. The red line shows the nominal type I error rate of 0.01.
Results are averaged over 100 simulations. Methods that control the type I error at or below the nominal level should lie below the red line.
conservative. voom produces results very close to the no differentially expressed (DE) genes at 5% FDR for six
nominal type I error rate for both library size scenarios. out of the seven possible partitions, indicating good error
limma-trend is similar to voom when the library sizes are rate control. The voom mean-variance trend for the SEQC
equal but somewhat conservative when the library sizes data, using all the libraries rather than the HBRR samples
are unequal. only, is shown in Figure 1a.
baySeq was not included in the type I error rate com-
parison because it does not return P values. However, the voom has the best power of methods that control the type
results presented in the next section show that it is rel- I error rate
atively conservative in terms of the false discovery rate Next we examined the power to detect true differential
(FDR) (Figure 4). expression. For the following simulations, 100 randomly
To check voom’s conservativeness on real data, we used selected genes were twofold upregulated in the first group
a set of four replicate libraries from the SEQC Project [37]. and another 100 were twofold upregulated in the second
All four libraries were Illumina HiSeq 2000 RNA-seq pro- group. This represents a typical scenario for a functional
files of samples of Ambion’s Human Brain Reference RNA genomics experiment in which the differential expression
(HBRR) [38]. We split the four libraries into two groups effects are large enough to be biologically important but
in all possible ways, and tested for differential expression nevertheless sufficiently subtle as to challenge many anal-
between the two groups for each partition. voom returned ysis methods. Figure 4 shows the number of true and
400
false positives false positives
true positives true positives
300
200
100
0
PoissonSeq
PoissonSeq
baySeq
baySeq
voom
voom
limma trend
limma notrend
t−test
DESeq
DSS
TSPM
DSS
TSPM
edgeR classic
edgeR glm
limma trend
limma notrend
t−test
edgeR classic
edgeR glm
DESeq
Figure 4 Power to detect true differential expression. Bars show the total number of genes that are detected as statistically significant (FDR <
0.1) (a) with equal library sizes and (b) with unequal library sizes. The blue segments show the number of true positives while the red segments
show false positives. 200 genes are genuinely differentially expressed. Results are averaged over 100 simulations. Height of the blue bars shows
empirical power. The ratio of the red to blue segments shows empirical FDR. FDR, false discovery rate.
Law et al. Genome Biology 2014, 15:R29 Page 7 of 17
http://genomebiology.com/2014/15/2/R29
false discoveries made by various analysis methods at not. To make the spike-ins more like a realistic dataset,
significance cutoff FDR < 0.1. When the library sizes are we replicated the counts for each of the 23 non-DE tran-
equal, voom and limma-trend have the next best power scripts three times. That is, we treated each non-DE
after edgeR and PoissonSeq. However, both edgeR and transcript as three different transcripts. This resulted in a
PoissonSeq give empirical FDRs greater than 0.1, confirm- dataset of 138 transcripts with half DE and half non-DE.
ing the results of the previous section that these methods Figure 6 is analogous to Figure 5 but using the spike-in
are somewhat liberal. limma-trend gives an empirical FDR data instead of simulated data. voom again achieved the
slightly greater than voom but still less than 0.1. With lowest FDR, with edgeR and the other limma methods
unequal library sizes, voom has the best power except for again being the closest competitors (Figure 6).
edgeR while still maintaining a low FDR. TSPM declares
by far the most DE genes, but these are mostly false dis- voom and limma-trend are faster than specialist RNA-seq
coveries. DSS also gives a worryingly high rate of false methods
discoveries when the library sizes are unequal. Figures 3 The different statistical methods compared varied con-
and 4 together show that voom has the best power of those siderably in computational time required, with DESeq,
methods that correctly control the type I and FDR error TSPM and baySeq being slow enough to limit the number
rates. of simulations that were done. voom is easily the fastest
of the methods compared, with edgeR-classic next fastest
voom has the lowest false discovery rate (Figure 7).
Next we compared methods from a gene ranking point
of view, comparing methods in terms of the number of RNA-seq profiles of male and female Nigerian individuals
false discoveries for any given number of genes selected So far we have demonstrated the performance of voom
as DE. Methods that perform well will rank the truly DE on RNA-seq datasets with small numbers of replicate
genes in the simulation ahead of non-DE genes. Genes libraries. To demonstrate the performance of voom on
were ranked by posterior likelihood for baySeq and by a heterogeneous dataset with a relatively large num-
P value for the other methods. The results show that ber of replicates and a high level of biological vari-
voom has the lowest FDR at any cutoff (Figure 5). When ability, we compared males to females using RNA-seq
the library sizes are equal, limma-trend and PoissonSeq profiles of lymphoblastoid cell lines from 29 male and
are very close competitors (Figure 5a). When the library 40 female unrelated Nigerian individuals [40]. Summa-
sizes are unequal, limma-trend and edgeR are the closest rized read counts and gene annotation are provided
competitors (Figure 5b). by the Bioconductor tweeDEseqCountData package [41].
Next we compared FDRs using spike-in control tran- Figure 1d shows the voom mean-variance trend of this
scripts from the SEQC project [39]. The data consists of dataset.
eight RNA-seq libraries, in two groups of four. A total of voom yielded 16 genes upregulated in males and 43
92 artificial control transcripts were spiked-in at different upregulated in females at 5% FDR. As expected, most of
concentrations in such a way that three quarters of the the top differentially expressed genes belonged to the X
transcripts were truly DE and the remaining quarter were or Y sex chromosomes (Table 1). The top gene is XIST,
voom voom
limma trend limma trend
PoissonSeq edgeR glm
edgeR classic edgeR classic
edgeR glm PoissonSeq
DESeq baySeq
False discoveries
DSS DESeq
baySeq limma notrend
40
0 40 80 120 0 40 80 120
Genes chosen Genes chosen
Figure 5 False discovery rates. The number of false discoveries is plotted for each method versus the number of genes selected as differentially
expressed. Results are averaged over 100 simulations (a) with equal library sizes and (b) with unequal library sizes. voom has the lowest FDR at any
cutoff in either scenario. FDR, false discovery rate.
Law et al. Genome Biology 2014, 15:R29 Page 8 of 17
http://genomebiology.com/2014/15/2/R29
limma notrend
t−test belong to the male-specific region of chromosome Y
50
False discoveries
edgeR glm [42,43]. A ROAST gene set test confirmed that these genes
edgeR classic collectively are significantly upregulated in males (P =
40
PoissonSeq
baySeq 0.0001). A CAMERA gene set test was even more con-
vincing, confirming that these genes are significantly more
30
DESeq
TSPM upregulated in males than are other genes in the genome
(P = 2 × 10−28 ).
20
PoissonSeq
DSS
DESeq
TSPM
baySeq
edgeR classic
edgeR glm
Table 1 Top 16 genes differentially expressed between males and females in the Nigerian data
Ensembl ID Symbol Chr logFC AveExpr t P value FDR B
ENSG00000229807 XIST X -9.815 3.8084 -36.4 7.03e-48 1.19e-43 74.8
ENSG00000099749 CYorf15A Y 4.251 0.3146 28.3 1.25e-40 1.05e-36 68.2
ENSG00000157828 RPS4Y2 Y 3.281 3.3081 26.5 9.38e-39 5.27e-35 72.6
ENSG00000233864 TTTY15 Y 4.897 -0.5538 25.9 4.31e-38 1.82e-34 64.0
ENSG00000131002 CYorf15B Y 5.440 -0.1710 23.2 4.81e-35 1.62e-31 60.0
ENSG00000198692 EIF1AY Y 2.398 2.6806 20.5 1.09e-31 3.07e-28 58.6
ENSG00000165246 NLGN4Y Y 5.330 -0.4916 19.7 1.26e-30 3.03e-27 52.4
ENSG00000213318 RP11-331F4.1 16 4.293 2.2654 19.3 4.44e-30 9.34e-27 54.1
ENSG00000129824 RPS4Y1 Y 2.781 4.7118 17.6 9.28e-28 1.74e-24 51.5
ENSG00000183878 UTY Y 1.878 2.7430 16.6 2.88e-26 4.85e-23 47.7
ENSG00000012817 KDM5D Y 1.470 4.7046 14.9 1.45e-23 2.22e-20 42.6
ENSG00000146938 NLGN4X X 4.472 -0.7801 14.8 2.09e-23 2.94e-20 38.9
ENSG00000243209 AC010889.1 Y 2.528 -0.0179 14.5 5.48e-23 7.11e-20 37.9
ENSG00000067048 DDX3Y Y 1.671 5.3077 13.4 3.05e-21 3.67e-18 37.5
ENSG00000006757 PNPLA4 X -0.988 2.5341 -10.4 4.78e-16 5.38e-13 25.7
ENSG00000232928 RP13-204A15.4 X 1.434 3.2506 10.3 1.02e-15 1.08e-12 25.2
The table shows output from the limma topTable function. As well as gene ID and symbol, columns give chromosome location (Chr), log2 -fold-change (logFC), average
log2 expression (AveExpr), moderated t, P value, false discovery rate and B-statistic for each gene. The B-statistic is the posterior log-odds of differential expression.
Figure 1e.
Out of 14,869 genes that were expressed during embry-
onic development, 8,366 showed a statistically significant
trend at 5% FDR using empirical Bayes F-tests. For each
differentially expressed gene, we identified the embry-
−5
Xescape
indicating smoothly decreasing or increasing trends over
0 5 10
Average log−cpm
time (Figure 11, panels 1 and 12). Genes peaking at
the first embryonic stage tended to be associated with
Figure 8 MA plot of male vs female comparison with male- and
female-specific genes highlighted. The MA plot was produced by
the cell cycle. Genes peaking at the last stage tended
the limma plotMA function, and is a scatterplot of log-fold-change to be associated with precursor metabolites and energy,
versus average log-cpm for each gene. Genes on the male-specific the oxidation-reduction process and with metabolic
region of the Y chromosome genes are highlighted blue and are pathways.
consistently upregulated in males, while genes on the X Genes peaking at intermediate stages have expression
chromosome reported to escape X inactivation are highlighted red
and are generally down in males. log-cpm, log-counts per million.
trends with an inverse-U shape (Figure 11, panels 2–
11). There was a substantial set of genes with peak
Law et al. Genome Biology 2014, 15:R29 Page 10 of 17
http://genomebiology.com/2014/15/2/R29
developmental stage.
2
2
12
11 Discussion
3
Dimension 2
10
log2( expression )
5
0
−5
1 2 3 4
10
log2( expression )
5
0
−5
5 6 7 8
10
log2( expression )
5
0
−5
9 10 11 12
2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12
Developmental stage Developmental stage Developmental stage Developmental stage
Figure 11 Expression trends for genes that peak at each Drosophila melanogaster embryonic stage. Panels (1) to (12) correspond to the 12
successive developmental stages. Each panel displays the fitted expression trends for the top ten genes that achieve their peak expression during
that stage. In particular, panel (1) shows genes that are most highly expressed at the first stage and panel (12) shows genes most highly expressed
at the last stage. Panels (7) and (8) are notable because they show genes with marked peaks at 12–14 hours and 14–16 hours respectively.
assumptions as made by a number of existing methods. The worst performer in our simulation was TSPM,
Other simulation scenarios would tend to increase voom’s presumably because we have simulated from NB distribu-
advantage. For example, it would be at least as scien- tions, which have quadratic mean-variance relationships,
tifically reasonable to assume that the true expression whereas TSPM assumes a linear mean-variance relation-
levels for each gene follow a log-normal distribution ship [25]. The second worst performer was the ordinary
between replicates instead of a gamma distribution, and t-test. This shows that traditional statistical methods can-
such an assumption would tend to improve the perfor- not be reliably applied to genomic data without borrowing
mance of voom relative to edgeR, DESeq, baySeq and strength between genes. The third worst performer was
DSS. In general, voom makes fewer distributional assump- limma-notrend, showing that the mean-variance trend in
tions than do competing methods and can therefore be the log-cpm values cannot be ignored.
expected to perform robustly across a range of scenar- To examine sensitivity of the results to the shape of the
ios. This study presented simulations with equal library dispersion distribution, we repeated all the simulations
sizes between replicates, and also explored the sensitivity using a log-normal distribution for the gene-wise disper-
of the methods to unequal library sizes. In our expe- sions instead of an inverse chi-square distribution. The
rience markedly unequal library sizes can arise in real two distributions were chosen to have the same mean
RNA-seq experiments for a variety of reasons. One sce- and variance on the log-scale. The results were virtually
nario is when an experiment is conducted in stages and unchanged from those shown in Figures 3, 4 and 5, show-
samples sequenced at a later time have a much higher ing that the shape of the dispersion distribution is not a
sequencing depth. Other possible scenarios occur when major determination of performance. This agrees with a
technical replicates are combined for a subset of samples similar conclusion in Wu et al. [22].
or when DNA samples are multiplexed onto a sequencing It may seem surprising at first that voom should per-
lane in unequal quantities. Some of the NB-based anal- form so well even though it ignores the discrete integer
ysis methods become very conservative or showed very nature of the counts. We think there are several possible
poor FDR control when the library sizes were unequal. reasons for this. First, the parametric advantages of the
In contrast, voom shows consistent performance in all Poisson or NB distributions are mitigated by the fact that
scenarios. the observed mean-variance relationship of RNA-seq data
Law et al. Genome Biology 2014, 15:R29 Page 12 of 17
http://genomebiology.com/2014/15/2/R29
effects or numerical covariates. As has been done pre- [8] is a competitive test that assesses differential expres-
viously [3,7,8,34], we use linear models to describe how sion of the gene set relative to all other genes on the array.
the treatment factors are assigned to the different RNA Both procedures offer considerable flexibility as they have
samples. We assume that: the ability to test the association of a genomic pathway or
gene set signature with quite general treatment compar-
E( ygi ) = μgi = xTi βg
isons or contrasts defined in the context of a microarray
where xi is a vector of covariates and βg is a vector linear model. We have adapted both methods to make use
of unknown coefficients representing log2 -fold-changes of quantitative weights as output by voom. The revised
between experimental conditions. In matrix terms: methods are implemented in the functions roast() and
E( yg ) = Xβg camera() of the limma software package.
In an alternative simulation, to investigate sensitivity to Genes were filtered out if they failed to achieve cpm > 1
the distribution of gene-wise dispersions, the δg were sim- in at least four libraries, and the remaining log-cpm values
ulated as log-normal with mean 0 and standard deviation were quantile normalized between libraries [61].
0.25 on the log-scale. This produces a distribution with a The comparison between technical replicates to check
similar CV as for the inverse chi-square simulation. the type I error rate control used only the four group B
For each simulated dataset, genes with less than ten libraries. Genes were filtered out if they failed to achieve
reads across all samples were filtered from the analysis. a cpm > 1 in at least two libraries and the log-cpm values
PoissonSeq resets the seed of the random number genera- for the 16,745 remaining genes were quantile normalized.
tor in R, so it was necessary to save and restore the state of Samples were separated into all possible two-versus-two
the random number generator before and after each call and three-versus-one combinations and a limma analysis
of the main PoissonSeq function. using voom weights was carried out for each partition.
Complete runnable code that reproduces all the simu- The false discovery rate analysis was conducted on the
lations is provided as Additional file 1. See also the voom spike-in transcripts only. ERCC Mixes 1 and 2 contain 92
website [56]. transcripts spiked in at different concentrations. For this
analysis, fragments were mapped to the known sequences
SEQC data of the spiked-in transcripts using Subread. The experi-
The SEQC project, also known as MAQC-III, aims ment is designed so that 23 transcripts have the same
to provide a comprehensive study of next-generation concentration in Mix 1 and Mix 2. The remaining tran-
sequencing technologies [37]. We analyze here a pilot scripts were spiked-in in such a way that 23 transcripts are
SEQC dataset consisting of 16 RNA-seq libraries in four fourfold more abundant in Mix 1, 23 are 1.5 higher in Mix
groups. The full SEQC data including the 16 libraries 2 and 23 are twofold higher in Mix 2. A majority of the
analyzed here will become available as GEO series spike-in transcripts data are DE. We replicated the counts
[GEO:GSE47792] when the main SEQC article is pub- for each of the 23 non-DE transcripts three times, so
lished in 2014. In the meantime, the aligned and summa- that each non-DE transcript was treated as three different
rized read counts for the pilot libraries needed to repeat transcripts. This resulted in a dataset of 138 transcripts,
the analyses in this article are available from the voom half DE and half non-DE. Our analysis used read counts
webpage [56]. for the spike-in transcripts only. TMM-scale normaliza-
The groups are labeled A–D and are closely analogous tion [54] was used for all the analysis methods, except
to the similarly labeled RNA samples used in the earlier for DESeq and PoissonSeq, which have their own built-
microarray quality control study [57]. Libraries in group in normalization methods. No transcripts were filtered,
A are profiles of Stratagene’s Universal Human Reference except by PoissonSeq as its standard analysis includes the
RNA (UHRR) with the addition of RNA from Ambion’s removal of probes with low counts. The genes that were
ERCC ExFold RNA spike-in mix 1 (Mix 1). Libraries in filtered out by PoissonSeq were re-introduced to the end
group B are profiles of Ambion’s Human Brain Refer- of the gene ranking, ordered from largest mean count to
ence RNA (HBRR) with added RNA from Ambion’s ERCC lowest mean count.
ExFold RNA spike-in mix 2 (Mix 2). RNA samples in
groups C and D are mixtures of A and B in the proportions Lymphoblastoid cell lines from Nigerian individuals
75:25 and 25:75, respectively. An Illumina HiSeq 2000 was As part of the International HapMap Project, RNA sam-
used to create a FastQ file of paired-end sequence reads ples were obtained from lymphoblastoid cell lines derived
for each sample. The library size for each sample varied from 69 unrelated Nigerian individuals including 29 males
from 5.4 to 8.0 million read pairs. Fragments were mapped and 40 females [40]. Sequencing was performed using
to the National Center for Biotechnology Information’s an Illumina Genome Analyzer II. Read counts, sum-
Build 37.2 of the human genome using the Subread aligner marized by Ensembl gene, and transcript annotations
[58]. Fragment counts were summarized by Entrez Gene were obtained from version 1.0.9 of the tweeDEseq-
ID using the featureCounts function [59] of version 1.8.2 CountData Bioconductor package [43], specifically from
of the Bioconductor package Rsubread [60]. Fragments the data objects pickrell1, annotEnsembl63 and
with both end reads mapped successfully contributed one genderGenes. Genes were filtered if they failed to
count if the fragment overlapped any annotated exon for achieve a cpm value of 1 in at least 20 libraries. Library
that gene. Fragments for which only one read mapped suc- sizes were scale-normalized by the TMM method [54]
cessfully contributed half a count if that read overlapped using edgeR software [12] prior to the voom analysis.
an exon. The summarized read count data is available
from the voom webpage [56]. Development stages of Drosophila melanogaster
The voom mean-variance trend shown in Figure 1a was RNA-seq was used to explore the developmental tran-
obtained from all 16 libraries, treated as four groups. scriptome of D. melanogaster [45]. Mapped read counts
Law et al. Genome Biology 2014, 15:R29 Page 15 of 17
http://genomebiology.com/2014/15/2/R29
are available from the ReCount project [62]. Specifically default prior degrees of freedom for squeezing the gene-
the pooled version of the modencodefly dataset from the wise dispersions is 10. Note that this is a change from
ReCount website [63] provides read counts summarized versions 3.0.X and earlier for which the default had
by Ensembl 61 gene IDs for 30 whole-animal biological been 20. For DSS the Wald test was used as rec-
samples. We discarded the larval, pupal and adult stages ommended in the documentation. The DESeq defaults
and kept only the 12 embryonic samples. Genes were have changed considerably since the original publication.
retained in the analysis if they achieved cpm > 1 for any We used the DESeq function estimateDispersions with
embryonic stage. Effective library sizes were estimated by sharingMode="maximum" and fitType="local"
TMM scale-normalization [54] using edgeR software [12] and conducted tests using nbinomTest.
prior to the voom analysis. The different count-based packages implement differ-
Gene ontology analysis used the GOstats software pack- ent methods of compositional normalization [54]. For
age [64] and version 2.9.0 of the org.Dm.eg.db annotation our simulations, there are no compositional differences
package [65]. All GO terms mentioned in the Results between the libraries so there should be no need to esti-
section had Fisher’s exact test P values less than 10−10 . mate compositional normalization factors. For this reason
we did not use the calcNormFactors function with edgeR
C57BL/6J and DBA/2J inbred mouse strains or estimateSizeFactors with DESeq or estNormFactors
An RNA-seq experiment was carried out to detect differ- with DSS. This should tend to improve the performance of
ential striatal gene expression between the C57BL/6J (B6) the packages and to make them more comparable, as any
and DBA/2J (D2) inbred mouse strains [66]. Profiles were differences between the packages can be attributed to the
made for 10 B6 and 11 D2 mice. Mapped read counts statistical procedures rather than to differences between
summarized by Ensembl 61 gene IDs were downloaded the normalization strategies.
as the bottomly dataset from the ReCount website [63].
Genes were filtered out if they failed to achieve cpm > 1 Additional file
in at least four libraries and the remaining log-cpm val-
ues were quantile normalized. The limma-voom analysis Additional file 1: Simulation code. R code to reproduce the simulations
compared the two strains and included a batch effect cor- presented in Figures 3, 4 and 5.
rection for the Illumina flow cell in which each sample was
Abbreviations
sequenced. The voom mean-variance trend is shown in CV: coefficient of variation; DE: differentially expressed; FDR: false discovery rate;
Figure 1b. HBRR: Ambion’s Human Brain Reference RNA; log-cpm: log-counts per million;
LOWESS: locally weighted regression; NB: negative binomial; rpkm: reads per
kilobase per million; UHRR: Stratagene’s Universal Human Reference RNA.
Software
The results presented in this article were obtained using Competing interests
R version 3.0.0 and the software packages limma 3.16.2, The authors declare that they do not have any competing interests.
edgeR 3.2.3, baySeq 1.14.1, DESeq 1.12.0, DSS 1.4.0, Pois- Authors’ contributions
sonSeq 1.1.2 and tweeDEseqCountData 1.0.8. All of these CL and GS developed the method and wrote the manuscript. GS
packages are part of the Bioconductor project [67,68], implemented the method and CL performed the analyses. YC contributed to
the comparative simulation studies and carried out the gene ontology
except for PoissonSeq, which is part of the Comprehen-
analysis. WS performed read alignment and summarized the SEQC and mouse
sive R Archive Network [69]. The TSPM function, dated RNA-seq datasets. All authors read and approved the final manuscript.
February 2011, was downloaded in March 2013 from the
Acknowledgements
author’s webpage [70].
The authors are grateful to Charles Wang and Leming Shi for the preliminary
The voom methodology proposed in the article is SEQC data and to Stephen Nutt for the RNA-seq mouse data used to motivate
implemented in the voom function of the limma pack- aspects of the simulation study.
age. The limma-trend method was implemented by
Author details
inputting the log-cpm values from voom into limma’s 1 Bioinformatics Division, The Walter and Eliza Hall Institute of Medical
standard pipeline, with trend=TRUE for the eBayes Research, 1G Royal Parade, Parkville, Victoria 3052, Australia. 2 Department of
function. Hence the limma-trend pipeline was the same Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia.
3 Department of Computing and Information Systems, The University of
as that for voom except that weights were not used Melbourne, Parkville, Victoria 3010, Australia. 4 Department of Mathematics
in the linear modeling step and the trend option was and Statistics, The University of Melbourne, Parkville, Victoria 3010, Australia.
turned on for the empirical Bayes step. The limma
Received: 25 September 2013 Accepted: 3 February 2014
package can be installed from the Bioconductor project Published: 3 February 2014
repository [71].
All the count-based packages were used with the References
1. Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays
default differential expression pipelines as recommended applied to the ionizing radiation response. Proc Natl Acad Sci 2001,
in the software for each package. For edgeR 3.2.3 the 98:5116–5121.
Law et al. Genome Biology 2014, 15:R29 Page 16 of 17
http://genomebiology.com/2014/15/2/R29
2. Wright GW, Simon RM: A random variance model for detection of 24. Srivastava S, Chen L: A two-parameter generalized Poisson model to
differential gene expression in small microarray experiments. improve the analysis of RNA-seq data. Nucleic Acids Res 2010, 38:e170.
Bioinformatics 2003, 19:2448–2455. 25. Auer PL, Doerge RW: A two-stage Poisson model for testing RNA-seq
3. Smyth G: Linear models and empirical Bayes methods for assessing data. Stat Appl Genet Mol Biol 2011, 10:Article 26.
differential expression in microarray experiments. Stat Appl Genet Mol 26. Li J, Witten D, Johnstone I, Tibshirani R: Normalization, testing, and
Biol 2004, 3:Article 3. false discovery rate estimation for RNA-sequencing data. Biostatistics
4. Cui X, Hwang JG, Qiu J, Blades NJ, Churchill GA: Improved statistical 2012, 13:523–538.
tests for differential gene expression by shrinking variance 27. Robles JA, Qureshi SE, Stephen SJ, Wilson SR, Burden CJ, Taylor JM:
components estimates. Biostatistics 2005, 6:59–75. Efficient experimental design and analysis strategies for the
5. Smyth G, Michaud J, Scott H: Use of within-array replicate spots for detection of differential expression using RNA-Sequencing.
assessing differential expression in microarray experiments. BMC Genomics 2012, 13:484.
Bioinformatics 2005, 21:2067–2075. 28. Soneson C, Delorenzi M: A comparison of methods for differential
6. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, expression analysis of RNA-seq data. BMC Bioinformatics 2013,
Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set 14:91.
enrichment analysis: a knowledge-based approach for interpreting 29. Ritchie M, Diyagama D, Neilson J, Van Laar R, Dobrovic A, Holloway A,
genome-wide expression profiles. Proc Natl Acad Sci USA 2005, Smyth G: Empirical array quality weights in the analysis of
102:15545–15550. microarray data. BMC Bioinformatics 2006, 7:261.
7. Wu D, Lim E, Vaillant F, Asselin-Labat M, Visvader J, Smyth G: ROAST 30. McCullagh P, Nelder JA: Generalized Linear Models. 2nd edition. Boca
rotation gene set tests for complex microarray experiments. Raton: Chapman & Hall/CRC; 1989.
Bioinformatics 2010, 26:2176–2182. 31. Wedderburn RWM: Quasi-likelihood functions, generalized linear
8. Wu D, Smyth G: Camera: a competitive gene set test accounting for models, and the Gauss-Newton method. Biometrika 1974,
inter-gene correlation. Nucleic Acids Res 2012, 40:e133–e133. 61:439–447.
9. Smyth G: Limma: linear models for microarray data. In Bioinformatics 32. Carroll RJ, Ruppert D: A comparison between maximum likelihood
and Computational Biology Solutions using R and Bioconductor. Edited by and generalized least squares in a heteroscedastic linear model.
Gentleman R, Carey V, Dudoit S, Irizarry R, Huber W. New York: Springer; J Am Stat Assoc 1982, 77:878–882.
2005:397–420. 33. Nelder JA, Pregibon D: An extended quasi-likelihood function.
10. Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for Biometrika 1987, 74:221–232.
transcriptomics. Nat Rev Genet 2009, 10:57–63. 34. McCarthy DJ, Chen Y, Smyth GK: Differential expression analysis of
11. Cloonan N, Forrest ARR, Kolle G, Gardiner BBA, Faulkner GJ, Brown MK, multifactor RNA-Seq experiments with respect to biological
Taylor DF, Steptoe AL, Wani S, Bethel G, Robertson AJ, Perkins AC, Bruce variation. Nucleic Acids Res 2012, 40:4288–4297.
SJ, Lee CC, Ranade SS, Peckham HE, Manning JM, Mckernan KJ, 35. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and
Grimmond SM: Stem cell transcriptome profiling via massive-scale quantifying mammalian transcriptomes by RNA-Seq. Nature Methods
mRNA sequencing. Nature Methods 2008, 5:613–619. 2008, 5:621–628.
12. Robinson M, McCarthy D, Smyth G: edgeR: a Bioconductor package for 36. Sartor MA, Tomlinson CR, Wesselkamper SC, Sivaganesan S, Leikauf GD,
differential expression analysis of digital gene expression data. Medvedovic M: Intensity-based hierarchical Bayes method improves
Bioinformatics 2010, 26:139–140. testing for differentially expressed genes in microarray
13. Anders S, Huber W: Differential expression analysis for sequence experiments. BMC Bioinformatics 2006, 7:538.
count data. Genome Biol 2010, 11:R106. 37. Sequencing Quality Control (SEQC) Project. [http://www.fda.gov/
14. Oshlack A, Robinson MD, Young MD: From RNA-seq reads to MicroArrayQC]
differential expression results. Genome Biol 2010, 11:220. 38. Ambion FirstChoice Human Brain Reference RNA. [http://products.
15. Perkins TT, Kingsley RA, Fookes MC, Gardner PP, James KD, Yu L, Assefa SA, invitrogen.com/ivgn/product/AM6050]
He M, Croucher NJ, Pickard DJ, Maskell DJ, Parkhill J, Choudhary J, 39. Baker SC, Bauer SR, Beyer RP, Brenton JD, Bromley B, Burrill J, Causton H,
Thomson NR, Dougan G: A strand-specific RNA-seq analysis of the Conley MP, Elespuru R, Fero M, Foy C, Fuscoe J, Gao X, Gerhold DL, Gilles
transcriptome of the typhoid bacillus Salmonella Typhi. PLoS Genet P, Goodsaid F, Guo X, Hackett J, Hockett RD, Ikonomi P, Irizarry RA,
2009, 5:e1000569. Kawasaki ES, Kaysser-Kranich T, Kerr K, Kiser G, Koch WH, Lee KY, Liu C, Liu
16. Han X, Wu X, Chung WY, Li T, Nekrutenko A, Altman NS, Chen G, Ma H: ZL, Lucas A, et al: The external RNA controls consortium: a progress
Transcriptome of embryonic and neonatal mouse cortex by report. Nature Methods 2005, 2:731–734.
high-throughput RNA sequencing. Proc Natl Acad Sci 1274, 40. Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E,
106:1–12746. Veyrieras JB, Stephens M, Gilad Y, Pritchard JK: Understanding
17. Parikh A, Miranda ER, Katoh-Kurasawa M, Fuller D, Rot G, Zagar L, Curk T, mechanisms underlying human gene expression variation with RNA
Sucgang R, Chen R, Zupan B, Loomis WF, Kuspa A, Shaulsky G: Conserved sequencing. Nature 2010, 464:768–772.
developmental transcriptomes in evolutionarily divergent species. 41. Esnaola M, Puig P, Gonzalez D, Castelo R, Gonzalez JR: A flexible count
Genome Biol 2010, 11:R35. data model to fit the wide diversity of expression profiles arising
18. Robinson MD, Smyth GK: Small-sample estimation of negative from extensively replicated RNA-seq experiments. BMC Bioinformatics
binomial dispersion, with applications to SAGE data. Biostatistics 2013, 14:254.
2008, 9:321–332. 42. Skaletsky H, Kuroda-Kawaguchi T, Minx PJ, Cordum HS, Hillier L, Brown LG,
19. Zhou YH, Xia K, Wright FA: A powerful and flexible approach to the Repping S, Pyntikova T, Ali J, Bieri T, Chinwalla A, Delehaunty A,
analysis of RNA sequence count data. Bioinformatics 2011, Delehaunty K, Du H, Fewell G, Fulton L, Fulton R, Graves T, Hou S-F,
27:2672–2678. Latrielle P, Leonard S, Mardis E, Maupin R, McPherson J, Miner T, Nash W,
20. Robinson MD, Smyth GK: Moderated statistical tests for assessing Nguyen C, Ozersky P, Pepin K, Rock S, et al: The male-specific region of
differences in tag abundance. Bioinformatics 2007, 23:2881–2887. the human Y chromosome is a mosaic of discrete sequence classes.
21. Hardcastle TJ, Kelly KA: baySeq: empirical Bayesian methods for Nature 2003, 423:825–837.
identifying differential expression in sequence count data. 43. Gonzalez JR, Esnaola M: tweeDEseqCountData: RNA-seq count data
BMC Bioinformatics 2010, 11:422. employed in the vignette of the tweeDEseq package. [http://www.
22. Wu H, Wang C, Wu Z: A new shrinkage estimator for dispersion bioconductor.org]
improves differential expression detection in RNA-seq data. 44. Carrel L, Willard HF: X-inactivation profile reveals extensive variability
Biostatistics 2013, 14:232–243. in X-linked gene expression in females. Nature 2005, 434:400–404.
23. Lund S, Nettleton D, McCarthy D, Smyth G: Detecting differential expression 45. Graveley BR, Brooks N, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri G,
in RNA-sequence data using quasi-likelihood with shrunken van Baren MJ, Boley N, Booth BW, Brown JB, Cherbas L, Davis CA, Dobin A,
dispersion estimates. Stat Appl Genet Mol Biol 2012, 11:Article 8. Li R, Lin W, Malone JH, Mattiuzzo NR, Miller D, Sturgill D, Tuch BB, Zaleski
Law et al. Genome Biology 2014, 15:R29 Page 17 of 17
http://genomebiology.com/2014/15/2/R29
C, Zhang D, Blanchette M, Dudoit S: The developmental transcriptome 67. Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B,
of Drosophila melanogaster. Nature 2011, 471:473–479. Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R,
46. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, Tanzer A, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth GK,
Lagarde J, Lin W, Schlesinger F, Xue C, Marinov GK, Khatun J, Williams BA, Tierney L, Yang JY, Zhang J: Bioconductor: open software
Zaleski C, Rozowsky J, Röder M, Kokocinski F, Abdelhamid RF, Alioto T, development for computational biology and bioinformatics.
Antoshechkin I, Baer MT, Bar NS, Batut P, Bell K, Bell I, Chakrabortty S, Chen Genome Biol 2004, 5:R80.
X, Chrast J, Curado J, et al: Landscape of transcription in human cells. 68. Bioconductor: open source software for bioinformatics. [http://www.
Nature 2012, 489:101–108. bioconductor.org]
47. Gonzalez-Porta M, Frankish A, Rung J, Harrow J, Brazma A: Transcriptome 69. Comprehensive R Archive Network. [http://www.r-project.org]
analysis of human tissues and cell lines reveals one dominant 70. Auer P, Doerge RW: TSPM.R: R code for a two-stage Poisson model for
transcript per gene. Genome Biol 2013, 14:R70. testing RNA-seq data. 2011. [http://www.stat.purdue.edu/~doerge/
48. Bera AK, Bilias Y: Rao’s score, Neyman’s Cα and Silvey’s LM tests: an software/TSPM.R]
essay on historical developments and some new results. J Stat Plann 71. Smyth GK: limma: Linear Models for Microarray Data. [http://www.
Inference 2001, 97:9–44. bioconductor.org/packages/release/bioc/html/limma.html]
49. Pregibon D: Score tests in GLIM with applications. In GLIM 82
Proceedings of the International Conference on Generalised Linear Models. doi:10.1186/gb-2014-15-2-r29
Edited by Gilchrist R. New York: Springer; 1982:87–97. Cite this article as: Law et al.: voom: precision weights unlock linear model
50. Phipson B, Lee S, Majewski IJ, Alexander WS, Smyth GK: Empirical Bayes analysis tools for RNA-seq read counts. Genome Biology 2014 15:R29.
in the presence of exceptional cases, with application to microarray
data. 2013. [http://www.statsci.org/smyth/pubs/RobustEBayesPreprint.
pdf]
51. Oehlert GW: A note on the delta method. Am Statistician 1992, 46:27–29.
52. Cleveland WS: Robust locally weighted regression and smoothing
scatterplots. J Am Stat Assoc 1979, 74:829–836.
53. Oshlack A, Emslie D, Corcoran L, Smyth G: Normalization of boutique
two-color microarrays with a high proportion of differentially
expressed probes. Genome Biol 2007, 8:R2.
54. Robinson MD, Oshlack A: A scaling normalization method for
differential expression analysis of RNA-seq data. Genome Biol 2010,
11:R25.
55. Gale WA, Sampson G: Good-Turing frequency estimation without
tears. J Quant Linguist 1995, 2:217–237.
56. Law CW, Chen Y, Shi W, Smyth GK: Supplementary information for
‘Voom: precision weights unlock linear model analysis tools for
RNA-seq read counts’. [http://bioinf.wehi.edu.au/voom]
57. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ,
de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist
RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM,
Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Scherf U,
Thierry-Mieg J, Wang C, Wilson M, Wolber PK, et al: The MicroArray
Quality Control (MAQC) project shows inter- and intraplatform
reproducibility of gene expression measurements. Nature Biotechnol
2006, 24:1151–1161.
58. Liao Y, Smyth GK, Shi W: The Subread aligner: fast, accurate and
scalable read mapping by seed-and-vote. Nucleic Acids Res 2013,
41:e108.
59. Liao Y, Smyth GK, Shi W: featureCounts: an efficient general-purpose
read summarization program. Bioinformatics 2013. [http://
bioinformatics.oxfordjournals.org/content/early/2013/11/30/
bioinformatics.btt656]
60. Shi W, Liao Y: Rsubread: a super fast, sensitive and accurate read
aligner for mapping next-generation sequencing reads. [http://
www.bioconductor.org]
61. Bolstad BM, Irizarry RA, Åstrand M, Speed TP: A comparison of
normalization methods for high density oligonucleotide array data
based on variance and bias. Bioinformatics 2003, 19:185–193.
62. Frazee AC, Langmead B, Leek JT: ReCount: a multi-experiment Submit your next manuscript to BioMed Central
resource of analysis-ready RNA-seq gene count datasets. BMC and take full advantage of:
Bioinformatics 2011, 12:449.
63. Frazee A, Langmead B, Leek J: ReCount: a multi-experiment resource
• Convenient online submission
of analysis-ready RNA-seq gene count datasets. [http://bowtie-bio.
sourceforge.net/recount] • Thorough peer review
64. Falcon S, Gentleman R: Using GOstats to test gene lists for GO term • No space constraints or color figure charges
association. Bioinformatics 2007, 23:257–258.
65. Carlson M: org.Dm.eg.db: Genome wide annotation for Fly. [http:// • Immediate publication on acceptance
www.bioconductor.org] • Inclusion in PubMed, CAS, Scopus and Google Scholar
66. Bottomly D, Walter NA, Hunter JE, Darakjian P, Kawane S, Buck KJ, Searles
• Research which is freely available for redistribution
RP, Mooney M, McWeeney S K Hitzemann, R: Evaluating gene
expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq
and microarrays. PLoS One 2011, 6:e17820. Submit your manuscript at
www.biomedcentral.com/submit