Original Title: nmeth.3665

Know when your

numbers are significant

Experimental biologists, their reviewers and their publishers must grasp basic

statistics, urges David L. Vaux, or sloppy science will continue to grow.

T

he incidence of papers in cell and to justify that decision, erodes the integrity And, once in the lab, people generally just

molecular biology that have basic of the scientific literature. do what everyone else does, without always

statistical mistakes is alarming. I see It is eight years since Nature adopted a understanding why.

figures with error bars that do not say what policy of insisting that papers containing fig- Even if experimental biologists do not

they describe, and error bars and P values for ures with error bars describe what the error need to use statistical evidence for their

single, ‘representative’ experiments. So, as an bars represent2. Nevertheless, it is still com- own experiments, they should have an

increasingly weary reviewer of many a biol- mon to find papers in most biology journals understanding of the basics so that they

ogy publication, I’m going to spell out again1 — Nature included — that contain this and can interpret others’ work critically. They

the basics that every experimental biologist other basic statistical errors. In my opinion, don’t all need to understand complex sta-

should know. the fact that these scientifically sloppy papers tistics, or to hire professional statisticians,

Simply put, statistics and error bars should continue to be published means that the but there would be fewer sloppy papers if

be used only for independent data, and not authors, reviewers and editors cannot com- every author, reviewer and editor under-

for identical replicates within a single experi- prehend the statistics, that they have not read stood statistical concepts such as stand-

ment. Because science represents the knowl- the paper carefully, or both. ard deviation, standard error of the mean

edge gained from repeated observations or Why does this happen? Most cell and (s.e.m.), sampling error and the difference

experiments, these have to be performed molecular biologists are taught some between replicate and independent data

more than once — or must use multiple statistics during their high-school or under- (see ‘Statistics glossary’).

independent samples — for us to have con- graduate years, but the principles seem to

fidence that the results are not just a fluke, a be forgotten somewhere between gradua- BACK TO BASICS

coincidence or a mistake. To show only the tion and starting in the lab. Often, the type In the life sciences there are typically two

result of a single experiment, even if it is a of statistics they learnt is not relevant to the types of publication: those that use large data

representative one, and then misuse statistics kinds of experiment they are now doing. sets and rely mostly or wholly on statistical

1 8 0 | NAT U R E | VO L 4 9 2 | 1 3 D E C E M B E R 2 0 1 2

© 2012 Macmillan Publishers Limited. All rights reserved

COMMENT

psychology, clinical trials and genome-wide STATISTICS GLOSSARY

Some common statistical concepts and their uses in analysing experimental results.

association studies), and those that do not

— such as much cell and molecular biology, Term Meaning Common uses

biochemistry and classical genetics. Standard deviation The typical difference Describing how broadly the sample values are

For papers with large data sets that rely (s.d.) between each value and the distributed.

mean value. s.d. = √–(∑ (x − mean)2/(N − 1))

purely on statistical evidence, recommen-

dations exist for computing sample size, Standard error of An estimate of how variable Inferring where the population mean is likely to

the mean (s.e.m.) the means will be if the lie, or whether sets of samples are likely to come

reporting on outlying results and other experiment is repeated from the same population.

issues3,4. But these guidelines do not serve multiple times. s.e.m. = s.d./√–N

authors of the other category of papers. Cell Confidence interval With 95% confidence, the To infer where the population mean lies, and to

and molecular biologists have the luxury (CI; 95%) population mean will lie in compare two populations.

of being able to probe their experimental this interval. CI = mean ± s.e.m. × t (N−1)

systems in multiple, independent ways and Independent data Values from separate Testing hypotheses about the population.

can therefore often get by with Ns of three, experiments of the same type

without the need for sophisticated statistics. that are not linked.

The first figure in a typical paper in cell or Replicate data Values from experiments Serves as an internal check on performance of an

molecular biology, for example, might show where everything is linked as experiment.

much as possible.

the difference in phenotype between three

wild-type and three gene-deleted mice. The Sampling error Variation caused by sampling Can reveal bias in the data (if it is too small)

part of a population rather or problems with conduct of the experiment

second figure might compare the levels of than measuring the whole (if it is too big). In binomial distributions (such

proteins in cells derived from the mice, look- population. as live and dead cell counts) the expected s.d.

ing at both the deleted protein and one of its is √–(N × p × (1 − p)); in Poisson distributions (for

example, cells per field) the expected s.d. is

substrates, or the effects of treating wild-type √–mean.

cells with an inhibitor of the protein encoded

N, number of independent samples; t, the t-statistic; p, probability.

by the deleted gene. If the evidence from these

experiments is consistent, and gives support

to a coherent model, it would be unnecessary the effects or their biological significance. the fields they cover. All journals should

to analyse 30 mice of each type, or to repeat Figure legends should state the number follow the lead of the Journal of Cell Biol-

the Western blots of protein levels 30 inde- of independent data points and, for experi- ogy7 and make a final check of all figures in

pendent times. Watson and Crick’s paper on ments in which replicates were performed, accepted papers before publication. They

the structure of DNA5 does not contain statis- only the mean of the replicates should be should refuse to publish papers that contain

tics, graphs with error bars or large Ns. shown as a single independent data point. fundamental errors, and readily publish

Understanding the rudiments of statistics For replicates, no statistics should be shown, corrections for published papers that fall

would stop experimental biologists from because they give only an indication of short. This requires engaging reviewers who

calculating a P value and a s.e.m. from trip- the fidelity with which the replicates were are statistically literate and editors who can

licates from one representative experiment, created: they might indicate how good the verify the process. Numerical data should

and might stop the reviewers and editors pipetting was, but they have no bearing on be made available either as part of the paper

from letting these pass unquestioned. If the the hypothesis being or as linked, computer-interpretable files so

results from one representative experiment “Experimental tested6. that readers can perform or confirm statisti-

are shown, then N = 1 and statistics do not biologists All experimen- cal analyses themselves.

apply. Besides, it is always better to include a should know tal biologists and all When William Strunk Jr, a professor of

full data set, rather than withholding results what sort those who review English, was faced with a flood of errors in

that are not representative. When N is only of sampling their papers should spelling, grammar and English usage, he

2 or 3, it would be more transparent to just errors are to know what sort of wrote a short, practical guide that became

plot the independent data points, and let the be expected.” sampling errors are to The Elements of Style (also known as Strunk

readers interpret the data for themselves, be expected in com- and White)8. Perhaps experimental biologists

rather than showing possibly misleading mon experiments, such as determining the need a similar booklet on statistics. ■

P values or error bars and drawing statisti- percentages of live and dead cells or count-

cal inferences. ing the number of colonies on a plate or cells David L. Vaux is professor of cell biology

If the data in an experiment are equivocal, in a microscope field. Otherwise, they will at the Walter and Eliza Hall Institute of

or the effect size is small, it is much better not be able to judge their own data critically, Medical Research and at the University

to come up with an extra, mechanistically or anyone else’s. of Melbourne, Parkville, Victoria 3052,

different, experiment to test the hypothesis, Australia.

than to repeat the same experiment until P is REPEAT AFTER ME e-mail: vaux@wehi.edu.au

less than 0.05. How can the understanding and use of

If statistics are shown, it should be for elementary statistics be improved? Young 1. Cumming, G., Fidler, F. & Vaux, D. L. J. Cell Biol.

177, 7–11 (2007).

a good reason. Descriptive statistics, such researchers need to be taught the practicali- 2. Vaux, D. L. Nature 428, 799 (2004).

as range or standard deviations, are only ties of using statistics at the point at which 3. Landis, S. C. et al. Nature 490, 187–191 (2012).

necessary when there are too many data they obtain the results of their very first 4. Nakagawa, S. & Cuthill, I. C. Biol. Rev. Camb.

Philos. Soc. 82, 591–605 (2007).

points to visualize easily. Inferential sta- experiments. 5. Watson, J. D. & Crick, F. H. Nature 171, 737–738

tistics (an s.e.m., confidence interval or To encourage established researchers (1953).

P value) should be shown only if they to use statistics properly, journals should 6. Vaux, D. L., Fidler, F. & Cumming, G. EMBO Rep.

13, 291–296 (2012).

make it easier to interpret the results, and publish guidelines for authors, reviewers 7. Rossner, M. The Scientist 20, 24–25 (2006).

they should not detract from other key and editors on the use and presentation 8. Strunk, W. Jr & White, E. B. The Elements of Style

considerations such as the magnitude of of data and statistics that are relevant to (5th edn) (Allyn & Bacon, 2009).

1 3 D E C E M B E R 2 0 1 2 | VO L 4 9 2 | NAT U R E | 1 8 1

© 2012 Macmillan Publishers Limited. All rights reserved

