You are on page 1of 2

speaking-of-science

Many scientific studies can't be replicated. That's a problem. ; The reproducibility crisis in science
just got a little more intense.
By Joel Achenbach
1,163 words
27 August 2015
Washington Post.com
WPCOM
English
Copyright 2015, The Washington Post Co. All Rights Reserved.
This post has been updated.

Maverick researchers have long argued that much of what gets published in elite scientific journals is
fundamentally squishy — that the results tell a great story but can't be reproduced when the experiments are
run a second time.

Now a volunteer army of fact-checkers has published a new report that affirms that the skepticism was
warranted. Over the course of four years, 270 researchers attempted to reproduce the results of 100
experiments that had been published in three prestigious psychology journals.

It was awfully hard. They ultimately concluded that they'd succeeded just 39 times.

The failure rate surprised even the leaders of the project, who had guessed that perhaps half the results
wouldn't be reproduced.

The new paper, titled "Estimating the reproducibility of psychological science," was published Thursday in the
journal Science. The sweeping effort was led by the Center for Open Science, a nonprofit based in
Charlottesville. The center's director, Brian Nosek, a University of Virginia psychology professor, said the
review focused on the field of psychology because the leaders of the center are themselves psychologists.

Despite the rather gloomy results, the new paper pointed out that this kind of verification is precisely what
scientists are supposed to do: "Any temptation to interpret these results as a defeat for psychology, or
science more generally, must contend with the fact that this project demonstrates science behaving as it
should."

The phenomenon -- irreproducible results -- has been a nagging issue in the science world in recent years.
That's partly due to a few spectacular instances of fraud, such as when Dutch psychologist Diederik Stapel
admitted in 2011 that he'd been fabricating his data for years.

[Background: Reproducibility is the new scientific revolution]

A more fundamental problem, say Nosek and other reform-minded scientists, is that researchers seeking
tenure, grants or professional acclaim feel tremendous pressure to do experiments that have the kind of
snazzy results that can be published in prestigious journals.

They don't intentionally do anything wrong, but may succumb to motivated reasoning. That's a subtle form of
bias, like unconsciously putting your thumb on the scale. Researchers see what they want and hope to see,
or tweak experiments to get a more significant result.

Moreover, there's the phenomenon of "publication bias." Journals are naturally eager to publish significant
results rather than null results. The problem is that, by random chance, some experiments will produce
results that appear significant but are merely anomalies — spikes in the data that might mean nothing.

Reformers like Nosek want their colleagues to pre-register their experimental protocols and share their data
so that the rest of the community can see how the sausage is made. Meanwhile, editors at Science, Nature
and other top journals have crafted new standards that require more detailed explanations of how
experiments are conducted.
Page 1 of 2 © 2015 Factiva, Inc. All rights reserved.
Gilbert Chin, senior editor of the journal Science, said in a teleconference this week, "This somewhat
disappointing outcome does not speak directly to the validity or the falsity of the theories. What it does say is
that we should be less confident about many of the experimental results that were provided as empirical
evidence in support of those theories."

['Fraudulent' peer review strikes another academic publisher -- 32 articles questioned]

John Ioannidis, a professor of medicine at Stanford, has argued for years that most scientific results are less
robust than researchers believe. He published a paper in 2005 with the instantly notorious title, "Why Most
Published Research Findings Are False."

In an interview this week, Ioannidis called the new paper "a landmark for psychological science" and said it
should have repercussions beyond the field of psychology. He said the paper validates his long-standing
argument, "and I feel sorry for that. I wish I had been proven wrong."

The 100 replication attempts, whether successful or unsuccessful, do not definitively prove or disprove the
results of the original experiments, noted Marcia McNutt, editor-in-chief of the Science family of journals.
There are many reasons that a replication might fail to yield the same kind of data.

Perhaps the replication was flawed in some key way — a strong possibility in experiments that have multiple
moving parts and many human factors.

And science is conducted on the edge of the knowable, often in search of small, marginal effects.

"The only finding that will replicate 100 percent of the time is one that's likely to be trite and boring and
probably already known," said Alan Kraut, executive director of the Association for Psychological Science. "I
mean, yes, dead people can never be taught to read."

[Two scientific journals accepted a study by Maggie Simpson and Edna Krabappel]

One experiment that underwent replication had originally showed that students who drank a sugary beverage
were better able to make a difficult decision about whether to live in a big apartment far from campus or a
smaller one closer to campus. But that first experiment was conducted at Florida State University. The
replication took place at the University of Virginia. The housing decisions around Charlottesville were much
simpler -- effectively blowing up the experiment even before the first sugary beverage had been consumed.

Another experiment had shown, the first time around, that students exposed to a text that undermined their
belief in free will were more likely to engage in cheating behavior. The replication, however, showed no such
effect.

The co-author of the original paper, Jonathan Schooler, a psychologist at the University of California at Santa
Barbara, said he still believes his original findings would hold up under specified conditions, but added,
"Those conditions may be more narrowly specified than we originally appreciated."

He has himself been an advocate for improving reproducibility, and said the new study shouldn't tarnish the
reputation of his field: "Psychology's really leading the charge here in investigating the science of science."

Nosek acknowledged that this new study is itself one that would be tricky to reproduce exactly, because there
were subjective decisions made along the way and judgment calls about what, exactly, "reproduced" means.
The very design of the review injected the possibility of bias, in that the volunteer scientists who conducted
the replications were allowed to pick which experiments they wanted to do.

"At every phase of this process, decisions were made that might not be exactly the same kind of decision that
another group would make," Nosek said.

There are about 1.5 million scientific studies published a year, he said. This review looked at only 100
studies.

That's a small sample size — another reason to be hesitant before declaring the discovery of a new truth.

Further Reading:

Sexism in science: Peer editor tells female researchers their study needs a male author

Hundreds of scientists ask Science to stop publishing a smorgasbord of stereotypes

Document WPCOM00020150829eb8r004jl

Page 2 of 2 © 2015 Factiva, Inc. All rights reserved.

You might also like