Science’s Significant Stats Problem

In 2009, researchers working in Thailand made headlines with a small success in a trial of an HIV vaccine. It reduced the rate of infection by 31 percent, the scientists calculated. That may not sound impressive, but in the fight against HIV, it looked like an unprecedented success. The researchers published their results in the influential New England Journal of Medicine, reporting that the data had passed standard statistical tests: If the vaccine had actually been worthless, there was only a 1 in 25 chance that it would have appeared to have the beneficial effect seen in the study.

In medicine, as in most other realms of science, observing low-probability data like that in the HIV study is cause for celebration. Typically, scientists in fields like biology, psychology, and other social sciences rejoice when the chance of a fluke is less than 1 in 20. In some fields, however, such as particle physics, researchers are satisfied only with much lower probabilities, on the order of one chance in 3.5 million. But whatever the threshold, recording low-probability data—data unlikely to be seen if nothing is there to be discovered—is what entitles you to conclude that you’ve made a discovery. Observing low-probability events is at the heart of the scientific method for testing hypotheses.

Scientists use elaborate statistical significance tests to distinguish a fluke from real evidence. But the sad truth is that the standard methods for significance testing are often inadequate to the task. In the case of the HIV vaccine, for instance, further analysis showed the findings not to be as solid as the original statistics suggested. Chances were probably 20 percent or higher that the vaccine was not effective at all.

Thoughtful experts have been pointing out serious flaws in standard statistical methods for decades. In recent years, the depth of the problem has become more apparent and more well-documented. One recent paper found an appallingly low chance that certain neuroscience studies could correctly identify an effect from statistical data. Reviews of genetics research show that the statistics linking diseases to genes are wrong far more often than they’re right. Pharmaceutical companies find that test results favoring new drugs typically disappear

You're reading a preview, sign up to read more.

More from Nautilus

Nautilus8 min readScience
Can New Species Evolve From Cancers? Maybe.
Reprinted with permission from Quanta Magazine’s Abstractions blog. Aggressive cancers can spread so fiercely that they seem less like tissues gone wrong and more like invasive parasites looking to consume and then break free of their host. If a wild
Nautilus6 min read
Butterfly Wonk Robert Pyle Pens His First Novel 44 Years in the Making
Last year marked a first for 71-year old Robert Michael Pyle, the acclaimed author, naturalist, and ecologist: the publication of his long-awaited first novel, Magdalena Mountain, nearly half a century in the making. Pyle has been investigating the b
Nautilus15 min read
How American Tycoons Created the Dinosaur: The story of dinosaurs is also the story of capitalism.
The dinosaur is a chimera. Some parts of this complex assemblage are the result of biological evolution. But others are products of human ingenuity, constructed by artists, scientists, and technicians in a laborious process that stretches from the di