Professional Documents
Culture Documents
Science Is A Public Good in Peril - Here's How To Fix It - Aeon Essays
Science Is A Public Good in Peril - Here's How To Fix It - Aeon Essays
Science is broken
Perverse incentives and the misuse of quantitative
metrics have undermined the integrity of scientific
research
We argue that over the past half-century, the incentives and reward structure of science
have changed, creating a hypercompetition among academic researchers. Part-time and
adjunct faculty now make up 76 per cent of the academic labour force, allowing universities
to operate more like businesses, making tenure-track positions much more rare and
desirable. Increased reliance on emerging quantitative performance metrics that value
numbers of papers, citations and research dollars raised has decreased the emphasis on
socially relevant outcomes and quality. There is also concern that these pressures could
encourage unethical conduct by scientists and the next generation of STEM scholars who
persist in this hypercompetitive environment. We believe that reform is needed to bring
balance back to the academy and to the social contract between science and society, to
ensure the future role of science as a public good.
Table 1: Modified and with quotes from the blog Embedded in Academia by John Regehr, professor of
computer science at the University of Utah; used with permission.
The increased reliance on quantitative metrics might create inequities and outcomes worse
than the systems they replaced. Specifically, if rewards are disproportionally given to
individuals manipulating the metrics, well-known problems of the old subjective paradigms
(eg, old-boys’ networks) appear simple and solvable. Most scientists think that the damage
owing to metrics is already apparent. In fact, 71 per cent of researchers believe that it is
possible to ‘game’ or ‘cheat’ their way into better evaluations at their institutions.
This manipulation of the evaluative metrics has been documented. Recent exposés have
revealed schemes by journals to manipulate impact factors, use of p-hacking by researchers
to mine for statistically significant and publishable results, rigging of the peer-review
process itself and over-citation practices. The computer scientist Cyril Labbé at the Joseph
Fourier University in Grenoble even created Ike Antkare, a fictional character, who, by
virtue of publishing 102 computer-generated fake papers, achieved a stellar h-index of 94
on Google Scholar, surpassing that of Albert Einstein. Blogs describing how to inflate your
h-index without committing outright fraud are, in fact, just a Google search away.
S ince the Second World War, scientific output as measured by cited work has doubled
every nine years. How much of the growth in this knowledge industry is, in essence,
illusory and a natural consequence of Goodhart’s law? It is a real question.
Consider the role of quality versus quantity maximising true scientific progress. If a process
is overcommitted to quality over quantity, accepted practices might require triple- or
quadruple-blinded studies, mandatory replication of results by independent parties, and
peer review of all data and statistics before publication. Such a system would produce very
few results due to over-caution, and would waste scarce research funding. At another
extreme, an overemphasis on quantity would produce numerous substandard papers with
lax experimental design, little or no replication, scant quality control and substandard peer-
review (see Figure 1 below). As measured by the quantitative metrics, apparent scientific
progress would explode, but too many results would be erroneous, and consumers of
research would be mired in wondering what was valid or invalid. Such a system merely
creates an illusion of scientific progress. Obviously, a balance between quantity and quality
is desirable.
Favouring output over outcomes, or quantity over quality, can also create a ‘perversion of
natural selection’. Such a system is more likely to weed out ethical and altruistic
researchers, while selecting for those who better respond to perverse incentives. The
average scholar can be pressured to engage in unethical practices in order to have or
maintain a career. Then, as per Mark Granovetter’s ‘Threshold Models of Collective
Behaviour’ (1978), unethical actions become ‘embedded in the structures and processes’ of
a professional culture. At this point, the conditioning to ‘view corruption as permissible’ or
even necessary is very strong. Compelling anecdotal testimony, in which accomplished and
public-minded professors write about why they are leaving a career they once loved, is
emerging. The Chronicle of Higher Education has even coined a name for this genre: Quit Lit.
In Quit Lit, even senior researchers provide perfectly rational explanations for leaving their
privileged and prized positions, rather than compromise their principles in a
hypercompetitive, perverse-incentive environment. One is left to wonder whether minority
students or women rationally and disproportionately decide to opt out of the system more
so than the groups who tend to persist.
Many scientific societies, research institutions, academic journals and individuals have
advanced arguments trying to correct some excesses of quantitative metrics. Some have
signed the San Francisco Declaration on Research Assessment (DORA). DORA recognises
the need for improving ‘ways in which output of scientific research are evaluated’, and calls
for challenging research-assessment practices, especially the currently operative ‘journal
impact factor’ parameters. As of 1 August this year, 871 organisations and 12,788
individuals have signed DORA, including the American Society for Cell Biology, the
American Association for the Advancement of Science, the Howard Hughes Medical
Institute, and the Proceedings of the National Academy of Sciences. The publishers of
Nature, Science and other journals have called for downplaying the impact-factor metric. The
American Society of Microbiology recently took a principled stand and eliminated impact-
factor information from all their journals ‘to avoid contributing further to the inappropriate
focus on journal [impact factors]’. The aim is to slow the ‘avalanche’ of unreliable
performance metrics dominating research assessment. Like others, we are not advocating
for the abandonment of metrics, but reducing their importance in decision-making by
institutions and funding agencies, until we possibly have objective measures that better
represent the true value of scientific research.
For at least the past decade, however, US federal spending on R&D has been in decline. Its
‘research intensity’ (or, the federal R&D budget as a share of the country’s gross domestic
product) declining to 0.78 per cent (2014) from about 2 per cent in the 1960s. In tandem,
China is projected to outspend the US on R&D by 2020.
US colleges and universities have also historically served to shape the next generation of
researchers, who will provide education and knowledge for and to the public. But as
universities morph into ‘profit centres’ focused on generating new products and patents,
they are de-emphasising science as a public good.
Competition among researchers for funding has never been more intense, entering an era
with the worst funding environment in half a century. Between 1997 and 2014, the funding
rate for the US National Institutes for Health (NIH) grants fell from 30.5 per cent to 18 per
cent. US National Science Foundation (NSF) funding rates have remained stagnant at 23-
25 per cent in the past decade. Thankful for small favours, these funding rates are still well
above 6 per cent, which is an approximate breakeven point when the net cost of proposal-
writing equals the net value obtained from a grant by the grant-winner. Nonetheless, the
grant environment is hypercompetitive, susceptible to reviewer biases, skewed towards
funding agencies’ research agendas, and strongly dependent on prior success as measured
by quantitative metrics. Even before the financial crisis struck, the Nobel laureate Roger
Kornberg remarked: ‘If the work you propose to do isn’t virtually certain of success, then it
won’t be funded.’ These broad changes take valuable time and resources away from
scientific discovery and translation, compelling researchers to spend inordinate amounts of
time constantly chasing grant proposals and filling out ever increasing paperwork for grant
compliance.
The steady growth of perverse incentives, and their instrumental role in faculty research,
hiring and promotion practices, amounts to a systemic dysfunction endangering scientific
integrity. There is growing evidence that today’s research publications too frequently suffer
from lack of replicability, rely on biased data-sets, apply low or sub-standard statistical
methods, fail to guard against researcher biases, and overhype their findings. In other
words, an overemphasis on quantity versus quality. It is therefore not surprising that
scrutiny has revealed a troubling level of unethical activity, outright faking of peer review,
and retractions. The Economist recently highlighted the prevalence of shoddy and non-
reproducible modern scientific research and its high financial cost to society. They strongly
suggested that modern science is untrustworthy and in need of reform. Given the high cost
of exposing, disclosing or acknowledging scientific misconduct, we can be fairly certain
that there is much more than has been revealed. Warnings of systemic problems go back to
at least 1991, when the NSF director Walter E Massey noted that the size, complexity and
increased interdisciplinary nature of research in the face of growing competition was
making science and engineering ‘more vulnerable to falsehoods’.
There are exceptional cases in which individuals have provided a reality check on
overhyped research press releases, especially in areas deemed potentially transformative
(for example, Johnathan Eisen’s real-time commentary on some mania surrounding the
‘microbiome’). Generally, however the limitations of hot research sectors are downplayed or
ignored. Because every modern scientific mania creates a quantitative metric windfall for
participants, and because few consequences come to those responsible when a science
bubble bursts, the only effective check on pathological science and a misallocation of
resources is the unwritten honour system.
The US Environmental Protection Agency (EPA) also published scientific reports from
consultants based on non-existent data in industry journals. More recently, the EPA
silenced its own whistleblowers during the water crisis in the city of Flint in Michigan. As
agencies increasingly compete with each other for reduced discretionary funding and
maintaining existing cash flows (CDC’s desire to focus more on lead paint, as opposed to
lead in water, for example), they seem to be more inclined to publish ‘good news’ instead of
science. In an era of declining discretionary funding, federal agencies have financial
conflicts of interest and fears of survival, similar to those in private industry. Given the
common misconception that federal funding agencies are free of such conflicts, the dangers
of institutional research misconduct might rival or even outweigh those of industry-
sponsored research, given that there is no system of checks and balances, and consumers
of such work might be overly trusting.
All scientists should aspire to leave the field in a better state than when we first entered it.
The very important matters of state and federal funding lie beyond our direct control.
However, when it comes to the health, integrity and public perception of science and its
value, we are the key actors. We can openly acknowledge and address problems with
perverse incentives and hypercompetition that are distorting science and imperilling
scientific research as a public good. Some relatively simple steps include arriving at a better
understanding of the problem, by systematically mining the experiences and perceptions of
academics in STEM fields, via a comprehensive survey of high-achieving graduate students
and researchers.
Second, the NSF should commission a panel of economists and social scientists with
expertise in perverse incentives to collect and review input from all levels of academia,
including retired National Academy members and distinguished STEM scholars. With a
long-term view to fostering science as a public good, the panel could also develop a list of
‘best practices’ to guide evaluation of candidates for hiring and promotion.
Third, we can no longer afford to pretend that the problem of research misconduct does not
exist. At both the undergraduate and graduate levels, science and engineering students
should receive realistic instruction on these subjects, so that they are prepared to act when,
not if, they encounter it. The curriculum should include review of real-world pressures,
incentives and stresses that can increase the likelihood of research misconduct.
Fourth, universities can take measures immediately to protect the integrity of scientific
research, and announce steps to reduce perverse incentives and uphold research
misconduct policies that discourage unethical behaviour. Finally, and perhaps most simply,
in addition to teaching technical skills, PhD programmes themselves should accept that
they ought to acknowledge the present reality of perverse incentives, while also fostering
22 Comments
character development, and respect for science as a public good, and the critical role of
quality science to the future of humankind.
This article is an abridged version of the journal paper ‘Academic Research in the 21st Century:
Email Save
Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition’,
Tweet Share published in Environmental Engineering Science, and was written to reach a wider audience.
Original paper © Marc A Edwards and Siddhartha Roy, 2016.
History of science Education Economics
Aeon for Friends
7 November 2017
FIND OUT MORE
Essay / Childhood and adolescence Essay / Cosmology Essay / Nations and empires
Essay / Global history Essay / Social psychology Essay / Virtues and vices