Chemistry's Reproducibility Crisis

3/7/2020 Chemistry’s reproducibility crisis that you’ve probably never heard of | News | Chemistry World
Create your free account

You'll be able to read more articles, watch more videos and listen to more podcasts. It takes
less than a minute and it's completely free.
REGISTER NOW CLOSE
SOURCE: © ROYAL SOCIETY O F CHEMISTRY; ELEMENTS © SHUTTER STO CK
Computational chemistry faces a coding crisis

B Y JA M I E D U R R A N I | 1 J U LY 2 0 2 0
In October last year, a team of natural product chemists discovered a glitch in a widely used piece
of NMR software. Buried deep inside the code was a simple file sorting issue, which on certain
operating systems led to incorrect values being predicted for chemical shifts. The finding cast
uncertainty over results published in more than 150 scientific papers over a five year period.
Ten years is a long time in this ﬁeld in terms of architecture

This site uses cookies from Google and other third parties to deliver its services, to personalise adverts and to analyse
traﬃc. Information about your use of this site is shared with Google. By using this site, you agree to its use of cookies.
developments, compiler developments, all sorts of
Read our policy.
https://www.chemistryworld.com/news/chemistrys-reproducibility-crisis-that-youve-probably-never-heard-of/4011693.article 1/6
developments
LYNN KAMERLIN, UPPSALA UNIVERSITY
This is not the first time that an error in a piece of software code has cast a shadow over
computational research, these sorts of issues are actually surprisingly common. In one famous
case, a coding error was at the heart of a seven-year dispute between some of the world’s top
theoretical chemists, who were trying to model the phases of supercooled water. And recently, an
algorithm used in older versions of the popular molecular dynamics software Gromacs was found
to introduce order of magnitude mistakes during simulations.
Ideally, code will be well documented and publicly available, allowing researchers to scrutinise
scripts and locate problems. But this isn’t always the case – traditional publishing practices, as
well as concerns around intellectual property, often mean that code is difficult or even impossible
to access.
Even when source code is open for all to see, other factors can complicate matters. Computer
programs tend to rely on an array of other pieces of software, which are continually being updated
and new versions introduced. This makes repeating the exact conditions that a computational
study was originally performed under surprisingly difficult. These problems have become so
widespread that a ‘reproducibility crisis’ is now a major concern among computational scientists.
The answer to what?
‘One of my old Python scripts depends on 200 software packages directly or indirectly, and all
these changed over time,’ says Konrad Hinsen, who develops molecular dynamics software at the
French National Center for Scientific Research in Orléans. ‘In the end it becomes very difficult to
run the stuff – and even if you can run it, it doesn’t mean you get the same numbers out.’ Hinsen
explains that even if older scientific programs still run and produce a result, it’s not always clear
what they have actually calculated. ‘It’s like the famous thing in the Hitchhikers Guide to the
Galaxy: the answer is 42 – but it’s the answer to which question?’ he says.
Hinsen is particularly interested in computational science methodology and has serious concerns
about reproducibility in the field. A few years ago, he co-founded the journal ReScience C to
create a space where people tasked with revisiting old code could share their results. ‘We wanted
to improve the problem where you have many computational prescriptions in papers which are
incomplete and code which is lacking and then at some point nobody reads up what is being
done,’ says Hinsen. ‘Even within the lab, one previous student does something and the following
one can’t pick up the work because nowhere is it properly documented.’
Read our policy.
ReScience C recently launched a challenge asking

authors to go back and see if they could recreate the
results of studies they published at least 10 years ago.
Hinsen explains that the challenge is all about making
computational science ‘more understandable, more
transparent and more durable’. To take part himself,
Hinsen re-examined two old programs of his own –
one written 10 years ago and another from the mid-
nineties. Curiously, the 25 year old code still runs Source: Courtesy of Nicolas P Rou gier
perfectly, whereas the more sophisticated 10 year old
code doesn’t. The failure of the modern code turned
out to be due to intentional changes made in 2014 to
the Python libraries on which it relied.
Computational biochemist Lynn Kamerlin explains that legacy issues tend to be more problematic
when code has become dormant. With ‘living’ code – methods that are actively used by a research
community – bugs tend to be spotted and fixed quickly. When mothballed methods are revisited
the situation gets much harder. Kamerlin describes the problems she had when working with old
code that had then been forgotten. ‘This was something [a colleague] developed 15–20 years
earlier and never thought about, and they put it on tape gathering dust somewhere in their office,’
she recalls. The tape was eventually located, but in a form that was incompatible with modern
hardware. ‘We couldn’t find a way to actually read the tape – we ended up having to re-implement
it from scratch,’ says Kamerlin.
While this is an extreme example, it illustrates the problems that can arise in the fast-moving world
of computing. ‘Ten years is a long time in this field in terms of architecture developments, compiler
developments, all sorts of developments,’ says Kamerlin. ‘And so if you haven’t looked at [a
program] for 10 years, there’s no guarantee you can actually run the code – so you can get
basically the software equivalent of my tape reader problem.’
Black box problem
With the growing use of machine learning models to solve chemistry problems, the issue of
reproducibility in AI studies is particularly worrisome. ‘The obvious problem is that you need a
huge amount of training data and you should, in theory, keep a copy of that and make it publicly
available so people can redo these things later,’ says Hinsen. ‘And this is often difficult, simply
because of the size of the data – you may not be able to store it or publish it, it easily gets lost,
also it often gets updated quickly and then you don’t know which version you used.’
Read our policy.
Basically teamwork, openness, transparency – I think this is

really the only way forward to ensure the safety of code
LYNN KAMERLIN, UPPSALA UNIVERSITY
One issue is that many computational chemistry program coders are not formally trained software
developers – they tend to be chemists who are trying to solve a problem for which no software is
readily available. As a result, programming practices often lag behind what would be considered
best practice within the computer science community.
‘Today you can publish a paper on machine learning in chemistry where you test [a model] on one
or two benchmarks and only compare with selected baselines, which may not necessarily be
state-of-the-art,’ says Massachusetts Institute of Technology’s Regina Barzilay, who develops
deep learning methods for drug discovery. ‘This is a serious problem that makes it hard to see if a
new method is really an advance.’ Barzilay explains that such an approach would be unacceptable
in her core discipline of computer science, where new models must be assessed against as many
public datasets as possible, to ensure reproducibility. ‘Unfortunately, this level of testing is still not
a common practice in AI and chemistry. I hope it will change,’ she adds.
So what can be done to increase the lifetime of computational methods?
Hinsen recommends that all students starting out in computational work should have access to
basic training in good programming practice, to help ensure that they can keep track of projects
and avoid accidental losses of data. He has taught courses for the Software Carpentry network,
which offers workshops and training in essential lab skills for research computing, and has also
organised a massive open online course (MOOC) covering crucial techniques such as file
management, version control and backing up data. While it’s less comprehensive than in person
training, Hinsen points out that a MOOC can reach thousands of people in a single session.
A prisoner of software
Kamerlin stresses the importance of documenting everything that goes into a computational study
– the need to publish not just scripts, but other details like the compiler used and software’s overall
architecture. She points to free online repositories like GitHub and Zenodo where researchers can
store all of the code and additional data used. Kamerlin explains that opening up code to the
community can help ensure it is used and kept active, rather than being discarded as soon as it
has helped solve a problem. ‘Basically teamwork, openness, transparency – I think this is really
the only way forward to ensure the safety of code,’ she says.
Read our policy.
The reproducibility issue – to address it demands more than

just the mere moral imperative of being open
ALEXANDER HOCQUET, UNIVERSITY OF LORRAINE
Hinsen agrees that documenting every aspect of a computational method is essential, but believes
that more is needed than just making code and data files publicly available. Source code for
scientific software is a complicated mixture of calculations, approximations and technical
computing mechanisms required for memory management, processing data sets and optimising
performance. As a result, code can often be almost indecipherable to anyone other than the
original developer. According to Hinsen, this has led to a situation where the complex models
underpinning many computational studies have become ‘imprisoned’ by scientific software – often
the only place where these models actually exist.
To illustrate this point, Hinsen describes the bimolecular simulation of a protein, which would
typically be defined by a function comprising thousands of coordinates. ‘In theory you should put it
in the paper – but you can’t because no publisher wants to have a 50 page equation that
describes the function of 5000 variables and nobody wants to read it and nobody could re-
compute it anyway,’ he says. ‘So what happens is that somewhere in the software there is an
implementation to run it, but nobody knows exactly what it does.’
Hinsen has called for a new digital scientific notation to help scientists regain control of their
codes. This would create a formal language that would enable coding information to be published
and scrutinised in a way that is suitable for the digital age. Such a language would be readable for
both humans and machines, enabling peer review of the scientific models and software verification
of the computational methods. Hinsen hopes that this will prevent scientific software packages
being used in a ‘black box’ fashion, as is often the case at the moment.
Open source
The problems surrounding the transparency of code highlight a fundamental dilemma for
computational science – should all code be open for all to see? If not, then how can the methods
be reproduced and built upon by peers? But if so, do computational scientists risk giving up the
fruits of their labour for free? Would it even be practical – for developers or users? These
questions have been hotly debated among computational chemists for decades.
For Alexandre Hocquet, a former computational chemist who is now a science historian at the
University of Lorraine, France, the questions shine a light on the complex ways that business and
law
Readareour embedded
policy. into scientific behaviour. Hocquet points out that while the drive for open
software makes sense on many levels, the makers of commercial software would argue that their
model provides the necessary resources for software to be maintained. Without the revenue
generated by proprietary licences, how can you support a workforce that will update programs and
keep them alive? As evidence of this, Hocquet highlights the success of one of computational
chemistry’s most well-known software packages. Gaussian’s strict licensing terms have long
attracted the ire of many scientists, yet the program still leads the field today, more than 50 years
after it was first developed.
‘The reproducibility issue – to address it demands more than just the mere moral imperative of
being open,’ says Hocquet. ‘There are a lot of governance issues, a lot of licensing issues. Not
every free software licence is the same – they have politics embedded into them.’
Hocquet points out that many scientific instruments are developed and maintained by companies,
without regular users questioning their inner workings. ‘When you buy a Bruker NMR
spectrometer, you can’t expect to know exactly what is going on inside. You rely on the
standardisation of the scientific instrument of the corporate entity, which you trust has been
developed, maintained, standardised, calibrated,’ says Hocquet. ‘There’s a parallel here between
“should [software] be open or not?”, and “what is trust in a scientific instrument?” – the moral
imperative to be open is far less developed when we talk about NMR spectroscopy, for example.’
Perhaps then, one positive to come out of the reproducibility crisis is that it has opened up a
conversation where fundamental scientific philosophies can take centre stage. ‘The computational
chemistry domain is actually a scientific field where the kind of issues, the two visions of what is
trust in science are actually debated,’ says Hocquet. ‘In other fields, you don’t see those debates.’
Read our policy.

Chemistry's Reproducibility Crisis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chemistry's Reproducibility Crisis

Uploaded by

Copyright:

Available Formats

3/7/2020 Chemistry’s reproducibility crisis that you’ve probably never heard of | News | Chemistry World

Create your free account

REGISTER NOW CLOSE

SOURCE: © ROYAL SOCIETY O F CHEMISTRY; ELEMENTS © SHUTTER STO CK

Computational chemistry faces a coding crisis

Ten years is a long time in this ﬁeld in terms of architecture

The answer to what?

ReScience C recently launched a challenge asking

Black box problem

Basically teamwork, openness, transparency – I think this is

So what can be done to increase the lifetime of computational methods?

The reproducibility issue – to address it demands more than

You might also like