Abstracts Written by Chatgpt Fool Scientists

Academics say this is just the beginning.
childcare subsidies; protections against much better: they correctly identified only
The rising expectations of an emboldened bullying, discrimination and harassment; 68% of the generated abstracts and 86% of
labour movement were on full display on and a new schedule for salaries. Incoming the genuine abstracts. They incorrectly iden-
23 December, when more than 35% of the mem- graduate students, for example, will see tified 32% of the generated abstracts as being
bers of two unions representing UC graduate their annual salary increase from around real and 14% of the genuine abstracts as being
students voted against accepting university US$22,000 to $30,500. “We could have won generated.
officials’ offer and ending the strike. a lot more, and it’s sad that we didn’t get there,” Wachter says that, if scientists can’t deter-
Kupsh says. “We’re going to have to repeat in mine whether research is true, there could be
Missed opportunity? another 2.5 years.” “dire consequences”. As well as being prob-
One of the organizers of the vote-no campaign For Barry Eidlin, a sociologist at McGill lematic for researchers, who could be pulled
was Dylan Kupsh, a graduate researcher in University in Montreal, Canada, who studies down flawed routes of investigation, because
computer science at UCLA. Kupsh was in close the labour movement, the scale of the vote-no the research they are reading has been fabri-
contact with union organizers at Columbia campaign is yet another sign of changing cated, there are “implications for society at
University, where student workers rejected an expectations in academia. “In the past, aca- large because scientific research plays such a
initial contract proposal and went on to secure demic workers have felt like they should just huge role in our society”. For example, it could
further concessions after a ten-week strike that keep their heads down and be grateful they mean that research-informed policy decisions
ended last January. have a job,” he says. “The idea that people now are incorrect, she adds.
In the end, UC graduate students received expect more, and are willing to fight for more, But Arvind Narayanan, a computer scientist
a range of new benefits, including increased seems to me a welcome shift in perspective.” at Princeton University in New Jersey, says: “It
is unlikely that any serious scientist will use
ChatGPT to generate abstracts.” He adds that
whether generated abstracts can be detected
ABSTRACTS WRITTEN
is “irrelevant”. “The question is whether the
tool can generate an abstract that is accurate
BY CHATGPT FOOL
and compelling. It can’t, and so the upside of
using ChatGPT is minuscule, and the downside
SCIENTISTS
is significant,” he says.
Irene Solaiman, who researches the social
impact of AI at Hugging Face, an AI company
with headquarters in New York and Paris, has
Researchers cannot always differentiate fears about any reliance on large language
models for scientific thinking. “These models
between AI-generated and original abstracts. are trained on past information and social and
scientific progress can often come from think-
By Holly Else
A
by ChatGPT. Now, a group led by Catherine ing, or being open to thinking, differently from
Gao at Northwestern University in Chicago, the past,” she adds.
n artificial-intelligence (AI) chatbot Illinois, has used ChatGPT to generate artifi- The authors suggest that those evaluating
can write such convincing fake cial research-paper abstracts to test whether scientific communications, such as research
research-paper abstracts that scien- scientists can spot them. papers and conference proceedings, should
tists are often unable to spot them, The researchers asked the chatbot to write put policies in place to stamp out the use of
according to a preprint posted on the 50 medical-research abstracts based on a selec- AI-generated texts. If institutions choose to
bioRxiv server in late December1. Researchers tion published in JAMA, The New England Jour- allow use of the technology in certain cases,
are divided over the implications for science. nal of Medicine, The BMJ, The Lancet and Nature they should establish clear rules around
“I am very worried,” says Sandra Wachter, Medicine. They then compared these with the disclosure. This month, the Fortieth Inter-
who studies technology and regulation at the national Conference on Machine Learning —
University of Oxford, UK, and was not involved “If the experts are not which will be held in Honolulu, Hawaii, in July
in the research. “If we’re now in a situation — announced that it has banned papers written
where the experts are not able to determine
able to determine what’s by ChatGPT and other AI language tools.
what’s true or not, we lose the middleman that true, we lose the middleman Solaiman adds that in fields where fake
we desperately need to guide us through com- to guide us through information can endanger people’s safety,
plicated topics,” she adds. such as medicine, journals might have to take
The chatbot, ChatGPT, creates realistic
complicated topics.” a more rigorous approach to verifying infor-
text in response to user prompts. It is a ‘large mation as accurate.
language model’, a system based on neural net- original abstracts by running them through a Narayanan says that the solutions to these
works that learn to perform a task by digesting plagiarism detector and an AI-output detector, issues should not focus on the chatbot itself,
huge amounts of existing human-generated and asked a group of medical researchers to “but rather the perverse incentives that
text. Software company OpenAI, based in spot the fabricated abstracts. lead to this behaviour, such as universities
San Francisco, California, released the tool conducting hiring and promotion reviews by
on 30 November, and it is free to use. Under the radar counting papers with no regard to their quality
Since its release, researchers have been The ChatGPT-generated abstracts sailed or impact”.
grappling with the ethical issues surround- through the plagiarism checker: the median 1. Gao, C. A. et al. Preprint at bioRxiv https://doi.
ing its use, because much of the chatbot’s originality score was 100%, which indicates org/10.1101/2022.12.23.521610 (2022).
output can be difficult to distinguish from that no plagiarism was detected. The AI-output 2. Blanco-Gonzalez, A. et al. Preprint at https://arxiv.org/
abs/2212.08104 (2022).
human-written text. Scientists have pub- detector spotted 66% of the generated 3. O’Connor, S. & ChatGPT. Nurse Educ. Pract. 66, 103537
lished a preprint2 and an editorial3 written abstracts. But the human reviewers didn’t do (2023).
Nature | Vol 613 | 19 January 2023 | 423

©
2
0
2
3
S
p
r
i
n
g
e
r
N
a
t
u
r
e
L
i
m
i
t
e
d
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.

Abstracts Written by Chatgpt Fool Scientists

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Abstracts Written by Chatgpt Fool Scientists

Uploaded by

Copyright:

Available Formats

Academics say this is just the beginning.

Nature | Vol 613 | 19 January 2023 | 423

You might also like