Preprint servers are seeing a rise in submissions seemingly produced by paper mills or with help from AI tools

AI content is tainting preprints: how moderators are fighting back

Preprint servers are seeing a rise in submissions seemingly produced by paper mills or with help from AI tools

Preprint servers such as PsyArxiv are contending with suspicious content written by AI systems.Credit: *Nature*

The preprint’s title — ‘Self-Experimental Report: Emergence of Generative AI Interfaces in Dream States’ — roused psychologist Olivia Kirtley’s suspicions.

When she clicked, her wariness grew. The manuscript, posted in July to PsyArXiv, a site for non-peer-reviewed research in the psychological sciences, was only a few pages long and listed just one author, whose affiliation was not included. And the artificial intelligence (AI) experiment described in the text “was pretty out there”, says Kirtley, at Catholic University of Leuven in Belgium.

So she flagged that preprint and similar ones to PsyArxiv’s managers, who removed them. The dream-states manuscript used AI in its methods but didn’t clearly declare how AI was used or whether it was used in other elements of the work, which means that it violated the site’s terms of use, says Dermot Lynott, the head of PsyArXiv’s scientific advisory board and a psychologist at Maynooth University in Ireland.

In response to questions from Nature, a message from the e-mail address of the listed author, Jiazheng Liu, said that AI played a limited part in the preprint’s generation.

PsyArXiv is just one of the many preprint servers — and journals — that are grappling with suspicious submissions. Some papers bear the fingerprints of paper mills, which are services that produce scientific papers on demand. Others show evidence of content written by AI systems, such as fake references, which can be a sign of an AI ‘hallucination’.

Such content poses a conundrum for preprint services. Many are non-profit organizations devoted to making it easier for scientists to publish their work, and screening for low-quality content demands resources and can slow processing of submissions. Such screening also raises questions about which manuscripts to allow. And this influx of dodgy content has its own risks.

“How do you do quality assurance while keeping things relatively light touch so the system doesn’t collapse on itself?” says Katie Corker, liaison from the executive committee of the Society for the Improvement of Psychological Sciences to PsyArXiv’s scientific advisory board. “No one wants a world where the individual reader has to figure out whether something is legitimate scholarship.”

AI growth spurt

The preprint services approached by Nature said that a relatively small proportion of their submissions bear signs of being generated by a large language model (LLM) such as the one that drives OpenAI’s ChatGPT. The operators of the preprint server arXiv, for example, estimate that roughly 2% of their submissions are rejected for being products of AI, paper mills or both.

Richard Sever, head of openRxiv, which operates life-sciences preprint server bioRxiv and biomedical server medRxiv, based in New York City, says that the two combined turn away more than ten manuscripts per day that seem formulaic, and might have been AI-generated. The services receive roughly 7,000 submissions per month.

But some say that the situation seems to be getting worse. The moderators of arXiv noticed an uptick in AI-written content soon after the launch of ChatGPT in late 2022, but “we really started thinking there was a crisis sometime in the last three months”, says Steinn Sigurðsson, scientific director at arXiv and an astrophysicist at Penn State University in University Park.

In a statement posted on 25 July, the Center for Open Science, a non-profit organization in Washington DC that hosts PsyArXiv, said that it was “seeing a noticeable rise in submitted papers that appear to be generated or heavily assisted by AI tools”. Lynott confirms that there has been a “small rise” on the site and that the server is working to minimize such content.

The challenges of moderating preprints were shown by the dream-states manuscript flagged by Kirtley: shortly after the preprint was removed, a preprint with a nearly identical title and abstract was posted on the site. An e-mail from the address associated with the author said that “AI’s role was limited to mathematical derivations, symbolic computations, assembling and applying existing mathematical tools, formula verification” and eight more tasks. The e-mail’s writer described themselves as an “independent researcher based in China” with no higher-education degree and whose “only tool is a secondhand smartphone”. The second version of the preprint has also been taken down.

Chatbot helpers

A study¹ published last week in Nature Human Behaviour estimates that in September 2024, almost two years after the rollout of ChatGPT, LLMs produced 22% of the content in computer-science abstracts posted on arXiv and roughly 10% of the text in biology abstracts posted on bioRxiv (See ‘Rise of the chatbot’). By comparison, an analysis² of biomedical abstracts published in journals in 2024 found that 14% had LLM-generated text in their abstracts .

Some of the AI text reported in the study could have been generated by scientists who would otherwise struggle to write a manuscript in English, says James Zou, a computer scientist at Stanford University in California and co-author of the Nature Human Behaviour paper.