The large language model does everything from reading the literature to writing and reviewing its own papers, but it has a limited range of applicability so far

Researchers built an ‘AI Scientist’ — what can it do?

Could science be fully automated? A team of machine-learning researchers has now tried.

‘AI Scientist’, created by a team at Tokyo company Sakana AI and at academic labs in Canada and the United Kingdom, performs the full cycle of research from reading the existing literature on a problem and formulating hypothesis for new developments to trying out solutions and writing a paper. AI Scientist even does some of the job of peer reviewers and evaluates its own results.

AI Scientist joins a slew of efforts to create AI agents that have automated at least parts of the scientific process. “To my knowledge, no one has yet done the total scientific community, all in one system,” says AI Scientist co-creator Cong Lu, a machine-learning researcher at the University of British Columbia in Vancouver, Canada. The results1 were posted on the arXiv preprint server this month.

“It’s impressive that they’ve done this end-to-end,” says Jevin West, a computational social scientist at the University of Washington in Seattle. “And I think we should be playing around with these ideas, because there could be potential for helping science.”

The output is not earth-shattering so far, and the system can only do research in the field of machine learning itself. In particular, AI Scientist is lacking what most scientists would consider the crucial part of doing science — the ability to do laboratory work. “There’s still a lot of work to go from AI that makes a hypothesis to implementing that in a robot scientist,” says Gerbrand Ceder, a materials scientist at Lawrence Berkeley National Laboratory and the University of California, Berkeley. Still, Ceder adds, “If you look into the future, I have zero doubt in mind that this is where much of science will go.”

Automated experiments

AI Scientist is based on a large language model (LLM). Using a paper that describes a machine learning algorithm as template, it starts from searching the literature for similar work. The team then employed the technique called evolutionary computation, which is inspired by the mutations and natural selection of Darwinian evolution. It proceeds in steps, applying small, random changes to an algorithm and selecting the ones that provide an improvement in efficiency.

To do so, AI Scientist conducts its own ‘experiments’ by running the algorithms and measuring how they perform. At the end, it produces a paper, and evaluates it in a sort of automated peer review. After ‘augmenting the literature’ this way, the algorithm can then start the cycle again, building on its own results.

The authors admit that the papers AI Scientists produced contained only incremental developments. Some other researchers were scathing in their comments on social media. “As an editor of a journal, I would likely desk-reject them. As a reviewer, I would reject them,” said one commenter on the website Hacker News.

West also says that the authors took a reductive view of how researchers learn about the current state of their field. A lot of what they know comes from other forms of communication, such as going to conferences or chatting to colleagues at the water cooler. “Science is more than a pile of papers,” says West. “You can have a 5-minute conversation that will be better than a 5-hour study of the literature.”

West’s colleague Shahan Memon agrees — but both West and Memon praise the authors for having made their code and results fully open. This has enabled them to analyze the AI Scientist’s results. They’ve found, for example, that it has a “popularity bias” in the choice of earlier papers it lists as references, skirting towards those with high citation counts. Memon and West say they are also looking into measuring whether AI Scientist’s choices were the most relevant ones.

Repetitive tasks

AI Scientist is, of course, not the first attempt at automating at least various parts of the job of a researcher: the dream of automating scientific discovery is as old as artificial intelligence itself — dating back to the 1950s, says Tom Hope, a computer scientist at the Allen Institute for AI based in Jerusalem. Already a decade ago, for example, the Automatic Statistician2 was able to analyse sets of data and write up its own papers. And Ceder and his colleagues have even automated some bench work: the ‘robot chemist’ they unveiled last year can synthesize new materials and experiment with them3.

Hope says that current LLMs “are not able to formulate novel and useful scientific directions beyond basic superficial combinations of buzzwords”. Still, Ceder says that even if AI won’t able to do the more creative part of the work any time soon, it could still automate a lot of the more repetitive aspects of research. “At the low level, you’re trying to analyse what something is, how something responds. That’s not the creative part of science, but it’s 90% of what we do.” Lu says he got a similar feedback from a lot of other researchers, too. “People will say, I have 100 ideas that I don’t have time for. Get the AI Scientist to do those.”

Lu says that to broaden AI Scientist’s capabilities — even to abstract fields beyond machine learning, such as pure mathematics — it might need to include other techniques beyond language models. Recent results on solving maths problems by Google Deep Mind, for example, have shown the power of combining LLMs with techniques of ‘symbolic’ AI, which build logical rules into a system rather than merely relying on it learning from statistical patterns in data. But the current iteration is but a start, he says. “We really believe this is the GPT-1 of AI science,” he says, referring to an early large language model by OpenAI in San Francisco, California.

The results feed into a debate that is at the top of many researchers’ concerns these days, says West. “All my colleagues in different sciences are trying to figure out, where does AI fit in in what we do? It does force us to think what is science in the twenty-first century — what it could be, what it is, what it is not,” he says.

doi: https://doi.org/10.1038/d41586-024-02842-3

This story originally appeared on: Nature - Author:Davide Castelvecchi