Google researchers make AI tech solve math puzzles “beyond human knowledge”

Artificial intelligence researchers claim to have made the world’s first genuine scientific discovery using a large language model (LLM), which is behind ChatGPT and similar programs. This signals a major breakthrough.

The discovery was made by Google DeepMind, an AI research laboratory where scientists are investigating whether LLMs can do more than just repackage information learned in training and actually generate new insights.

It turns out that they can, and the implications are potentially huge. DeepMind said in a blog post that its FunSearch, a method to search for new solutions in mathematics and computer science, made “the first discoveries in open problems in mathematical sciences using LLMs.”

This is important because LLMs – even though they excel at learning the patterns of language, including computer code from vast amounts of text and other data – are not known to generate new knowledge.

On the contrary, since the arrival of ChatGPT last year, LLMs have suffered regularly from hallucinations that can result in the models providing plausible and fluent but incorrect statements.

“What if we could harness the creativity of LLMs by identifying and building upon only their very best ideas?” DeepMind asked.

In a paper published in Nature, the lab has now introduced FunSearch, which is short for “searching in the function space.” DeepMind’s idea was to make an LLM write solutions to problems in the form of computer programs.

The LLM was then paired with an “evaluator” that automatically ranked the programs by how well they were performing. The best performers were combined and fed back to the LLM for improvement – up to the point when these programs reached the potential to discover new knowledge.

In a sense, DeepMind has tried to push the boundary of existing LLM-based approaches. And it worked, the lab says.

“The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something,”
Jordan Ellenberg.

“Applying FunSearch to a central problem in extremal combinatorics – the cap set problem – we discovered new constructions of large cap sets going beyond the best-known ones, both in finite-dimensional and asymptotic cases. This represents the first discoveries made for established open problems using LLMs,” said DeepMind.

The longstanding cap set problem deals with finding the largest set of points in space where no three points form a straight line. FunSearch created programs that generate new large cap sets that go beyond the best that human mathematicians have come up with.

For the second puzzle, FunSearch was deployed to discover more effective algorithms for the bin packing problem. It can mean both looking for the most efficient ways to arrange boxes in a shipping container and scheduling computing jobs in datacenters – that’s online bin packing.

Usually, the problem is solved by either packing items into the first bin that has space or into the bin with the least available space where the item will still fit. FunSearch found an approach that avoided leaving small gaps that were unlikely ever to be filled, according to the results.

“What makes FunSearch a particularly powerful scientific tool is that it outputs programs that reveal how its solutions are constructed, rather than just what the solutions are. We hope this can inspire further insights in the scientists who use FunSearch, driving a virtuous cycle of improvement and discovery,” said DeepMind.

This show-your-working approach is indeed how scientists generally operate, with new discoveries explained through the process used to produce them.

“The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something,” said Jordan Ellenberg, professor of mathematics at the University of Wisconsin-Madison, and co-author on the paper.

More from Cybernews:

Experiment: the ultimate kill switch for ads, malvertisers, and scammers

Google to limit advertisers' use of browser tracking cookies

GM’s Cruise lays off 24% of staff following pedestrian crash case

Meta oversight board to examine Israel-Hamas war content

UK’s Newsquest media group disrupted by cyberattack

Subscribe to our newsletter