DeepMind has used a large language model (LLM) to generate a novel solution to one of humanity’s toughest math problems — in a breakthrough that could herald a new era in AI development.
The model, known as FunSearch, discovered a solution to the so-called “cap set puzzle.” The decades-old math conundrum essentially comes down to how many dots you can joint down on a page while drawing lines between them, without three of them ever forming a straight line.
If that gave you a migraine, don’t worry. What’s important to note is that the problem has never been solved, and researchers have only ever found solutions for small dimensions. Until now.
FunSearch successfully discovered new constructions for large cap sets that far exceeded the best-known ones. While the LLM didn’t solve the cap set problem once and for all (contrary to some of the news headlines swirling around), it did find facts new to science.
“To the best of our knowledge, this shows the first scientific discovery – a new piece of verifiable knowledge about a notorious scientific problem — using an LLM,” wrote the researchers in a paper published in Nature this week.
In previous experiments, researchers have used large language models to solve maths problems with known solutions.
FunSearch works by combining a pre-trained LLM, in this case a version of Google’s PaLM 2, with an automated “evaluator.” This fact-checker guards against the production of false information.
LLMs have been shown to regularly produce so-called “hallucinations” — basically when they just make shit up and present it as fact. This has, naturally, limited their usefulness in making verifiable scientific discoveries. However, researchers at the London-based lab claim that the use of an in-built fact-checker makes FunSearch different.
FunSearch engages in a continuous back-and-forth dance between the LLM and the evaluator. This process transforms initial solutions into new knowledge.
What also makes the tool quite promising for scientists is that it outputs programs that reveal how its solutions are constructed, rather than just what the solutions are.
“We hope this can inspire further insights in the scientists who use FunSearch, driving a virtuous cycle of improvement and discovery,” said the researchers.