2024 MathematicalDiscoveriesfromProg
- (Romera-Paredes et al., 2024) ⇒ Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. (2024). “Mathematical Discoveries from Program Search with Large Language Models.” In: Nature, 625(7995). doi:10.1038/s41586-023-06924-6
Subject Headings:
Notes
- The paper introduces FunSearch, an evolutionary procedure that pairs a pretrained LLM with a systematic evaluator to surpass the best-known results in important problems, pushing the boundary of existing LLM-based approaches.
- The paper applies FunSearch to the cap set problem in extremal combinatorics, discovering new constructions of large cap sets beyond the best-known ones in both finite dimensional and asymptotic cases.
- The paper demonstrates the generality of FunSearch by applying it to the online bin packing problem, finding new heuristics that improve on widely used baselines.
- The paper highlights that unlike most computer search approaches, FunSearch searches for programs that describe how to solve a problem, rather than just finding the solution.
- The paper shows that programs discovered by FunSearch tend to be more interpretable than raw solutions, enabling effective feedback loops between domain experts and FunSearch, and facilitating the deployment of these programs in real-world applications.
- The paper illustrates the potential of LLMs in making scientific discoveries by providing verifiable new knowledge about notorious scientific problems through the FunSearch methodology.
- The paper argues that the discovered programs' interpretability and conciseness make them more scalable and easier to deploy, especially compared to other methods like neural networks, which often require specialized hardware and complex verification processes.
Cited By
Quotes
Abstract
Large language models (LLMs) have demonstrated tremendous capabilities in solving complex tasks, from quantitative reasoning to understanding natural language. However, LLMs sometimes suffer from confabulations (or hallucinations), which can result in them making plausible but incorrect statements. This hinders the use of current large models in scientific discovery. Here we introduce FunSearch (short for searching in the function space), an evolutionary procedure based on pairing a pretrained LLM with a systematic evaluator. We demonstrate the effectiveness of this approach to surpass the best-known results in important problems, pushing the boundary of existing LLM-based approaches. Applying FunSearch to a central problem in extremal combinatorics — the cap set problem — we discover new constructions of large cap sets going beyond the best-known ones, both in finite dimensional and asymptotic cases. This shows that it is possible to make discoveries for established open problems using LLMs. We showcase the generality of FunSearch by applying it to an algorithmic problem, online bin packing, finding new heuristics that improve on widely used baselines. In contrast to most computer search approaches, FunSearch searches for programs that describe how to solve a problem, rather than what the solution is. Beyond being an effective and scaleable strategy, discovered programs tend to be more interpretable than raw solutions, enabling feedback loops between domain experts and FunSearch, and the deployment of such programs in real-world applications.
Introduction
- NOTE: Discusses the capabilities and limitations of large language models (LLMs), particularly their tendency to confabulate, and introduces FunSearch as a solution to improve LLM-based approaches in scientific discovery.
FunSearch
- NOTE: Explains the evolutionary procedure in detail, including its components and how it surpasses existing LLM-based methods.
FunSearch Algorithm: 1. Initialize with a pretrained LLM and an evaluator 2. Set the initial program 3. Repeat until an optimal program is found: a. Create a prompt using the current best program b. Use LLM to generate new programs c. Evaluate the new programs d. Update the current best program with the highest-scoring new program 4. Return the optimal program
Overview of FunSearch
- NOTE: Describes the general methodology and workflow of FunSearch, emphasizing its innovative approach to problem solving.
Specification
- NOTE: Outlines the problem specification process, including the use of an 'evaluate' function and initial program implementation.
Pretrained LLM
- NOTE: Discusses the role of the large language model in FunSearch, highlighting its creative function and performance considerations.
Evaluation
- NOTE: Explains how generated programs are assessed and scored based on their performance on specific tasks.
Programs database
- NOTE: Details the management of a diverse pool of correct programs, crucial for maintaining exploratory capability.
Prompt
- NOTE: Elaborates on the creation of new prompts by combining high-scoring programs from the database for further LLM refinement.
Best-shot Prompting Algorithm: 1. Select top-scoring programs from the database 2. Create a prompt by combining these programs 3. Feed the prompt to the LLM to generate a new program 4. Return the new program
Distributed approach
- NOTE: Describes the implementation of FunSearch as a distributed system, leveraging parallelism and scalability for efficient problem solving.
Extremal Combinatorics
- NOTE: Covers the application of FunSearch to problems in extremal combinatorics, such as the cap set problem.
Cap sets
- NOTE: Focuses on the discovery of new constructions for large cap sets, demonstrating FunSearch's capability in finding superior solutions.
Cap Set Problem Algorithm: 1. Initialize dimension and priority function 2. Use FunSearch to evolve the priority function 3. Construct cap set: a. For each vector in the space: i. If the vector satisfies cap set constraints, add it to the cap set 4. Return the cap set
Admissible sets
- NOTE: Discusses the evolution of admissible sets, leading to significant improvements in lower bounds on the cap set capacity.
Admissible Sets Algorithm: 1. Initialize dimension and priority function 2. Use FunSearch to evolve the priority function 3. Construct admissible set: a. For each vector in the space: i. If the vector satisfies admissible set constraints, add it to the set 4. Return the admissible set
Bin Packing
- NOTE: Showcases the application of FunSearch to the online bin packing problem, resulting in the discovery of new heuristics that outperform traditional methods.
Online Bin Packing Algorithm: 1. Initialize bin size and heuristic function 2. Use FunSearch to evolve the heuristic function 3. Pack bins: a. For each item: i. Determine the best bin using the heuristic function ii. Place the item in the best bin 4. Return the packed bins
Discussion
- NOTE: Reflects on the effectiveness and implications of FunSearch, highlighting its potential for scalability, interpretability, and real-world applications.
Methods
- NOTE: Provides detailed implementation specifics of FunSearch, covering distributed system design, prompt building, evolutionary methods, and robustness.
Implementation details of FunSearch
- NOTE: Explains the distributed system architecture and the roles of different components within FunSearch.
Prompt building
- NOTE: Describes the mechanism for constructing prompts that guide the LLM in generating improved programs.
Evolutionary method and program selection
- NOTE: Elaborates on the evolutionary algorithm used to evolve and select high-performing programs.
Island-based Evolutionary Method: 1. Initialize multiple islands with initial programs 2. Repeat evolution: a. Evolve population within each island b. Evaluate and select top-performing individuals c. Periodically replace worst-performing islands with best individuals from other islands 3. Return the best program from all islands
Robustness
- NOTE: Addresses the inherent variability in FunSearch's performance due to randomness and the steps taken to ensure consistent results.
Related work
- NOTE: Situates FunSearch within the broader context of LLMs, genetic programming, and program superoptimization.
LLMs
- NOTE: Discusses previous work and advancements in large language models related to FunSearch.
Genetic programming
- NOTE: Explores the parallels between FunSearch and traditional genetic programming methods.
Program superoptimization and software engineering
- NOTE: Compares FunSearch's approach to other methods aimed at optimizing and modifying source code.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2024 MathematicalDiscoveriesfromProg | Mohammadamin Barekatain Pushmeet Kohli Bernardino Romera-Paredes Alexander Novikov Matej Balog M. Pawan Kumar Emilien Dupont Francisco J. R. Ruiz Jordan S. Ellenberg Pengming Wang Omar Fawzi Alhussein Fawzi | Mathematical Discoveries from Program Search with Large Language Models | 10.1038/s41586-023-06924-6 | 2024 |