2023 LargeLanguageModelsAsOptimizers
- (Yang, Wang et al., 2023) ⇒ Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. (2023). “Large Language Models As Optimizers.” doi:10.48550/arXiv.2309.03409
Subject Headings: Optimization by PROmpting (OPRO), LLM Prompt Optimization.
Notes
- It proposes using large language models as optimizers through iterative prompting, providing past solution-score pairs and natural language instructions.
- For prompt optimization, the goal is to find a prompt that maximizes the task accuracy. The accuracy serves as the objective function to guide the optimization. Only a small fraction of the full training set is used to compute the accuracy. For example, 3.5% of examples for GSM8K and 20% for Big Bench Hard.
- It shows LLMs can optimize small linear regression and traveling salesman problems when provided optimization history. GPT-4 performs the best.
- It applies this to optimizing natural language processing prompts to maximize accuracy. LLMs can generate improved prompts through iterative optimization, outperforming human-designed ones.
- It shows optimized prompts transfer well, also improving performance on unseen datasets in the same domain.
- It highlights challenges like sensitivity to prompt format and stability of optimization.
- It analyzes meta-prompt design choices. The meta-prompt consists of two main parts:
- Optimization problem description: High-level natural language description of the task, including the objective function and constraints. For prompt optimization, this includes task examples and instructions. For GSM8K, high-level instructions like “
Write your new text that is different from the old ones and has a score as high as possible.
". - Optimization trajectory: The history of previously generated solutions and their objective values/scores. For prompt optimization, this is previous prompt-accuracy pairs sorted by accuracy. For GSM8K, optimization trajectory: pairs of past instructions and their accuracy scores, e.g. “
text: Let's figure it out!, score: 61
". Sorted from lowest to highest score.
- Optimization problem description: High-level natural language description of the task, including the objective function and constraints. For prompt optimization, this includes task examples and instructions. For GSM8K, high-level instructions like “
- It demonstrates the potential of leveraging LLMs' natural language capabilities for optimization without additional training. Prompting LLMs to iteratively improve solutions is a promising direction.
- It provides some example prompts.
- For GSM8K math word problems:
- “
Let's break this down
." - “
Let's do the math!
." - “
Take a deep breath and work on this problem step-by-step.
."
- “
- For Big Bench Hard movie recommendation:
- “
Based on your input, I have analyzed the given movies in terms of genre, plot, tone, audience rating, year of release, director, cast, and reviews. I have also taken into account the given options. The movie that is most similar to the given movies in terms of all these factors is:
."
- “
- For GSM8K math word problems:
Cited By
Quotes
Abstract
Optimization is ubiquitous. While derivative-based algorithms have been powerful tools for various problems, the absence of gradient imposes challenges on many real-world applications. In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language. In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values, then the new solutions are evaluated and added to the prompt for the next optimization step. We first showcase OPRO on linear regression and traveling salesman problems, then move on to prompt optimization where the goal is to find instructions that maximize the task accuracy. With a variety of LLMs, we demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2023 LargeLanguageModelsAsOptimizers | Quoc V. Le Xuezhi Wang Hanxiao Liu Denny Zhou Chengrun Yang Yifeng Lu Xinyun Chen | Large Language Models As Optimizers | 10.48550/arXiv.2309.03409 | 2023 |