Bayesian Optimization Algorithm (BOA)
Jump to navigation
Jump to search
A Bayesian Optimization Algorithm (BOA) is a global sequential model-based optimization algorithm for black-box function optimization that places a prior over the objective function and uses Bayesian inference to sample the most informative points.
- Context:
- It can balance exploration and exploitation through the acquisition function which uses the posterior distribution to determine where to sample next.
- It can use an Acquisition Function (like expected improvement, upper confidence bound, and probability of improvement) to measure the utility of potential points and guide the exploration-exploitation trade-off.
- It can employ a posterior distribution to capture beliefs about the objective function using Bayesian inference, represent uncertainty, and trade off exploration and exploitation.
- It can be implemented by a Bayesian Optimization System (that solves a Bayesian optimization task).
- It can range from being a Bayesian Optimization of a Single Objective to being a Bayesian Optimization with Multiple Objectives (BOMO).
- …
- Example(s):
- Sequential Model-Based Optimization (SMBO): can be used with a variety of surrogate models and acquisition functions.
- Gaussian Process Optimization (GPO): uses Gaussian processes as the surrogate model.
- Tree Parzen Estimator (TPE): uses Tree Parzen Estimators as the surrogate model.
- Spearmint: a specific implementation of Bayesian optimization using Gaussian processes, developed for hyperparameter optimization in machine learning.
- Sequential Model-based Algorithm Configuration (SMAC): combines Bayesian optimization with random forests as the surrogate model, particularly effective for high-dimensional and categorical parameter spaces.
- Covariance Matrix Adaptation Evolution Strategy (CMA-ES): while not strictly a Bayesian method, it shares similarities in its approach to black-box optimization and can be used in conjunction with Bayesian techniques.
- Sklearn-Optimizer: a Python library that implements Bayesian optimization for hyperparameter tuning in scikit-learn models.
- Metric Optimization Engine (MOE): an open-source Bayesian optimization library developed by Yelp for A/B testing and hyperparameter optimization.
- …
- Counter-Example(s):
- a Local Optimization Algorithm, such as: Gradient Descent which requires gradient information.
- a Non-Bayesian Global Optimization Algorithm, such as: Simulated Annealing which uses probabilistic techniques for global optimization without employing Bayesian inference or surrogate models.
- a Population-based Optimization Algorithm, such as: Particle Swarm Optimization which optimizes a problem using a population of candidate solutions.
- See: Gaussian Process Regression, Upper Confidence Bound Algorithm.
References
2024
- LLM
ALGORITHM BayesianOptimization:
INPUT: - objective_function: black-box function to optimize - domain: search space for input parameters - max_iterations: maximum number of iterations - acquisition_function: function to determine next sampling point - initial_samples: number of initial random samples OUTPUT: - best_parameters: parameters that yield the best observed value - best_value: best observed value of the objective function
// Initialize 1. Generate initial_samples random points in the domain 2. Evaluate objective_function at these points 3. Initialize surrogate_model with these points and their evaluations 4. best_value = best evaluation so far 5. best_parameters = parameters corresponding to best_value
// Main optimization loop FOR iteration = 1 TO max_iterations: // Update surrogate model 6. Update surrogate_model with all available data points // Determine next sampling point 7. next_point = MAXIMIZE(acquisition_function(surrogate_model, domain)) // Evaluate objective function at new point 8. new_value = objective_function(next_point) // Update best solution if necessary 9. IF new_value > best_value: 10. best_value = new_value 11. best_parameters = next_point // Add new data point to dataset 12. Add (next_point, new_value) to dataset
RETURN best_parameters, best_value
FUNCTION acquisition_function(surrogate_model, domain):
// This function balances exploration and exploitation // It can be implemented as Expected Improvement, Upper Confidence Bound, etc. // Returns a function that can be maximized to find the next sampling point
END ALGORITHM
2023
- (Wikipedia, 2023) ⇒ https://en.wikipedia.org/wiki/Bayesian_optimization Retrieved:2023-10-18.
- Bayesian optimization is a sequential design strategy for global optimization of black-box functions that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions.
2019
- (Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Bayesian_optimization#Strategy Retrieved:2019-9-12.
- Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it.
The prior captures beliefs about the behaviour of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the posterior distribution over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines the next query point.
- Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it.
2012
- Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian optimization of machine learning algorithms. In Proc. of NIPS .
2010
- (Brochu et al., 2010) ⇒ Eric Brochu, Vlad M. Cora, and Nando De Freitas. (2010). “A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning." arXiv preprint arXiv:1012.2599
- ABSTRACT: We present a tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions. Bayesian optimization employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function. This permits a utility-based selection of the next observation to make on the objective function, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation (sampling areas likely to offer improvement over the current best observation). We also present two detailed extensions of Bayesian optimization, with experiments --- active user modelling with preferences, and hierarchical reinforcement learning --- and a discussion of the pros and cons of Bayesian optimization based on our experiences.