Continuing from the last blog post on Bayesian Optimization, we now know how to update the function , when we make new observations.
Now, there should be a way to tell the optimization where to look next, in-order to make sure we explore the optimization space, while when it makes sense also exploit local minimas. Keeping this in mind, we define a function, called the acquisition function.
The way the acquisition works is that it tries to find those samples in the optimization space, where:
- Either we have no information of the optimization space (high-variance regions), see image below.
- Or when we try to dig-deep in a local minima (high-mean regions)
We have an oracle ‘the acquisition function’, which bases its decision on what has been seen so far to update its belief of the world. It then tells the optimization to observe a new point and update the GP again. And this repeats.
The decision where to search next is based of maximizing the posterior-distributions given the new observation. Interested readers are requested to look at: A Tutorial on Bayesian Optimization, Eric Brochu, Vlad M. Cora, Nando de Freitas
Why BO works?
Simple answer is that it never looks at any region whose upper-bound (we are maximizing) is lower than the current best lower-bound. For example, in the case below, the GP says, that anything below the green-line is not worth looking since we already have a observation lower-bound is the green-line.
Problems with BO
- It needs a smooth space for fitting the GPs, this a more of a problem of GPs.
- It needs to do a matrix-inversion at each-step to update the GP. This inversion is unavoidable since after each new observation we have to add it to the already observed sample-points
- The parameters of the GP (see previous blog), also called Hyperparamters need to be hand-tuned to fit the user’s problem. Although there are works like Practical Bayesian Optimization of Machine Learning Algorithms, Jasper Snoek, et. al. which would do this automatically.
Applications Of BO
BO can be applied where ever, we have a global-search problem and we cannot formulate a convex-optimization problem. It excels at doing a better job than grid-search to find new places to search. Keeping in mind the above limitations, here’s an incomplete list of applications:
- Robot-Parameter Tuning: For example, robot gait parameter tuning.
- Computer-Vision: Image-Segmentation
- Multi-armed bandit problem
- Computer-Graphics & Animation: See link.
- Lately, with Practical Bayesian Optimization of Machine Learning Algorithms, Jasper Snoek et. al.’s work, Bayesian Optimization has auto-paramter tuning, as a result more application avenues are opening up.