Overview

Unlike grid or random search, which are 'blind,' Bayesian optimization is 'informed.' It keeps track of past results and uses them to build a model (often a Gaussian Process) of the objective function.

How it Works

  1. Surrogate Model: A probabilistic model that approximates the relationship between hyperparameters and model performance.
  2. Acquisition Function: A mathematical formula that decides which set of hyperparameters to try next, balancing 'exploration' (trying new areas) and 'exploitation' (focusing on areas that performed well).

Benefits

  • Much more efficient than grid or random search.
  • Ideal for expensive-to-train models where you can only afford a few dozen evaluations.

Related Terms