Overview

The learning rate is arguably the most important hyperparameter. It controls how quickly or slowly a model 'learns.'

The Tradeoff

  • Too High: The model might overshoot the optimal solution and fail to converge (or even diverge).
  • Too Low: The model will take a very long time to train and might get stuck in a poor local minimum.

Learning Rate Schedulers

Techniques that change the learning rate during training (e.g., starting high and gradually decreasing it) to improve performance and stability.

Related Terms