Overview
The learning rate is arguably the most important hyperparameter. It controls how quickly or slowly a model 'learns.'
The Tradeoff
- Too High: The model might overshoot the optimal solution and fail to converge (or even diverge).
- Too Low: The model will take a very long time to train and might get stuck in a poor local minimum.
Learning Rate Schedulers
Techniques that change the learning rate during training (e.g., starting high and gradually decreasing it) to improve performance and stability.