Overview

Unlike Random Forest, where trees are built independently in parallel, Gradient Boosting builds trees one after another. Each new tree attempts to minimize the 'residual error' (the difference between the actual and predicted values) of the existing ensemble.

The Process

  1. Start with a simple model (like the average of the target values).
  2. Calculate the errors (residuals).
  3. Train a new model to predict those residuals.
  4. Add the new model to the ensemble (with a small 'learning rate' to prevent overfitting).
  5. Repeat for many iterations.

Popular Implementations

  • XGBoost
  • LightGBM
  • CatBoost

Related Terms