Overview

Cross-validation provides a more reliable estimate of how a model will perform on new data than a single train-test split. It helps detect overfitting and ensures the results aren't just due to luck.

K-Fold Cross-Validation

  1. Split the data into K equal parts (folds).
  2. Train the model on K-1 folds and test it on the remaining fold.
  3. Repeat K times, using a different fold as the test set each time.
  4. Average the results from all K iterations.

Benefits

  • Uses the entire dataset for both training and testing.
  • Reduces the variance of the performance estimate.

Related Terms