Overview
Cross-validation provides a more reliable estimate of how a model will perform on new data than a single train-test split. It helps detect overfitting and ensures the results aren't just due to luck.
K-Fold Cross-Validation
- Split the data into K equal parts (folds).
- Train the model on K-1 folds and test it on the remaining fold.
- Repeat K times, using a different fold as the test set each time.
- Average the results from all K iterations.
Benefits
- Uses the entire dataset for both training and testing.
- Reduces the variance of the performance estimate.