Overview

Random Forest is one of the most popular and versatile machine learning algorithms. It improves upon the accuracy and stability of a single decision tree by using a 'forest' of them.

How it Works

  1. Bagging (Bootstrap Aggregating): Each tree is trained on a random subset of the data.
  2. Feature Randomness: At each split in a tree, only a random subset of features is considered.
  3. Voting/Averaging: For classification, the forest takes a majority vote; for regression, it takes the average.

Advantages

  • Highly accurate and robust to outliers.
  • Handles both categorical and numerical data.
  • Provides a measure of 'feature importance.'
  • Less prone to overfitting than individual decision trees.

Related Terms