Overview

Feature extraction involves creating new features from the original raw data, often by reducing the dimensionality of the data. Unlike feature selection, which picks a subset of existing features, extraction creates entirely new ones.

Common Techniques

  • Principal Component Analysis (PCA): Finding linear combinations of features that capture the most variance.
  • UMAP / t-SNE: Non-linear dimensionality reduction for visualization.
  • Autoencoders: Using neural networks to learn a compressed representation of data.

Goal

To simplify the data without losing important information, making it easier for models to learn patterns.

Related Terms