Overview
Feature extraction involves creating new features from the original raw data, often by reducing the dimensionality of the data. Unlike feature selection, which picks a subset of existing features, extraction creates entirely new ones.
Common Techniques
- Principal Component Analysis (PCA): Finding linear combinations of features that capture the most variance.
- UMAP / t-SNE: Non-linear dimensionality reduction for visualization.
- Autoencoders: Using neural networks to learn a compressed representation of data.
Goal
To simplify the data without losing important information, making it easier for models to learn patterns.