Overview
Research shows that many weights in a large neural network are redundant or contribute very little to the final output. Pruning identifies and removes these weights with minimal loss in accuracy.
Types
- Weight Pruning: Removing individual connections.
- Neuron/Filter Pruning: Removing entire neurons or convolutional filters.
- Structured vs. Unstructured: Structured pruning is easier to accelerate on standard hardware.
Benefits
- Reduced memory footprint.
- Faster inference speeds.
- Lower power consumption for mobile and edge devices.