Overview
The Sigmoid function produces an S-shaped curve. It is mathematically defined as 1 / (1 + exp(-x)).
Use Cases
- Binary Classification: Predicting the probability of a 'yes' or 'no' outcome.
- Gating Mechanisms: Used in LSTMs and GRUs to control the flow of information.
Limitations
- Vanishing Gradient: In very deep networks, the gradients can become so small that the model stops learning.
- Output not zero-centered: Can make optimization more difficult.