Overview

The Sigmoid function produces an S-shaped curve. It is mathematically defined as 1 / (1 + exp(-x)).

Use Cases

  • Binary Classification: Predicting the probability of a 'yes' or 'no' outcome.
  • Gating Mechanisms: Used in LSTMs and GRUs to control the flow of information.

Limitations

  • Vanishing Gradient: In very deep networks, the gradients can become so small that the model stops learning.
  • Output not zero-centered: Can make optimization more difficult.

Related Terms