ReLU | Tech Glossary | LavX News | LavX News

Overview

ReLU is the default activation function for most deep learning models. Its mathematical form is f(x) = max(0, x).

Advantages

Computational Efficiency: Very simple to calculate.
Reduces Vanishing Gradient: Helps training in deep networks compared to Sigmoid or Tanh.
Sparsity: Can result in some neurons being exactly zero, which can be beneficial.

Variants

Leaky ReLU: Allows a small, non-zero gradient when the input is negative to prevent 'dying neurons.'

Related Terms

Activation Function Sigmoid Gradient Vanishing Problem