Overview

The Horizontal Pod Autoscaler (HPA) automatically scales the number of Pods in a replication controller, deployment, replica set, or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).

How it Works

  1. Monitoring: The HPA controller periodically queries the resource metrics API for each target.
  2. Calculation: It calculates the desired number of replicas based on the current metric value and the target value.
  3. Scaling: It updates the replica count of the target resource.

Benefits

  • Efficiency: Ensures you have enough Pods to handle traffic spikes without over-provisioning.
  • Cost Savings: Reduces the number of Pods during low-traffic periods.

Related Terms