Overview

Metrics are quantitative measurements (e.g., CPU usage, request count, error rate). Collecting metrics allows teams to monitor trends, set alerts, and perform capacity planning.

Key Concepts

  • Counters: Metrics that only increase (e.g., total requests).
  • Gauges: Metrics that can go up and down (e.g., memory usage).
  • Histograms/Summaries: Metrics that track the distribution of values (e.g., request latency).
  • Time-Series Database (TSDB): A database optimized for storing and querying time-stamped data (e.g., Prometheus, InfluxDB).

Benefits

  • Real-time Monitoring: See the current state of the system.
  • Trend Analysis: Identify patterns over days, weeks, or months.
  • Alerting: Automatically notify teams when metrics exceed predefined thresholds.

Related Terms