Overview
Distributed tracing provides a 'bird's-eye view' of a request's journey. It helps identify where bottlenecks occur and which service is responsible for a failure in a complex, multi-step transaction.
Key Concepts
- Trace: The complete path of a request through the system.
- Span: A single operation within a trace (e.g., a database query or an API call).
- Trace ID: A unique identifier passed between services to correlate spans.
- Context Propagation: The mechanism for passing the Trace ID across network boundaries.
Popular Tools
- Jaeger: An open-source, end-to-end distributed tracing system.
- Zipkin: Another widely used open-source tracing system.
- AWS X-Ray / Google Cloud Trace: Cloud-native tracing services.
- OpenTelemetry: A vendor-neutral standard for collecting traces, metrics, and logs.