Problem Statement
What observability signals do you instrument on a messaging pipeline to detect incidents early?
Explanation
Track end-to-end latency (produce → consume → side effect), per-partition lag, consumer group rebalances, and DLQ rates. Add cardinality-bounded labels like topic, partition, and consumer group.
Propagate a correlation id through events and logs. Sample traces around spikes and retries to pinpoint slow handlers and hotspots.
Code Solution
SolutionRead Only
metrics: lag_seconds, processing_ms, dlq_count, rebalance_count
Practice Sets
This question appears in the following practice sets:
