Problem Statement
Describe an adaptive rate limiter that reacts to downstream saturation.
Explanation
Start with configured limits per tenant. Continuously watch downstream p95 latency, queue depth, and error rate. When thresholds breach, reduce allowed tokens proportionally (AIMD or gradient) for all tenants, preserving relative weights.
Once metrics recover, gently increase limits. This keeps the system responsive under partial failures while maintaining fairness and honoring premium weights.
Code Solution
SolutionRead Only
if(p95>t || err>e){ limit *= 0.7 } else { limit += step }