Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

beautifulentropy · 2024-11-25T19:02:18Z

By default, go-redis will retry a request 3 times. Check if retries are applied to individual requests in a pipeline or if they are applied to the entire pipeline. The theory we are trying to prove out is whether a timeout of 1 or 2 keys in a 100 name order results in a whole pipeline being retried, and thus considerably more load on those same shards.

We can add a label to our metrics to bin transactions by count. With a ~105 upper limit for new-order rate limit checks, including per-name checks, we could use bins like: 1-25, 26-50, 51-75, 76-105, 106+. With these deployed we should be able to correlate timeouts to queries.

aarongable added this to the Sprint 2024-12-03 milestone Dec 3, 2024

aarongable assigned jsha Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

beautifulentropy commented Nov 25, 2024 •

edited

Loading

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Comments

beautifulentropy commented Nov 25, 2024 • edited Loading

beautifulentropy commented Nov 25, 2024 •

edited

Loading