Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnose if rate limit timeouts are due to pipeline timeouts in 100 name orders #7846

Open
beautifulentropy opened this issue Nov 25, 2024 · 0 comments
Assignees

Comments

@beautifulentropy
Copy link
Member

beautifulentropy commented Nov 25, 2024

By default, go-redis will retry a request 3 times. Check if retries are applied to individual requests in a pipeline or if they are applied to the entire pipeline. The theory we are trying to prove out is whether a timeout of 1 or 2 keys in a 100 name order results in a whole pipeline being retried, and thus considerably more load on those same shards.

We can add a label to our metrics to bin transactions by count. With a ~105 upper limit for new-order rate limit checks, including per-name checks, we could use bins like: 1-25, 26-50, 51-75, 76-105, 106+. With these deployed we should be able to correlate timeouts to queries.

@aarongable aarongable added this to the Sprint 2024-12-03 milestone Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants