-
Notifications
You must be signed in to change notification settings - Fork 332
Retry
Retrying failed work items has 2 faces.
- Retry a task and keep the scope within the task (in memory)
- Retry a task and keep the scope outside (in queue)
note: this article was written due to https://github.com/jondot/sneakers/issues/129#issuecomment-101525527
The usage scenario is simple. You called an external API and the network was glitchy, and it failed. Most times, retrying the very same moment, the exact same call should work.
Ideally, the whole retry logic:
- When to retry
- How long to wait between retries
- How many times to retry
Should not be any of your concern. This is a retry policy, that in the case of your HTTP client (the API example) should manage transparently.
See more here:
- Retry in Faraday
- A popular retry interval/policy called exponential backoff
Some exceptional cases are really exceptional. There exists some kind of tasks that once failed, will remain failed until a long time has passed (external API just throttled you), requires manual intervention (database has failed in a spectacular way), and so on.
These cases cannot be solved with in-memory retries. More often than not, if you tried that, you would get a ton of processes, threads churning your systems endlessly. Hopelessly trying to make up failed jobs.
This is where Sneakers' retry module come in. It uses RabbitMQ's dead-letter exchange which is built for failed jobs specifically. Historically, other queuing solutions needed to "invent" this time and time again.
See more:
- Sneakers max_retry