You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context
We are running a Python function with a runtime of 200-500 ms, served through FastAPI with Uvicorn behind Gunicorn. During load testing—where we push 5 minutes' worth of work to the server by sending 1200 requests in parallel—we are encountering [CRITICAL] WORKER TIMEOUT errors. This causes Gunicorn to replace the workers even though they are still actively processing requests.
Working Hypothesis
Based on our observations, it seems that Gunicorn may be immediately passing all incoming requests to Uvicorn/FastAPI without managing an internal backlog. The worker timeout appears to be counting from the time Gunicorn hands over the request, rather than when the worker begins processing it. This leads to premature timeouts and unnecessary worker replacement.
Desired Behavior
We have control over the client sending the requests, which is configured to be patient (it times out after 30 seconds and then retries with a longer timeout). Ideally, we would like a mechanism that drops any requests that have been queued for longer than the client's timeout while avoiding triggering the worker timeout in Gunicorn.
Current Workaround
Our current workaround involves:
Sending the timeout duration as a header with each request.
Implementing a FastAPI middleware to drop requests that cannot be fulfilled within the specified timeout.
Setting the worker timeout in Gunicorn to a very high value.
However, this feels like a suboptimal solution. We believe others may have faced similar challenges and come up with a more elegant approach.
Request for Help
We are looking for advice or suggestions on better handling this scenario in Gunicorn. Is there a more effective way to manage request backlogs and worker timeouts that we might have missed? Any insights or recommendations would be greatly appreciated!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Context
We are running a Python function with a runtime of 200-500 ms, served through FastAPI with Uvicorn behind Gunicorn. During load testing—where we push 5 minutes' worth of work to the server by sending 1200 requests in parallel—we are encountering
[CRITICAL] WORKER TIMEOUT
errors. This causes Gunicorn to replace the workers even though they are still actively processing requests.Working Hypothesis
Based on our observations, it seems that Gunicorn may be immediately passing all incoming requests to Uvicorn/FastAPI without managing an internal backlog. The worker timeout appears to be counting from the time Gunicorn hands over the request, rather than when the worker begins processing it. This leads to premature timeouts and unnecessary worker replacement.
Desired Behavior
We have control over the client sending the requests, which is configured to be patient (it times out after 30 seconds and then retries with a longer timeout). Ideally, we would like a mechanism that drops any requests that have been queued for longer than the client's timeout while avoiding triggering the worker timeout in Gunicorn.
Current Workaround
Our current workaround involves:
However, this feels like a suboptimal solution. We believe others may have faced similar challenges and come up with a more elegant approach.
Request for Help
We are looking for advice or suggestions on better handling this scenario in Gunicorn. Is there a more effective way to manage request backlogs and worker timeouts that we might have missed? Any insights or recommendations would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions