-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy letting through too many requests before additional replicas ready #1038
Comments
Hello @noyoshi |
what would be the desired response from the interceptor in this case? Should it start responding with 429 when there are too many requests? Overall stability might be slightly harder to achieve because interceptor would probably need to get involved with some form of loadbalancing too. There might be a situation, where
I guess we could introduce some form of request window per replica and instead of using |
@wozniakjan hey! Sorry for the late reply. The interceptor shouldn't be responding with 429 - it should hold onto the requests like normal. I think it should just "release" the requests, allowing N*M requests to go to the service, where N = the autoscaling request count, and M = the number of running pods in the deployment |
This is all under the assumption that the scaleup threshold was properly configured and it is the max concurrent requests a replica can handle at once. Otherwise, this would technically leave some throughput on the table in some cases where the single replica could handle more than N requests. So probably something that should be up to the user to configure IMO. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. |
Report
Expected Behavior
The keda proxy should know how many replicas of a given deployment are up, and only allow (N*X) requests through, where N=num of replicas of the deployment running, and X = the targetPendingRequest value
Actual Behavior
our server cannot physically handle all 150 requests with 1 replica, which causes the requests to fail. KEDA sends all the pending requests to the single replica.
Steps to Reproduce the Problem
Logs from KEDA HTTP operator
HTTP Add-on Version
0.7.0
Kubernetes Version
< 1.27
Platform
Other
Anything else?
No response
The text was updated successfully, but these errors were encountered: