-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trickster v2.0.0-beta2 : Prometheus query_range calls resulting in proxy-only sometimes instead of cache hit #594
Comments
We have found that in trickster-v2, the reason for decreased hit rate and calls being served as proxy-only is because, all the pods don't use cache. It is also found that request it tracks is: Due to incorrect request tracking, the handler for Can you please let us know if this could be related to: https://github.com/trickstercache/trickster/pull/672/files where the paths is not being copied. |
It is found that following is the cause of this issue:
Will raise a PR for the same. Please reach out in case any issues. Thanks! |
…uest uri When the default backend routes are registered, currently it is as per unordered map. This means that the routes can have "" or "\api\v1" before "api\v1\query_range". When the router matches the regex expression (Refs: https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/route.go#L40 https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/regexp.go#L323) with the routes map, if partial routes appears first that will be considered as a match, which causes request to be "Proxy-Only". Thus, with sorting we will ensure that routes will be a sorted array in the descending order of path lengths, meaning: "\api\v1\query_range" would appear before "api\v1" and "" so that we have appropriate match. Issue:trickstercache#594
…uest uri When the default backend routes are registered, currently it is as per unordered map. This means that the routes can have "" or "\api\v1" before "api\v1\query_range". When the router matches the regex expression (Refs: https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/route.go#L40 https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/regexp.go#L323) with the routes map, if partial routes appears first that will be considered as a match, which causes request to be "Proxy-Only". Thus, with sorting we will ensure that routes will be a sorted array in the descending order of path lengths, meaning: "\api\v1\query_range" would appear before "api\v1" and "" so that we have appropriate match. Issue:trickstercache#594 Signed-off-by: PKhivasra <[email protected]>
…uest uri (#687) * Test only Signed-off-by: PKhivasra <[email protected]> * Sort the paths in routes to ensure matcher picks nearest regex to request uri When the default backend routes are registered, currently it is as per unordered map. This means that the routes can have "" or "\api\v1" before "api\v1\query_range". When the router matches the regex expression (Refs: https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/route.go#L40 https://gecgithub01.walmart.com/Telemetry/trickster-v2/blob/main/pkg/router/regexp.go#L323) with the routes map, if partial routes appears first that will be considered as a match, which causes request to be "Proxy-Only". Thus, with sorting we will ensure that routes will be a sorted array in the descending order of path lengths, meaning: "\api\v1\query_range" would appear before "api\v1" and "" so that we have appropriate match. Issue:#594 Signed-off-by: PKhivasra <[email protected]> --------- Signed-off-by: PKhivasra <[email protected]>
We are trying to upgrade from Trickster v1.1 to v2.0.0-beta2 and facing the following issue. Hit rate has decreased and some of the calls are now being served as
proxy-only
. More details in logs below.The same suite which used to give 100% hit rate in v1.1 is giving lesser hit rate in v.2.0 beta version.
Test setup :
Trickster is deployed in our K8s cluster and we tried running the test suite with different number of pod replicas and here are our findings. We are making 10 queries/sec. Test is run continuously for ~10 mins. Same query (test_metrics) which is already present in TSDB is made every time.
Backend used is Prometheus and config is such that recent 5 mins data is not cached, anything older than till 3 weeks is cached.
Fixed time range means test_metrics is queried for same fixed time range every time REST call is made ie. start & end query params is fixed for all calls.
Variable time range uses a dynamic time range ie. end as the currentTime and start as currentTime - 900 secs (15 mins ago) for each API call this should result in partial hit (phit) for Trickster.
Fixed time range means test_metrics is queried for same fixed time range every time REST call is made ie. start & end query params is fixed for all calls.
Variable time range uses a dynamic time range ie. end as the currentTime and start as currentTime - 900 secs (15 mins ago) for each API call this should result in partial hit (phit) for Trickster.
Panel Queries :
Hit Rate :
sum(rate(trickster_proxy_requests_total{path=~"/api/.*", cache_status=~"hit|phit|nchit|rhit", job="m3-query-pod-name"}[2m])) / sum(rate(trickster_proxy_requests_total{path=~"/api/.*", job="m3-query-pod-name"}[2m]))
Proxy Only Rate :
sum(rate(trickster_proxy_requests_total{path=~"/api/.*", cache_status=~".*proxy.*", job="m3-query-pod-name"}[2m])) / sum(rate(trickster_proxy_requests_total{path=~"/api/.*", job="m3-query-pod-name"}[2m]))
Trickster Configuration :
Test suite logs with trickster response headers :
Please let me know if their is something wrong/missing in config which is leading to some queries getting served as
proxy-only
instead of cache hit.The text was updated successfully, but these errors were encountered: