Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not find stackdriver metric with query fetch pubsub_subscription - Google Cloud Platform‎ Pub/Sub #5855

Closed
rcng6514 opened this issue Jun 3, 2024 · 6 comments
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity

Comments

@rcng6514
Copy link

rcng6514 commented Jun 3, 2024

Report

We seem to have the same problem as #5452 we believe since upgrading to 2.14.0. Infrequently Keda throwing that it cannot find a metric that matches that filter. GCP audit logs show no failed auth or perms issues.
In a 7 day window we observed 863 instances of this error

Expected Behavior

Metric query returned consistently

Actual Behavior

Metric query inconsistently returned with the following error: could not find stackdriver metric with query fetch pubsub_subscription

Steps to Reproduce the Problem

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  annotations:
    meta.helm.sh/release-name: <name>
    meta.helm.sh/release-namespace: <namespace>
  labels:
    app.kubernetes.io/managed-by: Helm
    helm.toolkit.fluxcd.io/name: <name>
    helm.toolkit.fluxcd.io/namespace: <namespace>
    scaledobject.keda.sh/name: <name>
  name: <name>
  namespace: <namespace>
spec:
  cooldownPeriod: 10
  maxReplicaCount: 3
  minReplicaCount: 1
  pollingInterval: 1
  scaleTargetRef:
    name: <target>
  triggers:
  - authenticationRef:
      name: <auth>
    metadata:
      mode: SubscriptionSize
      subscriptionName: projects/<project>/subscriptions/<sub>
      value: "50000"
    type: gcp-pubsub

Logs from KEDA operator

2024-05-30T18:04:43Z	ERROR	gcp_pub_sub_scaler	error getting metric	{"type": "ScaledObject", "namespace": "blah", "name": "blah", "metricType": "pubsub.googleapis.com/subscription/num_undelivered_messages", "error": "could not find stackdriver metric with query fetch pubsub_subscription | metric 'pubsub.googleapis.com/subscription/num_undelivered_messages' | filter (resource.project_id == 'blah' && resource.subscription_id == 'blah') | within 2m"}

KEDA Version

2.14.0

Kubernetes Version

1.28

Platform

Google Cloud

Scaler Details

Google Cloud Platform‎ Pub/Sub

Anything else?

Poll interval if reduced to 5 seconds reduces errors seen but does not eliminate them

@rcng6514 rcng6514 added the bug Something isn't working label Jun 3, 2024
@Caislear
Copy link
Contributor

Just noting here as these two issues may be correlated as this issue sounds somewhat similar to what I am experiencing #5896

Out of interest do your subscriptions regularly have no messages in them?

@rcng6514
Copy link
Author

Thanks @Caislear , we've got two envs that are impacted by this. One has a constant stream of messages in queue that rarely hit zero, the other more often hits zero messages in queue. Both are impacted by this bug but interestingly the higher volume environment experiences more of these errors

Copy link

stale bot commented Aug 22, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Aug 22, 2024
@JorTurFer
Copy link
Member

There was a bug in v2.13 that reduced unexpectedly the aggregation period but if was solved in v2.14. In v2.15 the time window supports custom horizons -> #5429
I think that you could have some periods without metrics that are treated as errors and probably increasing the window it can help.
We added also the option to set a custom value if metric not available as part of #5897

Copy link

stale bot commented Aug 31, 2024

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Aug 31, 2024
@dominykasn
Copy link

We are still experiencing this issue while using

helm.sh/chart=keda-2.15.1

any possible fixes we could try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity
Projects
None yet
Development

No branches or pull requests

4 participants