Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TimerHeapMetrics #4187

Draft
wants to merge 1 commit into
base: series/3.x
Choose a base branch
from

Conversation

iRevive
Copy link
Contributor

@iRevive iRevive commented Nov 27, 2024

A follow-up for #3317 (comment).

I'm unfamiliar with the TimerHeap internals, so perhaps the direction needs to be corrected.

@armanbilge armanbilge added this to the v3.6.0 milestone Nov 27, 2024
Copy link
Member

@armanbilge armanbilge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks so much for starting this follow-up :)

/**
* Returns the next due to fire.
*/
def nextTimerDue(): Option[Long]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should figure out what this Long represents. Currently it is an absolute time in nano seconds relative to the (arbitrary) nanoTime offset. I am not sure how useful that is.

  • Would it make sense to convert to a unix epoch time? I think we'd have to give up some precision.
  • Or, we could return the number of nanoseconds until the next timer fires — so a relative time to the present, not absolute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the number of nanoseconds until the next timer fires — so a relative time to the present, not absolute.

This option looks more versatile and suitable.

Comment on lines +252 to +256
def nextTimerDue(): Option[Long] =
heap
.lift(1)
.filter(t => !t.isDeleted() && t.triggerTime != Long.MinValue)
.map(_.triggerTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we decide on the API I can help with the implementation. I assume that this method may be called from non-owning threads, in which case we need to write the implementation defensively against race conditions.

I also think we should just return the time at the root of the heap, even if it's deleted/canceled. Returning None is confusing when there are more active timers besides the root. Besides, the timer was active at some point, deletion/cancelation is a race condition against metrics retrieval, and it's still a lower bound on the actual next timer due. So I think it's still relevant statistically speaking.

@iRevive
Copy link
Contributor Author

iRevive commented Nov 29, 2024

From the OpenTelemetry perspective, I can think of the following metrics:

  • timerheap.scheduled.count - total number of timers that have been scheduled (enqueued)
  • timerheap.executed.count - total number of executed timers
  • timerheap.cancelled.count - total number of canceled timers - canceledCounter.get() or removedCanceledCounter
  • timerheap.queue.size - current number of pending timers - size
  • timerheap.next.execution.time - the due of the next timer - nextTimerDue

We may need to introduce new counters to keep track of scheduled and executed timers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants