-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TimerHeapMetrics
#4187
base: series/3.x
Are you sure you want to change the base?
Add TimerHeapMetrics
#4187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks so much for starting this follow-up :)
/** | ||
* Returns the next due to fire. | ||
*/ | ||
def nextTimerDue(): Option[Long] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should figure out what this Long
represents. Currently it is an absolute time in nano seconds relative to the (arbitrary) nanoTime
offset. I am not sure how useful that is.
- Would it make sense to convert to a unix epoch time? I think we'd have to give up some precision.
- Or, we could return the number of nanoseconds until the next timer fires — so a relative time to the present, not absolute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number of nanoseconds until the next timer fires — so a relative time to the present, not absolute.
This option looks more versatile and suitable.
def nextTimerDue(): Option[Long] = | ||
heap | ||
.lift(1) | ||
.filter(t => !t.isDeleted() && t.triggerTime != Long.MinValue) | ||
.map(_.triggerTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we decide on the API I can help with the implementation. I assume that this method may be called from non-owning threads, in which case we need to write the implementation defensively against race conditions.
I also think we should just return the time at the root of the heap, even if it's deleted/canceled. Returning None
is confusing when there are more active timers besides the root. Besides, the timer was active at some point, deletion/cancelation is a race condition against metrics retrieval, and it's still a lower bound on the actual next timer due. So I think it's still relevant statistically speaking.
From the OpenTelemetry perspective, I can think of the following metrics:
We may need to introduce new counters to keep track of |
A follow-up for #3317 (comment).
I'm unfamiliar with the
TimerHeap
internals, so perhaps the direction needs to be corrected.