-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opentelemetry-sdk: speed up exemplars a bit #4260
base: main
Are you sure you want to change the base?
Conversation
Use a sparse dict to allocate ExemplarsBucket on demand instead of preallocating all of them in FixedSizeExemplarReservoirABC. Make the following return around 2X more loops for both trace_based and always_off exemplars filter: .tox/benchmark-opentelemetry-sdk/bin/pytest opentelemetry-sdk/benchmarks/metrics/ -k 'test_histogram_record_1000[7]'
Does the dict have any downside vs list in the steady state (measuring the same attribute set repeatedly)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xrmx
This looks good to me.
Without any proof, I would say accessing a list item by index is faster than a dict value by key. So if there are lots of buckets this may be less optimized; but the number of buckets should never be very high. |
With small data sets a list may be faster because it's simpler than the dict but with more items the O(n) list access time would lose against dict O(1) access time. |
If we don't already have a benchmark for taking a measurement with a pre-existing attribute set, it would be great to add. Maybe with the default buckets and one for an ExponentialHistogram, which defaults to maximum of 160 buckets |
One important note about the exponential histogram case is that the number of buckets for the exemplar has a lower upper bound for buckets than the histogram:
|
Is it that complexity for search rather than for accessing? If you know the index like in this case, I doubt the complexity of the list is higher. |
Got nerdsniped in this and yes the getting a list item looks O(1) too:
So I guess the difference in performance comes just from the lazyness of the ExemplarBucket() instantiations. |
@xrmx would you be open to adding a benchmark for this to make sure the dict doesn't make this much slower? That seems like the more common case to optimize for vs churning attributes. |
Need to sort out how to write such benchmark 😅 |
@aabmass is something like this what you had in mind?
Before
After
|
Description
Use a sparse dict to allocate ExemplarsBucket on demand instead of preallocating all of them in FixedSizeExemplarReservoirABC.
Make the following return around 2X more loops for both trace_based and always_off exemplars filter:
.tox/benchmark-opentelemetry-sdk/bin/pytest opentelemetry-sdk/benchmarks/metrics/ -k 'test_histogram_record_1000[7]'
I get 8% less rounds than with
a8aacb0c6f2f06bf19b501d98e62f7c0e667fa4c
that is the commit before the introductions of exemplars. Without this we are doing 60% less rounds.Refs #4243
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Contrib Repo Change?
Checklist: