Splitting benchmark for numbers and complex arrays. #40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available: #25
Description of changes:
As part of #25 realized that numeric matchers are orders of magnitude slow not because of inherent issue within the matcher specific code, but instead because the benchmarks were stressed while checking for complex arrays (don't have a better term for this yet, think "json arrays within arrays within ..." )
After splitting the matchers into two and introducing a second PARTIAL_COMBO benchmark, I was able to identify a regression I would have introduced within the ByteMachine.java for numeric ranges.
As part of this change we're also changing the citylots2.json.gz file and adding a new
firstCoordinates
key for numeric matching only. I tried other existing properties first but as none of them have floating points or large numbers, the benchmarks results were not matching my expectations. No licensing concerns with the modification of the dataset as the original citylots data was under public domain and the citylots2 dataset was created as part of this library.Benchmark / Performance (for source code changes):
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.