Skip to content

Commit

Permalink
Splitting benchmark for numbers and complex arrays.
Browse files Browse the repository at this point in the history
As part of #25 realized that numeric matchers are orders of magnitude slow not because  of inherent issue within the matcher specific  code, but instead because the benchmarks were stressed while checking for complex arrays (don't have a better term for this yet, think "json arrays within arrays within ..." )

After splitting the matchers into two and introducing a second PARTIAL_COMBO benchmark, I was able to identify a regression I would have introduced within the ByteMachine.java for numeric ranges.

As part of this change we're also changing the citylots2.json.gz file and adding a new firstCoordinates key for numeric matching only. I tried other existing properties first but as none of them have floating points or large numbers, the benchmarks results were not matching my expectations.
  • Loading branch information
baldawar committed Sep 10, 2022
1 parent b22cecf commit 0816dfb
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 5 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -688,10 +688,11 @@ prefix-match, suffix-match, equals-ignore-case-match, wildcard-match, numeric-ma
counts the matches, yields the following on a 2019 MacBook:

Events are processed at over 220K/second except for:
- equals-ignore-case matches, which are processed at over 180K/second.
- equals-ignore-case matches, which are processed at over 200K/second.
- wildcard matches, which are processed at over 170K/second.
- anything-but matches, which are processed at over 110K/second.
- numeric matches, which are processed at over 2.5K/second.
- anything-but matches, which are processed at over 150K/second.
- numeric matches, which are processed at over 120K/second.
- complex array matches, which are processed at over 2.5K/second.

### Suggestions for better performance

Expand Down
Binary file modified src/test/data/citylots2.json.gz
Binary file not shown.
63 changes: 61 additions & 2 deletions src/test/software/amazon/event/ruler/Benchmarks.java
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ public class Benchmarks {
};
private final int[] EQUALS_IGNORE_CASE_MATCHES = { 131, 211, 1758, 825, 116386 };

private final String[] NUMERIC_RULES = {
private final String[] COMPLEX_ARRAYS_RULES = {
"{\n" +
" \"geometry\": {\n" +
" \"type\": [ \"Polygon\" ],\n" +
Expand Down Expand Up @@ -223,7 +223,48 @@ public class Benchmarks {
" }\n" +
"}"
};
private final int[] NUMERIC_MATCHES = { 227, 2, 149444, 64368, 127485 };
private final int[] COMPLEX_ARRAYS_MATCHES = { 227, 2, 149444, 64368, 127485 };

private final String[] NUMERIC_RULES = {
"{\n" +
" \"geometry\": {\n" +
" \"type\": [ \"Polygon\" ],\n" +
" \"firstCoordinates\": {\n" +
" \"x\": [ { \"numeric\": [ \"=\", -122.42916360922355 ] } ]\n" +
" }\n" +
" }\n" +
"}",
"{\n" +
" \"geometry\": {\n" +
" \"type\": [ \"MultiPolygon\" ],\n" +
" \"firstCoordinates\": {\n" +
" \"z\": [ { \"numeric\": [ \"=\", 0 ] } ]\n" +
" }\n" +
" }\n" +
"}",
"{\n" +
" \"geometry\": {\n" +
" \"firstCoordinates\": {\n" +
" \"x\": [ { \"numeric\": [ \"<\", -122.41600944012424 ] } ]\n" +
" }\n" +
" }\n" +
"}",
"{\n" +
" \"geometry\": {\n" +
" \"firstCoordinates\": {\n" +
" \"x\": [ { \"numeric\": [ \">\", -122.41600944012424 ] } ]\n" +
" }\n" +
" }\n" +
"}",
"{\n" +
" \"geometry\": {\n" +
" \"firstCoordinates\": {\n" +
" \"x\": [ { \"numeric\": [ \">\", -122.46471267081272, \"<\", -122.4063085128395 ] } ]\n" +
" }\n" +
" }\n" +
"}"
};
private final int[] NUMERIC_MATCHES = { 8, 120, 148943, 64120, 127053 };

private final String[] ANYTHING_BUT_RULES = {
"{\n" +
Expand Down Expand Up @@ -476,10 +517,28 @@ public void CL2Benchmark() throws Exception {

bm = new Benchmarker();

bm.addRules(COMPLEX_ARRAYS_RULES, COMPLEX_ARRAYS_MATCHES);
bm.run(citylots2);
System.out.println("COMPLEX_ARRAYS events/sec: " + String.format("%.1f", bm.getEPS()));

// skips complex arrays matchers because their slowness can hide improvements
// and regressions for other matchers. Remove this once we find ways to make
// arrays fast enough to others matchers
bm = new Benchmarker();

bm.addRules(NUMERIC_RULES, NUMERIC_MATCHES);
bm.addRules(EXACT_RULES, EXACT_MATCHES);
bm.addRules(PREFIX_RULES, PREFIX_MATCHES);
bm.addRules(ANYTHING_BUT_RULES, ANYTHING_BUT_MATCHES);
bm.run(citylots2);
System.out.println("PARTIAL_COMBO events/sec: " + String.format("%.1f", bm.getEPS()));

bm = new Benchmarker();
bm.addRules(NUMERIC_RULES, NUMERIC_MATCHES);
bm.addRules(EXACT_RULES, EXACT_MATCHES);
bm.addRules(PREFIX_RULES, PREFIX_MATCHES);
bm.addRules(ANYTHING_BUT_RULES, ANYTHING_BUT_MATCHES);
bm.addRules(COMPLEX_ARRAYS_RULES, COMPLEX_ARRAYS_MATCHES);
bm.run(citylots2);
System.out.println("COMBO events/sec: " + String.format("%.1f", bm.getEPS()));
}
Expand Down

0 comments on commit 0816dfb

Please sign in to comment.