avg calculation with Parquet file in vector runtime seems more "off" #5530

philrz · 2024-12-12T18:31:41Z

tl;dr

This is a follow-on from #5516. Repeating the avg calculation using the test data from that issue, consider the results in the table below. Treating the result 1058.5234017218875 from DuckDB as a baseline, with the different super formats and runtime options we see:

Format	Runtime	Result	Delta
BSUP	sequential	`1058.523401720017`	`0.0000000018705`
CSUP	sequential	`1058.523401720017`	`0.0000000018705`
Parquet	sequential	`1058.523401720017`	`0.0000000018705`
CSUP	vector	`1058.5234017218877`	`0.0000000000002`
Parquet	vector	`1058.5297770606794`	`0.0063753387919`

Details

Repro is with super commit 883ffd2.

The the runs that generated the results in the table above are in #5516 (comment) and the test data is linked from that issue.

Users are accustomed to seeing small differences in precision with floating point math, so I expect this might all be explained by something about parallel operations. Indeed, we've seen some non-deterministic floating point calculations with other database systems, for instance. However, seeing the delta as soon as the 4th digit with vector Parquet is surprising compared to out at 9+ digits like the others, so I figured I'd surface this in case it's worthy of closer scrutiny.

The text was updated successfully, but these errors were encountered:

philrz mentioned this issue Dec 12, 2024

Incorrect avg calculation from vector runtime #5516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avg calculation with Parquet file in vector runtime seems more "off" #5530

avg calculation with Parquet file in vector runtime seems more "off" #5530

philrz commented Dec 12, 2024 •

edited

Loading

avg calculation with Parquet file in vector runtime seems more "off" #5530

avg calculation with Parquet file in vector runtime seems more "off" #5530

Comments

philrz commented Dec 12, 2024 • edited Loading

tl;dr

Details

philrz commented Dec 12, 2024 •

edited

Loading