chore: refactor `compare_dicts` #1224

EdAbati · 2024-10-18T21:55:40Z

What type of PR is this? (check all applicable)

Related issues

Related issue #
Closes #

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below.

I propose a couple of changes:

I'd like to have a more verbose assertion error message when a unitest fail. Taking inspiration from pandas, I'm printing all the values of the columns that differ and the index where the difference was first noticed. This is particularly helpful for dataframes libraries that don't keep row order (* cough cough * pyspark * cough cough *)

def test_select(constructor: Constructor) -> None:
    data = {"a": [1, 3, 2], "b": [4, 4, 6], "z": [7.0, 8, 9]}
    df = nw.from_native(constructor(data))
    result = df.select("a")
    expected = {"a": [1, 3, 3]}
    compare_dicts(result, expected)

Before:

After:

I think we should rename compare_dicts because we are not actually comparing 2 dictionaries. 😅 I propose something like assert_equal_data, any better name? This makes the diff of the PR a bit large, happy to revert if you think we should keep the previous name

tests/utils.py

FBruzzesi

Thanks @EdAbati I like this a lot more! I left a comment but can also be thought as a follow up.

For the naming, I think this is the closest we have to what in pandas and polars is assert_frame_equal, but we are not there quite yet, so I am not sure 🙈

FBruzzesi · 2024-10-19T15:11:18Z

tests/utils.py

            else:
-                assert lhs == rhs, (lhs, rhs)
+                are_valid_values = lhs == rhs
+            assert are_valid_values, f"Mismatch at index {i}: {lhs} != {rhs}\nExpected: {expected}\nGot: {result}"


Should we record the full diff instead of stopping at the first unequal encounter?

EdAbati added 4 commits October 17, 2024 18:08

rename compare_dicts

5488c1b

Merge remote-tracking branch 'upstream/main' into assert-equal-data

006da9d

missing rename

a9db656

refactor assert_equal_data

85c3d45

github-actions bot added the internal label Oct 18, 2024

EdAbati marked this pull request as ready for review October 18, 2024 21:59

EdAbati commented Oct 18, 2024

View reviewed changes

tests/utils.py Outdated Show resolved Hide resolved

EdAbati added 2 commits October 19, 2024 11:45

Merge remote-tracking branch 'upstream/main' into assert-equal-data

468a371

use to_py_scalar

c219b90

FBruzzesi reviewed Oct 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: refactor `compare_dicts` #1224

chore: refactor `compare_dicts` #1224

EdAbati commented Oct 18, 2024

FBruzzesi left a comment

FBruzzesi Oct 19, 2024

chore: refactor compare_dicts #1224

Are you sure you want to change the base?

chore: refactor compare_dicts #1224

Conversation

EdAbati commented Oct 18, 2024

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below.

FBruzzesi left a comment

Choose a reason for hiding this comment

FBruzzesi Oct 19, 2024

Choose a reason for hiding this comment

chore: refactor `compare_dicts` #1224

chore: refactor `compare_dicts` #1224