Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(polars): add ArrayDistinct operation #10334

Merged
merged 1 commit into from
Oct 19, 2024

Conversation

IndexSeek
Copy link
Contributor

Description of changes

Implements ArrayUnion with pl.Series.list.unique.

The test marker

Similar to what I was experiencing when implementing ArrayUnion, I'm getting nan/floats when there is a None in the array<int64> column. Here is some of the code I was using to test this behavior:

In [1]: import polars as pl
   ...: from ibis.interactive import *

In [2]: d_con = ibis.connect("duckdb://")
   ...: p_con = ibis.connect("polars://")

In [3]: df = pl.DataFrame({"a": [[1, 3, 3], [], [42, 42], [], [None], None]})
   ...: t = ibis.memtable(df)

In [4]: expr = t.select("a", uniqued=_.a.unique())

In [5]: d_con.execute(expr)
Out[5]: 
           a uniqued
0  [1, 3, 3]  [3, 1]
1         []      []
2   [42, 42]    [42]
3         []      []
4     [None]  [None]
5       None    None

In [6]: p_con.execute(expr)
Out[6]: 
                 a     uniqued
0  [1.0, 3.0, 3.0]  [1.0, 3.0]
1               []          []
2     [42.0, 42.0]      [42.0]
3               []          []
4            [nan]       [nan]
5             None        None

In [7]: df.select(pl.col("a"), pl.col("a").list.unique().alias("uniqued"))
Out[7]: 
shape: (6, 2)
┌───────────┬───────────┐
│ a         ┆ uniqued   │
│ ---       ┆ ---       │
│ list[i64] ┆ list[i64] │
╞═══════════╪═══════════╡
│ [1, 3, 3] ┆ [1, 3]    │
│ []        ┆ []        │
│ [42, 42]  ┆ [42]      │
│ []        ┆ []        │
│ [null]    ┆ [null]    │
│ null      ┆ null      │
└───────────┴───────────┘

@github-actions github-actions bot added tests Issues or PRs related to tests polars The polars backend labels Oct 18, 2024
@cpcloud cpcloud added this to the 10.0 milestone Oct 19, 2024
@cpcloud cpcloud merged commit 5657d21 into ibis-project:main Oct 19, 2024
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
polars The polars backend tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants