-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Children proxy interface #9477
base: main
Are you sure you want to change the base?
Children proxy interface #9477
Changes from 2 commits
320a8a2
1383211
e61112f
2c1b13f
03c56ea
0ca7f67
860764d
51c198e
e0358aa
08f83f8
7f87d08
1dace08
d6de4a1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,19 @@ | ||
from __future__ import annotations | ||
|
||
from collections.abc import Iterator | ||
from textwrap import dedent | ||
from typing import cast | ||
|
||
import pytest | ||
|
||
from xarray.core.iterators import LevelOrderIter | ||
from xarray.core.treenode import InvalidTreeError, NamedNode, NodePath, TreeNode | ||
from xarray.core.treenode import ( | ||
Children, | ||
InvalidTreeError, | ||
NamedNode, | ||
NodePath, | ||
TreeNode, | ||
) | ||
|
||
|
||
class TestFamilyTree: | ||
|
@@ -224,6 +231,52 @@ def test_overwrite_child(self): | |
assert marys_evil_twin.parent is john | ||
|
||
|
||
class TestChildren: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a bit awkward to know whether to put tests for this class in On the one hand the concept of access to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a lot less awkward after #9482. |
||
def test_properties(self): | ||
sue: TreeNode = TreeNode() | ||
kate: TreeNode = TreeNode() | ||
mary = TreeNode(children={"Sue": sue, "Kate": kate}) | ||
|
||
children = mary.children | ||
assert isinstance(children, Children) | ||
|
||
# len | ||
assert len(children) == 2 | ||
|
||
# iter | ||
assert list(children) == ["Sue", "Kate"] | ||
|
||
assert mary.children["Sue"] is sue | ||
assert mary.children["Kate"] is kate | ||
|
||
assert "Sue" in mary.children | ||
assert "Kate" in mary.children | ||
assert 0 not in mary.children | ||
assert "foo" not in mary.children | ||
|
||
with pytest.raises(KeyError): | ||
children["foo"] | ||
with pytest.raises(KeyError): | ||
children[0] | ||
|
||
# TODO not sure what this should look like... | ||
# repr | ||
expected = dedent( | ||
"""\ | ||
Children: | ||
* x (x) int64 16B -1 -2 | ||
* y (y) int64 24B 0 1 2 | ||
a (x) int64 16B 4 5 | ||
b int64 8B -10""" | ||
) | ||
# actual = repr(coords) | ||
# assert expected == actual | ||
|
||
def test_modify(self): | ||
# TODO | ||
... | ||
|
||
|
||
class TestPruning: | ||
def test_del_child(self): | ||
john: TreeNode = TreeNode() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether this class should allow path-like access to grandchildren or only allow access to immediate children.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question also for setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thoughts I think this class should only allow getting/setting immediate children (because it's basically a glorified
dict
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to #9485 though - why exactly do we want to support full paths for
.coords
but only local access for.children
? We could do whatever, but what's the rationale for them being different?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're basically thinking of
.coords
/.children
as meaning "access to all coords/children defined anywhere on the subtree". I originally thought of them as "access to just coords/children defined on this node".FYI path-like access like this means that
list(dt.children)
will return a subset of what can be accessed via.__getitem__
, (and__contains__
could go either way, see #9354)).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One compromise option of sorts (similar to what is done for Dataset for derived variables like
time.year
, or for coordinates inDataset.__getitem__
) is to only list immediate children in__iter__
and__contains__
but to support accessing them in__getitem__
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a detailed discussion about this on the old repo (xarray-contrib/datatree#240 (comment)). There the conclusion was that
__contains__
should support paths if__getitem__
does, but it's okay for.keys()
/.values()
/.items()
to be local only.To play devil's advocate a bit: If
.coords
can access the whole tree then why not also.variables
? I think they should all be consistent.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems to be how Dataset works for coordinates.
Sure, we could do that. Hopefully it's all very straightforwardly reusing the same machinery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean. You're talking about virtual vs non-virtual coordinates on
Dataset
? That they are all in__contains__
and__getitem__
, but only a subset (the non-virtual ones) are listed by.keys()
?I actually meant that as a example of counter-intuitive behaviour 😅 . But maybe it's fine too? That would mean we have really done a 180 degree turn in this issue... It would imply changing the current behaviour of
dt.data_vars
too.It's not right now (local vs path-supporting have different codepaths), but it could be made to: everything could go through some version of using
TreeNode._set_item
xarray/xarray/core/treenode.py
Line 550 in 8bd456c
or
._walk_to
and we could add a parameter liketraverse_paths
to generalize it for both cases. Generalizing.update
to support paths might be the approach that is easiest to integrate with theCoordinates
base class.Also one final thing: We are talking here about supporting paths to descendant nodes only? Or upwards via
dt['../../path/to/node/']
too?