Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Depend on geoarrow-rust? #669

Open
kylebarron opened this issue Oct 3, 2024 · 0 comments
Open

Depend on geoarrow-rust? #669

kylebarron opened this issue Oct 3, 2024 · 0 comments

Comments

@kylebarron
Copy link
Member

kylebarron commented Oct 3, 2024

There are probably a few places where geoarrow-rust could be useful to Lonboard.

One is getting the total bounds of the input:

def total_bounds(field: Field, column: ChunkedArray) -> Bbox:
"""Compute the total bounds of a geometry column"""
extension_type_name = field.metadata[b"ARROW:extension:name"]
if extension_type_name == EXTENSION_NAME.POINT:
return _total_bounds_nest_0(column)
if extension_type_name in [EXTENSION_NAME.LINESTRING, EXTENSION_NAME.MULTIPOINT]:
return _total_bounds_nest_1(column)
if extension_type_name in [EXTENSION_NAME.POLYGON, EXTENSION_NAME.MULTILINESTRING]:
return _total_bounds_nest_2(column)
if extension_type_name == EXTENSION_NAME.MULTIPOLYGON:
return _total_bounds_nest_3(column)
assert False
def _coords_bbox(arr: Array) -> Bbox:
assert DataType.is_fixed_size_list(arr.type)
list_size = arr.type.list_size
assert list_size is not None
np_arr = list_flatten(arr).to_numpy().reshape(-1, list_size)
min_vals = np.min(np_arr, axis=0)
max_vals = np.max(np_arr, axis=0)
return Bbox(minx=min_vals[0], miny=min_vals[1], maxx=max_vals[0], maxy=max_vals[1])
def _total_bounds_nest_0(column: ChunkedArray) -> Bbox:
bbox = Bbox()
for coords in column.chunks:
bbox.update(_coords_bbox(coords))
return bbox
def _total_bounds_nest_1(column: ChunkedArray) -> Bbox:
bbox = Bbox()
flat_array = list_flatten(column)
for coords in flat_array:
bbox.update(_coords_bbox(coords))
return bbox
def _total_bounds_nest_2(column: ChunkedArray) -> Bbox:
bbox = Bbox()
flat_array = list_flatten(list_flatten(column))
for coords in flat_array:
bbox.update(_coords_bbox(coords))
return bbox
def _total_bounds_nest_3(column: ChunkedArray) -> Bbox:
bbox = Bbox()
flat_array = list_flatten(list_flatten(list_flatten(column)))
for coords in flat_array:
bbox.update(_coords_bbox(coords))
return bbox

That apparently takes two seconds with 12.5M points, which seems awfully slow: xarray-contrib/xdggs#67 (comment)

The main blocker here is that we want to ensure that geoarrow-rust is stable enough to depend on here in Lonboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant