-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading CICE data is very expensive #287
Comments
Hi Martin. I'm not sure of your specific case, but when loading datasets using
This makes some extra assumptions about concat variables etc. and makes the loading much quicker. It's described in more detail in the "Note" at https://xarray.pydata.org/en/stable/user-guide/io.html#reading-multi-file-datasets I would have to differ to @angus-g or @aidanheerdegen as to whether these options are/should be implemented in the cookbook. |
|
Thanks Adele, decode_coords is what I'd been looking for. |
This issue has been mentioned on ACCESS Hive Community Forum. There might be relevant details there: https://forum.access-hive.org.au/t/issues-loading-access-om2-01-data-from-cycle-4/418/3 |
Loading a CICE variable takes much more time and memory than a MOM variable. E.g.
takes 90 s and several GB of memory (from notebook on OOD) compared to
which takes ~15s. Trying to load the full run for a CICE variable takes a crazy amount of memory.
I think the issue is that the CICE variables have
where
TLON
andTLAT
are 2D variables included in the CICE files. MOM variables havewhere
geolon_t
andgeolat_t
are not in the files.I think this means that
xarray.open_mfdataset
is readingTLON
andTLAT
for each file to check if it has to concatenate on those coordinates.I couldn't see a way of persuading xarray that it should only try to concatenate on the time dimension.
The text was updated successfully, but these errors were encountered: