-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model crash at a specific forecast time #2494
Comments
How often are you writing ice history/restart files? |
24h. Rahul thinks that I potentially hit a limit on CICE outputs - it sounds like you are thinking in the same direction. |
At 1/x day for CICE, even the full 6months would only be 180 files. That is well below the "600 file" approx limit. |
Yeah, I agree with Denise's assessment. Can you tell which process it's failing on? |
I don't seem to have any more debug information than what is in the original message (no PET files). The biggest clue I have so far is that in all four runs that crashed the crash occurred part way through writing gefs.ocean.t00z.24hr_avg.f2064.nc. It also looks like they all generated gefs.ice.t00z.24hr_avg.f2088.nc but the file has 0 size. Edit: Given that the error is coming from PIOc_createfile it seems like maybe the error is coming from attempting to create that gefs.ice.t00z.24hr_avg.f2088.nc file? |
Update - I am running a new case with breakpnts set every 1472 hours in the global workflow and the run has now progressed past this point (currently at about hour 2500), so it certainly appears that my earlier runs encountered an internal limit of some kind. |
@benjamin-cash Do you continue to see this zap_snow_temperature error in the log files (even if it doesn't crash)? |
@NickSzapiro-NOAA - Yes, those errors appear steadily throughout the simulation. |
I am running a 6-month, 10-member C192mx025 ensemble, and so far 4 of the 10 members have reached a specfic n_atmsteps value and crashed with the error below (slight variations between runs).
Looking at the history files, in each case the model was part-way through writing out gefs.ocean.t00z.24hr_avg.f2064.nc when it crashed. All of the crashes were at different times, so it wasn't something simple like a disk issue that caused a temporary error in writing files. Is there some kind of internal limit that I've encountered?
The text was updated successfully, but these errors were encountered: