-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will cmdstanr error out if model compilation/initialization hangs? #1044
Comments
That’s a good question. I haven’t seen that particular error before.
I suppose it’s possible, but I haven’t seen it before. I think it would be determined by however processx is handling this internally, because I don’t think there’s any sort of limit imposed by cmdstanr itself. Unfortunately I don’t know enough about the internals of processx to speculate further. |
Before I reply with a wall of text, I have some more concrete questions, which I would really appreciate any comments on:
Looking at https://github.com/r-lib/processx/blob/3cbce8443f58e59a9447bf1191b9e0b8c581bf96/R/run.R#L27-L36 and https://github.com/r-lib/processx?tab=readme-ov-file#errors-1 I think it's true that some combination of cmdstanr's use of So for future readers, I am thinking that my particular set of issues is being caused by a combination of the following:
To address these hypothetical issues, I will try this combination of changes:
EDIT: I tried these changes and I still very rarely and randomly get the Regarding what this has to do with cmdstanr, perhaps the way processx is handled could be made more robust/user-managed? For example, if it's true that these issues are due to high I/O latency, then perhaps there's some argument you can pass to it, to wait longer? I think implementing this would require knowledge of both how cmdstanr uses processx and the way processx itself works. |
I've been running cmdstanr a few tens of thousands of times via brms on a HPC node. Something I've noticed is that very rarely, I'll get an error message like
Or
Because this does not happen every time I compile a model through
brms
and because it doesn't happen very frequently, I do not think it is an issue with the data I'm feeding the model or with any other aspect of the code. However, I noticed that these errors seem to happen when many thousands of models have been fitted within one session and when the system load is very high. I suspect that the compilation of the model is hanging when the system load is too high. Or perhaps it's not even compilation, but the initialization of the model that is taking too long.So my question is: How does cmdstanr react when it's waited for the compilation of the model for too long? Is it possible that the processx library/function will give the errors I got above when the system has hanged for too long?
I found
thesome associated processx code used in cmdstanr herecmdstanr/R/run.R
Lines 755 to 760 in c681d32
The HPC node is a linux system, so I don't think it's related to how cmdstanr uses processx for
mac_os
orwsl
systems.The text was updated successfully, but these errors were encountered: