-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrapper: Monitor status of child task process even if parent exits #3307
base: master
Are you sure you want to change the base?
Conversation
On Windows, CreateProcess() is used to launch tasks, but this on its own does not handle child processes; if the parent task process exits, the workunit will be terminated. If <wait_for_children> is set in the job file, attach the task process to a job object instead, which can then be monitored to determine when all child processes are finished.
Closing and reopening to rerun CI builds |
When using a job object to handle child processes, the status should still return 0 on success. If a child exits abnormally, the completion code is set to JOB_OBJECT_MSG_ABNORMAL_EXIT_PROCESS, so just return that for now.
Add routines for handling kill(), stop(), and resume() calls on tasks that use the <wait_for_children> option
The job_handle for job objects is only relevant on Windows, so should not be referenced outside of _WIN32 blocks
Closing and reopening again to get CI builds to run. |
Why is this needed? Normally programs that create children wait for them to finish. |
Normally they do, but we are trying to wrap an application that does not follow the normal pattern. It spawns child processed and exits, which in turn makes the wrapper and thus BOINC think the app has finished computations. |
The get_job_object_processes() function was not providing a complete list of PIDs, as the cbJobObjectInformationLength parameter passed to QueryInformationJobObject() needed to be larger. It should now accomodate up to 32 processes in the job object. Also related to job control, having no timeout set in the GetQueuedCompletionStatus() call was causing task polling to hang indefinitely when a child process launched another child process. Set the timeout to 3000ms to prevent this.
@tristanolive @Rytiss This PR has a conflict with current master. Could you please check if the changes are still needed and adjust the code accordingly? |
I believe the changes are still needed. @tristanolive - can you mod the code so that it does not conflict? |
Description of the Change
On Windows, CreateProcess() is used to launch tasks, but this on its own does not handle child processes; if the parent task process exits, the workunit will be terminated. If <wait_for_children> is set in the job file, attach the task process to a job object instead, which can then be monitored to determine when all child processes are finished.
Alternate Designs
Release Notes
Add <wait_for_children> option for tasks in job.xml