Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[@rushstack/package-deps-hash] Wait for stream close when async executing git commands #4711

Merged
merged 1 commit into from
May 16, 2024

Conversation

akres
Copy link
Contributor

@akres akres commented May 15, 2024

Fixing the race condition mentioned in microsoft/rushstack-websites#231

Summary

As mentioned in the issue, this code contains a race condition risk. The function sets up data event handlers for stdout and stderr streams. Then it waits for the exit event on the process. However, it is possible for the streams' data and close events to arrive after the exit event. In that case, the function returns an empty string, instead of the actual output.

Details

The fix is that we need to wait for the streams to close to get the full output. The node docs state that "The 'close' event is emitted after a process has ended and the stdio streams of a child process have been closed.", but when I tried changin the awaited event from exit to close, it didn't help. So the solution is to wait for all three events.

They also need to be awaited in parallel, because if we first wait for the close event, but the exit event comes first, the following wait for exit hangs the process entirely, because the exit event had already come before we started waiting for it.

How it was tested

The manifestation of this issue is randomly misbehaving incremental build when there are any uncommited changes in the repo.
Suppose I run rush build 5x in a row:
1st run builds everything
2nd run skips everything
3rd run builds packages that contain uncommited changes, even though there was no change done to the files between runs 2 and 3
4th run does the same asi 3rd
5th run skips everything

The race condition occurs on the 3rd run. It causes the git status call to return an empty string, so the deps hash returns hashes of files as they are in HEAD. But the previous run saved the hashes of the changed files, so there is a change detected and the build is triggered. In the 4th run the git status call returns the correct data, but the 3rd saved the incorrect hashes, to the build is triggered again.

This would happen randomly, 5-20% of executions.

I've tested this by running rush build 1000x in sequence and I haven't received a single rebuild or hang, so it should be solid now.

@iclanton
Copy link
Member

@akres - Can you sign the CLA?

@akres
Copy link
Contributor Author

akres commented May 15, 2024

@akres - Can you sign the CLA?

Hi, @iclanton, I need to double check with my company that it is OK. I expect that I will do it tomorrow (my tomorrow, so if all goes well should be signed when you folks show up)

@akres
Copy link
Contributor Author

akres commented May 16, 2024

@microsoft-github-policy-service agree

@akres akres force-pushed the fix-race-condition-in-wsl branch from 0c69bbe to 9f40a15 Compare May 16, 2024 12:12
@D4N14L D4N14L merged commit 73e4e90 into microsoft:main May 16, 2024
5 checks passed
@akres akres deleted the fix-race-condition-in-wsl branch May 20, 2024 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants