[@rushstack/package-deps-hash] Wait for stream close when async executing git commands #4711
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixing the race condition mentioned in microsoft/rushstack-websites#231
Summary
As mentioned in the issue, this code contains a race condition risk. The function sets up
data
event handlers forstdout
andstderr
streams. Then it waits for theexit
event on the process. However, it is possible for the streams'data
andclose
events to arrive after theexit
event. In that case, the function returns an empty string, instead of the actual output.Details
The fix is that we need to wait for the streams to close to get the full output. The node docs state that
"The 'close' event is emitted after a process has ended and the stdio streams of a child process have been closed."
, but when I tried changin the awaited event fromexit
toclose
, it didn't help. So the solution is to wait for all three events.They also need to be awaited in parallel, because if we first wait for the
close
event, but theexit
event comes first, the following wait forexit
hangs the process entirely, because theexit
event had already come before we started waiting for it.How it was tested
The manifestation of this issue is randomly misbehaving incremental build when there are any uncommited changes in the repo.
Suppose I run
rush build
5x in a row:1st run builds everything
2nd run skips everything
3rd run builds packages that contain uncommited changes, even though there was no change done to the files between runs 2 and 3
4th run does the same asi 3rd
5th run skips everything
The race condition occurs on the 3rd run. It causes the
git status
call to return an empty string, so the deps hash returns hashes of files as they are in HEAD. But the previous run saved the hashes of the changed files, so there is a change detected and the build is triggered. In the 4th run thegit status
call returns the correct data, but the 3rd saved the incorrect hashes, to the build is triggered again.This would happen randomly, 5-20% of executions.
I've tested this by running
rush build
1000x in sequence and I haven't received a single rebuild or hang, so it should be solid now.