-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RayCluster][Fix] leave .Status.State untouched when there is a reconcile error #2622
[RayCluster][Fix] leave .Status.State untouched when there is a reconcile error #2622
Conversation
36d4cb3
to
90d7377
Compare
do we still need this in case #2357 (comment)? |
I think we still need this. Currently, we mark the RayCluster CR as This PR will make |
maybe we need to check the number of Pods before passing to the function |
Yeap, checking |
does it include in |
Yes, it is included in the And I think it is not very good to call |
This is a bug. We should not call |
Discussed offline:
|
57b4fbf
to
456eb43
Compare
…cile error Signed-off-by: Rueian <[email protected]>
…esiredWorkers+Head Signed-off-by: Rueian <[email protected]>
456eb43
to
605dd33
Compare
Updated. Test cases related to |
@@ -1784,6 +1801,147 @@ func TestCalculateStatus(t *testing.T) { | |||
assert.True(t, meta.IsStatusConditionPresentAndEqual(newInstance.Status.Conditions, string(rayv1.RayClusterReplicaFailure), metav1.ConditionTrue)) | |||
} | |||
|
|||
func TestCalculateStatusWithoutDesiredReplicas(t *testing.T) { | |||
setupTest(t) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add comments to explain this test.
- Add assertions to ensure that
testRayCluster
fulfills the assumptions made by this test (i.e., the worker group's replicas should not be 0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
func TestCalculateStatusWithReconcileErrorBackAndForth(t *testing.T) { | ||
setupTest(t) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…esiredWorkers+Head Signed-off-by: Rueian <[email protected]>
Why are these changes needed?
We accidentally set the
.State
of RayCluster CR torayv1.Ready
even if there was a reconcile error in PR #2258 where we removedrayv1.Failed
. This made users unable to deal withfalse ready
RayClusters with no pod actually created, see #2357.But since #2258 has been released and the
rayv1.Failed
has gone, I think we can't bring backrayv1.Failed
either now. This PR fixes the issue by leaving the.State
field untouched when there is a reconcile error but setting the error message in the.Reason
field. That means the initial.State
for case #2357 will beblank
and the pod creation error will be in the.Reason
field. The new unit test case in this PR also tests this behavior.Example result:
Related issue number
Closes #2357
Checks