fix/optimize the logic of finding cluster service/pod ip range #3285

unilinu · 2024-11-28T07:58:18Z

What type of PR is this?

Fix or Optimize

What this PR does / why we need it:

According to the original logic, but fix some issues and optimize:

Remove find pod ip range from pod's spec. The node.spec.podCIDR field is not the cluster pod ip range, but the node-assigned.

kubectl explain node.spec.podCIDR shows:
PodCIDR represents the pod IP range assigned to the node

Correct the label selector of kube-proxy from component=kube-proxy to k8s-app=kube-proxy
Similar to findPodIPRange function, add the redundant finding from kube-controller-manager.

submariner-bot · 2024-11-28T07:58:23Z

🤖 Created branch: z_pr3285/unilinu/fix/optimize-finding-cluster-iprange
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

tpantelis · 2024-11-28T12:42:43Z

pkg/discovery/network/generic.go

@@ -78,22 +77,31 @@ func discoverNetwork(ctx context.Context, client controllerClient.Client) (*Clus

 func findClusterIPRange(ctx context.Context, client controllerClient.Client) (string, error) {
 	clusterIPRange, err := findClusterIPRangeFromApiserver(ctx, client)
-	if err != nil || clusterIPRange != "" {
-		return clusterIPRange, err
+	if clusterIPRange != "" {


Why did you remove the error checking here and other places below? This hides the error which isn't best practice. An error here means it could not determine if the IP range could be obtained, most likely because the API server wasn't available, in which case we want to abort and report the error.

Well, It seems I understand wrong the purpose of these error checking. I thought this function is to detect the available ClusterIP range, but it seems to detect the availability of component/service related to ClusterIP. I will cancel the related modifications later.

In addition, I think the point 1&2 should be retained, but for point 3, I wonder if add the redundant finding from kube-controller-manager.

@tpantelis Another question, the original if conditions err != nil || clusterIPRange != "" is almost true, which will cover/skip the following judging!!
In my opinion, this function is used to find the service cluster ip range. Even if occurs error, which should be detected and handled somewhere, adding a waring/error log but continuing finding is a good idea here.

If you agree, I will modify it as above. If not, looking forward to your reply.

In my opinion, this function is used to find the service cluster ip range. Even if occurs error, which should be detected and handled somewhere, adding a waring/error log but continuing finding is a good idea here.

If you agree, I will modify it as above. If not, looking forward to your reply.

I don't agree. Consider this scenario:

Let's say in a particular environment the first call, findClusterIPRangeFromApiserver, under normal circumstances with no error finds the cluster IP. But on a particular run a transient error occurs, eg the API server is busy and times out, in which case we would just log a warning and continue. For the rest of the code below, the API server responds and no error occurs but the cluster IP isn't found by any of the following methods. This function would return and report to the caller that no cluster IP was found and no error occurred. But this is incorrect as an error did occur but it was essentially ignored by just logging it. We must propagate errors to the caller to indicate that the result was indeterminate and to let the caller handle the error as it sees fit.

I can understand it. So, each error should be checked and not be covered by assignments. However, there is no doubt that err != nil || clusterIPRange != "" condition will cover/skip the following statements. This point must be corrected. I'll modify the PR correspondingly.

No - the error checking was fine as is. Given your recent changes and the scenario I outlined above, it would return a valid cluster IP and an error which is non-sensical and non-idiomatic.

Please just focus on the purpose of this PR as outlined in your summary and not modify the existing error handling.

Thanks for your patience. I know what I misunderstood. err != nil || clusterIPRange != "" is not always true. There is a situation not finding cluster ip range from the args of apiserver/controller-manager.

I have modified my patch, removing the changes of error checking.

tpantelis · 2024-12-04T02:29:18Z

pkg/discovery/network/generic.go

-	// the cluster is a single node deployment, we should rely on the node.Spec.PodCIDR as podCIDR of the cluster.
-	if len(nodes) == 1 {
-		return nodes[0].Spec.PodCIDR, nil
-	}


From the PR summary:

The node.spec.podCIDR field is not the cluster pod ip range, but the node-assigned.

But this is only considered if there's only one node as explained in the comments above and if the pod CIDR wasn't found from any prior methods. It seems valid to keep this.

tpantelis · 2024-12-04T02:39:43Z

@unilinu The DCO job failed - you need to add Signed-off-by your commit.

unilinu requested review from Oats87, skitt, sridhargaddam, tpantelis and vthapar as code owners November 28, 2024 07:58

tpantelis requested changes Nov 28, 2024

View reviewed changes

unilinu force-pushed the fix/optimize-finding-cluster-iprange branch from e588a15 to 128bf17 Compare December 3, 2024 02:45

fix/optimize the logic of finding cluster service/pod ip range

66c3f51

unilinu force-pushed the fix/optimize-finding-cluster-iprange branch from 128bf17 to 66c3f51 Compare December 4, 2024 01:41

tpantelis requested changes Dec 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix/optimize the logic of finding cluster service/pod ip range #3285

fix/optimize the logic of finding cluster service/pod ip range #3285

unilinu commented Nov 28, 2024

submariner-bot commented Nov 28, 2024

tpantelis Nov 28, 2024

unilinu Nov 29, 2024

unilinu Dec 2, 2024

tpantelis Dec 2, 2024 •

edited

Loading

unilinu Dec 3, 2024

tpantelis Dec 3, 2024 •

edited

Loading

unilinu Dec 4, 2024

tpantelis Dec 4, 2024

tpantelis commented Dec 4, 2024 •

edited

Loading

fix/optimize the logic of finding cluster service/pod ip range #3285

Are you sure you want to change the base?

fix/optimize the logic of finding cluster service/pod ip range #3285

Conversation

unilinu commented Nov 28, 2024

What type of PR is this?

What this PR does / why we need it:

submariner-bot commented Nov 28, 2024

tpantelis Nov 28, 2024

Choose a reason for hiding this comment

unilinu Nov 29, 2024

Choose a reason for hiding this comment

unilinu Dec 2, 2024

Choose a reason for hiding this comment

tpantelis Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

unilinu Dec 3, 2024

Choose a reason for hiding this comment

tpantelis Dec 3, 2024 • edited Loading

Choose a reason for hiding this comment

unilinu Dec 4, 2024

Choose a reason for hiding this comment

tpantelis Dec 4, 2024

Choose a reason for hiding this comment

tpantelis commented Dec 4, 2024 • edited Loading

tpantelis Dec 2, 2024 •

edited

Loading

tpantelis Dec 3, 2024 •

edited

Loading

tpantelis commented Dec 4, 2024 •

edited

Loading