Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: update enableRdma/rdmaIsolation field for spiderMultusConfig #4301

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ spec:
properties:
ibKubernetesEnabled:
default: false
description: Enforces ib-sriov-cni to work with ib-kubernetes.
type: boolean
ippools:
description: SpiderpoolPools could specify the IPAM spiderpool
Expand All @@ -179,17 +180,30 @@ spec:
type: object
linkState:
default: enable
description: 'Enforces link state for the VF. Allowed values:
auto, enable, disable.'
enum:
- auto
- enable
- disable
type: string
pkey:
description: infiniBand pkey for VF, this field is used by ib-kubernetes
to add pkey with guid to InfiniBand subnet manager client e.g.
Mellanox UFM, OpenSM
type: string
rdmaIsolation:
default: true
description: rdmaIsolation enablw RDMA CNI plugin is intended
to be run as a chained CNI plugin. it ensures isolation of RDMA
traffic from other workloads in the system by moving the associated
RDMA interfaces of the provided network interface to the container's
network namespace path.
type: boolean
resourceName:
description: The SR-IOV RDMA resource name of the SpiderMultusConfig.
the SR-IOV RDMA resource is often reported to kubelet by the
sriov-device-plugin.
type: string
required:
- resourceName
Expand All @@ -210,11 +224,14 @@ spec:
type: array
type: object
master:
description: name of the host interface to create the link from.
type: string
type: object
ipvlan:
properties:
bond:
description: Optional bond configuration for the CNI. It must
not be nil if the multiple master interfaces are specified.
properties:
mode:
format: int32
Expand All @@ -229,10 +246,6 @@ spec:
- mode
- name
type: object
enableRdma:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 是 破坏性 的 版本 变更,只能说 deprecated ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我也是考虑了一下,目前没有任何地方在引用这个代码,所以感觉可以直接移除?

Copy link
Collaborator

@weizhoublue weizhoublue Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是代码引用的问题。
升级 流程上 是否能 会问题,存量实例 的转换 怎么办

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

升级指的是老 CRD 用新镜像,因为新镜像不会用到 enableRdma 这个字段,所以不会影响升级。手动测试过都是正常的,我也是考虑到没地方引用这个字段,所以想一步到位

Copy link
Collaborator

@weizhoublue weizhoublue Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那不现实的,只换镜像,那不叫升级,不处理 CRD 的升级 和转换。那未来 任何新功能,crd 新的定义,存量环境都享受不到,结果它们升级的意义 只是为了 修复 bug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我的意思是即使未升级 CRD只升级镜像测试都没问题,正常升级流程CRD和镜像都要升级,所以更没问题, crd 升级后,crs 中该字段自动移除

Copy link
Collaborator Author

@cyclinder cyclinder Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否要一步到位, 你决定就好,我都OK @weizhoublue

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我的意思是即使未升级 CRD只升级镜像测试都没问题--- 不是一个升级服务的流程,不考虑

正常升级流程CRD和镜像都要升级 --- crd 升级后 删除字段,相关实例 是否能正常,是否有验证

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个验证过,没问题

default: false
description: enable share rdma for ipvlan
type: boolean
ippools:
description: SpiderpoolPools could specify the IPAM spiderpool
CNI configuration default IPv4&IPv6 pools.
Expand All @@ -247,15 +260,23 @@ spec:
type: array
type: object
master:
description: The master interface(s) for the CNI configuration.
At least one master interface must be specified. If multiple
master interfaces are specified, the spiderpool will create
a bond device with the bondConfig by the ifacer plugin.
items:
type: string
type: array
rdmaResourceName:
description: Resource name of the rdma device-plugin, If it's
empty and enableRdma is true, the value will be auto set by
operator. and the user can also set this value manually.
description: The RDMA resource name of the nic. the RDMA resource
is often reported to kubelet by the k8s-rdma-shared-dev-plugin.
when it is not empty and spiderpool podResourceInject feature
is enabled, spiderpool can automatically inject it into the
container's resources via webhook.
type: string
vlanID:
description: 'The VLAN ID for the CNI configuration, optional
and must be within the specified range: [0.4096).'
format: int32
maximum: 4094
minimum: 0
Expand All @@ -266,6 +287,8 @@ spec:
macvlan:
properties:
bond:
description: Optional bond configuration for the CNI. It must
not be nil if the multiple master interfaces are specified.
properties:
mode:
format: int32
Expand All @@ -280,10 +303,6 @@ spec:
- mode
- name
type: object
enableRdma:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同理

default: false
description: enable share rdma for macvlan
type: boolean
ippools:
description: SpiderpoolPools could specify the IPAM spiderpool
CNI configuration default IPv4&IPv6 pools.
Expand All @@ -298,15 +317,23 @@ spec:
type: array
type: object
master:
description: The master interface(s) for the CNI configuration.
At least one master interface must be specified. If multiple
master interfaces are specified, the spiderpool will create
a bond device with the bondConfig by the ifacer plugin.
items:
type: string
type: array
rdmaResourceName:
description: Resource name of the rdma device-plugin, If it's
empty and enableRdma is true, the value will be auto set by
operator. and the user can also set this value manually.
description: The RDMA resource name of the nic. the RDMA resource
is often reported to kubelet by the k8s-rdma-shared-dev-plugin.
when it is not empty and spiderpool podResourceInject feature
is enabled, spiderpool can automatically inject it into the
container's resources via webhook.
type: string
vlanID:
description: 'The VLAN ID for the CNI configuration, optional
and must be within the specified range: [0.4096).'
format: int32
maximum: 4094
minimum: 0
Expand Down Expand Up @@ -361,6 +388,7 @@ spec:
properties:
enableRdma:
default: false
description: DEPRECATED, use RdmaIsolation flled instead.
type: boolean
ippools:
description: SpiderpoolPools could specify the IPAM spiderpool
Expand All @@ -376,14 +404,28 @@ spec:
type: array
type: object
maxTxRateMbps:
description: Mbps, 0 = disable rate limiting
minimum: 0
type: integer
minTxRateMbps:
minimum: 0
type: integer
rdmaIsolation:
default: false
description: rdmaIsolation enable RDMA CNI plugin is intended
to be run as a chained CNI plugin. it ensures isolation of RDMA
traffic from other workloads in the system by moving the associated
RDMA interfaces of the provided network interface to the container's
network namespace path.
type: boolean
resourceName:
description: The SR-IOV RDMA resource name of the SpiderMultusConfig.
the SR-IOV RDMA resource is often reported to kubelet by the
sriov-device-plugin.
type: string
vlanID:
description: 'The VLAN ID for the CNI configuration, optional
and must be within the specified range: [0.4096).'
format: int32
maximum: 4094
minimum: 0
Expand Down
9 changes: 6 additions & 3 deletions docs/reference/crd-spidermultusconfig.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ This is the SpiderReservedIP spec for users to configure.
| master | the Interfaces on your master, you could specify a single one Interface<br/> or multiple Interfaces to generate one bond Interface | list of strings | required | |
| vlanID | vlan ID | int | optional | [0,4094] |
| bond | expected bond Interface configurations | [BondConfig](./crd-spidermultusconfig.md#bondconfig) | optional | |
| rdmaResourceName | rdma resource name of the spiderMultusConfig, it often reported to kubelet by the k8s-rdma-shared-dev-plugin. when it is not empty and spiderpool podResourceInject feature is enabled, spiderpool can automatically inject it into the container's resources via webhook | string | optional | |
| ippools | the default IPPools in your CNI configurations | [SpiderpoolPools](./crd-spidermultusconfig.md#spiderpoolpools) | optional | |

#### SpiderIPvlanCniConfig
Expand All @@ -80,24 +81,26 @@ This is the SpiderReservedIP spec for users to configure.
| master | the Interfaces on your master, you could specify a single one Interface<br/> or multiple Interfaces to generate one bond Interface | list of strings | required | |
| vlanID | vlan ID | int | optional | [0,4094] |
| bond | expected bond Interface configurations | [BondConfig](./crd-spidermultusconfig.md#bondconfig) | optional | |
| rdmaResourceName | rdma resource name of the spiderMultusConfig, it often reported to kubelet by the k8s-rdma-shared-dev-plugin. when it is not empty and spiderpool podResourceInject feature is enabled, spiderpool can automatically inject it into the container's resources via webhook | string | optional | |
| ippools | the default IPPools in your CNI configurations | [SpiderpoolPools](./crd-spidermultusconfig.md#spiderpoolpools) | optional | |

#### SpiderSRIOVCniConfig

| Field | Description | Schema | Validation |
|---------------|-------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
| resourceName | this property will create an annotation for Multus net-attach-def to cooperate with SRIOV | string | required |
| resourceName | this property will create an annotation for Multus net-attach-def to cooperate with SRIOV, if spiderpool podResourceInject feature is enabled, spiderpool can automatically inject it into the container's resources via webhook | string | required |
| vlanID | vlan ID | int | optional |
| minTxRateMbps | change the allowed minimum transmit bandwidth, in Mbps, for the VF. Setting this to 0 disables rate limiting. The min_tx_rate value should be <= max_tx_rate. Support of this feature depends on NICs and drivers | int | optional |
| maxTxRateMbps | change the allowed maximum transmit bandwidth, in Mbps, for the VF. Setting this to 0 disables rate limiting | int | optional |
| enableRdma | enable rdma chain cni to isolate the rdma device | bool | optional |
| enableRdma(deprecated) | It will be remove in the furture, use rdmaIsolation instead. | bool | optional |
rdmaIsolation | rdmaIsolation enable RDMA CNI plugin is intended to be run as a chained CNI plugin. it ensures isolation of RDMA traffic from other workloads in the system by moving the associated RDMA interfaces of the provided network interface to the container's network namespace path. | bool | optional |
| ippools | the default IPPools in your CNI configurations | [SpiderpoolPools](./crd-spidermultusconfig.md#spiderpoolpools) | optional |

#### SpiderIBSRIOVCniConfig

| Field | Description | Schema | Validation |
|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|------------|
| resourceName | this property will create an annotation for Multus net-attach-def to cooperate with ib-sriov | string | required |
| resourceName | this property will create an annotation for Multus net-attach-def to cooperate with SRIOV, if spiderpool podResourceInject feature is enabled, spiderpool can automatically inject it into the container's resources via webhook | string | required |
| pkey | InfiniBand pkey for VF, this field is used by [ib-kubernetes](https://github.com/Mellanox/ib-kubernetes) to add pkey with guid to InfiniBand subnet manager client e.g. [Mellanox UFM](https://www.nvidia.com/en-us/networking/infiniband/ufm/) | string | optional |
| linkState | Enforces link state for the VF. Allowed values: auto, enable [default], disable | string | optional |
| rdmaIsolation | Enable RDMA network namespace isolation for RDMA workloads, default to true | bool | optional |
Expand Down
1 change: 0 additions & 1 deletion docs/usage/install/ai/get-started-macvlan-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,6 @@
spidernet.io/shared_cx5_gpu6: 1
spidernet.io/shared_cx5_gpu7: 1
spidernet.io/shared_cx5_gpu8: 1
#nvidia.com/gpu: 1
EOF

$ helm install rdma-tools spiderchart/rdma-tools -f ./values.yaml
Expand Down
1 change: 0 additions & 1 deletion docs/usage/install/ai/get-started-macvlan.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,6 @@ The network planning for the cluster is as follows:
spidernet.io/shared_cx5_gpu6: 1
spidernet.io/shared_cx5_gpu7: 1
spidernet.io/shared_cx5_gpu8: 1
#nvidia.com/gpu: 1
```

During the creation of the network namespace for the container, Spiderpool will perform connectivity tests on the gateway of the macvlan interface.
Expand Down
10 changes: 5 additions & 5 deletions docs/usage/install/ai/get-started-sriov-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
cniType: sriov
sriov:
resourceName: spidernet.io/gpu1sriov
enableRdma: true
rdmaIsolation: true
ippools:
ipv4: ["gpu1-net11"]
EOF
Expand Down Expand Up @@ -598,10 +598,10 @@ Spiderpool 使用了 [sriov-network-operator](https://github.com/k8snetworkplumb
name: ib-sriov
namespace: spiderpool
spec:
cniType: ib-sriov
ibsriov:
pkey: 1000
...
cniType: ib-sriov
ibsriov:
pkey: 1000
...
EOF
```

Expand Down
Loading
Loading