Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for TLS #692

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Add support for TLS #692

wants to merge 1 commit into from

Conversation

bipuladh
Copy link

@bipuladh bipuladh commented Oct 28, 2024

Changes

  • Adds a requirement for TLS cert for Sidecar Pods
  • Adds a auth header check for bearer tokens on the Sidecar Pods
  • Enables TLS by default on Controller Manager client pool
  • Adds a bearer token to the manager client request headers(Taken from corresponding SA)

Requirements for the above features to work correctly:

  • TokenReview create access for all sidecar containers
  • TLS certs are to be created and mounted via either Service(using annotations) or other mechanism on each of the server pods.

Images for testing:

  • Sidecar: quay.io/badhikar/k8s-sidecar:v1
  • Controller: quay.io/badhikar/k8s-controller:v1

Test logs

2024-11-11T18:12:00.948Z INFO Successfully connected to sidecar {"controller": "csiaddonsnode", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "CSIAddonsNode", "CSIAddonsNode": {"name":"csi-rbdplugin-gbr85","namespace":"openshift-storage"}, "namespace": "openshift-storage", "name": "csi-rbdplugin-gbr85", "reconcileID": "964eea1e-324b-4957-984d-2611972645ed", "NodeID": "ip-10-0-50-177.us-east-2.compute.internal", "DriverName": "openshift-storage.rbd.csi.ceph.com", "EndPoint": "10.0.50.177:9070"}

Copy link
Collaborator

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few minor comments for improving, looks pretty good to me already. Thanks!

internal/kubernetes/namespace.go Outdated Show resolved Hide resolved
tools/go.mod Outdated Show resolved Hide resolved
internal/kubernetes/token/grpc.go Outdated Show resolved Hide resolved
internal/kubernetes/token/grpc.go Outdated Show resolved Hide resolved
@bipuladh
Copy link
Author

@nixpanic should we disable it by default for easy opt-in for users?

sidecar/main.go Outdated Show resolved Hide resolved
@nixpanic
Copy link
Collaborator

@nixpanic should we disable it by default for easy opt-in for users?

If it works on current Kubernetes installations by default, it can be enabled by default IMHO. You may want to add a note in https://github.com/csi-addons/kubernetes-csi-addons/blob/main/docs/deploy-controller.md on the requirements of Kubernetes services, and how to disable it in case those services are unavailable.

@bipuladh
Copy link
Author

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 12, 2024

@Madhu-1 my PR fails until I revert code introduced at https://github.com/csi-addons/kubernetes-csi-addons/pull/695/files#diff-03e0410a78ce106bc8ee10f594661c87995994955bfb6b07972d8fde2ec8a555R125-R146 Could you PTAL?

@bipuladh please use latest yamls from master branch in Rook

Copy link
Member

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a documentation about this feature and as its enabled by default i assume upgraded works as it is (in kubernetes)

}
tlsConfig := &tls.Config{
RootCAs: caCertPool, // The CA certificates to verify the server
InsecureSkipVerify: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont we need to set it to true?

Copy link
Author

@bipuladh bipuladh Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default setting it to false but making it configurable by users from args

internal/connection/connection.go Outdated Show resolved Hide resolved
internal/connection/connection.go Outdated Show resolved Hide resolved
Comment on lines 78 to 89
return status.Errorf(codes.Unauthenticated, "missing metadata")
}

authHeader, ok := md["authorization"]
if !ok || len(authHeader) == 0 {
return status.Errorf(codes.Unauthenticated, "missing authorization token")
}

token := authHeader[0]
isValidated, err := validateBearerToken(ctx, token, kubeclient)
if !isValidated || (err != nil) {
return status.Errorf(codes.Unauthenticated, "invalid token")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can use status.Error instead of status.Errorf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return the err message as well for better debugging

Comment on lines 52 to 62
file, err := os.Open(tokenPath)
if err != nil {
return "", err
}
defer file.Close()

data, err := io.ReadAll(file)
if err != nil {
return "", err
}
return string(data), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make use of readfile function which is doing the same thing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioutil readFile is deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant readFile we have below in this file

ss.server = grpc.NewServer(grpc.UnaryInterceptor(token.AuthorizationInterceptor(ss.client)), grpc.Creds(creds))
}
if !ss.enableTLS {
ss.server = grpc.NewServer(grpc.UnaryInterceptor(token.AuthorizationInterceptor(ss.client)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if tls is not enabled why we need to have add UnaryInterceptor here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something we need to decide. Should we disable auth token verification if TLS is disabled?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing Interceptor when TLS not used.

cmd/manager/main.go Outdated Show resolved Hide resolved
}
tlsConfig := &tls.Config{
RootCAs: caCertPool, // The CA certificates to verify the server
InsecureSkipVerify: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be set to false?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default setting it to false but making it configurable by users from args

internal/kubernetes/token/grpc.go Outdated Show resolved Hide resolved
return nil
}

func removeBearer(authHeader string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func removeBearer(authHeader string) string {
func extractToken(authHeader string) string {

since the function just extracts the token, this naming is more suitable?

sidecar/internal/server/server.go Outdated Show resolved Hide resolved
sidecar/main.go Outdated Show resolved Hide resolved
@bipuladh
Copy link
Author

Please add a documentation about this feature and as its enabled by default i assume upgraded works as it is (in kubernetes)

@Madhu-1 @nixpanic for a smoother upgrade experience, should we disable by default? As mentioned in the PR description the sidecar deployer will have to create a certificate and mount it if not done the sidecar would go to CLBO. I can add this setup procedure on the readme files for reference.
I was thinking in the absence of certs we could automatically disable TLS but this could end up becoming a debugging nightmare. Perhaps logging could help, not sure.

@nixpanic
Copy link
Collaborator

@Madhu-1 @nixpanic for a smoother upgrade experience, should we disable by default? As mentioned in the PR description the sidecar deployer will have to create a certificate and mount it if not done the sidecar would go to CLBO. I can add this setup procedure on the readme files for reference. I was thinking in the absence of certs we could automatically disable TLS but this could end up becoming a debugging nightmare. Perhaps logging could help, not sure.

Yes, users should not need to do anything extra for upgrading. If there are manual steps to configure certificates, disable the feature by default.

Copy link
Collaborator

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please squash all the commits together so that the Mergify automation can merge it when all approvals are in.

}

token := authHeader[0]
isValidated, err := validateBearerToken(ctx, token, kubeclient)
if !isValidated || (err != nil) {
return status.Errorf(codes.Unauthenticated, "invalid token")
return status.Error(codes.Unauthenticated, fmt.Sprint("invalid token: %w", err))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where status.Errorf() can be used so that fmt.Sprintf() is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please address this as well?

@@ -92,6 +94,8 @@ func main() {
flag.BoolVar(&enableAdmissionWebhooks, "enable-admission-webhooks", false, "[DEPRECATED] Enable the admission webhooks")
flag.BoolVar(&showVersion, "version", false, "Print Version details")
flag.StringVar(&cfg.SchedulePrecedence, "schedule-precedence", "", "The order of precedence in which schedule of reclaimspace and keyrotation is considered. Possible values are sc-only")
flag.BoolVar(&enableTLS, "enable-tls", false, "Enable TLS(enabled by default)")
flag.BoolVar(&skipInsecureVerify, "insecure-skip-tls-verify", false, "skip the certificate check")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
flag.BoolVar(&skipInsecureVerify, "insecure-skip-tls-verify", false, "skip the certificate check")
flag.BoolVar(&skipInsecureVerify, "insecure-skip-tls-verify", false, "skip server certificate verification")

Copy link
Member

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add/update the documentation on how to enable this functionality

@@ -92,6 +94,8 @@ func main() {
flag.BoolVar(&enableAdmissionWebhooks, "enable-admission-webhooks", false, "[DEPRECATED] Enable the admission webhooks")
flag.BoolVar(&showVersion, "version", false, "Print Version details")
flag.StringVar(&cfg.SchedulePrecedence, "schedule-precedence", "", "The order of precedence in which schedule of reclaimspace and keyrotation is considered. Possible values are sc-only")
flag.BoolVar(&enableTLS, "enable-tls", false, "Enable TLS(enabled by default)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update the comment as its disabled by default

internal/connection/connection.go Show resolved Hide resolved
grpc.WithIdleTimeout(time.Duration(0)),
}

opts = append(opts, token.WithServiceAccountToken())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to move this inside the if check when TLS is enabled?


caFile, caError := token.GetCACert()
if caError != nil {
return nil, (fmt.Errorf("failed to get server cert %w", caError))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, (fmt.Errorf("failed to get server cert %w", caError))
return nil, fmt.Errorf("failed to get server cert %w", caError)

Comment on lines 52 to 62
file, err := os.Open(tokenPath)
if err != nil {
return "", err
}
defer file.Close()

data, err := io.ReadAll(file)
if err != nil {
return "", err
}
return string(data), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant readFile we have below in this file

Comment on lines 96 to 85
func parseToken(authHeader string) string {
if strings.HasPrefix(authHeader, bearerPrefix) {
return strings.TrimPrefix(authHeader, bearerPrefix)
}
return authHeader
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to just to parse, we can remove the if check and trim it

func parseToken(authHeader string) string {
	return strings.TrimPrefix(authHeader, bearerPrefix)
}

Comment on lines +109 to +96
result, err := kubeclient.AuthenticationV1().TokenReviews().Create(ctx, tokenReview, metav1.CreateOptions{})
if err != nil {
return false, fmt.Errorf("failed to review token %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont we need RBAC for this one? am not sure we have this RBAC already

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we require. Whoever deploys the sidecar should be responsible to have this.

// create the gRPC server and register services
ss.server = grpc.NewServer()
if ss.enableTLS {
creds, err := credentials.NewServerTLSFromFile("/etc/tls/tls.crt", "/etc/tls/tls.key")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"/etc/tls/tls.crt", "/etc/tls/tls.key" Are we going to have these two files in all the pods in same location when TLS is enabled?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes. But I guess we can make it configurable later on if that is preferred.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding these as parameters

if ss.enableTLS {
creds, err := credentials.NewServerTLSFromFile("/etc/tls/tls.crt", "/etc/tls/tls.key")
if err != nil {
klog.Fatalf("Could not find TLS file: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.Fatalf("Could not find TLS file: %v", err)
klog.Fatalf("failed to load TLS certificate and key: %v", err)

@mergify mergify bot added the api Change to the API, requires extra care label Nov 13, 2024
nixpanic
nixpanic previously approved these changes Nov 13, 2024
Copy link
Collaborator

@nixpanic nixpanic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty complete to me, thanks for the contribution!

nixpanic
nixpanic previously approved these changes Nov 13, 2024
Copy link
Member

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits, overall LGTM

docs/deploy-controller.md Outdated Show resolved Hide resolved
opts := []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont we need to set this incase enableTLS is set to false?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. thanks!

@@ -211,6 +213,10 @@ func (r *CSIAddonsNodeReconciler) resolveEndpoint(ctx context.Context, rawURL st
if err != nil {
return "", "", err
}
if r.EnableTLS {
// We need to use this name to accept certificates signed for pods
return podname, podname + "." + namespace + ".pod" + ":" + port, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work when we are using host networking?

internal/kubernetes/token/grpc.go Outdated Show resolved Hide resolved
}

token := authHeader[0]
isValidated, err := validateBearerToken(ctx, token, kubeclient)
if !isValidated || (err != nil) {
return status.Errorf(codes.Unauthenticated, "invalid token")
return status.Error(codes.Unauthenticated, fmt.Sprint("invalid token: %w", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please address this as well?

internal/kubernetes/token/grpc.go Outdated Show resolved Hide resolved
"net"

"github.com/csi-addons/kubernetes-csi-addons/internal/kubernetes/token"
"google.golang.org/grpc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a new line for separation of imports

sidecar/internal/server/server.go Outdated Show resolved Hide resolved
@Madhu-1
Copy link
Member

Madhu-1 commented Nov 13, 2024

@bipuladh i tested this PR on the existing cluster without enabling TLS. its not working, can you please ensure that the functionality works with/without TLS?

@bipuladh
Copy link
Author

@bipuladh i tested this PR on the existing cluster without enabling TLS. its not working, can you please ensure that the functionality works with/without TLS?

Tested with the newly updated code. Working fine now.
Regarding host networking I think instead of relying on Pod IP we might have to create a service on front of it (the operators that deploy sidecar) so that we can have a DNS mapping. These service names can be put by the operator that deploys on the Pods in their annotations. We can then get the Service DNS key by querying it. This way we could update the code for TLS part to look into that and document it. WDYT?

@bipuladh
Copy link
Author

/hold
ceph/ceph-csi-operator#173
Waiting on the decision of using Service over IP.
And also how the name of the service is to be communicated to the Server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Change to the API, requires extra care
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants