-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race Condition with ArgoCD deleting PVC #168
Comments
It probably to do with the ownerReference that ties the PVC to the Terraform Resource in Kubernetes. When the Terraform resource is deleted, the PVC is deleted . Aside from that , the PVC should only be created if the Terraform resource still exists. I've done manual deletions of the Terraform resource to clean up the PVC. But I have not observed the PVC getting recreated. I wonder what ArgoCD is doing differently. Let me try to understand Argo a little better. If ArgoCD deletes a resource:
For troubleshooting, can you confirm that deleting a terraform resource manually does not "recreate" the pvc? |
Yes, ArgoCD issues a delete to every resource which is owned by the Terraform CR. As the CR has a finalizer it is not fully deleted until the terraform-operator removes the finalizer, so the operator recreates the PVC sometimes, sometimes it throws an error that it can't find the PVC because Argo deleted it.
Yes and all resources associated with the CR, so PVC, CM, Secret, Pods, etc.
Yes, but i tried adding a finalizer to the PVC via the taskOptions script section. This leads to the PVC staying and only entering terminating state, but this does not solve the problem as a terminating PVC can't be bound to a pod and the delete pods get stuck in pending waiting for the PVC
No
Yes deleting the terraform CR via kubectl works just as you'd expect. My hacky workaround for now is removing the ownerReferences in the setup script for the PVC (and CM and Secret, as i have observed that there's the same race condition) and then deleting the resources via kubectl as last action in the apply-delete step. All in all this is definitely a bug caused solely by ArgoCD and not your work, as ArgoCD is not even caring about its own annotations which i tried to use to prevent deletion of these resources. So i'm not sure if you want to tackle this issue, i could fully understand if you wouldn't want to. But i noticed a bug with your work that would help me out if you could fix it: #169 Thanks for your work! |
ArgoCD should only delete resources that are marked with metadata:
labels:
argocd.argoproj.io/instance: xxx and the sub-resources of the terraform-operator resource are not marked with that. IE, it should delete the terraform-operator resource, which will trigger the finalizers on that, which in turn will run the delete. And then when the finalizer code allows it it will be deleteable and the subresources will be marked as deleteable. |
I totally agree that it SHOULD be like that, but it isn't. The sub resources do not have the instance label (nor the tracking id annotation) still they get deleted by argocd because it knows the subresources through the ownerReferences |
Did you forgot to setup ArgoCD to actually respect the tf run status? |
No, i added the lua script and that works like a charm. Did you try to reproduce the problem? |
the PVC is not still alive it got recreated. Compare the deletion timestamp of the terraform CR with the creation timestamp of the PVC => The moment the terraform CR got deleted the original PVC got deleted too but the terraform operator recreated the PVC instantly hence the creationtimestamp equalling the deletion timestamp. In your picture the creation timestamp of the terraform CR is one month earlier and so should the creation timestamp of the PVC be |
Hey,
i love your project, but i'm currently facing an issue when using it in combination with ArgoCD. When i delete a terraform CR via ArgoCD ArgoCD also instantly deletes the PVC that was created by the terraform-operator for the CR. This sometimes leads to the terraform operator instantly recreating the PVC, sometimes it leads to the terraform-operator getting stuck because the PVC is missing. I tried setting multiple ArgoCD annotations on the created PVC but to no avail. This seems like an ArgoCD bug, as somebody else is having a similar issue with StatefulSets argoproj/argo-cd#13503 . Do you have a workaround for this problem or is this not happening for you? I tried with v0.17.0 of the terraform-operator and ArgoCD v2.10.9
The text was updated successfully, but these errors were encountered: