Replace deprecated pykube(-ng) library with official Kubernetes python library #3093

AndrewFarley · 2021-08-03T07:48:56Z

Description

The pykube and pykube-ng libraries that Luigi depends on are flawed and don't seem to support more recent versions of Kubernetes. Both libraries appear to be abandonware, so we should pivot to the official python library for Kubernetes. This pivot will simplify some of the handlings of this library, as certain configuration values are already supported and configurable via environment variables, as is the widely accepted standard for Kubectl and its supported tools and toolchain. This also adds the following...

Better/more debug logging about what its doing to/in Kubernetes
Event logging before the pod starts up (so you can see what its doing, eg Pulling Image, etc)
Cleaned up code a bit
Able to detect cluster scaling up (contribution from @dav009)
Able to detect cluster scaling up won't work and failing
WIP Adding more robust tests to ensure we're handling all success and failure conditions properly

Motivation and Context

Luigi wasn't working for me on a number of my clusters, and after further investigation this was largely due to the outdated and deprecated libraries in-use. I've found numerous other projects which have pivoted over to the official library with a lot of success and used those largely as examples for how to do this.

I've added extra comments and documentation in all places that make sense for others to be able to more-easily understand how/what to do to use and adopt Kubernetes support for Luigi.

Have you tested this? If so, how?

I've updated the unit tests and run a few test pipelines successfully with all features.

Items to complete

Implement core feature-set and pivot to the new library
Add documentation
WIP Add/update the test coverage
WIP Add ability to watch logs during a run, not just after a run
Merge and publish a new version for all to enjoy

AndrewFarley · 2021-08-03T08:57:30Z

luigi/contrib/kubernetes.py

-    auth_method = luigi.Parameter(
-        default="kubeconfig",
-        description="Authorization method to access the cluster")
-    kubeconfig_path = luigi.Parameter(
-        default="~/.kube/config",
-        description="Path to kubeconfig file for cluster authentication")
-    max_retrials = luigi.IntParameter(
-        default=0,
-        description="Max retrials in event of job failure")


A note for reviewers. These things are safe to remove, and no longer need to live in code, as the upstream Kubernetes library supports handling them automatically via Environment Variables. The max_retrials also (unless I misunderstand) is completely unnecessary as it duplicates functionality with backoff_limit. Arguably, one config option here I could keep is auth_method so an end-user can "force" which auth method to use. But, the Kubernetes library should be able to automatically detect and cascade into the "mode" it needs to function.

AndrewFarley · 2021-08-03T08:58:25Z

luigi/contrib/kubernetes.py

+        try:
+           kubernetes_api.config.load_incluster_config()
+        except Exception as e:
+           try:
+              kubernetes_api.config.load_kube_config()
+           except Exception as ex:
+              raise ex


Note for reviewers: This is the "new" cascading automated configuration handling. If we're in Kubernetes, it will automatically detect this and use in-cluster configuration and API endpoints, if that fails, it will attempt to use the kubeconfig file and env vars that help support it. This logic could (arguably) be allowed to be forced one way or the other via a config variable.

luigi/contrib/kubernetes.py

dlstadther

left a few comments; once they're addressed, i'll approve

luigi/contrib/kubernetes.py

setup.py

test/contrib/kubernetes_test.py

AndrewFarley · 2021-08-29T15:08:37Z

Note: I am merging and fixing up the MR #3089 in this MR. Picture of proof of work, added tons of debug logging as well. MR will be updated shortly I believe addressing all issues...

AndrewFarley · 2021-08-29T15:12:44Z

@dav009 I merged(ish) your branch into mine logicall, but added code to detect if a scale up failed (aka, it won't fit even with scale up) which the previous code would infinite loop because it wasn't looking into the Kubernetes event stream. Pic of proof...

AndrewFarley · 2021-08-30T05:59:42Z

@dlstadther Please have another look and review code, but please don't merge yet, I've decided I need to add some further tests because it's really abysmal to not really have any decent ones here to ensure things work as they should, there's so many different kinds of failure conditions which are trying to be detected here and I think we should test them. I'm not fantastic with Pytest so it might be a bit slow, but I'm working on adding tests which I think are critical before merging this. Only able to work on this in free time, so if anyone else can/wants to help me with tests, please do. I've left some of my hacking at tests commented out.

AndrewFarley · 2021-08-30T06:02:23Z

And @dav009 please review the code here and make sure I encapsulated what you were trying to do with your MR. Overall, this MR will have a lot better QoL support for Kubernetes. There's still tons more things I think could/should be done in this library but mostly pedantic or improvements and could be in future merges instead. I'm still working on redoing some of the tests and will sorta include your tests once I fix them. Things I'd love to see in the future are for example on API failure or timeout or general internet jitter/brief unavailability some automated retry and/or exponential backoff mechanism.

dav009 · 2021-08-31T07:42:54Z

@AndrewFarley amazing job, thank you. I was also quite frustrated with how outdated this is. I will try to review this soonish.

😅 I made an internal library for my organisation that addresses these (old pykube) and also take into account other issues (logging). I am testing it at the moment. We were planning on releasing it at some point.

dav009 · 2021-11-12T05:23:14Z

@AndrewFarley In my team we ended up implementing this kubernetes contrib update as an independent package:
https://github.com/optimizely/kubeluigi

AndrewFarley · 2022-02-14T06:37:43Z

Ping. Folks, I'm not using Luigi, but this MR still really needs to happen. If anyone has time to rebase and take over this PR, please let me know I'm happy to transfer my fork to you or add you permissions. I know a few folks are using my fork already because it's necessary if you want Kubernetes support, so perhaps one of you can take the reins?

nlahmi-cymotive · 2023-01-24T14:19:59Z

Hi, why was this not reviewed yet? I tested it locally and it seems to work great..

AndrewFarley · 2023-01-26T09:39:06Z

Hah, I wrote this 2 years ago and still no one merged it. Is this project dead? I'm really glad this client of mine I recommended to not choose Luigi. Eesh.

AndrewFarley · 2023-01-26T09:43:14Z

I just rebased and re-pushed and validated this works on Kubernetes still. Hopefully, in the hopes and dreams of fellow engineers, someone can see it in their heart to merge this. Or, just don't use Kubernetes. :P Up to you

dlstadther · 2023-01-28T18:01:29Z

Apologies for this not being reviewed.

As far "is this project dead?" - there is not consistent maintainer involvement (myself included). The project is still owned by Spotify, but not actively developed (to my knowledge, mostly due to their articles related to their intent to replace Luigi with Flyte (note, this is just my personal assumption)). While I have permissions to merge PRs and push to PyPi, I have not used Luigi for development since changing employers a few years ago. I wish to see this project continue, but without an active community and additional maintainer participation, there's only so much I can do. I sympathize with your disappointment and I'm here to help as I reasonably can in my spare time.

dlstadther

In terms of review of this PR, I'm open to merging this code. However, 1 item prevents me and 1 item deters me:

flake8 tests are failing
a bunch of the unittests are commented out with TODO notes

I am willing to let point # 2 to slide so that master has a functional k8s module. But # 1 will need to be fixed before I can click the merge button.

AndrewFarley · 2023-01-28T23:46:24Z

@dlstadther I've just fixed the code styling to satisfy that need. I indeed also don't use Luigi and more, and I don't really have tons of time to write/update/etc the tests to validate that this works. I can confirm it did work for me once upon a time, and others above have recently checked its (basic) functionality. I can/should leave the TODOs in the test code commented out, in hopes that someone else can take up the reigns and add the tests.

At least the flake tests pass now - https://github.com/spotify/luigi/actions/runs/4034116089/jobs/6935056885

But it appears there are some other errors now? I don't have much time this weekend, but I will try to review this further sometime soon.

dlstadther · 2023-01-29T01:31:01Z

@AndrewFarley , thank you for the quick turnaround to address the failing flake8 tests. I'm good with getting this included once test results allow me to merge. I hope whomever will come use this updated k8s support within luigi will contribute back any improvements and test coverage items.

I just checked the current failures and they're all 1 failure:

the test class tries to load a kubeconfig and cannot find one - kubernetes_api.config.load_kube_config()

So looks to me like once a kube config is mocked (or however it is expected) for tests, that we'd be good to go for now.

AndrewFarley · 2023-01-29T20:32:13Z

Well, in order for the Kubernetes stuff to work, it needs to initialize the library and config somehow. I will need to have a think and a google around of some creative way to workaround this for testing only which won't have K8s support.

AndrewFarley requested review from dlstadther, Tarrasch and a team as code owners August 3, 2021 07:48

AndrewFarley commented Aug 3, 2021

View reviewed changes

tnsetting reviewed Aug 16, 2021

View reviewed changes

luigi/contrib/kubernetes.py Show resolved Hide resolved

dlstadther reviewed Aug 17, 2021

View reviewed changes

luigi/contrib/kubernetes.py Outdated Show resolved Hide resolved

luigi/contrib/kubernetes.py Outdated Show resolved Hide resolved

setup.py Outdated Show resolved Hide resolved

test/contrib/kubernetes_test.py Outdated Show resolved Hide resolved

tnsetting mentioned this pull request Aug 18, 2021

Kubernetes contrib : waiting for cluster scale up #3089

Open

AndrewFarley force-pushed the replace-deprecated-pykube-library branch from e991674 to d340099 Compare August 30, 2021 09:30

abjorkl5 mentioned this pull request Aug 30, 2021

Fix the kubernetes test #3101

Open

AndrewFarley added 6 commits January 26, 2023 22:40

adding notes and more variables into kubernetes example

dd3f863

initial poc with official kubernetes library

1aa77be

adding note about library

c0e31b7

not using farleys name

6f91aab

cleaning up MR, still need to fix and add tests

7155d17

Adding better readme to example

a1ef345

AndrewFarley force-pushed the replace-deprecated-pykube-library branch from d340099 to a1ef345 Compare January 26, 2023 09:42

dlstadther reviewed Jan 28, 2023

View reviewed changes

fixing code styling

af5959e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace deprecated pykube(-ng) library with official Kubernetes python library #3093

Replace deprecated pykube(-ng) library with official Kubernetes python library #3093

AndrewFarley commented Aug 3, 2021 •

edited

Loading

AndrewFarley Aug 3, 2021

AndrewFarley Aug 3, 2021 •

edited

Loading

dlstadther left a comment

AndrewFarley commented Aug 29, 2021

AndrewFarley commented Aug 29, 2021

AndrewFarley commented Aug 30, 2021

AndrewFarley commented Aug 30, 2021 •

edited

Loading

dav009 commented Aug 31, 2021

dav009 commented Nov 12, 2021

AndrewFarley commented Feb 14, 2022

nlahmi-cymotive commented Jan 24, 2023

AndrewFarley commented Jan 26, 2023 •

edited

Loading

AndrewFarley commented Jan 26, 2023 •

edited

Loading

dlstadther commented Jan 28, 2023

dlstadther left a comment

AndrewFarley commented Jan 28, 2023 •

edited

Loading

dlstadther commented Jan 29, 2023

AndrewFarley commented Jan 29, 2023

Replace deprecated pykube(-ng) library with official Kubernetes python library #3093

Are you sure you want to change the base?

Replace deprecated pykube(-ng) library with official Kubernetes python library #3093

Conversation

AndrewFarley commented Aug 3, 2021 • edited Loading

Description

Motivation and Context

Have you tested this? If so, how?

Items to complete

AndrewFarley Aug 3, 2021

Choose a reason for hiding this comment

AndrewFarley Aug 3, 2021 • edited Loading

Choose a reason for hiding this comment

dlstadther left a comment

Choose a reason for hiding this comment

AndrewFarley commented Aug 29, 2021

AndrewFarley commented Aug 29, 2021

AndrewFarley commented Aug 30, 2021

AndrewFarley commented Aug 30, 2021 • edited Loading

dav009 commented Aug 31, 2021

dav009 commented Nov 12, 2021

AndrewFarley commented Feb 14, 2022

nlahmi-cymotive commented Jan 24, 2023

AndrewFarley commented Jan 26, 2023 • edited Loading

AndrewFarley commented Jan 26, 2023 • edited Loading

dlstadther commented Jan 28, 2023

dlstadther left a comment

Choose a reason for hiding this comment

AndrewFarley commented Jan 28, 2023 • edited Loading

dlstadther commented Jan 29, 2023

AndrewFarley commented Jan 29, 2023

AndrewFarley commented Aug 3, 2021 •

edited

Loading

AndrewFarley Aug 3, 2021 •

edited

Loading

AndrewFarley commented Aug 30, 2021 •

edited

Loading

AndrewFarley commented Jan 26, 2023 •

edited

Loading

AndrewFarley commented Jan 26, 2023 •

edited

Loading

AndrewFarley commented Jan 28, 2023 •

edited

Loading