New Python Backend wishlist #20897

benjyw · 2024-05-09T14:04:18Z

benjyw
May 9, 2024
Maintainer Sponsor

The Python backend is the oldest part of Pants v2. It was initially designed back in 2019-2020, based on the experiences of its implementers, with some external feedback but also a lot of guesswork. So, while very useful, it has become quite bloated, and doesn't have great support for some real-world use cases.

Now that we have the benefit of several years of intensive usage, we're looking at creating a new, more streamlined Python backend. We want this proposed new backend to support all the common (and 90% of the uncommon) use cases of the existing backend, but to also support new use cases that the current backend doesn't handle well. All this while, ideally, being easier to set up and maintain.

It's important to note that the new backend would be opt-in, and we would not get rid of the old one until the new one achieves feature parity and then a long transition period.

So this discussion is a place for you to throw down your wish lists for Pants' python support: What are your use cases? What is not currently well-supported and you'd like to see improved? What ideas do you have? Feel free to post your thoughts, and also to comment or "thumbs up" other posts, to express your support for that idea.

(In your posts, please be kind and respectful of the many thousands of hours of work by dozens of people that went into the existing backend...)

benjyw · 2024-05-09T14:07:01Z

benjyw
May 9, 2024
Maintainer Author Sponsor

I'll start: a common request we get on Slack is for different config to apply in different parts of the codebase. So one "big idea" for a new backend is to allow (some) options to be set in BUILD file targets, to apply to the files in that target.

4 replies

tdyas May 9, 2024
Collaborator

This request is not even Python-specific. For example, one can imagine being able to set tailor-related options in different parts of the source tree.

grihabor May 9, 2024

I've created an issue about it #20898

huonw May 10, 2024
Collaborator

Is this line of thinking stuff like "run different versions of black"?

enriquemaffezzini May 22, 2024

I have a clear example in which having different resolves in a monorepo but having only one pytest setting version is causing trouble. It would be ideal to pass on a pydentic resolve through the BUILD file for example.

benjyw · 2024-05-09T14:12:29Z

benjyw
May 9, 2024
Maintainer Author Sponsor

Another thing I'd like to implement: Getting rid of dummy tailored BUILD files, and just doing the right thing in the 95% of cases where that is obvious. In other words, moving from a target-centric user view to a file/directory-centric one.

5 replies

tdyas May 9, 2024
Collaborator

This idea seems distinct from a new Python backend, although a new Python backend could be written to create synthetic top-level targets based on detecting Python code.

One concern is how users will be able to understand what the new backend has inferred. Debugging magic inference is not fun ...

benjyw May 9, 2024
Maintainer Author Sponsor

Well, I want to at least experiment with not modeling targets as expansively as we do today, even internally.

isra17 May 10, 2024

One concern is how users will be able to understand what the new backend has inferred.

Isn't something that could be done simply with the existing previewing goals such as pants list or pants peek?

Just so I understand the idea, we are talking about omitting BUILD files in folders and just infer what would have been the tailored build file instead? In some way, this is no different than generatives target such as python_sources that infer multiple python_source?

thejcannon May 13, 2024
Maintainer

I think the nudge I gave @kaos did a lot of the groundwork here. Allow for a root to say "I am a Python (or whatever) root." Then at the various directory levels you can override the default values. Targets are referred to as files.

kaos May 14, 2024
Collaborator

yea, I have some half done works and ideas on this topic. But it doesn't get rid of the targets as a concept.. merely adds support for not having to present all the scaffolding in actual BUILD files.

sureshjoshi · 2024-05-09T14:28:19Z

sureshjoshi
May 9, 2024
Collaborator

De-coupling pytest from python test. Especially in this crazy world of Rust-based tooling.

e.g. pants test means pytest - but what happens when Astral eventually introduces a super fast ruff-test or something.

So, maybe having test follow the same opt-in backend style that we use for check/fix/fmt/lint/etc

Ditto for pex, in the sense of not getting rid of pex - but if there was some way to interface it away to reduce some coupling against the specific implementation.

6 replies

benjyw May 9, 2024
Maintainer Author Sponsor

That seems wise.

tdyas May 16, 2024
Collaborator

Side note: This change could be made in the existing backend in a straightforward manner (i.e., rename python_tests to python_pytest_tests etc.) We could even then introduce a unittest test runner (e.g., python_unittest_tests).

sureshjoshi May 17, 2024
Collaborator

As a nitpick, per implementation targets feels gross. I get why we need them, but blah from the API side. I also know we do it in a lot of places already, but... again... blah.

Something we have in the JS backend (I think it's still there, anyways) is a field that lets you pick your implementation/runner for package management via a field (npm, pnpm, yarn). That may be at the NodeJS subsystem level though, I'd need to look at it again.

Would be nice at least for the generators, to reduce the number of concepts new (and honestly, existing) users would need to learn.

python_tests(
  name="",
  runner="nose",
)

The per-implementation field details would be where things fall off the rails. I suppose we could LSP away a lot of the complexity, and then generators could generate a per-implementation variant - copying the associated field data (which would need to have a dict or kwargs or something).

python_tests(
  name="",
  runner="some_rust_runner",
  dependencies=[...],
  rust_version="1.78",
  ...
)

# generates

python_some_rust_runner_test(
  ...
)

I'm entirely handwaving all technical complexity here :)

tdyas May 17, 2024
Collaborator

As a nitpick, per implementation targets feels gross. I get why we need them, but blah from the API side. I also know we do it in a lot of places already, but... again... blah.

I agree. The per-framework target types almost seem like something which should belong in an "intermediate" build graph. Pants can infer there are tests and build an internal representation of the technologies used. Even python_tests feels too intermediate. Which begs the question: What is the truly high-level config language for builds?

jasondamour Jul 25, 2024

+1 for abstracting away pex, in some cases it might beneficial to use a virtual venv as the execution context instead

Nishikoh · 2024-05-09T14:32:34Z

Nishikoh
May 9, 2024

The number of 3rd party packages increases with larger repositories. As this number grows, lock generation takes time and affects the developer experience. Just as uv speeds up the dependency lock process, it would be nice if Pants could speed up the process as well.

5 replies

thejcannon May 9, 2024
Maintainer

I think this could probably just be implemented in Pex using the pip --fast-deps flag perhaps?

tdyas May 9, 2024
Collaborator

Can the slow-down be attributed to a lack of built wheels for certain dependencies?

benjyw May 9, 2024
Maintainer Author Sponsor

I was going to play with integrating uv. The main reason we haven't done this in the existing backend is that it can't generate multiplatform lockfiles, but maybe we switch to having separate lockfiles per platform. See pex-tool/pex#2371 for more info.

kaos May 10, 2024
Collaborator

On a related note is to not invalidate the entire world on every change to the lockfile/requirements. This PR is a first step to support tracking affected targets in this case.

Taytay May 17, 2024

Slow lockfile generation will likely cause us not to adopt Pants unfortunately, so using something like uv would be incredible. I realize that this is a bit of apples and oranges, but: It takes 4 minutes on a decent cloud-based machine to generate a lockfile, even with everything fully cached. It takes another 2 minutes to export the environment.
Paying a 6 minute penalty to bump a module version is going to be hard for our team to swallow.

In my experiments with slow generate-lockfiles, even on a fully-cached run (where I had JUST run generate-lockfiles and made no changes), even the 2.21.0a branch, 3 minutes was simply spent in pip checking pypi, realizing that the version it already had were current, and then another 10s copying some wheels. There are likely some settings I'm missing to speed this up.

I've got a gist of the pip download log here: https://gist.github.com/Taytay/492c12eaedce6c7999e0028fb6a9a50a
Slack post regarding this behavior here: https://pantsbuild.slack.com/archives/C046T6T9U/p1715862601075849

sureshjoshi · 2024-05-09T14:34:45Z

sureshjoshi
May 9, 2024
Collaborator

Unsure if this is Python backend vs Backend in Python vs Pants - but it would be nice if we had some sort of "seamless interop", or at least, first-class citizenship for pyo3/cython/pypy/python-make-go-faster-tools.

I haven't given this any thought past the above statement, I'm looking into a cython plugin/backend again soon, and it's at the top of my mind right now :)

4 replies

tdyas May 9, 2024
Collaborator

Do you mean seamless interop between Rust and Python rule code? Idea: an out-of-tree plugin contains both Python rule code and Rust intrinsics and/or support code callable from Python

sureshjoshi May 10, 2024
Collaborator

Idea: an out-of-tree plugin contains both Python rule code and Rust intrinsics and/or support code callable from Python

[sharpens pencil, opens notepad]
...go on... 😄

Specifically on the out-of-tree part? Why would it be handled any differently from today?

Don't get me wrong, I've been kinda pitching pushing backends out-of-mainline for a while (with no expectation that can happen until we have API stability)

tdyas May 10, 2024
Collaborator

The thought is a very nascent one: using this new Python backend as an example, imagine if it could be developed out of tree with both its Python rule code and the Rust-based dependency inference code living together in the same repository. And they would be packaged together and consumable by users by just referencing the name of this published plugin in pants.toml.

Note: I am intentionally glossing over how it would be distributed (whether on PyPi, pexes, whatever.) and how it would be packaged.

What matters is that:

We have an API which allows both Python and Rust code to plug into the Pants core.
There is a packaging mechanism which allows Python and Rust to live together separately from the Pants core.
Some way (other than PyPi?) for discovering and distributing these plugins.

sureshjoshi May 10, 2024
Collaborator

Hmm, that's interesting... I think I need to fiddle around with pyo3/maturin a bit more with in-repo plugins - might help at least refine the proposition.

Nishikoh · 2024-05-09T14:38:09Z

Nishikoh
May 9, 2024

I am running the GPU version of PyTorch with pants run example.py. When executing programs that include torch, the waiting time until execution is long, so it would be great if it could be executed immediately!

2 replies

tdyas May 9, 2024
Collaborator

Do you know if the slow-down is due to some amount of overhead in initializing PyTorch? Are there other causes?

Taytay May 17, 2024

@tdyas : Since this behavior isn't seen outside of pants (that I'm aware of), this is likely due to the sheer size of the pytorch dependency, and likely some pex-related dependency movement/detection, right? I thought some unpacking needed to happen in a case of pants run

yjabri · 2024-05-09T16:51:42Z

yjabri
May 9, 2024

It would be great if updating the lockfile didn't require us to rebuild all of our PEXs. Also being able to update specific 3rd party dependencies in the lockfile while not updating everything.

3 replies

Jackevansevo May 9, 2024

For the latter, there's ongoing work for this here #20364

jyggen May 9, 2024
Collaborator

For the former (which is something I was going to "wishlist" myself), I think #20531 could solve it.

danny-todd-oxb Dec 2, 2024

+1 for this. Currently, we have to regenerate the entire lockfile for a specific singular dependency that changes a lot more frequently than the rest. This also has the side effect of invalidating "irrelevant" cached processes as other dependencies are also updated.

grihabor · 2024-05-09T17:21:20Z

grihabor
May 9, 2024

It would be cool if 400 resolves didn't eat up all the memory 😄 #20568

0 replies

huonw · 2024-05-10T05:07:48Z

huonw
May 10, 2024
Collaborator

Some observations (across several comments for better threading)

I think Pants works especially well when it's as thin a wrapper around the underlying tools as reasonable, and, in particular, ensures that as much of the functionality of those tools is available as possible.

o make this more concrete, this might mean a target like pex_binary wouldn't add its own interpretation of functionality. For instance, "execution mode" doesn't seem to be a pex term, so maybe something like venv: bool | "prepend" | "append" would better reflect the underlying --venv [{prepend,append}] argument.

I think this includes both "more naive" target fields but also generic "pass these args too" like added in #20737 (the discussion there has some nuance too). For the pass-through args, maybe this would even include things like python_pytest_test(..., extra_pytest_args=[""]) or python_source(..., extra_black_args=["..."]) (this might help with #20897 (comment) above?).

0 replies

huonw · 2024-05-10T05:33:49Z

huonw
May 10, 2024
Collaborator

Some observations (across several comments for better threading)

The resources/files/source files/packaged artifact distinction seems to be a consistent source of confusion on Slack too.

I think the resources/files part is well documented. The latter is things like putting a packaged artefact into another one (e.g. to have a PEX that invokes some binaries):

python_sources(name="src")
pex_binary(name="a", entry_point="a.py")
pex_binary(name="b", entry_point="b.py", dependencies=[":a"])
# (silently) doesn't include the `a.pex` and not sure there's any way to get it there

This isn't a problem with the Python backend alone of course.

1 reply

thejcannon May 13, 2024
Maintainer

I still think the solution to this is "edge metadata" (as @stuhood called it). E.g. allow the user to say how a dependency ought to exist.

sureshjoshi · 2024-05-10T13:37:34Z

sureshjoshi May 10, 2024
Collaborator

Are you referring to dependency-less standalone scripts? Or do you mean under the hood, if we need to create a pex, that should be transparent to the user?

isra17 · 2024-05-10T13:40:49Z

isra17 May 10, 2024

I would say the latter, something that would work the same way as python my/script.py, but let pants build the sandbox.

sureshjoshi · 2024-05-10T13:42:40Z

sureshjoshi May 10, 2024
Collaborator

I thought that already worked?

# BUILD
python_sources(name="libhellofib", sources=["**/*.py"])

pants run examples/python/hellofib/hellofib/main.py
Launching HelloFib from __main__
Calculating fibs took 1.0172736644744873 seconds

# BUILD
python_sources(
    name="libhelloworld",
    sources=["**/*.py"],
    dependencies=[
        "examples/python/core:libcore",
    ],
)

pants run examples/python/helloworld/helloworld/main.py
Launching HelloWorld from __main__
Hello, World!
Goodbye, world!
Greetings, world!

I'm pretty sure it's creating intermediary pexes - just not showing up on the logs

isra17 · 2024-05-10T13:47:13Z

isra17 May 10, 2024

Oh we missed that, you're right!

krishnan-chandra · 2024-05-10T23:43:05Z

krishnan-chandra
May 10, 2024
Collaborator

One thing that would be really nice is default partitioning Python tools by config: #17739

1 reply

krishnan-chandra May 10, 2024
Collaborator

I guess this is basically the same as #20897 (comment), except applied to external tools instead of Pants itself.

sureshjoshi · 2024-05-13T17:05:13Z

sureshjoshi
May 13, 2024
Collaborator

From the monthly meeting:

Is there a way for the new python backend to define an API (or pseudo API, or API + implementation) that we could backfill into the other backends to share as much code as possible, and reduce the per-backend maintenance burden (especially for those backends where we don't have dedicated users at the moment).

For example: Python, JVM, and Go backends don't have a lot of shared code and each solve similar problems in different ways (with good historical reasoning for that).

From my personal experience:

The C/C++ backend is a surprisingly small amount of novel code for what it is and leverages a lot of the APIs that we currently use.

Ditto for Swift, where it packages a whole "module's" worth of code together at once, but there really isn't much "swift-centric" Pants code required to get 90% of Pants functionality out of it.

1 reply

jasondamour Jul 25, 2024

I like the sound of this, but it seems to inherently conflict with #20897 (comment) which I also like. There are two philosophies, and I'm not sure which I subscribe to:

Pants should be a super thin wrapper which transparently delegates to language-specific tools and preserves as much of their functionality as possible
Pants should have a single engine which gives a consistent cross-language experience, and only use language-specific tools to map each language onto the interface

I can see each approach being beneficial in different scenarios:

Right now, I'm adding pants to a python-only repository. It's frustrating that pants has so much "bespoke" functionality, so I can't interchange between pants and other tools. I almost want pants to just be a plugin for faster tests, with a config section in pyproject.toml
On the flip side, if I was adding pants to a repo which had multiple languages (i.e. java and python), I might find it really frustrating that dependency resolution logic or syntax is different (i.e. maven vs poetry), and it would be nicer if there was a consistent Pants abstraction across both

aviau · 2024-05-15T22:09:53Z

aviau
May 15, 2024

Would it make sense to provide for a way to run tests sequentially without PANTS_PROCESS_EXECUTION_LOCAL_PARALLELISM=1 ?

We are migrating to pants currently and our codebase does not support running tests in parallel. However I see no reason why we wouldn't be able to build sandboxes in parallel.

3 replies

kaos May 16, 2024
Collaborator

(I'm assuming a Python code base..)
Have you tried using https://www.pantsbuild.org/2.20/reference/targets/python_tests#batch_compatibility_tag and perhaps also bump https://www.pantsbuild.org/2.20/reference/goals/test#batch_size

I imagine that you would end up with a single batch (i.e. one pytest invocation with all files) if all python_test targets share the same compatibility tag, and the batch size is large enough to hold them all. Hint: use __defaults__ to set the compat tag on all targets in a subtree using:

__defaults__({python_tests: dict(batch_compatibility_tag="our-compat-tag")})

https://www.pantsbuild.org/2.20/docs/using-pants/key-concepts/targets-and-build-files#field-default-values

aviau May 16, 2024

If we did a single batch, wouldn't we lose all caching?

kaos May 16, 2024
Collaborator

yea, it would trade the parallelism for caching.. so not ideal from that perspective.

grihabor · 2024-05-16T08:23:37Z

grihabor
May 16, 2024

It would be nice, if python backend had a better integration with docker. Right now you have to do a lot of steps to create an optimized docker image with 2 layers of pexes - thirdparty dependencies and first party dependencies. It's even harder to split thirdparty dependencies into more layers.

0 replies

isra17 · 2024-05-16T16:10:52Z

isra17
May 16, 2024

Would be nice to be able to add explicit dependencies in python code. I mean something like:

# pants: infer[//my/package/foo.txt]
importlib.resources.files("my.package") / "foo.txt"

Right this can be done through BUILD, but having it in the python has the advantage of being more obvious while you read the code and actually can be explained by the code in close proximity. Dependencies in BUILD files also tend to not get cleaned up when the related code is. Perhaps something similar could be done for most format. For example we have some dynamic code loading from config files in YAML, so something similar could be done:

# pants: infer[//my/package]
package_path: my.package

0 replies

benjyw · 2024-05-16T21:48:05Z

benjyw
May 16, 2024
Maintainer Author Sponsor

We'll want to properly model transitive 3rdparty deps, so that, e.g., dependencies and paths and other graph-querying goals can act on them, and they can be excluded with !! and so on.

0 replies

benjyw · 2024-05-16T21:49:46Z

benjyw
May 16, 2024
Maintainer Author Sponsor

We should look into caching certain "facts" such as "this test passes" or "this file passes lint" independently of caching the process that asserted that fact. This allows us to disconnect caching/invalidation granularity from process granularity: We can run processes in batch for efficiency but still cache at a fine-grained level.

1 reply

tdyas May 16, 2024
Collaborator

This may require some amount of rethinking how Pants caches build results then. [For the benefit of other readers:] Right now, the REAPI model used by Pants enforces using process invocations as the basis of caching and caches using all fields of the process invocation as the cache key (since the Command and Action REAPI structs are hashed). Moreover, those process invocations (if not in the cache) are expected to produce the expected output. If we are going to relax that constraint, then we are no longer in REAPI's data model (which may be fine but it will need thought).

An idea of mine from a while ago: We might work around the REAPI model's constraints by constructing "synthetic" process invocations for which we just upload results which were otherwise computed in a batch via a single process invocation. The synthetic process should still conceivably invoke a process which could recompute just that file's result. This could conceivably work for linting and formatting process invocations.

Taytay · 2024-05-17T11:40:52Z

Taytay
May 17, 2024

First disclaimer: Pants is awesome, and the community is awesome. The work and thoughtfulness of the team is to be commended. I have some criticisms, but I am only being exposed to Python and its tooling in the last year or so, and my needs are simpler than many of Pants' users. Also, I'm using this as a place to document my experience as a new user of Pants and a semi-new user of Python in general. So some of this feedback is Python-specific, and some of it is Pants-specific. Before I have the "curse of knowledge" and know too much about Pants, I wanted to write down my experience in hopes that it helps with the new direction Pants/Python might go!

TL;DR: I'd love for the next Python backend to be faster and simpler for the common case :)

==============

This post comes at a great time for us. I am SO excited about the potential of Pants for Python. I am a believer in monorepos. I am in the middle of rearranging my "organically" grown repo at work that started as a single ML/AI project, and has now morphed into something I am not proud of. :)

"All" I'm trying to do is this: Follow a good convention for organizing my Python-only repo for multiple, potentially related python projects, with a standard way of building, doing CI in Github Actions, and deploying on Docker. I want my daily workflow to be fast, and I want it to be fast for my team. I love opinionated code formatters, and opinionated convention-based build systems. :)

In an effort to learn how Pants worked, I cobbled together a monorepo example based on other examples I found:
https://github.com/taytay/pants-python-template-repo/
The readme in particular is unfinished, but the structure works, the Github actions work.

Pants has some fantastic ideas and implementation, and I love how involved the community is! (People were responding to my Slack post within no time, and there are some great testimonials from teams much larger than mine.)

Here are the other things attracting me to Pants for a Python monorepo.

I like opinionated frameworks and repo-organizations. I frankly don't know how to do a lot of Python-related stuff, so I'm looking to stand on the shoulders of giants here. Pants is built for much more complicated things than my setup.
I love the dependency inference from the large universe of repo-wide dependencies. I like that I can only bundle modules that are actually used by a given project.
Built-in build targets for Pex, Docker, and wheels. I am so new to Python that I don't know how to do any of those things, so someone telling me how to do it is awesome!
Built on rust, (as all of the cool kids are doing these days). Emphasis on speed and caching.
The team seems smart. If they can handle these other complicated use-cases, surely they can handle my simple use case.
Good documentation.
Formatting, checking, and linting are first class citizens! Nice!
Integration with ruff. Yay - it's keeping up with modern tooling offerings!
CI/CD is a first class concept. Examples of Github actions with Pants are out there. There is a built-in Pants installer.
Setting a source root up allows me to easily reference other python modules! This is surprisingly nice because I find Python's module resolution stuff to be surprisingly complicated/finicky compared to other languages.
Lockfiles! Yes! This is SORELY missing from the Python ecosystem, so the Pants developers understand reproducibility!

Here are the things that have concerned me as I learned about it. To be clear, these are not necessarily failings of Pants or the team. Just documenting my experience as a newbie.

Steep learning curve. After about a week, I feel like I have my head wrapped around it. I'm concerned about introducing something to the team that feels so "foreign". I realize that this is likely a "me" problem, and I have no experience with any of the build systems it is built on, which doesn't help. On that note, I kept looking for, "Are you coming from a traditional pyenv background? Here's what you need to know, and here is daily workflow stuff. Here's how you update a module. Here's how you edit and run two related projects in the same repo. Here's when you need to do pants run vs just running the code in your exported environment. etc" One of the most practical guides I eventually found was this one: https://docs.backend.ai/en/latest/dev/daily-workflows.html
I liked it because it told me: "Just do X when you want to do common dev task Y."
I know Pants is doing a lot, so I expect it to be complicated. It would have been nicer if it was one of those things that was extremely deep, but that I only needed to know 10% of to use 10% of the features. Instead, I feel like I needed to know 40% to use the first 5% of the features.

1.5) It took me a long time to just learn how to update the version of a build tool I was using. The tools are built in, which is great, but what if I want to use a new version of ruff? If I hadn't already followed the lockfile-per-tool convention, I don't think I could? I'm still not sure how I'd add another tool that wasn't built-in.

No official VSCode extension. My first thought was : "Uh oh. I might be the only VSCode user of this! " There are some docs around exporting an environment for VScode, but I'm still not positive it's the happy path. I saw suspenders, and the more I've read, the better and more official it seems, but it does make me think this is not popular in companies/environments like mine.
No frequent mentions of it in Python-Twitter, at least in the AI/ML space. The Pantsbuild Twitter itself is not active. This might seem small, but it made me think I was swimming upstream again or trying to innovate too much by adopting Pants. I'm not looking to innovate here. I think that's likely due to the fact that most people in this world are making single-purpose repos, and Pants is more of an "Enterprise" use-case.
Discoverability of goals seemed harder than something like a makefile or a package.json that exposes, "These are the 5 things you'd want to do in this repo. Here are the commands to do them." Instead, the list of goals include other things that my team would not want to do. I created my own scripts for this.
Concerns about adaptability when I want to run something that isn't built-in. I know there is an "ad-hoc" something, but even after reading the docs, I wasn't convinced that I'd be able to easily integrate a new formatter for example. That scares me. What if I want to adopt something before the community/maintainers do? What if a team-member wants to add a new build-script? Do they have to learn pants plugins? 😬
Necessity of a BUILD file in each sub-directory of my modules. This REALLY surprised me when I ran tailor and it suggested a new BUILD file in every folder of what I thought of as a single module. I thought it was a bug. I am new to Python in general, so I was thinking, "Now I need a init.py AND a BUILD file in every folder? And they are identical every time? Surely there is a better way..."
No built-in templating for creating a new module/project in my monorepo. People have written scripts to copy a template folder to a new folder from what I've seen, but I think I expected something opinionated like Rails' that knows how to generate a new item of a given type in my repo. It puts it in the right place and fills in the right bits for me.
Some mentions online of needing to "reset" pants when it misbehaves. 😬 That scares me - makes me think I might be burdening my team, and change it from "It just works" to "It works on my machine when everything is okay..."
Speed: I thought that it would install Python for me, and with the approprate pyenv, it does! Cool! That will make a new dev's life much easier! Unfortunately, it uses pyenv, which requires the complication of Python. Oh, and the Pyhton is unoptimized. There is no option to download pre-built Pythons. I was surprised I hadn't realized this before. I've been installing Python with Conda, which is faster.
Speed: And then, when I finally got it configured for my existing repo, I hit a wall. it was so slow to generate my lockfile based on my requirements. I thought something had gone wrong. I posted about this up here: New Python Backend wishlist #20897 (reply in thread)
It took about 4 minutes to do the initial generation, and another 4 minutes to export that environment, even when fully cached. Compared to uv that can finish the "equivalent" in a few ms, and I was disappointed.
I saw others with similar concerns. That's the sort of lag that would seriously slow me down. Even if we only update dependencies once a week, that's an 8 minute hit for the person adding the new dependency (IF the new dependency works well the first time!), and a 4 minute hit for everyone else exporting that environment!
A ramdisk helps with environment export, but that's a lot of work to work around that.
Seeing what Pex or Pip was doing during lockfile generation isn't possible/easy. Even with the verbose flags for Pants and Pex, the logs are not easily tailable, and nothing is written to stdout as it churns for 4 minutes. It's impossible to tell if it's progressing during that time. There are mentions in Slack of figuring out where the pip logfile is and then tailing it, and I did that, but it seems like a very frequently asked question (how do I debug a slow or broken pex/pip), with no built-in support for digging in. Even if I learn to do it, I'd hate for others on my team to run into an issue here. This is supposed to reduce per-machine issues, but my fear is that it would exacerbate them. I would have expected there to be an option to stream the backend stdout as it works. If I am running into "difficult-to-investigate" issue when doing the equivalent of a pip install, I'm really nervous.
Knowing if I'm up to date (lockfile and environment export) before working isn't easy. With pip, yarn, (or especially uv) just re-run the command to make sure you've got everything installed. If I re-run these commands in Pants, it takes just as long the 3rd time, whether it's up to date or not.
Knowing what the "best" speed-related settings to getting my env configured are don't seem to be documented in one place. That backend.ai post mentions a ramdisk, but there are some other settings that Pex (and maybe Pants?) seem to have that I might need to enable, but I'm not sure what they are.

==========================

What I'm considering instead:
A) There are options that just use "typical" tools and conventions, like Tweag documented here: https://www.tweag.io/blog/2023-04-04-python-monorepo-1/
https://www.tweag.io/blog/2023-07-13-python-monorepo-2/
and here: https://github.com/tweag/python-monorepo-example
I know they went on to adopt Pants later, which I considered a pretty big vote for Pants.

B) Rye: https://rye-up.com/guide/
It's nascent, but it does some stuff I really like. First, its goals are to make it easy for a newish Python developer to get started with a new or existing repo. Installs pre-built python for me. Easy ramp-up when bringing new devs/machines onboard. Uses uv by default. Allows for monorepo setup. Has lockfiles (not cross platform, but I don't need that I don't think...) I KNOW this is NOT an Apples-to-Apples replacement! Pants does 100x more than Rye.

I don't think Rye is long for this world now that it's been adopted by Astral (of ruff/uv fame), but its successor will be presumably developed/released by them, and I expect its abilities to grow. Also, it's based on standards like requirements.txt, project.toml, penv, etc, so if it goes away, ripping it out will be quick/easy. Frankly, seeing how popular uv and ruff became in a short period of time, I expect their next build-based tool to be similarly popular. It will do ONLY python, and will do 25% of what Pants does, but it will likely be the 25% I need for my Python monorepo.

So if Pants had a similar, "Just use this backend, and we'll bootstrap Python for you, handle linting, dependency installation/subset detection, deployment to common targets, etc" and we're about as fast as using pip or uv by itself, it would be my preferred solution! This might be too big of a departure conceptually, but I actually feel like the seeds are already there (including a rust backend for goodness sakes :) )

Thanks for coming to my talk. ;)

6 replies

sureshjoshi May 17, 2024
Collaborator

Holy crap, this is an incredible write up. It should absolutely be a discussion all of its own - as a lot of these concerns aren't Python specific, but Pants and/or new-user specific.

Just going to touch on a couple of points.

Learning curve

I think better (read: less thorough, but easier to digest) documentation would help. We've talked about a "recipes" kinda page - but yeah, something more like what I put up here https://tldr.inbrowser.app/pages/common/pants would be useful for the first 10 minutes of a user's Pants workflow.

I would also like if pants itself guided users somehow with trivial suggestions (no idea how this could work or what it would look like).

A split between "starting a new project" and "using an existing project" might also be a good approach.

Documentation suggestions always welcome!!!

No official VSCode extension

Correct, not yet. I wrote Suspenders - and it was on hold while I was pushing through some pants changes that were required to make Suspenders useful. I'm slowly going through the process of adding code lenses, test runners, and LSPs (and unit testing!) - in the next 1-2 months I'm hoping for a decently featured, but still "preview" release.

The BUILD LSP it already a huge win for me locally, but I need to make it tailor itself to each project, rather than just my projects :)

Discoverability of goals seemed harder than something like a makefile or a package.json that exposes

Compared to package.json, yeah - less sure about the makefile comparison 😆

This could be an interesting usage of alias, where each team can carve out its set of scripts, but that would be at a monorepo level currently - rather than a per project level (I think, I've never tried it otherwise).

I personally suggest that to teams, but it's up to them to use it. It's cool though, because it stretches out how long a team can go before looking at the docs (which is great for new users).

For instance, one of the companies I set it up for - my readme (and aliases) are all they need for the most part - the fact that they call pants is an almost an afterthought, could just as well be pnpm or cargo to them.

Necessity of a BUILD file in each sub-directory of my modules

You don't need to and I never do it. Gives me flashbacks of my older CMake days.

I personally have a single BUILD file at the top-level of each of my sub-projects (e.g. https://github.com/sureshjoshi/pants-plugins/tree/main/examples/python/hellofastapi has sources="**/*.py")

Tailor automatically generates one per location of sources, by default I think.

No built-in templating for creating a new module/project in my monorepo

Yeah, this is a tricky one because I haven't seen two companies with the same ideas of what a monorepo should look like. Pants prefers to be indifferent to that aspect, so long as BUILD files are present.

Not sure what the solution would be here - other than some cookie cutters that you could init against (ala yeoman)

9/10: Speed

Yeah, there is never a "too fast" :)

Each level of tooling on top of no tooling adds overhead. Personally, speed is my biggest issue with Pants too - but in my case, it's daemon init while working on Pants itself. There's a lot of work going into this side, but I'd agree that we want to optimize the most often called workflows.

Just use this backend, and we'll bootstrap Python for you, handle linting, dependency installation/subset detection, deployment to common targets, etc

I wonder if an interactive tailor might be the partial-solution to this kinda thing. Instead of doing something and having new users read the docs on how to configure it, if it inferred a lot of information and asked clarification questions instead.

There's a lot of work to get to what you described, but making life easier for new users is always a good thing.

TL;DR: I'd love for the next Python backend to be faster and simpler for the common case :)

Yes, yes, a thousand times yes.

Taytay May 18, 2024

Thanks @sureshjoshi ! And thanks for Suspenders!
Yes, even that short documentation helps a LOT! : https://tldr.inbrowser.app/pages/common/pants

Fair point about makefiles being obtuse! I just meant that they expose named targets. :)

You don't need to and I never do it. Gives me flashbacks of my older CMake days.
I personally have a single BUILD file at the top-level of each of my sub-projects (e.g. https://github.com/sureshjoshi/pants-plugins/tree/main/examples/python/hellofastapi has sources="**/*.py")

Oh thank goodness! I think tailor is just eager. :)

Yeah, this is a tricky one because I haven't seen two companies with the same ideas of what a monorepo should look like. Pants prefers to be indifferent to that aspect, so long as BUILD files are present.
Not sure what the solution would be here - other than some cookie cutters that you could init against (ala yeoman)

Oh! That makes sense! I don't know enough to have opinions yet. :) I think I was looking for something a bit more opinionated and hand-holding in that regard. I think the flexibility is awesome in the long run though.

I wonder if an interactive tailor might be the partial-solution to this kinda thing. Instead of doing something and having new users read the docs on how to configure it, if it inferred a lot of information and asked clarification questions instead.

I really like that idea. Of course it would be difficult to make it flexible enough for "ALL" existing repo scenarios, but questions like the following would have helped me tremendously in adopting it for my repo, even if it didn't move anything for me. Of course, they would help with a new repo too...Even if my repo isn't set up correctly at first, knowing that I need to move my code into the 'src' folder after I'm done is super helpful!
"Q: What version of Python do you want to be your default? If you are unsure, just leave it blank. [3.12] : "
"Q: Where will each of your python project folders be located? This is called a "source root". (We recommend that they go into 'src' or 'src/python'.) Don't worry - if there are more locations, you can specify those in a moment. [src/python]: "
"Q: "
"Q: Would you like for me to create two sample projects in 'src/python' that show you some basic concepts? Y/n: "
"Q: Would you like me to create sample Github Action showing how to build and package those sample projects? Y/n: "
"Q: Would you like me to add a section to your readme with common pants commands?"
"Q: What IDE do you use? [VSCode, PyCharm, Other...]"
"Q: Which formatter would you like to use?..."
Summary: [Here's what we did. Here are the common commands you will need to know... Next Steps: ...]

jasondamour Jul 25, 2024

I personally have a single BUILD file at the top-level of each of my sub-projects (e.g. https://github.com/sureshjoshi/pants-plugins/tree/main/examples/python/hellofastapi has sources="**/*.py")

Oh wow I was explicitly avoiding this because I thought pants did "something" at the target-level, like dependency graphing or cache invalidation. Are you saying there's currently no downside to declaring top-level targets?

benjyw Jul 28, 2024
Maintainer Author Sponsor

It's nuanced, but those python_sources() "targets" are actually "target generators" - they are macros that expand to a python_source() target for each underlying file. So there is no downside to having a single top-level target. It used to be that if you wanted to override some field value for a specific file (say a timeout for a test) you had to create a new target for just that file, which meant you had to exclude the file from the sources of the main target, which was a hassle. But now that you can use overrides on that main target, even that reason is moot.

sureshjoshi Jul 28, 2024
Collaborator

@jasondamour Benjy just sniped the answer from me, but higher level, still feeling the sting from years of CMake and even if there were small downsides - I would pay that willingly to not have hundreds of identically named files 😆

Jackevansevo · 2024-05-17T15:35:00Z

Jackevansevo
May 17, 2024

Couple of ideas

Perhaps a opt-in mode that avoids BUILD files. I appreciate BUILD files allow you to progressively adopt pants for subsets of your codebase, as someone who's 100% adapted it in their mono-repo perhaps things can be inferred implicitly? Having to run pants tailor :: just to add python_sources() or python_tests() surely could be determined at runtime (using the same heuristics).
Automatically fetch a python interpreter (similar to how rye automatically manages your toolchain from https://github.com/indygreg/python-build-standalone)
I see the utility of pex packages bundling deps into the binary, but I've run into quite a few pex related cross platform issues that would be solved just by shipping a requiremnets.txt and having the deps installed remotely instead (i.e. inside docker/google-app-engine etc). Perhaps the ability to bypass the pex/packaging step and just utilise dependency inference / tree shaking to output a basic dist/<target> with a requirements.txt?

5 replies

sureshjoshi May 17, 2024
Collaborator

Automatically fetch a python interpreter (similar to how rye automatically manages your toolchain from https://github.com/indygreg/python-build-standalone)

This is funny, because I thought we DID do this (circa this post https://www.pantsbuild.org/blog/2023/03/31/two-hermetic-pythons). scie-pants (aka pants fetches the appropriate python-build-standalone to run itself) and I thought there was a feature which allowed picking a PBS, or grabbing one based on interpreter constraints. I might have conflated it with pyenv though (or just Docker builds: https://www.pantsbuild.org/2.20/reference/subsystems/python-bootstrap#internal_python_build_standalone_info)

I'm a +1 for this being automatic, if it isn't.

Oh, also, I don't think this requires a new Python backend - this should be possible (again, if we're not already able to do it) in the current backend.

benjyw May 18, 2024
Maintainer Author Sponsor

Thanks for this "talk"! It's very very helpful. Particularly since it dovetails with much of my own thinking (getting rid of BUILD files unless needed, leaning in to uv even if it means needing per-platform lockfiles, simplifying the UI, etc etc). The current Python support was strongly influenced by how some specific companies managed their monorepos, so getting a more diverse set of opinions is super helpful.

Taytay May 18, 2024

Honestly avoiding BUILD files scares me a bit. I thought that it was a nice way to structure it: "BUILD files mean to BUILD this folder"...
But that's for someone who is adopting it progressively, so yeah, maybe I'll feel better later. There's a tension between easy/hidden "magic" and explicit repetitive determinism. I thought the BUILD files with very few lines of "code" were a nice compromise, but maybe with the other python-related files (project.toml?) they are repetitive.

isra17 May 18, 2024

The BUILD files get created everywhere when you run pants tailor :: which you probably should run in your CI to make sure you didn't forget it anyway, so there are no special semantic such as "If there's no BUILD file in this folder don't build it" other than "I forgot to run pants tailor". On top of that the BUILD file are 99% of the time just a quick magic spell like python_sources(). It doesn't make anything explicit, I already can see there are .py in this folder. If anything they are annoying when you cleanup some folder such as removing a conftest.py and now pants complains about a target with no source.

Taytay May 18, 2024

But YES! Pulling down a python was something I was 1) surprised didn't happen by default, and 2) was slower than I expected after I enabled pyenv. :) I'd love it if a new dev just needed to run pants and would be off to the races...

isra17 · 2024-05-19T22:38:44Z

isra17
May 19, 2024

My current experience of migrating our monorepo to pants have one major pain point: multi-resolves.

So we have a few AI projects or projects that depends on application dependency with strict constraints, so we have to keep them in a different resolve (so we have something like python-default, python-ai, python-mitmproxy, etc).

I thought I would then just set the resolve on the proper PEX and make sure all the required third-party are in the lockfile.

Turns out these projects depend on other internal module a bit everywhere. Obviously, starting from scratch, things would have been better organized, but we are migrating to pants for a reason after all.

My main issue so far is that there's no automatic resolve inference. So if a module depends on another module that is not in the same resolve, I get an error. Throw in transitive dependencies and you have me trying to carve out the resolve over our thousands of modules for the last few weeks. Add too much and I get an error about missing third-party dependencies. Add too little and I get an error about a file not in the resolve. Now I'm full of BUILD file with overrides and many __default__ override. This is getting ugly and I'm not through it yet.

I feel that pants should be able to infer these and error when some module depends on a third-party not available in the resolve. After all, isn't the whole point of resolve, 3rd-party dependencies? Why bother us with 1st-party.

1 reply

jasondamour Jul 25, 2024

Exactly this!! #21194

sureshjoshi · 2024-06-01T14:05:45Z

sureshjoshi
Jun 1, 2024
Collaborator

Just ran into another one which is specific to Python/Pex.

In some form, a lot of the other backends rely on Pex for tooling - which makes Pants itself harder to hack on, to try out experiments, or test out random ideas. I ran into this with the cc backend a bit, and now with docker.

The number of cyclic dependencies I've run into just trying to throw some Getters into random places in the code to see what happens if we change order of calls or something, and then getting hit with a cyclic dependency that 8-10 levels down comes back to util_rules.pex.

Example: Last night + this morning, I'm at 2 hours of fighting cyclic dependencies trying to test out a new docker feature by using a pex_binary as my example code, but as a result of using that - I basically have to do everything the perfectly correct way, without yet actually knowing what that will look like.

So, I guess the weirdness comes about by requiring the Python backend to be a dependency of other backends (by virtue of needing Pex/PythonToolBase) which is fine - but then the dependency tree explodes as a result.

This might fall into the camp of a well defined API, or using visibility rules, or something from the get-go - I'm not really sure.

1 reply

tdyas Jun 23, 2024
Collaborator

Should the new backend should ditch pex as the underlying venv setup technology?

Would it be easier to just have Pants build venv's directly?

cognifloyd · 2024-06-06T23:39:19Z

cognifloyd
Jun 6, 2024
Collaborator

I've been trying to address #15481, but it's proving difficult. I need to make some python_test targets depend on an entry point or a group of entry points defined on a python_distribution target. Once I can record that dependency, I can easily add codegen rules to inject the appropriate entry_points.txt file into the python test sandbox.

I looked at turning python_distribution into a target generator, but hit several issues because the generated entry point targets represent only a subset of the python_distribution's metadata, and parametrize is only supported on the generated targets (ie on the moved fields, but not on the copied fields).

1 reply

cognifloyd Jun 21, 2024
Collaborator

I ended up adding yet another special dependencies field in #21062 to add this to the existing python backend.

The new python backend should have a richer dependencies field that allows edge tagging/annotation to describe how a dependency is required (as a wheel file? As an editable install? As a fully installed wheel? As a runtime package dependency? As a file? As a resource?) or in this case, which entry points are required.

tdyas · 2024-06-25T18:03:25Z

tdyas
Jun 25, 2024
Collaborator

Contra-viewpoint (for discussion purposes): Should we just evolve the existing Python backend instead of rewriting it but do so in a way which minimizes impact on existing users of the Python backend?

We need not do the evolution in its current place in the Pants repository. Rather, what if we adopt the idea of "channels" (e.g., stable, nightly) for the Python backend like how Rust is distributed? That is, fork the Python backend into its own repository but keep the existing "stable" version in the main repository. Then start evolving the Python backend in its own repository as an "edge" or "nightly" channel. When the dev channel of the backend becomes stable enough to cut a release, we would then merge the changes into the main Pants repository.

For the sake of making the point, I am glossing over things like not losing git history in moving from dev/edge to stable, what constitutes a good point to stabilize a set of changes in the new channel, etc.

Basically, instead of rewriting the software, let's focus more on how we develop the software including where development takes place and how we package and distribute the software.

3 replies

benjyw Jun 26, 2024
Maintainer Author Sponsor

The purpose here is to reimagine Python support without the conceptual burden of all the legacy cruft, based on the many lessons we've learned in the last 4-5 years. That seems very hard to achieve via evolution.

tdyas Jun 26, 2024
Collaborator

Maybe evolution is the wrong way to describe what I am suggesting. The thought is to do breaking changes in a separate repository ("channel"), get it tested, stabilize it, and then merge it into main for release as the next iteration of the Python backend. Then repeat that process again and again.

Let's find a way to develop breaking changes in a way that does not break main nor regular Pants releases. I believe sequencing a number of smaller stabliized breaking changes over time will end up better for users and for Pants developers. Users will always have a working Pants release with Python support. As developers, we won't have to spend inordinate amounts of time building out a fully featured rewritten backend before we consider having users port to it. (And we would need to do that because as a user I am less likely to use a new backend which does not support everything I need out of it.)

We have already seen this issue with Pants v2 over Pants v1. While Pants v2 probably required being a rewrite, even to this day Pants v2 still does not support the full breadth of the user-facing features that Pants v1 did (outside of Python). I would set a high bar for treating this as a full rewrite versus a sequence of breaking changes developed to the side.

kuza55 Jul 25, 2024

As a user who is excited about improvements to the Python backend, but is unaware of any tradeoffs, I would personally prefer an incremental approach where improvements can be shipped sooner, rather than waiting for a full rewrite before seeing improvements.

gcbirzan · 2024-07-02T20:44:33Z

gcbirzan
Jul 2, 2024
Sponsor

For a relatively large monorepo, where we were disciplined enough to create libraries and declare our own dependencies on them, we find the fact that every file is a target to be a hinderance. We have an huge number of targets, which causes both slowness and memory usage issues. Even just passing a list of targets to pants and waiting for it to start running the tests takes up to 3-4 minutes at times.

I'm not sure if anything can be done without changing how pants works fundamentally, but it'd be nice to have an option

3 replies

sureshjoshi Jul 28, 2024
Collaborator

Can you give some stats on how many targets/files/etc you have? For a cold run, sure, but even for warm/hot runs it takes that long?

njgrisafi Aug 23, 2024
Collaborator

I also experience this and it's extremely painful

Here's the SCC result

$ scc .
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                   61481  11128150   816441    321013  9990696     418941
HTML                     21527   4603162   858670    788851  2955641          0
JSON                     10798   8408038      482         0  8407556          0
Plain Text                4385    210323    22270         0   188053          0
Markdown                   885     66991    16496         0    50495          0
CSV                        567   1832170      149         0  1832021          0
XML                        513    104085       19         4   104062          0
XML Schema                 362     77045     2929         0    74116          0
YAML                       291     35289      475       356    34458          0
Shell                      289     14743     2044      1492    11207       1630
Smarty Template             62      2702      353         0     2349        387
SQL                         45     38095      319       121    37655        131
INI                         37       999      176       167      656          0
TOML                        35      5544      112        60     5372          3
ReStructuredText            30      2854      912         0     1942          0
JavaScript                  21     13040      119        93    12828        160
gitignore                   16       339       59        57      223          0
BASH                        10      1060      124       112      824        148
CSS                          8      1469      259         2     1208          0
Expect                       7        85        2         0       83         11
Properties File              7        67        6         6       55          0
CloudFormation (YAM…         6       178        7        24      147          0
Dockerfile                   6        86       17         4       65         10
Mako                         3        72       21         0       51          0
Stata                        3        45        0         0       45          0
Makefile                     2        43       11         7       25          6
SVG                          2        41        1         1       39          0
Web Services Descri…         2      1600        0        27     1573          0
Docker ignore                1        23        0         0       23          0
Jinja                        1        50        0         0       50         16
TypeScript                   1       264       41        59      164         30
───────────────────────────────────────────────────────────────────────────────
Total                   101403  26548652  1722514   1112456 23713682     421473
───────────────────────────────────────────────────────────────────────────────

Pants being so slow and resource intensive it's becoming unusable. If there's a pants command I can run to provide number of targets lmk.

njgrisafi Aug 23, 2024
Collaborator

To be clear, both cold run and warm runs are slow, similar to what's described here #18911

kuza55 · 2024-07-25T17:11:32Z

kuza55
Jul 25, 2024

The main complaint I have gotten about pants is that it is slow, in two particular areas: generate-lockfiles and building requirements pexes (which still happens very often).

It also doesn't play very nicely with ML code due to the very heavy deps that ML packages and the performance slowdown that comes from that in pants' view of the world.

I would also be interested in improvements to remote cache hit rates for dev machines.

But in general, as a tool that is in the hot loop of development, better user experienced performance is highly desirable.

5 replies

isra17 Jul 25, 2024

We've been running pants for a little bit and this is one of the main feedback from the developers.

We tought about using pants run as our driver, but even with warm shared cache between containers, this has been a very bad idea. Booting up the pants scheduler alone takes between 10-20 seconds, and you can't use the daemon for more than one command at a time.

For now we are backtracking on using pants run and instead using the venv from pants export.

sureshjoshi Jul 28, 2024
Collaborator

@kuza55 How often do you find yourself generating lockfiles? Is it a frequent part of the workflow - or an occasional pain?

I've said it elsewhere in this discussion, but I'd take improved performance over basically anything else - as I'm a very impatient person (apparently).

There are a few changes coming down which should hopefully help a little, but any stats or backfilling of what packages you use that cause the most grief would also be helpful (maybe we should have a per-step tracking ticket/discussion where people could flood their requirements.txt and benchmarks).

We tought about using pants run as our driver, but even with warm shared cache between containers, this has been a very bad idea. Booting up the pants scheduler alone takes between 10-20 seconds, and you can't use the daemon for more than one command at a time.

@isra17 Two things you said feel concerning already:

The scheduler taking 10-20 seconds is something we run into on pants mainline, but I have a ticket pointing out that a couple of backends are doing some things they shouldn't, and we can bring that down to 6 seconds on my machine (from 20-30). I've never experienced anything remotely close to that as a consumer of pants though. I'll create a tracking ticket for this when I'm back at my computer later today
"Can't use the daemon for more than one command at a time" - this feels like something janky in the config or setup, could you create a ticket for this (if you already haven't)? This is not a thing that should happen.

benjyw Jul 28, 2024
Maintainer Author Sponsor

I assume "more than one command at a time" means "more than one run of Pants at a time"? You can run multiple goals in a single Pants invocation, but they will run serially.

kuza55 Jul 28, 2024

Genrate-lockfiles slowness impacts are not distributed evenly. If you're trying to do something new and need a new dep, or you're trying several different libraries, etc, it can be really painful, I think it regularly takes like 10 minutes for a single invocation.

This also hurts pants beginners more because they don't understand when they need to run this, and that they should basically avoid this if they can, so they get stuck running this command more than they need to.

I didn't mention it here since I thought it could get resolved separately from this, but multiple pants invocations at once is a big concern for me too. Mostly because of long running test & run targets, i.e. #20642

I also have a feeling that some other performance issues I experience are downstream of using PANTS_CONCURRENT to workaround that issue.

sureshjoshi Jul 28, 2024
Collaborator

Tracking ticket for generate-lockfiles slowness. There are a large number of possible solutions/improvements floating around, would be good to have some canonical repos to test against:

#21223

AdamHess · 2024-07-29T16:41:12Z

AdamHess
Jul 29, 2024

Would love to see support for Pyink its more flexible than black and is fully compatible with it

1 reply

sureshjoshi Jul 29, 2024
Collaborator

That's not really related to this backend wishlist.

pyink can be added as another formatter backend already - in the same way we have ruff and black: https://www.pantsbuild.org/2.21/docs/writing-plugins/common-plugin-tasks/add-a-formatter

New Python Backend wishlist #20897

benjyw May 9, 2024 Maintainer Sponsor

Replies: 27 comments · 66 replies

benjyw May 9, 2024 Maintainer Author Sponsor

tdyas May 9, 2024 Collaborator

huonw May 10, 2024 Collaborator

benjyw May 9, 2024 Maintainer Author Sponsor

tdyas May 9, 2024 Collaborator

benjyw May 9, 2024 Maintainer Author Sponsor

thejcannon May 13, 2024 Maintainer

kaos May 14, 2024 Collaborator

sureshjoshi May 9, 2024 Collaborator

benjyw May 9, 2024 Maintainer Author Sponsor

tdyas May 16, 2024 Collaborator

sureshjoshi May 17, 2024 Collaborator

tdyas May 17, 2024 Collaborator

thejcannon May 9, 2024 Maintainer

tdyas May 9, 2024 Collaborator

benjyw May 9, 2024 Maintainer Author Sponsor

kaos May 10, 2024 Collaborator

sureshjoshi May 9, 2024 Collaborator

tdyas May 9, 2024 Collaborator

sureshjoshi May 10, 2024 Collaborator

tdyas May 10, 2024 Collaborator

sureshjoshi May 10, 2024 Collaborator

tdyas May 9, 2024 Collaborator

jyggen May 9, 2024 Collaborator

huonw May 10, 2024 Collaborator

huonw May 10, 2024 Collaborator

thejcannon May 13, 2024 Maintainer

This comment has been hidden.

sureshjoshi May 10, 2024 Collaborator

sureshjoshi May 10, 2024 Collaborator

krishnan-chandra May 10, 2024 Collaborator

krishnan-chandra May 10, 2024 Collaborator

sureshjoshi May 13, 2024 Collaborator

kaos May 16, 2024 Collaborator

kaos May 16, 2024 Collaborator

benjyw May 16, 2024 Maintainer Author Sponsor

benjyw May 16, 2024 Maintainer Author Sponsor

tdyas May 16, 2024 Collaborator

sureshjoshi May 17, 2024 Collaborator

benjyw
May 9, 2024
Maintainer Sponsor

Replies: 27 comments 66 replies

benjyw
May 9, 2024
Maintainer Author Sponsor

tdyas May 9, 2024
Collaborator

huonw May 10, 2024
Collaborator

benjyw
May 9, 2024
Maintainer Author Sponsor

tdyas May 9, 2024
Collaborator

benjyw May 9, 2024
Maintainer Author Sponsor

thejcannon May 13, 2024
Maintainer

kaos May 14, 2024
Collaborator

sureshjoshi
May 9, 2024
Collaborator

benjyw May 9, 2024
Maintainer Author Sponsor

tdyas May 16, 2024
Collaborator

sureshjoshi May 17, 2024
Collaborator

tdyas May 17, 2024
Collaborator

thejcannon May 9, 2024
Maintainer

tdyas May 9, 2024
Collaborator

benjyw May 9, 2024
Maintainer Author Sponsor

kaos May 10, 2024
Collaborator

sureshjoshi
May 9, 2024
Collaborator

tdyas May 9, 2024
Collaborator

sureshjoshi May 10, 2024
Collaborator

tdyas May 10, 2024
Collaborator

sureshjoshi May 10, 2024
Collaborator

tdyas May 9, 2024
Collaborator

jyggen May 9, 2024
Collaborator

huonw
May 10, 2024
Collaborator

huonw
May 10, 2024
Collaborator

thejcannon May 13, 2024
Maintainer

sureshjoshi May 10, 2024
Collaborator

sureshjoshi May 10, 2024
Collaborator

krishnan-chandra
May 10, 2024
Collaborator

krishnan-chandra May 10, 2024
Collaborator

sureshjoshi
May 13, 2024
Collaborator

kaos May 16, 2024
Collaborator

kaos May 16, 2024
Collaborator

benjyw
May 16, 2024
Maintainer Author Sponsor

benjyw
May 16, 2024
Maintainer Author Sponsor

tdyas May 16, 2024
Collaborator

sureshjoshi May 17, 2024
Collaborator