Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic PyPi package rating and removal #16923

Open
LucaCappelletti94 opened this issue Oct 19, 2024 · 0 comments
Open

Automatic PyPi package rating and removal #16923

LucaCappelletti94 opened this issue Oct 19, 2024 · 0 comments
Labels
feature request requires triaging maintainers need to do initial inspection of issue

Comments

@LucaCappelletti94
Copy link

What's the problem this feature will solve?
At this time, there are lots of dead packages hosted on Pip.

These packages are characterized by no link to the source code, no README, and sometimes a single almost empty release long ago by a user who has never logged in again since. This impacts the name availability, which seems to have a rather large backlog at the time of writing, and generally makes Pip suffer from package rot (more and more results I get when I search a package name are just dead things).

Describe the solution you'd like
This may be Déformation professionnelle, but I believe it should be possible to create a ranking of sorts for package quality, based on an open-source algorithm that people may contribute to. This algorithm would receive a package directory in input alongside its structure metadata and spits out a score depending on how many desirable traits it has (or has not). Packages that fall outside of a certain percentile of the distribution, get flagged and an email is sent to their owner advising them that their package has entered a grace period after which, if no action is taken by them, it will be deleted, or the ownership of the package transferred to some other user who may take over its development.

Examples of such rules
Examples of rules, which I stress I believe should be defined as an open source library and be gradually improved upon by the community, could be:

  • Whether it has a README, whether the readme contains human-readable content (it isn't just gibberish) and has some minimum length that isn't for instance the default README of GitHub repositories, otherwise it does not function as documentation.
  • How long ago the package was updated last
  • How many releases of the package have been made
  • How long ago did the user log in last
  • How many downloads the package has had in the last year
  • Whether the package has a link to the source code, and that link works (is not a 404)
  • How large is the code base. In some cases, it may be almost empty.

I believe such simple rules could already help eliminate a large amount of dead packages.

I would be most definitely open to contributing to such a project.

User Interface
This score may be integrated into the PyPi website itself, allowing users who are comparing packages to see their ratings.

Pypi warning
A pip install operation may warn the user installing a given package that it has a very low score and may get deleted.

@LucaCappelletti94 LucaCappelletti94 added feature request requires triaging maintainers need to do initial inspection of issue labels Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request requires triaging maintainers need to do initial inspection of issue
Projects
None yet
Development

No branches or pull requests

1 participant