Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a separate action for removing old wheels #95

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

agriyakhetarpal
Copy link

@agriyakhetarpal agriyakhetarpal commented Sep 27, 2024

Description

This PR closes #62; by converting the .github/workflows/remove-wheels.yml workflow to a composite action in remove-wheels/action.yml that users can trigger in their workflows separately to delete binaries uploaded to the https://anaconda.org/scientific-python-nightly-wheels/ index or other Anaconda.org indices.

This way, users should be able to add, for example:

- name: Remove old wheels
  uses: scientific-python/upload-nightly-action/[email protected] # or, better, via the commit hash
  with:
    anaconda_token: ${{ secrets.ANACONDA_NIGHTLY_UPLOAD_TOKEN }} # same as that for uploads (convenience?)
    n_latest_uploads: '5'
    anaconda_user: 'my-custom-org'

or similar variations, to delete any but the last five wheel uploads, which means that this repository does not need to do so and can delegate this responsibility to the workflow files for the twenty-six packages listed under the https://anaconda.org/scientific-python-nightly-wheels/repo page.

Changes made

  • A new action in ./remove-wheels/action.yml that accepts an organisation name and the number of latest uploads to keep
  • The older wheel removal script moved to its dedicated file in remove_wheels.sh
  • Documentation updates in the README

Checklist

  • Resolve the to-do items (some of them have been marked as inline comments in the code for now)
  • Add relevant documentation, and adjust headings in the README
  • If .github/workflows/remove-wheels.yml stays, figure out how to not unintentionally delete wheels without the existence of packages-ignore-from-cleanup.txt

I've been meaning to do this for some time now, and I didn't receive a response to #62 (comment) back then. However, since it was mentioned that PRs would be welcome for reviews, and the repository has recently received quite a few updates (and a new release! 🎉), so I finally spent thirty minutes of my time writing this out—mostly just adding the script to a separate file—wherein I've added the necessary scaffolding around its creation as a separate, distributable action.

@agriyakhetarpal
Copy link
Author

Okay, I've just read some discussion(s) on keeping the removal of old wheels "centralised" in #4 (comment) and in the comments after it. Those made sense to me, so, this PR could be considered in conjunction with that policy – to let SPNW clean up old wheels itself, and let these changes be a "refactor" of the wheel removal logic in a sense, i.e., from being embedded in a workflow to a separate action that gets used in it. This would also ease the removal process around those who want to take up the responsibility of removing their wheels that have been uploaded outside SPNW-managed Anaconda.org indices. It is to be noted that that discussion predates the ones linked in #62, so it is likely the case that requests for easier removals have arisen in more contexts by now.

@bsipocz
Copy link
Member

bsipocz commented Sep 28, 2024

Yes, having the cleanup parts moved to a new action and then using it sounds to be the right way to go, that way we expose the functionality for easy reuse while we also keep using it.

@matthewfeickert
Copy link
Member

I am in transit and won't have time to review or give feedback on this for a few days. My original intent in Issue #62 was to make a new repository and have that action be totally independent of this repository. Though, we can certainly get this in here and then just move it elsewhere later if people agree that's a good idea.

@agriyakhetarpal
Copy link
Author

agriyakhetarpal commented Sep 28, 2024

That sounds great! Thanks for triggering the workflow for me – I shall revisit this either later in the day or early next week. That said, I don't think one can run the workflow from forks, though... I have an account on Anaconda.org already since I'm helping manage the PyWavelets wheels (#75), should I add a dummy package for testing (in my account, that is)?

@agriyakhetarpal
Copy link
Author

agriyakhetarpal commented Sep 28, 2024

to make a new repository and have that action be totally independent of this repository. Though, we can certainly get this in here and then just move it elsewhere later if people agree that's a good idea.

Personally, I would say that related actions like these ones could be grouped together just fine, since one repository is easier to manage, rather than two – easier permissions to deal with, unified documentation in the README, unified dependency management (using pixi), etc., so this could fit in the same repository very well. For example, https://github.com/github/codeql-action contains six actions, with each managed in its own folder.

Of course, this is just my opinion, so I'm happy to implement whatever works best within the community.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@agriyakhetarpal
Copy link
Author

Thanks for the documentation suggestions, addressed them in 2f62d85 – that commit should resolve the conversations from the review above.

remove-wheels/action.yml Outdated Show resolved Hide resolved
remove-wheels/action.yml Outdated Show resolved Hide resolved
Comment on lines +17 to +20
ANACONDA_USER="${INPUT_ANACONDA_USER}"
ANACONDA_TOKEN="${INPUT_ANACONDA_TOKEN}"
N_LATEST_UPLOADS="${INPUT_N_LATEST_UPLOADS}"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the Anaconda token that is authorised to maintainers of, say, a package X is restricted to only remove wheels for said package X, or whether it can remove all packages in the index. This is because while users of SPNW won't be affected since we handle the deletions, users with their own organisation with wheels for packages X, Y, and Z being uploaded to it would either:

  • run this action somewhere centrally to remove all (but the latest N) uploads of X, Y, and Z; or
  • they would want to run the deletions from separate repositories for X, Y, and Z and want to delete just X, Y, and Z at a time on each package's own accord and deletion schedule.

Which situation would be more plausible? If it is the latter, we'd have to provide another input in the action for a comma-separated string of packages to delete uploads for (and maybe * for deleting all packages?). To replace the existence of packages-ignore-from-cleanup.txt (which would exist only in this repository – see below), it might also make sense to include a whitelist input, too. If it's the former, it gets easier to implement, but users won't have fine-grained control on automations for what packages are being deleted.

P.S. This is all valid only if I'm not missing something about how the index and its permissions are structured :) Happy to receive others' thoughts!

Copy link
Author

@agriyakhetarpal agriyakhetarpal Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on @tupui's suggestion in #95 (comment), users will need to run the action multiple times to remove old uploads for multiple packages. Hence, the second option of allowing per-package deletions is better, and an input for a list of packages to delete wheels for isn't required.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if someone has an appropriately scoped token for their organisation, they can very well remove multiple wheels from the index at a time, and some users might want to do that. So, it might be necessary to mention in the documentation how the tokens work and pre-emptively warn about possible deletions in an admonition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, users have full control over their project. So they can add or remove any wheel they want on that project. People can just not add members to their project and they need to ask admins.

Comment on lines +41 to +48
# TODO: should be possible to alter this, since separating the workflow
# into two steps, one for uploading and one for cleanup, should make it
# possible for users to manually trigger the cleanup step before/after the
# upload step has completed in their own repos instead of us having to do it.
#
# TODO: raises questions on how to moderate cleanups among multiple users
# operating on the same channel, but that might be a different issue.
curl https://raw.githubusercontent.com/scientific-python/upload-nightly-action/main/packages-ignore-from-cleanup.txt --output packages-ignore-from-cleanup.txt
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see above for context and my question on this. :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put some thought into it, and this can be included as an input in the action. There would be two cases:

  • An API token for package X that can delete older wheels for X
  • An API token for the organisation that can perform index-wide deletions of older wheels for packages

While the former would be recommended to limit deletion scopes, when the latter is used by the action users (as it is being done here for this repo), a whitelist_packages: input can be added to include a comma-separated list of packages ("openblas-libs" in our case).

Another reasonable message could be to add a warning: "Wheels for package X requested for deletion, but X is whitelisted. Please either remove it from the whitelist or try to delete a different package as needed.".

Comment on lines +60 to +66
# Remember can't quote subshell as need to split on (space separated) token
for package_name in $(cat package-names.txt); do
# TODO: this outer loop can be removed when ready since there will be
# just one package to remove when the action is triggered manually from
# a user's (different) repo.

echo -e "\n# package: ${package_name}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO item would also be resolved if we have a way forward with the questions above, since, if multiple packages are removed at once, this loop can stay; if we were to restrict the action to removing just one wheel at a time (which I don't think we should), then we could remove this. I think an action of the form:

- name: Remove old wheels
  uses: scientific-python/upload-nightly-action/remove-wheels@cantknowhashyet # 0.6.0
  with:
    n_latest_uploads: ${{ env.N_LATEST_UPLOADS }}
    anaconda_nightly_upload_organization: "your-organization"
    anaconda_nightly_token: ${{secrets.ANACONDA_TOKEN}}
    packages_to_remove: "mypackage1,mypackage2,mypackage3" # or just "*"
    # I could have suggested "all" here, but that breaks in the
    # case where "all" is also the name of a package that has
    # been uploaded (possible, albeit quite unlikely)

is better than specifying the step and authenticating multiple times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages: ["a", "b", "c"] hopefully!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a list of strings sounds elegant! :)

@agriyakhetarpal
Copy link
Author

Thanks for the reviews so far! I can officially mark this ready for review now. I've added additional context about most of my questions in my review above.

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review October 1, 2024 12:47
@agriyakhetarpal agriyakhetarpal changed the title [WIP]: Add a separate action for removing old wheels Add a separate action for removing old wheels Oct 1, 2024
@tupui
Copy link
Member

tupui commented Oct 1, 2024

I am not sure if the Anaconda token that is authorised to maintainers of, say, a package X is restricted to only remove wheels for said package X

There is a restriction. Users are only allowed to do actions on the packages they are given access to 😃

There is just a handful of folks who have higher access right to administrate the Anaconda account. Here the token setup is very light but we use all the tools Anaconda gives us.

@agriyakhetarpal
Copy link
Author

Thanks for the clarification! Should we make a mention of this in the documentation?

@tupui
Copy link
Member

tupui commented Oct 1, 2024

Feel free to make a suggestion 👍

Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in transit and so giving a quick but incomplete review.

If the things I mention aren't clear please ask for clarification or more detail from me.

pixi.toml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Member

@stefanv stefanv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall; I made two more suggestions for clarifying the documentation.

README.md Outdated Show resolved Hide resolved
uses: scientific-python/upload-nightly-action/remove-wheels@cantknowhashyet # 0.6.0
with:
n_latest_uploads: ${{ env.N_LATEST_UPLOADS }}
anaconda_nightly_upload_organization: "your-organization"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called organization, not channel; are these interchangeable terms? If so, use channel; if not, explain the difference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure, since they look like they have been used interchangeably. For example, the README outside of this PR:

## Using a different channel
This Github Action can upload your nightly builds to a different channel. To do so,
define the `anaconda_nightly_upload_organization` variable. Furthermore,
you can add labels for organizing your artifacts using `anaconda_nightly_upload_labels`
optional parameter. See below:
```yml
jobs:
steps:
...
- name: Upload wheel
uses: scientific-python/upload-nightly-action@82396a2ed4269ba06c6b2988bb4fd568ef3c3d6b # 0.6.1
with:
artifacts_path: dist
anaconda_nightly_upload_organization: my-alternative-organization
anaconda_nightly_upload_token: ${{secrets.UPLOAD_TOKEN}}
anaconda_nightly_upload_labels: dev
```

mentions how one may upload to a different channel, but it uses "organization" as an input.

README.md Outdated Show resolved Hide resolved
Comment on lines +60 to +66
# Remember can't quote subshell as need to split on (space separated) token
for package_name in $(cat package-names.txt); do
# TODO: this outer loop can be removed when ready since there will be
# just one package to remove when the action is triggered manually from
# a user's (different) repo.

echo -e "\n# package: ${package_name}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages: ["a", "b", "c"] hopefully!

@agriyakhetarpal
Copy link
Author

CI is going to fail here because I don't have an ANACONDA_TOKEN set up in my fork's secrets. I could set an Anaconda organisation up (instructions would be welcome) for testing (or look into failures in a new PR, though I don't recommend that).

Or, the best way I can think of is that I could try this with a project in a PR downstream with the action set to my fork – since I help co-maintain PyWavelets's nightly wheels as I mentioned earlier up somewhere :) I would be unavailable next week, please let me know if this is reasonable and I'll try it out as early as the week after.

MridulS added a commit to scipp/scipp that referenced this pull request Oct 8, 2024
The current action has a bug as it will delete the nightly wheels if there wasn't a new commit to the main branch since the last nightly release.

Let's just turn it off for now. We can use the action upstream scientific-python/upload-nightly-action#95 to remove the wheels once it's merged in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make Remove Old Wheels workflow seperate GitHub Action
5 participants