Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow specifying HTTP Basic auth parameters in config file (or netrc) for dvc import #10623

Open
vvuk opened this issue Nov 15, 2024 · 3 comments
Labels
awaiting response we are waiting for your reply, please respond! :) feature request Requesting a new feature

Comments

@vvuk
Copy link

vvuk commented Nov 15, 2024

I'd like to place dvc references in my git repo to data hosted on e.g. huggingface. But the models are private. If I do:

dvc import https://huggingface.co/spaces/foo/bar mymodel.bin

I get prompted for a username/password. I'd like to avoid this username/password prompt by being able to specify the username/password in a local config file. I cannot figure out how to do this with dvc, and I suspect it's not possible (but please tell me if it is!). What I'd want to work is either in .dvc/config (well, .dvc/config.local, but any of them):

['url "https://huggingface.co/"']
    user = foo
    password = bar

alternatively, look for any remote that has the same URL configured as what's passed to dvc import; so:

['remote "hf"']
    url = https://huggingface.co/

and in config.local:

['remote "hf"']
  user = foo
  password = bar

Another alternate option -- use a ~/.netrc file like requests or curl does. aiohttp doesn't have native support for netrc, but the file format is trivial.

@shcheklein
Copy link
Member

Hey @vvuk , I think there are a few options:

  1. Use global or system config. E.g. for the dvc remote modify call here
  2. You can also pass config directly to the dvc import command - https://dvc.org/doc/command-reference/import#--remote-config

Let me know if something of this works for you.

@shcheklein shcheklein added awaiting response we are waiting for your reply, please respond! :) feature request Requesting a new feature labels Nov 16, 2024
@vvuk
Copy link
Author

vvuk commented Nov 16, 2024

Heya, thanks for the quick reply!

Use global or system config. E.g. for the dvc remote modify call here

Hmm, can you give me an example here? I've got a remote configured -- it doesn't seem to matter whether it's global or project or local.

You can also pass config directly to the dvc import command - https://dvc.org/doc/command-reference/import#--remote-config

Hmm -- this one still requires command line work, but I'm actually not sure how remotes come into play at all right now.

Here's a more specific example/set of commands with a public gated model. I created a new git/dvc repo (git init ; dvc init):

dvc import https://huggingface.co/google/gemma-2b README.md

this prompts for username/password.

I can create a remote for huggingface (note: doesn't matter if I use global/system/project/local, same result):

dvc remote add --global hf https://huggingface.co/
dvc remote modify --global hf user ...
dvc remote modify --global hf password hf_abc....

but doing the same dvc import above still prompts for username password. Same result with dvc import --remote hf https://huggingface.co/google/gemma-2b README.md (which I'd expect, since I believe --remote just sets where the data would be pushed to with a dvc push).

I don't think the remote is being considered at all... the only way it could be is if was matched by url prefix, and I don't think that's happening?

The concrete use case is I've got a bunch of private models on huggingface that I'd like to reference in my repo. But I'd like each of my developers to use their own huggingface credentials to pull them down when doing dvc pull. The only thing that works (as expected) is explicitly providing the user/pass in the URL:

dvc import https://user@pass:huggingface.co/google/gemma-2b README.md

but then my username/password is stored in the .dvc file.

@shcheklein
Copy link
Member

Ah, I see. I think I misunderstood the question. I see now that it's user / password for the HF itself.

Since it's pretty much about Git protocol here, you should probably config Git to be able to do git clone https://huggingface.co/spaces/foo/. I think git supports quite a few ways to manage credentials.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) feature request Requesting a new feature
Projects
None yet
Development

No branches or pull requests

2 participants