Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dropbox Syncer #13

Open
marclennox opened this issue Sep 30, 2014 · 9 comments
Open

Add Dropbox Syncer #13

marclennox opened this issue Sep 30, 2014 · 9 comments
Labels

Comments

@marclennox
Copy link

It would be great to have Dropbox (and Google Drive) cloud syncers in addition to S3 and CloudFiles. I think the Dropbox API would support this.

@tombruijn tombruijn added Storage and removed Storage labels Sep 30, 2014
@tombruijn
Copy link
Member

It's certainly an option to add this. However, the core team is not currently working expanding the features of v4 to work on v5 instead. So we're not adding this in v4, but as v5 rolls around we take a look at this repo and see what can be done. During that time other volunteers are free to pick this up.

@tombruijn
Copy link
Member

@marclennox it might also be an idea to split this issue into two issues, one for dropbox and one for google drive

@marclennox marclennox changed the title Dropbox and Google Drive syncers Add Dropbox Syncer Oct 2, 2014
@marclennox
Copy link
Author

Done

@marclennox
Copy link
Author

I've started working on this along with abstracting a lot of the content in Storage::Dropbox into a shared CloudIO::Dropbox class.

For some reason the chunked uploader just hangs indefinitely trying to upload a file from my dev environment to dropbox, but I'll keep playing with it.

@marclennox
Copy link
Author

Hey @tombruijn I have this mostly working, but it's intrusive enough that I'd like to get some comment on my approach.

At a high level, I took a lot of the code out of Storage::Dropbox and put it in a new CloudIO::Dropbox object (similar to others). This new object is used by Storage::Dropbox and by a new Syncer::Cloud::Dropbox.

One of the challenges with doing sync with Dropbox is that they do not return a hash of the file contents, instead returning their own revision ID for each unique revision of a file. So in order to determine if a file needs to be synced, I compare the remote file's revision ID with a locally cached revision ID, which I store when I first upload the file (Dropbox returns it to you on upload).

I created a new cache file for this purpose (in the same cache directory as the session cache) which is a YAML array of file path keys and metadata values. I also augment the dropbox metadata with the local file's MD5 hash, which I also use to check for local file changes.

For now I've overloaded the sync_file method in the Dropbox CloudIO, because as I mentioned I am checking for a local file MD5 hash change, as well as a remote file revision difference to determine whether to upload. I'm sure this can be abstracted in a more generic fashion later.

Lastly, I noticed when mirroring, local file deletes work correctly, however if a local directory is deleted, only the file contents are deleted remotely and the directory remains. I'm not sure whether this is a problem in my Dropbox implementation or if it's a problem with the Cloud Sync base class. Does this work properly with S3 and CloudFiles?

Thanks.

@marclennox
Copy link
Author

@tombruijn once I've got your stamp of approval on my approach, I think I will break this up into 2 PRs. The first one would be the abstraction of dropbox functionality into CloudIO, which would make sure that all existing tests run properly. Then the second PR would add the sync functionality.

@tombruijn
Copy link
Member

@marclennox I haven't been maintaining Backup for that long so I'm still in the dark about a lot of classes. I don't know how the CloudIO S3 and Cloudfiles work exactly.

I would suggest you try and make the best possible solution that fits in the current codebase without changing too much of the existing setup (limiting what could break). And about splitting the PR into two PRs, I'm in favor of that. It will allow me to review them more easily.

@marclennox
Copy link
Author

OK I will break it into 2.  I think that breaking out the Dropbox core stuff into a CloudIO makes very good sense from my understanding of the code base.


Sent from Mailbox

On Sun, Oct 5, 2014 at 9:47 AM, Tom de Bruijn [email protected]
wrote:

@marclennox I haven't been maintaining Backup for that long so I'm still in the dark about a lot of classes. I don't know how the CloudIO S3 and Cloudfiles work exactly.

I would suggest you try and make the best possible solution that fits in the current codebase without changing too much of the existing setup (limiting what could break). And about splitting the PR into two PRs, I'm in favor of that. It will allow me to review them more easily.

Reply to this email directly or view it on GitHub:
#13 (comment)

@marclennox
Copy link
Author

Probably won’t get this done for another week.   Dropbox has some idiosyncrasies that make this a little more challenging, but I’m confident it’s going to work well.


Sent from Mailbox

On Sun, Oct 5, 2014 at 11:53 AM, Marc @ Gmail [email protected]
wrote:

OK I will break it into 2.  I think that breaking out the Dropbox core stuff into a CloudIO makes very good sense from my understanding of the code base.

Sent from Mailbox
On Sun, Oct 5, 2014 at 9:47 AM, Tom de Bruijn [email protected]
wrote:

@marclennox I haven't been maintaining Backup for that long so I'm still in the dark about a lot of classes. I don't know how the CloudIO S3 and Cloudfiles work exactly.

I would suggest you try and make the best possible solution that fits in the current codebase without changing too much of the existing setup (limiting what could break). And about splitting the PR into two PRs, I'm in favor of that. It will allow me to review them more easily.

Reply to this email directly or view it on GitHub:
#13 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants