-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly describe hashed bin delegation in the specification #210
Comments
Maybe an inspirational resource for the repository side: |
Feature request: |
I think the optimal "load factor" is what we are looking for here. |
This is nonsense, the load factor is related to the cost of retrieving a value from a hash table, which is a completely different optimisation problem than what we are interested in. |
I spent some more thought on how to find the optimal number of bins for a given number of target files, and came up with the following simple problem definition:
For snapshot and bin metadata sizes we only need to consider those parts that change for different numbers of targets and bins:
Someone with mad math skills can maybe solve this using a fancy (discrete?) optimisation technique. Until then we can use this script or this Google sheet. |
Good Q, Lukas. I think a simple objective to optimise is simply |
Hashed bin delegation is not well documented in the specification. One of the better/more frequently referenced descriptions is in PEP 458.
We might add this to the ~new repository operations section of the specification.
Originally posted by @joshuagl in theupdateframework/taps#148 (review)
The text was updated successfully, but these errors were encountered: