Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate HDT for bundle RDF storage #33

Open
mwatts15 opened this issue Sep 19, 2021 · 0 comments
Open

Investigate HDT for bundle RDF storage #33

mwatts15 opened this issue Sep 19, 2021 · 0 comments

Comments

@mwatts15
Copy link
Contributor

owmeta bundle archives store graph data in n-triples files. This format is good for doing line-oriented diffs, but it has a lot of redundancy which makes it less compressible, making archive retrieval and storage more expensive. As an alternative we could use HDT which provides one way to reduce the redundancy. Although it would be possible to do something similar without HDT, many of the questions of what does and doesn't work have probably been worked out by HDT - any deficiency that requires making our own format (e.g., removing or altering features that are only useful for querying) will be clear from this investigation. The query functionality should also be investigated as an alternative to pow_store_zodb for the bundle indexed store (see HDT-FoQ and https://rdflib.dev/rdflib-hdt/hdtstore.html).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant