Skip to content

Takes Reddit comments and Reddit submissions from files.pushshift.io and creates a nested collection out of it.

License

Notifications You must be signed in to change notification settings

gabefair/reddit_mongodb_reconstructor

Repository files navigation

reddit_mongodb_reconstructor

Takes Reddit comments and Reddit submissions from files.pushshift.io and creates a nested collection out of it.

Assumes the following

  1. You have downloaded the RC_ and RS_ reddit archives from http://files.pushshift.io
  2. You have imported the json into a mongo database with a collection for the submissions and a collection for the comments.

This ipython code with reconcile the two creating the comments as nested children of the parent submission.

collection_relationship collection_relationship collection_relationship

About

Takes Reddit comments and Reddit submissions from files.pushshift.io and creates a nested collection out of it.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published