Download preprocessed datasets

With data as your current working directory, download these two files and extract them. Make sure gdown has been installed. You can try pip install gdown.

For QM9:

gdown '1jmc2JBoXJxat_Aq74E3ffCIQGKH9JuG-'
tar -xf qm9_processed.tar.gz

For GEOM:

gdown '1UXDaJak686jtEyyfJrTxkiOkYT1SsKyK'
tar -xf geom_processed.tar.gz

Setup instructions from scratch (not necessary)

conda env

Download and extract qm9_crude.msgpack from https://dataverse.harvard.edu/file.xhtml?fileId=4327190&version=4.0
Place in data/qm9/raw
Run python data/qm9/preprocess.py
Download and extract drugs_crude.msgpack from https://dataverse.harvard.edu/file.xhtml?fileId=4360331&version=4.0
Place in data/geom/raw
Run python data/geom/preprocess.py

Alternatively, create a symlink:

mkdir data/qm9/raw
mkdir data/geom/raw
ln -s /path/to/qm9_crude.msgpack data/qm9/raw/qm9_crude.msgpack
ln -s /path/to/drugs_crude.msgpack data/geom/raw/drugs_crude.msgpack

The preprocessing caches the entire dataset in a .npy file, which is much faster to load than .msgpack files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SETUP.md

SETUP.md

Download preprocessed datasets

Setup instructions from scratch (not necessary)

Files

SETUP.md

Latest commit

History

SETUP.md

File metadata and controls

Download preprocessed datasets

Setup instructions from scratch (not necessary)