Codebase for analyzing sub-morphemic systematicity using CELEX and word2vec.
Before running, you must have the Google News pretrained word embeddings: http://mccormickml.com/2016/04/12/googles-pretrained-word2vec-model-in-python/
To run:
python main.py
To do:
- Make code more modular
- Save dataset with embeddings so model doesn't have to be reloaded each time
- Validate against other word embedding models?