It's a Generative Pretrained Transformer (GPT) that takes in user input as the context and generates words as following that given context using an attention model (given by paper "Attention is All You Need").
It is the basic GPT-1 implementation.
- Create a virtual environment.
pip install -r requirements.txt
- To deploy the project
Make sure to have docker setup and running first This will pull docker image of 4~5GB for pytorch-cuda.
docker compose build && docker compose up
- Set up front-end
cd frontend/
yarn
yarn dev
Model training in detail is explained by Andrej Karpathy in this video.