Airgapped Offline RAG

This project by Vincent Koc implements a Retrieval-Augmented Generation (RAG) based Question-Answering system for documents. It uses Llama 3, Mistral, and Gemini models for local inference with LlaMa c++, langchain for orchestration, chromadb for vector storage, and Streamlit for the user interface.

Setup

Ensure Python 3.9 is installed: You can use pyenv:

pyenv install 3.9.16
pyenv local 3.9.16
pyenv rehash

Create a virtual environment and install dependencies:
```
make setup
```
Download Models: Download the Llama 3 (8B) and Mistral (7B) models in GGUF format and place them in the models/ directory. TheBloke on Hugging Face has shared the models here:
- Mistral-7B-Instruct-v0.2-GGUF
- LLaMA-Pro-8B-Instruct-GGUF
The models from unsloth have also been tested and can be found here:
- Gemma-2-2b-it.q2_k.gguf
- Llama-3.2-3B-Instruct-Q2_K.gguf
Qdrant Sentence Transformer Model: This will be downloaded automatically on the first run. If running the airgapped RAG locally, it's best to run the codebase with internet access initially to download the model.

Running the Application

Locally

make run

Using Docker

make docker-build
make docker-run

Usage

Upload PDF documents using the file uploader.
Select the model you want to use (e.g., Mistral).
Enter your question in the text input.
Click "Generate Answer" to get a response based on the document content.

Configuration

Adjust settings in config.yaml to modify model paths, chunk sizes, and other parameters.

Features

Supported Features

Future Features

Contributing

Contributions are welcome! Please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the GNU General Public License v3.0 (GPLv3). See the LICENSE file for details.

This means:

You can freely use, modify, and distribute this software.
If you modify or extend this software, you must release your changes under the GPL.
You must include the original copyright notice and the full text of the GPL license.
There's no warranty for this free software.

For more information, visit GNU GPL v3.

Acknowledgments

Thanks to TheBloke and unsloth for sharing the quantized models.
This project uses various open-source libraries. See requirements.txt for details.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github		.github
.streamlit		.streamlit
app		app
assets		assets
documents		documents
models		models
tests		tests
.dcignore		.dcignore
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
.yamllint		.yamllint
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
example.env		example.env
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
runtime.txt		runtime.txt
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Airgapped Offline RAG

Table of Contents

Setup

Running the Application

Locally

Using Docker

Usage

Configuration

Features

Supported Features

Future Features

Contributing

License

Acknowledgments

About

Releases 1

Sponsor this project

Packages

Languages

License

vincentkoc/airgapped-offfline-rag

Folders and files

Latest commit

History

Repository files navigation

Airgapped Offline RAG

Table of Contents

Setup

Running the Application

Locally

Using Docker

Usage

Configuration

Features

Supported Features

Future Features

Contributing

License

Acknowledgments

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 1

Sponsor this project

Packages 0

Languages

Packages