-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to use Ollama or any other local LLM for indexing instead of openai ? #432
Comments
There are a few threads discussing this: https://github.com/microsoft/graphrag/issues?q=ollama |
Yes, I think you can use any open ai complaint API with GraphRAG! I use
llama.cpp/llama-server --host 0.0.0.0 --port 8080 \
--threads 8 \
--parallel 1 \
--gpu-layers 999 \
--ctx-size 0 \
--n-predict -1 \
--defrag-thold 1 \
--model ./models/qwen2-7b-instruct-fp16.gguf
llama.cpp/llama-server --host 0.0.0.0 --port 8081 \
--threads 8 \
--parallel 1 \
--gpu-layers 999 \
--ctx-size 0 \
--n-predict -1 \
--defrag-thold 1 \
--embeddings \
--pooling mean \
--batch-size 8192 \
--ubatch-size 4096 \
--model ./models/qwen2-7b-instruct-fp16.gguf
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat # or azure_openai_chat
model: qwen2-7b-instruct
model_supports_json: true # recommended if this is available for your model.
max_tokens: 512
# request_timeout: 180.0
api_base: http://localhost:8080
api_version: v1
embeddings:
## parallelization: override the global parallelization settings for embeddings
async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: qwen2-7b-instruct
api_base: http://localhost:8081
api_version: v1 |
i got global search working but not local |
Actually my end application is long context multi-file coding development using this repo. here we want to index every edit that is made by developer and then when next AI editing inference is ran it should take input from graphrag and use it as context and then make code edits. Please anyone has done anything similar please let me know . |
That's a good plan.I ran through the offline model with this solution and solved my problem. |
Yes, you can. For running llama3.1 as an example try running the following command: ollama run llama3.1 And then change your LLM configuration in settings.yaml file to the following: llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat # or azure_openai_chat
model: llama3.1
model_supports_json: true # recommended if this is available for your model.
api_base: http://127.0.0.1:11434/v1 |
Is it possible to use Ollama or any other local LLM for indexing instead of openai ?
The text was updated successfully, but these errors were encountered: