Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Local Embeddings] Community Support thread #370

Closed
zzk2021 opened this issue Jul 5, 2024 · 16 comments
Closed

[Local Embeddings] Community Support thread #370

zzk2021 opened this issue Jul 5, 2024 · 16 comments
Labels
community_support Issue handled by community members

Comments

@zzk2021
Copy link

zzk2021 commented Jul 5, 2024

image

@vv5d
Copy link

vv5d commented Jul 5, 2024

I don't think so at the moment.

@av
Copy link

av commented Jul 5, 2024

You can change the embeddings API base url to the local one

GRAPHRAG_EMBEDDING_API_BASE

it must still be compatible with OpenAI API schema

@SpaceLearner
Copy link

It works with ollama embedding by changing the file in /opt/anaconda3/envs/graphrag/lib/python3.11/site-packages/graphrag/llm/openai/openai_embeddings_llm.py with

from typing_extensions import Unpack

from graphrag.llm.base import BaseLLM
from graphrag.llm.types import (
EmbeddingInput,
EmbeddingOutput,
LLMInput,
)

from .openai_configuration import OpenAIConfiguration
from .types import OpenAIClientTypes

import ollama

class OpenAIEmbeddingsLLM(BaseLLM[EmbeddingInput, EmbeddingOutput]):
_client: OpenAIClientTypes
_configuration: OpenAIConfiguration

def __init__(self, client: OpenAIClientTypes, configuration: OpenAIConfiguration):
    self.client = client
    self.configuration = configuration
async def _execute_llm(
    self, input: EmbeddingInput, **kwargs: Unpack[LLMInput]
) -> EmbeddingOutput | None:
    args = {
        "model": self.configuration.model,
        **(kwargs.get("model_parameters") or {}),
    }
    # embedding = await self.client.embeddings.create(
    #     input=input,
    #     **args,
    # )
    # inputs = input['input']
    # print(inputs)
    embedding_list = []
    for inp in input:
        embedding = ollama.embeddings(model="nomic-embed-text", prompt=inp)
        embedding_list.append(embedding["embedding"])
    # return [d.embedding for d in embedding.data]
    return embedding_list

@bmaltais
Copy link

bmaltais commented Jul 6, 2024

Thank you for sharing. Pretty brutal fix but if it work then at least this is a stop gap until the Microsoft team implement a more elegant solution.

maybe checking for the presence of an ollama = true in the embedding parameters could allow to keep the default behaviour and only use the hack when true.

@zeyunie-vecml
Copy link

My solution was to write a local server with Flask that basically serves as a decoder of "cl100k_base" and a caller of ollama. Then change the api_base for embedding to the local host address. It works pretty well as far as I am concerned.

@av
Copy link

av commented Jul 9, 2024

@zeyunie-vecml, yes, that's precisely what the provided server does in addition to the OAI <-> Ollama translation

UPD: messed up GitHub threads, was in relation to this server

@AlonsoGuevara
Copy link
Contributor

I'm making this thread as our official discussion place for Local Embeddings setup and troubleshooting.
Thanks for the curiosity and proactivity!

@chenli90s
Copy link

chenli90s commented Jul 10, 2024

i think add middleware api for oai

embeddings:
  ## parallelization: override the global parallelization settings for embeddings
  async_mode: threaded # or asyncio
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding # or azure_openai_embedding
    model: mxbai-embed-large
    api_base: http://localhost:8686/api
import fastapi
from langchain_community.embeddings.ollama import OllamaEmbeddings
app = fastapi.FastAPI()
@app.post('/api/embeddings')
def embeddings(body:dict):
    # print(body)
    ollama = OllamaEmbeddings(model=body['model'])
    res = ollama.embed_documents(body['input'])
    return {
        'data': [
            {'embedding': rs } for rs in res
        ]
    }

@silviachen46
Copy link

silviachen46 commented Jul 10, 2024

I used ollama embedding with the above modification in embedding function and was able of generating the graph, but I can't query the graph with similar modification to another embed function.
This is what i was trying to do:(modify in function _embed_with_retry in embedding.py in query/llm/oai folder)

for attempt in retryer:

            with attempt:

                embedding = ollama.embeddings(model="nomic-embed-text", prompt=text) 

                return (embedding["embedding"], len(text))

Kind of wondering what's going wrong here:)

@goodmaney
Copy link

#451.

@automateyournetwork
Copy link

Instead of ollama I am trying llama.cpp for embeddings but I get this error:

11:17:36,84 graphrag.llm.openai.create_openai_client INFO Creating OpenAI client base_url=http://localhost:8080
11:17:36,99 graphrag.index.llm.load_llm INFO create TPM/RPM limiter for nomic-embed-text-v1.5.Q5_K_M.gguf: TPM=0, RPM=0
11:17:36,99 graphrag.index.llm.load_llm INFO create concurrency limiter for nomic-embed-text-v1.5.Q5_K_M.gguf: 1
11:17:36,107 graphrag.index.verbs.text.embed.strategies.openai INFO embedding 177 inputs via 177 snippets using 12 batches. max_batch_size=16, max_tokens=8191
11:17:36,126 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.
Traceback (most recent call last):
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb
result = await result
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed
return await _text_embed_in_memory(
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory
result = await strategy_exec(texts, callbacks, cache, strategy_args)
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run
embeddings = await _execute(llm, text_batches, ticker, semaphore)
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute
results = await asyncio.gather(*futures)
File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed
result = np.array(chunk_embeddings.output)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.

@wanglufei1
Copy link

Instead of ollama I am trying llama.cpp for embeddings but I get this error:

11:17:36,84 graphrag.llm.openai.create_openai_client INFO Creating OpenAI client base_url=http://localhost:8080 11:17:36,99 graphrag.index.llm.load_llm INFO create TPM/RPM limiter for nomic-embed-text-v1.5.Q5_K_M.gguf: TPM=0, RPM=0 11:17:36,99 graphrag.index.llm.load_llm INFO create concurrency limiter for nomic-embed-text-v1.5.Q5_K_M.gguf: 1 11:17:36,107 graphrag.index.verbs.text.embed.strategies.openai INFO embedding 177 inputs via 177 snippets using 12 batches. max_batch_size=16, max_tokens=8191 11:17:36,126 datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part. Traceback (most recent call last): File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/datashaper/workflow/workflow.py", line 415, in _execute_verb result = await result File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 105, in text_embed return await _text_embed_in_memory( File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/text_embed.py", line 130, in _text_embed_in_memory result = await strategy_exec(texts, callbacks, cache, strategy_args) File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 61, in run embeddings = await _execute(llm, text_batches, ticker, semaphore) File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 105, in _execute results = await asyncio.gather(*futures) File "/home/fragb0x/packet_graphRAG/GRAPH/lib/python3.10/site-packages/graphrag/index/verbs/text/embed/strategies/openai.py", line 100, in embed result = np.array(chunk_embeddings.output) ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (16,) + inhomogeneous part.

I'm having the same problem. Is this solved or how to avoid it?

@silviachen46
Copy link

@wanglufei1 check out #345 . Embedding with Ollama would work with modification made by user Spacelearner.

@wanglufei1
Copy link

@silviachen46
Thank you very much. In my opinion, ollama uses llama.cpp at the bottom layer. I am in a Chinese environment. I tried to switch the model, gave up using nomic-embed-text, and used the qwen model. Now it works properly.

@karthik-codex
Copy link

The local search with embeddings from Ollama now works.
You can read full guide here:
https://medium.com/@karthik.codex/microsofts-graphrag-autogen-ollama-chainlit-fully-local-free-multi-agent-rag-superbot-61ad3759f06f
Here is the link to the repo:
https://github.com/karthik-codex/autogen_graphRAG

@natoverse
Copy link
Collaborator

Consolidating Ollama-related issues: #657

@natoverse natoverse closed this as not planned Won't fix, can't repro, duplicate, stale Jul 22, 2024
@natoverse natoverse unpinned this issue Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community_support Issue handled by community members
Projects
None yet
Development

No branches or pull requests