You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of what the bug is.
I'm trying to use haystack's API to build a RAG pipeline. I'm using FAISSDocumentStore and EmbeddingRetriever.
Works like the following:
# Create the document store using the factory
document_store = create_document_store(store_type, **store_config)
documents = []
documents_dir = args.docs_path
for filename in os.listdir(documents_dir):
file_path = os.path.join(documents_dir, filename)
if os.path.isfile(file_path):
with open(file_path, 'r', encoding='utf-8') as file:
content = file.read()
document = Document(content=content)
documents.append(document)
document_store.write_documents(documents)
# Ensure the retriever is initialized before updating embeddings
retriever = RetrieverFactory.get_retriever(retriever_type=args.retriever_type,
document_store=document_store,
query_embedding_model=args.query_embedding_model,
passage_embedding_model=args.passage_embedding_model
)
# Update embeddings right after writing documents
if hasattr(document_store,
'update_embeddings'): # check ensures that this code block only executes if the document_store instance has the update_embeddings method.
document_store.update_embeddings(retriever=retriever, batch_size=10)
Error message
Error that was thrown (if available)
haystack/modeling/model/language_model.py", line 222, in _pool_tokens
ignore_mask_3d[:, :, :] = ignore_mask_2d[:, :, np.newaxis]
~~~~~~~~~~~~~~^^^^^^^^^
IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here, like type of downstream task, part of etc..
To Reproduce
Steps to reproduce the behavior
System:
OS: Ubuntu 18.04
GPU/CPU:
FARM version:
The text was updated successfully, but these errors were encountered:
Describe the bug
A clear and concise description of what the bug is.
I'm trying to use haystack's API to build a RAG pipeline. I'm using FAISSDocumentStore and EmbeddingRetriever.
Works like the following:
Error message
Error that was thrown (if available)
haystack/modeling/model/language_model.py", line 222, in _pool_tokens
ignore_mask_3d[:, :, :] = ignore_mask_2d[:, :, np.newaxis]
~~~~~~~~~~~~~~^^^^^^^^^
IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here, like type of downstream task, part of etc..
To Reproduce
Steps to reproduce the behavior
System:
The text was updated successfully, but these errors were encountered: