-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Add BYO vectors ingestion tutorial #115112
base: main
Are you sure you want to change the base?
Conversation
Documentation preview: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
<titleabbrev>Bring your own vector embeddings</titleabbrev> | ||
++++ | ||
|
||
This tutorial demonstrates how to index documents that already have dense vector embeddings into {es}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding an example for sparse_vector
embeddings here as well?
"properties": { | ||
"review_vector": { | ||
"type": "dense_vector", | ||
"dims": 8, <1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could technically omit some of these, as dims
can be dynamically calculated.
PUT /amazon-reviews/_doc/1 | ||
{ | ||
"review_text": "This product is lifechanging! I'm telling all my friends about it.", | ||
"review_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note here emphasizing that the size of the review_vector
array is 8 matching the dims count?
[[bring-your-own-vectors-search-documents]] | ||
=== Step 3: Search documents with embeddings | ||
|
||
Now you can query these document vectors using a <<knn-retriever,`knn` retriever>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see retriever examples! 🎉
} | ||
} | ||
---- | ||
// TEST[skip:flakeyknnerror] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this test flakey?
} | ||
---- | ||
// TEST[skip:flakeyknnerror] | ||
<1> In this toy example, we're sending a raw vector as the query text. In a real-world scenario, you'll need to generate vectors for queries using an embedding model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<1> In this toy example, we're sending a raw vector as the query text. In a real-world scenario, you'll need to generate vectors for queries using an embedding model. | |
<1> In this simple example, we're sending a raw vector as the query text. In a real-world scenario, you'll need to generate vectors for queries using an embedding model. |
|
||
This was a simple example to help you understand the syntax for indexing a set of existing embeddings into {es}. | ||
|
||
In this toy example, we're sending a raw vector for the query text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this toy example, we're sending a raw vector for the query text. | |
In this simple example, we're sending a raw vector for the query text. |
In a real-world scenario you won't know the query text ahead of time. | ||
You'll need to generate vectors for queries, on the fly, using an embedding model. | ||
|
||
For this you'll need to deploy a text embedding model in {es} and use the <<knn-query-top-level-parameters,`query_vector_builder` parameter>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also legitimate to do this client side and just send the vectors in with the request.
Screenshot (while URL preview loads)