Discussion on AutoMergingRetriever
and HierarchicalDocumentSplitter
#78
TuanaCelik
started this conversation in
General
Replies: 1 comment 1 reply
-
Ok y'all, I have been looking at the draft cookbook we are about to share for this component so I decided, why don't I add the first comment already: Imo, for the Hierarchical document splitter to be useable in pipelines, we should also have an output called |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is the discussion board for the experimental
AutoMergingRetriever
andHierarchicalDocumentSplitter
componentsThese components are used to split documents with a reference to the 'parent' document, and then based on a threshold setting, to return the parent documents if a certain number of 'child' documents are retrieved.
The rational is, given that a paragraph is split into multiple chunks represented as leaf documents, and if for a given query, multiple chunks are matched, the whole paragraph might be more informative than the individual chunks alone.
📚Full Documentation of the AutoMergingRetriever
📚Full Documentation of the HierarchicalDocumentSplitter
🧑🍳 Try the Cookbook here
PS: The first version of this experiment was implemented by @davidsbatista 🚀
Beta Was this translation helpful? Give feedback.
All reactions