Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research: Investigate Feasibility of Using LangSmith to Automate Test & Evaluation of RAG #257

Open
davidgxue opened this issue Jan 11, 2024 · 0 comments
Assignees

Comments

@davidgxue
Copy link
Contributor

Research Issue: Investigate Feasibility of Using LangSmith for Test & Evaluation of RAG

Context

This is an action item as a result of this research issue: #195.

We are exploring the possibility of automating the rest & evaluation process for the RAG application using LangSmith. The objective is to compare pre and post-changes responses generated by RAG, assess the alignment with a reference answer, and determine the overall improvement in outcomes.

Proposed Approach

Utilize LangSmith's Dataset and Test feature (LangSmith Documentation) to conduct head-to-head comparisons of responses. This involves evaluating whether the RAG application's outputs align with a predefined reference answer and identifying any improvements.

Implementation Details

No immediate code changes are required. The investigation will likely involve running local scripts to assess the feasibility of integrating LangSmith into the Test & Evaluation workflow. At the end, the script may be uploaded to the repository depending on whether it is secure to be published to the public.

Action Items

  1. Investigate the feasibility of LangSmith with the current test & evaluation process for RAG.
  2. Assess the effectiveness of LangSmith in comparing responses and identifying improvements.
  3. Identify costs associated with using LangSmith's test and evaluation features.
  4. Document findings and considerations regarding the integration of LangSmith.

This research initiative is not expected to result in immediate code changes and aims to explore the potential benefits of leveraging LangSmith for enhanced Test & Evaluation of RAG.

@davidgxue davidgxue self-assigned this Jan 11, 2024
@davidgxue davidgxue modified the milestone: 0.3.0 Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant