Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: include tokens usage for streamed output #4282

Merged
merged 3 commits into from
Nov 28, 2024

Commits on Nov 27, 2024

  1. Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc …

    …to get the proper usage data in reply streaming mode at the last [DONE] frame
    mintyleaf committed Nov 27, 2024
    Configuration menu
    Copy the full SHA
    2931ea4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    8e8e05d View commit details
    Browse the repository at this point in the history

Commits on Nov 28, 2024

  1. Configuration menu
    Copy the full SHA
    e459118 View commit details
    Browse the repository at this point in the history