Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream api #392

Open
alex-dixon opened this issue Nov 26, 2024 · 1 comment
Open

Stream api #392

alex-dixon opened this issue Nov 26, 2024 · 1 comment

Comments

@alex-dixon
Copy link
Contributor

A couple users have asked about a streaming api. Specifically a way to receive chunked output for text blocks and structured output.

How to make this work with tracking?

@alex-dixon
Copy link
Contributor Author

Looked in the code a bit.

One approach is to have streaming versions of the provider calls. This makes it clear we should be streaming all the way through. separate code paths for streaming in complex and or track may help too. There’s a clear spot to branch on track vs track_streaming for example if that helps us.

How should the api should surface at the user level? Eg

Stream as a value returned by an lmp when stream is true
for c in lmp(a,b, stream=True):

as a property of whatever result type… (nonstreaming is list[Message]
result, _ = lmp(a,b, stream=True)

for c in result.stream():

Other approaches:
normalize on python’s “lazy sequence” abstraction by forcing a stream api in the provider layer and calling “list” on the result (collecting it) when stream is false. Unfortunately not all provider api calls support streaming so we’d artificially force this in some cases. Provider code is already stream aware by default for text and they just collect the stream. So the extra work would be a faux streaming api in ell (iterate over messages and yield parts of them). Could be something here. Need to review the OpenAI stream chunk data structures and see how they map to ell types. If we can have a normal chunk returned by providers then ell should be able to 1. Yield the chunks when stream true 2. collect them into messages otherwise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant