Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool support in the streaming response #988

Open
zhfeng opened this issue Oct 16, 2024 · 1 comment
Open

Tool support in the streaming response #988

zhfeng opened this issue Oct 16, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@zhfeng
Copy link
Contributor

zhfeng commented Oct 16, 2024

Now calling a function is not support in the streaming response, it throws an expception like

2024-10-15 16:55:38,809 ERROR [io.qua.web.nex.run.WebSocketEndpointBase] (executor-thread-1) Unable to send text message from Multi: WebSocket connection [endpointId=io.quarkiverse.langchain4j.sample.chatbot.ChatBotWebSocket, path=/chatbot, id=362059f0-2b79-4691-9a42-7cae9c730be3] : java.lang.IllegalArgumentException: Tools are currently not supported by this model
	at dev.langchain4j.model.chat.StreamingChatLanguageModel.generate(StreamingChatLanguageModel.java:61)
	at dev.langchain4j.model.chat.StreamingChatLanguageModel_FwyQP9Of9oZwwZhlQz1A4k1Ak7I_Synthetic_ClientProxy.generate(Unknown Source)
	at dev.langchain4j.service.AiServiceTokenStream.start(AiServiceTokenStream.java:116)
	at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport$MultiEmitterConsumer.accept(AiServiceMethodImplementationSupport.java:713)
	at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport$MultiEmitterConsumer.accept(AiServiceMethodImplementationSupport.java:680)
	at io.smallrye.context.impl.wrappers.SlowContextualConsumer.accept(SlowContextualConsumer.java:21)
	at io.smallrye.mutiny.operators.multi.builders.EmitterBasedMulti.subscribe(EmitterBasedMulti.java:67)
	at io.smallrye.mutiny.operators.AbstractMulti.subscribe(AbstractMulti.java:60)
	at io.smallrye.mutiny.operators.multi.MultiFlatMapOp$FlatMapMainSubscriber.onItem(MultiFlatMapOp.java:182)
	at io.smallrye.mutiny.operators.multi.builders.EmitterBasedMulti$DropLatestOnOverflowMultiEmitter.drain(EmitterBasedMulti.java:220)
	at io.smallrye.mutiny.operators.multi.builders.EmitterBasedMulti$DropLatestOnOverflowMultiEmitter.emit(EmitterBasedMulti.java:153)
	at io.smallrye.mutiny.operators.multi.builders.SerializedMultiEmitter.onItem(SerializedMultiEmitter.java:50)
	at io.smallrye.mutiny.operators.multi.builders.SerializedMultiEmitter.emit(SerializedMultiEmitter.java:140)
	at io.smallrye.mutiny.groups.MultiCreate.lambda$completionStage$2(MultiCreate.java:128)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1773)
	at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:635)
	at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2516)
	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2495)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1521)
	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:840)

Qoute from Bruno Meseguer's comment

I guess there are no rules on when the stream back starts... I mean, the LLM might first use the tools it needs, and then start streaming its response back

It could be greate if we can have tool support in the streaming reponse as well.

@geoand geoand added the enhancement New feature or request label Oct 17, 2024
@geoand
Copy link
Collaborator

geoand commented Oct 17, 2024

This is definitely interesting, but it's actually harder that it looks.

Specifically about:

I mean, the LLM might first use the tools it needs, and then start streaming its response back

this can't be done because until you get the entire response from the LLM, you don't know whether a tool execution is necessary or not (remember that a single query from the user can result in the invocation of multiple tools).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants