You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I use the gpt-4o-audio-preview model and pass the audio, the fetch method I've implemented myself, chat with openai to require the audio and transcription, the audio in the returned data was discarded, rendering the entire conversation completely invalid.
Code example
No response
AI provider
@ai-sdk/openai v1.0.4
Additional context
I found that the zodSchema.safeParse method dropped the audio field.
The text was updated successfully, but these errors were encountered:
The original response from OpenAI contains an audio field:
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"refusal": null,
"audio": {
"id": "audio_abc123",
"expires_at": 1729018505,
"data": "<bytes omitted>",
"transcript": "Yes, golden retrievers are known to be ..."
}
},
"finish_reason": "stop"
}
However, it was filtered by the SDK internally and the audio field was discarded.
I need the original return result, not the trimmed data. The response data I get in the onStepFinish middleware is also filtered and doesn't include the audio field.
Ideally, ai sdk should formally support audio, as it supports images and pdf. Other ai providers may as well implement audio i/o soon enough, i.e. anthropic & hume ai colaboration rumours are around.
Description
When I use the
gpt-4o-audio-preview
model and pass the audio, the fetch method I've implemented myself, chat withopenai
to require the audio and transcription, the audio in the returned data was discarded, rendering the entire conversation completely invalid.Code example
No response
AI provider
@ai-sdk/openai v1.0.4
Additional context
I found that the zodSchema.safeParse method dropped the audio field.
The text was updated successfully, but these errors were encountered: