Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All Audio file rejected process #635

Open
Rbkyada opened this issue Oct 2, 2024 · 1 comment
Open

All Audio file rejected process #635

Rbkyada opened this issue Oct 2, 2024 · 1 comment

Comments

@Rbkyada
Copy link

Rbkyada commented Oct 2, 2024

Issue with Recorder Audio Failing for Speech-to-Text Models
Version of react-native-audio-recorder-player: 3.6.11

Version of React Native: ^0.73.5

Platforms affected: Both iOS and Android

Expected Behavior
I expected to record audio using the react-native-audio-recorder-player package and use the generated .wav file as input for speech-to-text models without issues.

Actual Behavior
The audio files generated by the react-native-audio-recorder-player consistently fail when uploaded to speech-to-text AI models. I've tested multiple services, and they all reject the .wav files created by this package.

code option:
AudioSourceAndroid: AudioSourceAndroidType.MIC,
OutputFormatAndroid: OutputFormatAndroidType.DEFAULT,
AudioEncoderAndroid: AudioEncoderAndroidType.AMR_NB,
AudioSamplingRateAndroid: 44100,
AudioChannelsAndroid: 2,
AudioEncodingBitRateAndroid: 128000,
AVSampleRateKeyIOS: 44100,
AVFormatIDKeyIOS: AVEncodingOption.wav,

Steps to Reproduce

  1. Record an audio file using react-native-audio-recorder-player.
  2. Attempt to upload the resulting .wav file to any speech-to-text model.
  3. Observe that the upload or processing fails.
    2024-10-02 21 20 43
@tjlondon-npauctions
Copy link

Try it without passing any settings at all, and don't pass an audioPath and let it generate its own file and file extension and see what comes out and upload that. Then work backwards from there adding in each of your settings.

I think this is more likely to do with the bitrate and encoding settings rather than the package.

Sorry, I'm no expert on audio codecs but I use this just fine with AssemblyAI, Gladia and Deepgram. I'm using AAC, MP3 and M4A on iOS. Just my advice, I find WAVs way too big. You generally don't need an uncompressed lossless format for transcribing speech.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants