-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training waveglow model for 16kHz #215
Comments
Someone please answer this question.I trained the model after loading the pretrained weights ,but after 14K steps the audio is full of noise. |
I got the same issue.
Before training, I used the After training with the pre-trained model, the loss could fast drop to Of course if I trained without pre-trained model, the loss will drop very slowly, and the inference results were also full of noise. |
Maybe we could try to modify the code as #88 , then try again. |
So after training the pre-trained model for 25k steps,you are still getting noisy output? |
#88 may work |
after #88 , training 16kHz with pre-trained model is not available anymore, because the |
Yes,I also faced the same issue.So I trained the model from scratch.After 100K steps ,the audio quality is not improving much . |
Have you tried #99?Can we train 16KHz with pre-trained model using this code? |
Hi, I currently have a problem with 16kHz waveglow training
I have trained for 236k steps and every output audios are silence. Hope u guys could give me some light :( |
Did anyone manage to solve this issue? I'm also training on 16000 dataset. To check the model I trained it just on 12 samples (1 batch) with different parameters using pretrained model. The first one:
after 500 epochs the loss starts to increase, all the inferences (500, 1000, ... 5000) give only noise in the output.
Gives audible speech after 500, but there's a lot of noise and it's too fast. The question is: why does the loss increase? Why does the quality remain the same on the training set and does not improve even though the sample has been seen many times? And how to remove the noise and normalize the audio speed? |
Was anyone able to figure this out? I also tried training 16k from scratch and had the same experience as @mychiux413 |
You can find a model trained from scratch on 21 hours of multispeaker 16kHz data (544000 training steps) here: http://adrianastan.com/models/ . Not as good as the NVIDIA release, but it does the job. The config is as follows:
Perhaps you can warmstart your model from it. |
Trained one for 377.5k steps, unsure of how good/bad it is because for my use case it was okay-ish - https://drive.google.com/file/d/1dP4eMDPrZyqRo_gMz1VUDr2Bd_eRXoIa/view?usp=sharing |
Can you also share your config please. |
I get the following exception when loading the model: |
Hi,
I'm trying to train 16kHz models for both waveglow and tacotron2.
for 16k tacotron I have used win_length=800 and hop_length=200, It has produced good results with 22k pretrained waveglow model. In order to get better results I want to train an 16khz waveglow model
I guess that the same parameter values of 800 and 200 should be used for waveglow training.
When I use these new parameters instead of 1024 and 256, can I still use pretrained 22k waveglow model for warmstart? I have some reservations because pretrained 22k waveglow model is trained with win_length:1024 and hop_length:200
Thanks.
The text was updated successfully, but these errors were encountered: