Example data processing scripts, and you may tailor your own one along with a Dataset class in src/f5_tts/model/dataset.py
.
Download corresponding dataset first, and fill in the path in scripts.
# Prepare the Emilia dataset
python src/f5_tts/train/datasets/prepare_emilia.py
# Prepare the Wenetspeech4TTS dataset
python src/f5_tts/train/datasets/prepare_wenetspeech4tts.py
# Prepare the LibriTTS dataset
python src/f5_tts/train/datasets/prepare_libritts.py
# Prepare the LJSpeech dataset
python src/f5_tts/train/datasets/prepare_ljspeech.py
Use guidance see #57 here.
python src/f5_tts/train/datasets/prepare_csv_wavs.py
Once your datasets are prepared, you can start the training process.
# setup accelerate config, e.g. use multi-gpu ddp, fp16
# will be to: ~/.cache/huggingface/accelerate/default_config.yaml
accelerate config
# .yaml files are under src/f5_tts/configs directory
accelerate launch src/f5_tts/train/train.py --config-name F5TTS_Base_train.yaml
Discussion board for Finetuning #57.
Gradio UI training/finetuning with src/f5_tts/train/finetune_gradio.py
see #143.
The use_ema = True
is harmful for early-stage finetuned checkpoints (which goes just few updates, thus ema weights still dominated by pretrained ones), try turn it off and see if provide better results.
The wandb/
dir will be created under path you run training/finetuning scripts.
By default, the training script does NOT use logging (assuming you didn't manually log in using wandb login
).
To turn on wandb logging, you can either:
- Manually login with
wandb login
: Learn more here - Automatically login programmatically by setting an environment variable: Get an API KEY at https://wandb.ai/site/ and set the environment variable as follows:
On Mac & Linux:
export WANDB_API_KEY=<YOUR WANDB API KEY>
On Windows:
set WANDB_API_KEY=<YOUR WANDB API KEY>
Moreover, if you couldn't access Wandb and want to log metrics offline, you can the environment variable as follows:
export WANDB_MODE=offline