ComfyUI-audio

generative audio tools for ComfyUI. highly experimental—expect things to break and/or change frequently or not at all.

NOTE: for the foreseeable future, i will be unable to continue working on this extension. please consider forking this repository!

features

tacotron2 text-to-speech
- uses justinjohn0306's forks of tacotron2 and hifi-gan
musicgen text-to-music + audiogen text-to-sound
- audiocraft and transformers implementations
- supports audio continuation, unconditional generation
tortoise text-to-speech
vall-e x text-to-speech
- uses korakoe's fork
voicefixer
audio utility nodes
- save audio, convert audio

installation

# TORCH_CUDA_INDEX_URL=https://download.pytorch.org/whl/cu118  # for cuda 11.8
TORCH_CUDA_INDEX_URL=https://download.pytorch.org/whl/cu121  # for cuda 12.1

cd ComfyUI/custom_nodes
git clone https://github.com/eigenpunk/ComfyUI-audio
cd ComfyUI-audio

# for linux
pip install -r requirements.txt --extra-index-url $TORCH_CUDA_INDEX_URL

# for windows
pip install -r requirements_windows.txt --extra-index-url $TORCH_CUDA_INDEX_URL

this extension is developed and tested on a Linux-based OS. i've not yet been able to get the extension fully working on Windows, so expect some difficulty if that is your platform. i've not tested the extension on macOS at all.

would be nice to have maybe

audio uploads
audio previews
prompt weights for text-to-music/audio
stereo musicgen
multi-band diffusion
more/faster tts model support
- vits?
- ~~tacotron2~~
- ~~vall-e x~~
- ???
split generator nodes by model stages
- e.g. tortoise:
  - autoregressor
  - clvp/cvvp
  - spectrogram diffusion
- e.g. musicgen:
  - t5 text encode
  - encodec audio encode
  - generate with decoder
more audio generation models
- magnet, etc
demucs
~~audiogen~~

NOTE: this work is solely a personal project; its development is not supported/sponsored by any past/present employer or any other external organization.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
include		include
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
musicgen_hf_nodes.py		musicgen_hf_nodes.py
musicgen_nodes.py		musicgen_nodes.py
requirements.txt		requirements.txt
requirements_windows.txt		requirements_windows.txt
tacotron_nodes.py		tacotron_nodes.py
tortoise_nodes.py		tortoise_nodes.py
util.py		util.py
util_nodes.py		util_nodes.py
valle_x_nodes.py		valle_x_nodes.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ComfyUI-audio

features

installation

would be nice to have maybe

About

Languages

License

eigenpunk/ComfyUI-audio

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-audio

features

installation

would be nice to have maybe

About

Topics

Resources

License

Stars

Watchers

Forks

Languages