You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BREAKING(pipeline): remove segmentation_duration parameter from SpeakerDiarization pipeline (defaults to duration of segmentation model)
BREAKING(task): remove support for variable chunk duration for segmentation tasks
BREAKING(pipeline): remove support for FINCHClustering and HiddenMarkovModelClustering
BREAKING(setup): drop support for Python 3.7
BREAKING(io): channels are now 0-indexed (used to be 1-indexed)
BREAKING(io): multi-channel audio is no longer downmixed to mono by default.
You should update how pyannote.audio.core.io.Audio is instantiated:
replace Audio() by Audio(mono="downmix");
replace Audio(mono=True) by Audio(mono="downmix");
replace Audio(mono=False) by Audio().
BREAKING(model): get rid of (flaky) Model.introspection
If, for some weird reason, you wrote some custom code based on that,
you should instead rely on Model.example_output.
BREAKING(interactive): remove support for Prodigy recipes
Fixes and improvements
fix(pipeline): fix reproducibility issue with Ampere CUDA devices
fix(pipeline): fix support for IOBase audio
fix(pipeline): fix corner case with no speaker
fix(train): prevent metadata preparation to happen twice
fix(task): fix support for "balance" option
improve(task): shorten and improve structure of Tensorboard tags
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
TL;DR
Better pretrained pipeline and model
Major breaking changes
Use
pipeline.to(torch.device('cuda'))
to use GPUSpeakerSegmentation
pipelineUse
SpeakerDiarization
pipeline insteadprodi.gy
recipesFull changelog
Features and improvements
pipeline.to(device)
return_embeddings
option toSpeakerDiarization
pipelinesegmentation_batch_size
andembedding_batch_size
mutable inSpeakerDiarization
pipeline (they now default to1
)SpeakerDiarization
taskBreaking changes
Segmentation
task toSpeakerDiarization
pipeline.to(device)
)SpeakerSegmentation
pipeline (useSpeakerDiarization
pipeline)segmentation_duration
parameter fromSpeakerDiarization
pipeline (defaults toduration
of segmentation model)FINCHClustering
andHiddenMarkovModelClustering
You should update how
pyannote.audio.core.io.Audio
is instantiated:Audio()
byAudio(mono="downmix")
;Audio(mono=True)
byAudio(mono="downmix")
;Audio(mono=False)
byAudio()
.Model.introspection
If, for some weird reason, you wrote some custom code based on that,
you should instead rely on
Model.example_output
.Fixes and improvements
Dependencies update
This discussion was created from the release Version 3.0.0.
Beta Was this translation helpful? Give feedback.
All reactions