v0.33.0: MUSA backend support and bugfixes
MUSA backend support and bugfixes
Small release this month, with key focuses on some added support for backends and bugs:
- Support MUSA (Moore Threads GPU) backend in accelerate by @fmo-mt in #2917
- Allow multiple process per device by @cifkao in #2916
- Add
torch.float8_e4m3fn
formatdtype_byte_size
by @SunMarc in #2945 - Properly handle Params4bit in set_module_tensor_to_device by @matthewdouglas in #2934
What's Changed
- [tests] fix bug in torch_device by @faaany in #2909
- Fix slowdown on init with
device_map="auto"
by @muellerzr in #2914 - fix: bug where
multi_gpu
was being set and warning being printed even withnum_processes=1
by @HarikrishnanBalagopal in #2921 - Better error when a bad directory is given for weight merging by @muellerzr in #2852
- add xpu device check before moving tensor directly to xpu device by @faaany in #2928
- Add huggingface_hub version to setup.py by @nullquant in #2932
- Correct loading of models with shared tensors when using accelerator.load_state() by @jkuntzer in #2875
- Hotfix PyTorch Version Installation in CI Workflow for Minimum Version Matrix by @yhna940 in #2889
- Fix import test by @muellerzr in #2931
- Consider pynvml available when installed through the nvidia-ml-py distribution by @matthewdouglas in #2936
- Improve test reliability for Accelerator.free_memory() by @matthewdouglas in #2935
- delete CCL env var setting by @Liangliang-Ma in #2927
- feat(ci): add
pip
caching in CI by @SauravMaheshkar in #2952
New Contributors
- @HarikrishnanBalagopal made their first contribution in #2921
- @fmo-mt made their first contribution in #2917
- @nullquant made their first contribution in #2932
- @cifkao made their first contribution in #2916
- @jkuntzer made their first contribution in #2875
- @matthewdouglas made their first contribution in #2936
- @Liangliang-Ma made their first contribution in #2927
- @SauravMaheshkar made their first contribution in #2952
Full Changelog: v0.32.1...v0.33.0