DeepSpeed error with LoRa #2

g-batalhao-a · 2024-04-01T12:57:39Z

Hi,

I was trying to use the fine-tune script with a quantized Mistral and I get the following error:
ValueError: DeepSpeed Zero-3 is not compatible with "low_cpu_mem_usage=True" or with passing a "device_map".

After looking at this issue, I removed this line but then the error that appears is the following:
RuntimeError: Only Tensors of floating point and complex dtype can require gradients

Could I get help solving this issue?
Thanks

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed error with LoRa #2

DeepSpeed error with LoRa #2

g-batalhao-a commented Apr 1, 2024

DeepSpeed error with LoRa #2

DeepSpeed error with LoRa #2

Comments

g-batalhao-a commented Apr 1, 2024