Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
do not use distributed when not available, instead use CPU or MPS. This entails a few changes: --device is now a valid flag to the library since `ilab` can pass CPU, MPS, or default to cuda when using CPU or MPS, do not initialize DS, instead put the model on the device and initialize `Adafactor` optimizer which is more efficient and than Adam based one inside of `train` add logic for handling if torch.cuda.is_available and torch.distributed.is_initialized() we dont use distributed torch on consumer systems the train loop needs some custom step and loss logic for a LlamaForCausalLM model, add that in when using CPU or MPS we are always world_size == 1 and local_rank == 0 Signed-off-by: Charlie Doern <[email protected]>
- Loading branch information