Hamid Shojanazeri
|
442c1ccf7c
adding barrier to end of trainer loop
|
il y a 1 an |
Hamid Shojanazeri
|
f74d57dc08
printing scores based on fsdp usage or single gpu
|
il y a 1 an |
Hamid Shojanazeri
|
3d887ea483
update with active memory and removing rank0 for eval score
|
il y a 1 an |
Hamid Shojanazeri
|
bedb96b78a
fixing the full state path in checkpoint handler
|
il y a 1 an |
Hamid Shojanazeri
|
bd01f64cbd
Merge branch 'main' into fix-cuda_id
|
il y a 1 an |
Andrew Gu
|
71fdc4920a
Save memory and fix typos
|
il y a 1 an |
Hamid Shojanazeri
|
a7156dfb5d
fixing the cuda id
|
il y a 1 an |
Hamid Shojanazeri
|
707af7ea24
adding cuda:0 for non-fsdp situations
|
il y a 1 an |
Hamid Shojanazeri
|
6678be75ad
fixing identation
|
il y a 1 an |
Hamid Shojanazeri
|
6a84e9e4d5
fixing scaler for both fsdp and non fsdp
|
il y a 1 an |
Hamid Shojanazeri
|
065ddaa77b
fixing the condition for moving to cuda
|
il y a 1 an |
Hamid Shojanazeri
|
20b061e01c
modify to steping the lr scheduler each epoch
|
il y a 1 an |
chauhang
|
4767f09ecd
Initial commit
|
il y a 1 an |