Hamid Shojanazeri
|
442c1ccf7c
adding barrier to end of trainer loop
|
пре 1 година |
Hamid Shojanazeri
|
f74d57dc08
printing scores based on fsdp usage or single gpu
|
пре 1 година |
Hamid Shojanazeri
|
3d887ea483
update with active memory and removing rank0 for eval score
|
пре 1 година |
Hamid Shojanazeri
|
bedb96b78a
fixing the full state path in checkpoint handler
|
пре 1 година |
Hamid Shojanazeri
|
bd01f64cbd
Merge branch 'main' into fix-cuda_id
|
пре 1 година |
Andrew Gu
|
71fdc4920a
Save memory and fix typos
|
пре 1 година |
Hamid Shojanazeri
|
a7156dfb5d
fixing the cuda id
|
пре 1 година |
Hamid Shojanazeri
|
707af7ea24
adding cuda:0 for non-fsdp situations
|
пре 1 година |
Hamid Shojanazeri
|
6678be75ad
fixing identation
|
пре 1 година |
Hamid Shojanazeri
|
6a84e9e4d5
fixing scaler for both fsdp and non fsdp
|
пре 1 година |
Hamid Shojanazeri
|
065ddaa77b
fixing the condition for moving to cuda
|
пре 1 година |
Hamid Shojanazeri
|
20b061e01c
modify to steping the lr scheduler each epoch
|
пре 1 година |
chauhang
|
4767f09ecd
Initial commit
|
пре 1 година |