Abhilash Majumder
|
977d68a7ce
fix memory bug
|
1 год назад |
Abhilash Majumder
|
a2ca4a7710
fix bug on indent
|
1 год назад |
Abhilash Majumder
|
d5f39914e8
Merge branch 'main' into ipex_feature
|
1 год назад |
abhilash1910
|
82d3ca6e06
Fix bugs in data loading
|
1 год назад |
abhilash1910
|
ed7ba999a9
enable xpu finetuning and inference
|
1 год назад |
lchu
|
feaa344af3
resolve conflicts
|
1 год назад |
Hamid Shojanazeri
|
75f291fe1c
resolved conflicts
|
1 год назад |
Hamid Shojanazeri
|
44ef280d31
adding flash attention and xformer memory efficient through PT SDPA
|
1 год назад |
luoyifan
|
79d0d4fc4e
fix some typos.
|
1 год назад |
lchu
|
80a4c36707
further fix #90
|
1 год назад |
Hamid Shojanazeri
|
88d3e1febc
fix the save_train_param condition
|
1 год назад |
Hamid Shojanazeri
|
62be60355a
resolving conflicts
|
1 год назад |
Hamid Shojanazeri
|
017cadd04b
Merge branch 'checkpoint_handler_path_fix' of https://github.com/facebookresearch/llama-recipes into checkpoint_handler_path_fix
|
1 год назад |
Hamid Shojanazeri
|
4f70348b94
remove the redundant lr step
|
1 год назад |
Hamid Shojanazeri
|
5b916114eb
merge main branch
|
1 год назад |
Hamid Shojanazeri
|
668c364f6b
add rank to save_train_params
|
1 год назад |
Hamid Shojanazeri
|
231c9e7da9
adding train_param.yaml saving for fsdp checkpoint loading for inference
|
1 год назад |
Hamid Shojanazeri
|
41dd7ff1cb
Merge branch 'main' into checkpoint_handler_path_fix
|
1 год назад |
Hamid Shojanazeri
|
a955ed1999
added checks for dist barrier and commented cuda exapnadable segements and dist_dbug
|
1 год назад |
Hamid Shojanazeri
|
a2403c7c1a
clean up
|
1 год назад |
Hamid Shojanazeri
|
e9559d2669
fixing the train/eval_loss calcualtion
|
1 год назад |
Hamid Shojanazeri
|
4ba4400a75
adding dist barrier before and after checkpointing
|
1 год назад |
Hamid Shojanazeri
|
a49a2c2804
adding PT cuda allocation expand flag
|
1 год назад |
Hamid Shojanazeri
|
442c1ccf7c
adding barrier to end of trainer loop
|
1 год назад |
Hamid Shojanazeri
|
f74d57dc08
printing scores based on fsdp usage or single gpu
|
1 год назад |
Hamid Shojanazeri
|
3d887ea483
update with active memory and removing rank0 for eval score
|
1 год назад |
Hamid Shojanazeri
|
bedb96b78a
fixing the full state path in checkpoint handler
|
1 год назад |
Hamid Shojanazeri
|
563e572f7c
adding active mem stat
|
1 год назад |
Hamid Shojanazeri
|
bd01f64cbd
Merge branch 'main' into fix-cuda_id
|
1 год назад |
Andrew Gu
|
71fdc4920a
Save memory and fix typos
|
1 год назад |