Commit History

Author SHA1 Message Date
  sekyonda 59b59afe88 Update spellcheck.sh 1 year ago
  lchu 895dfcea30 add nightly check for using low_cpu_fsdp mode 1 year ago
  Hamid Shojanazeri 9b0eae4056 add max_pad length as an arg 1 year ago
  Hamid Shojanazeri 6beadcd8dd adding padding to inference script with embedding size changing 1 year ago
  lchu 1e64fc98d9 switch to simpler param_init_fn and meta device init 1 year ago
  lchu 101391f46a Revert "replace init_empty_weights with torch.device(meta)" 1 year ago
  lchu c8d4f38d23 replace init_empty_weights with torch.device(meta) 1 year ago
  lchu d8a81bb531 save cpu mem by leveraging FSDP rank0 broadcasting 1 year ago
  Geeta Chauhan 1387b76e11 fixing the full state path in checkpoint handler+loss report calculation (#51) 1 year ago
  Hamid Shojanazeri 88d3e1febc fix the save_train_param condition 1 year ago
  Hamid Shojanazeri b56028c98d fixing the word list/spell check 1 year ago
  Hamid Shojanazeri 62be60355a resolving conflicts 1 year ago
  Geeta Chauhan 174b856591 update README: python 3.9 rec + fix formatting (#63) 1 year ago
  Geeta Chauhan 0cd5694a14 Fsdp inference checkpoints (#39) 1 year ago
  Hamid Shojanazeri c4e96af6ee clean up 1 year ago
  Christian Miller 7c1884c690 recommend python 3.9 1 year ago
  Hamid Shojanazeri 7d2e06821e fixing the path to script 1 year ago
  Hamid Shojanazeri 5f97db8f0c fix spell check word list 1 year ago
  Hamid Shojanazeri 017cadd04b Merge branch 'checkpoint_handler_path_fix' of https://github.com/facebookresearch/llama-recipes into checkpoint_handler_path_fix 1 year ago
  Hamid Shojanazeri 4f70348b94 remove the redundant lr step 1 year ago
  Hamid Shojanazeri 9c95ed4bbe clean up 1 year ago
  Hamid Shojanazeri 311a5c1eec add notes for train_param.yaml 1 year ago
  Hamid Shojanazeri 5b916114eb merge main branch 1 year ago
  Hamid Shojanazeri 668c364f6b add rank to save_train_params 1 year ago
  Hamid Shojanazeri 231c9e7da9 adding train_param.yaml saving for fsdp checkpoint loading for inference 1 year ago
  Hamid Shojanazeri 475e67b4ec clean up 1 year ago
  Hamid Shojanazeri 50e9d17045 add the default option for find the HF model_name/path from train_param.yaml 1 year ago
  Hamid Shojanazeri 41dd7ff1cb Merge branch 'main' into checkpoint_handler_path_fix 1 year ago
  Hamid Shojanazeri 31d6ce8bf6 adding expnadable sgement and dist debug flag info 1 year ago
  Hamid Shojanazeri a955ed1999 added checks for dist barrier and commented cuda exapnadable segements and dist_dbug 1 year ago