1 anno fa · 4b18e49f44
--- a/docs/inference.md
+++ b/docs/inference.md
@@ -38,12 +38,27 @@ python chat_completion.py --model_name "PATH/TO/MODEL/7B/" --prompt_file chats.j
 
				 
			
 
				 In case you have fine-tuned your model with pure FSDP and saved the checkpoints with "SHARDED_STATE_DICT" as shown [here](../configs/fsdp.py), you can use this converter script to convert the FSDP Sharded checkpoints into HuggingFace checkpoints. This enables you to use the inference script normally as mentioned above.
			
 
				 **To convert the checkpoint use the following command**:
			
 
				+
			
 
				+This is helpful if you have fine-tuned you model using FSDP only as follows:
			
 
				+
			
 
				+```bash
			
 
				+torchrun --nnodes 1 --nproc_per_node 8  llama_finetuning.py --enable_fsdp --model_name /patht_of_model_folder/7B --dist_checkpoint_root_folder model_checkpoints --dist_checkpoint_folder fine-tuned --pure_bf16 
			
 
				+```
			
 
				+Then convert your FSDP checkpoint to HuggingFace checkpoints using:
			
 
				 ```bash
			
 
				  python checkpoint_converter_fsdp_hf.py --model_name  PATH/to/FSDP/Checkpoints --save_dir PATH/to/save/checkpoints --model_path PATH/or/HF/model_name
			
 
				 
			
 
				  # --model_path specifies the HF Llama model name or path where it has config.json and tokenizer.json
			
 
				  ```
			
 
				 
			
 
				+Then run inference using:
			
 
				+
			
 
				+```bash
			
 
				+python inference/inference.py --model_name <training_config.output_dir> --prompt_file <test_prompt_file> 
			
 
				+
			
 
				+```
			
 
				+
			
 
				+
			
 
				 ## Other Inference Options
			
 
				 
			
 
				 Alternate inference options include: