1 год назад · 64bea6e60b
--- a/recipes/responsible_ai/llama_guard/README.md
+++ b/recipes/responsible_ai/llama_guard/README.md
@@ -1,13 +1,13 @@
 
				-# Llama Guard demo
			
 
				+# Meta Llama Guard 1 & 2 demo
			
 
				 <!-- markdown-link-check-disable -->
			
 
				-Llama Guard is a language model that provides input and output guardrails for LLM deployments. For more details, please visit the main [repository](https://github.com/facebookresearch/PurpleLlama/tree/main/Llama-Guard).
			
 
				+Meta Llama Guard is a language model that provides input and output guardrails for LLM inference. For more details and model cards, please visit the main repository for each model, [Meta Llama Guard 1](https://github.com/meta-llama/PurpleLlama/tree/main/Llama-Guard) and Meta [Llama Guard 2](https://github.com/meta-llama/PurpleLlama/tree/main/Llama-Guard2).
			
 
				 
			
 
				-This folder contains an example file to run Llama Guard inference directly. 
			
 
				+This folder contains an example file to run inference with a locally hosted model, either using the Hugging Face Hub or a local path. 
			
 
				 
			
 
				 ## Requirements
			
 
				 1. Access to Llama guard model weights on Hugging Face. To get access, follow the steps described [here](https://github.com/facebookresearch/PurpleLlama/tree/main/Llama-Guard#download)
			
 
				-2. Llama recipes package and it's dependencies [installed](https://github.com/albertodepaola/llama-recipes/blob/llama-guard-data-formatter-example/README.md#installation)
			
 
				-3. A GPU with at least 21 GB of free RAM to load both 7B models quantized.
			
 
				+2. Llama recipes package and it's dependencies [installed](https://github.com/meta-llama/llama-recipes?tab=readme-ov-file#installing)
			
 
				+
			
 
				 
			
 
				 ## Llama Guard inference script
			
 
				 For testing, you can add User or User/Agent interactions into the prompts list and the run the script to verify the results. When the conversation has one or more Agent responses, it's considered of type agent. 
			
@@ -27,12 +27,12 @@ For testing, you can add User or User/Agent interactions into the prompts list a
 
				 
			
 
				     ]
			
 
				 ```
			
 
				-The complete prompt is built with the `build_prompt` function, defined in [prompt_format.py](../../src/llama_recipes/inference/prompt_format.py). The file contains the default Llama Guard  categories. These categories can adjusted and new ones can be added, as described in the [research paper](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/), on section 4.5 Studying the adaptability of the model.
			
 
				+The complete prompt is built with the `build_custom_prompt` function, defined in [prompt_format.py](../../../src/llama_recipes/inference/prompt_format_utils.py). The file contains the default Meta Llama Guard categories. These categories can adjusted and new ones can be added, as described in the [research paper](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/), on section 4.5 Studying the adaptability of the model.
			
 
				 <!-- markdown-link-check-enable -->
			
 
				 
			
 
				 To run the samples, with all the dependencies installed, execute this command:
			
 
				 
			
 
				-`python examples/llama_guard/inference.py`
			
 
				+`python recipes/responsible_ai/llama_guard/inference.py`
			
 
				 
			
 
				 This is the output:
			
 
				 
			
@@ -53,8 +53,14 @@ This is the output:
 
				 ==================================
			
 
				 ```
			
 
				 
			
 
				+To run it with a local model, you can use the `model_id` param in the inference script:
			
 
				+
			
 
				+`python recipes/responsible_ai/llama_guard/inference.py --model_id=/home/ubuntu/models/llama3/llama_guard_2-hf/ --llama_guard_version=LLAMA_GUARD_2`
			
 
				+
			
 
				+Note: Make sure to also add the llama_guard_version if when it does not match the default, the script allows you to run the prompt format from Meta Llama Guard 1 on Meta Llama Guard 2
			
 
				+
			
 
				 ## Inference Safety Checker
			
 
				-When running the regular inference script with prompts, Llama Guard will be used as a safety checker on the user prompt and the model output. If both are safe, the result will be shown, else a message with the error will be shown, with the word unsafe and a comma separated list of categories infringed. Llama Guard is always loaded quantized using Hugging Face Transformers library.
			
 
				+When running the regular inference script with prompts, Meta Llama Guard will be used as a safety checker on the user prompt and the model output. If both are safe, the result will be shown, else a message with the error will be shown, with the word unsafe and a comma separated list of categories infringed. Meta Llama Guard is always loaded quantized using Hugging Face Transformers library with bitsandbytes.
			
 
				 
			
 
				 In this case, the default categories are applied by the tokenizer, using the `apply_chat_template` method.