|
@@ -6,7 +6,7 @@ This folder contains an example file to run Llama Guard inference directly.
|
|
|
|
|
|
## Requirements
|
|
|
1. Access to Llama guard model weights on Hugging Face. To get access, follow the steps described [here](https://github.com/facebookresearch/PurpleLlama/tree/main/Llama-Guard#download)
|
|
|
-2. Llama recipes dependencies installed
|
|
|
+2. Llama recipes package and it's dependencies [installed](https://github.com/albertodepaola/llama-recipes/blob/llama-guard-data-formatter-example/README.md#installation)
|
|
|
3. A GPU with at least 21 GB of free RAM to load both 7B models quantized.
|
|
|
|
|
|
## Llama Guard inference script
|
|
@@ -34,8 +34,27 @@ To run the samples, with all the dependencies installed, execute this command:
|
|
|
|
|
|
`python examples/llama_guard/inference.py`
|
|
|
|
|
|
+This is the output:
|
|
|
+
|
|
|
+```
|
|
|
+['<Sample user prompt>']
|
|
|
+> safe
|
|
|
+
|
|
|
+==================================
|
|
|
+
|
|
|
+['<Sample user prompt>', '<Sample agent response>']
|
|
|
+> safe
|
|
|
+
|
|
|
+==================================
|
|
|
+
|
|
|
+['<Sample user prompt>', '<Sample agent response>', '<Sample user reply>', '<Sample agent response>']
|
|
|
+> safe
|
|
|
+
|
|
|
+==================================
|
|
|
+```
|
|
|
+
|
|
|
## Inference Safety Checker
|
|
|
-When running the regular inference script with prompts, Llama Guard will be used as a safety checker on the user prompt and the model output. If both are safe, the result will be show, else a message with the error will be show, with the word unsafe and a comma separated list of categories infringed. Llama Guard is always loaded quantized using Hugging Face Transformers library.
|
|
|
+When running the regular inference script with prompts, Llama Guard will be used as a safety checker on the user prompt and the model output. If both are safe, the result will be shown, else a message with the error will be shown, with the word unsafe and a comma separated list of categories infringed. Llama Guard is always loaded quantized using Hugging Face Transformers library.
|
|
|
|
|
|
In this case, the default categories are applied by the tokenizer, using the `apply_chat_template` method.
|
|
|
|