1 år sedan · add9623940
--- a/README.md
+++ b/README.md
@@ -193,6 +193,8 @@ This folder contains a series of Llama2-powered apps:
 
				   2. Ask Llama questions about structured data in a DB
			
 
				   3. Ask Llama questions about live data on the web
			
 
				 
			
 
				+**[New] A tutorial on how to deploy [Llama 2 on-prem](./demo_apps/llama-on-prem.md) with vLLM and TGI based API services, as well as client code in Python.**
			
 
				+
			
 
				 # Repository Organization
			
 
				 This repository is organized in the following way:
			
 
				 
			
--- a/docs/inference.md
+++ b/docs/inference.md
@@ -144,3 +144,5 @@ python examples/vllm/inference.py --model_name <PATH/TO/MODEL/7B>
 
				 ```
			
 
				 
			
 
				 [**TGI**](https://github.com/huggingface/text-generation-inference): Text Generation Inference (TGI) is another inference option available to you. For more information on how to set up and use TGI see [here](../examples/hf_text_generation_inference/README.md).
			
 
				+
			
 
				+[Here](../demo_apps/llama-on-prem.md) is a complete tutorial on how to use vLLM and TGI to deploy Llama 2 on-prem and interact with the Llama API services.