Bläddra i källkod

update README to refer to llama on prem

Jeff Tang 1 år sedan
förälder
incheckning
add9623940
2 ändrade filer med 4 tillägg och 0 borttagningar
  1. 2 0
      README.md
  2. 2 0
      docs/inference.md

+ 2 - 0
README.md

@@ -193,6 +193,8 @@ This folder contains a series of Llama2-powered apps:
   2. Ask Llama questions about structured data in a DB
   3. Ask Llama questions about live data on the web
 
+**[New] A tutorial on how to deploy [Llama 2 on-prem](./demo_apps/llama-on-prem.md) with vLLM and TGI based API services, as well as client code in Python.**
+
 # Repository Organization
 This repository is organized in the following way:
 

+ 2 - 0
docs/inference.md

@@ -144,3 +144,5 @@ python examples/vllm/inference.py --model_name <PATH/TO/MODEL/7B>
 ```
 
 [**TGI**](https://github.com/huggingface/text-generation-inference): Text Generation Inference (TGI) is another inference option available to you. For more information on how to set up and use TGI see [here](../examples/hf_text_generation_inference/README.md).
+
+[Here](../demo_apps/llama-on-prem.md) is a complete tutorial on how to use vLLM and TGI to deploy Llama 2 on-prem and interact with the Llama API services.