Browse Source

update based on PR feedback

Jeff Tang 1 year ago
parent
commit
c97d28c65c
1 changed files with 3 additions and 2 deletions
  1. 3 2
      demo_apps/llama-on-prem.md

+ 3 - 2
demo_apps/llama-on-prem.md

@@ -12,7 +12,8 @@ You'll also need your Hugging Face access token which you can get at your Settin
 
 ## Setting up vLLM with Llama 2
 
-On a terminal, run the following commands:
+On a terminal, run the following commands (note that `pip install vllm` will install vllm and all its dependency packages, while the clone of the vllm repo will make the vLLM API scripts available):
+
 ```
 conda create -n vllm python=3.8
 conda activate vllm
@@ -24,7 +25,7 @@ cd vllm/vllm/entrypoints/
 
 Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login.
 
-There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server.
+There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server (see [here](https://platform.openai.com/docs/api-reference/authentication) on how the OpenAI API authenticates, but you won't need to provide a real OpenAI API key when running Llama 2 via vLLM in the OpenAI-compatible mode).
 
 ### Deploying Llama 2 as an API Server