1 年之前 · c97d28c65c
--- a/demo_apps/llama-on-prem.md
+++ b/demo_apps/llama-on-prem.md
@@ -12,7 +12,8 @@ You'll also need your Hugging Face access token which you can get at your Settin
 
				 
			
 
				 ## Setting up vLLM with Llama 2
			
 
				 
			
 
				-On a terminal, run the following commands:
			
 
				+On a terminal, run the following commands (note that `pip install vllm` will install vllm and all its dependency packages, while the clone of the vllm repo will make the vLLM API scripts available):
			
 
				+
			
 
				 ```
			
 
				 conda create -n vllm python=3.8
			
 
				 conda activate vllm
			
@@ -24,7 +25,7 @@ cd vllm/vllm/entrypoints/
 
				 
			
 
				 Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login.
			
 
				 
			
 
				-There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server.
			
 
				+There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server (see [here](https://platform.openai.com/docs/api-reference/authentication) on how the OpenAI API authenticates, but you won't need to provide a real OpenAI API key when running Llama 2 via vLLM in the OpenAI-compatible mode).
			
 
				 
			
 
				 ### Deploying Llama 2 as an API Server