1 год назад · 43771602c9
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 
				 # Llama 2 Fine-tuning / Inference Recipes, Examples and Demo Apps
			
 
				 
			
 
				-**[Update Nov. 14, 2023] We recently released a series of Llama 2 demo apps [here](./demo_apps). These apps show how to run Llama 2 locally, in the cloud, on-prem or with WhatsApp, and how to ask Llama 2 questions in general and about custom data (PDF, DB, or live).**
			
 
				+**[Update Nov. 16, 2023] We recently released a series of Llama 2 demo apps [here](./demo_apps). These apps show how to run Llama (locally, in the cloud, or on-prem), how to ask Llama questions in general or about custom data (PDF, DB, or live), how to integrate Llama with WhatsApp, and how to implement an end-to-end chatbot with RAG (Retrieval Augmented Generation).**
			
 
				 
			
 
				 The 'llama-recipes' repository is a companion to the [Llama 2 model](https://github.com/facebookresearch/llama). The goal of this repository is to provide examples to quickly get started with fine-tuning for domain adaptation and how to run inference for the fine-tuned models. For ease of use, the examples use Hugging Face converted versions of the models. See steps for conversion of the model [here](#model-conversion-to-hugging-face).
			
 
				 
			
@@ -184,6 +184,7 @@ This folder contains a series of Llama2-powered apps:
 
				 2. Llama on Google Colab
			
 
				 3. Llama on Cloud and ask Llama questions about unstructured data in a PDF
			
 
				 4. Llama on-prem with vLLM and TGI
			
 
				+5. Llama chatbot with RAG (Retrieval Augmented Generation)
			
 
				 
			
 
				 * Specialized Llama use cases:
			
 
				 1. Ask Llama to summarize a video content
			
--- a/demo_apps/RAG_Chatbot_example/RAG_Chatbot_Example.ipynb
+++ b/demo_apps/RAG_Chatbot_example/RAG_Chatbot_Example.ipynb
--- a/demo_apps/RAG_Chatbot_example/data/Llama
+++ b/demo_apps/RAG_Chatbot_example/data/Llama
--- a/demo_apps/RAG_Chatbot_example/requirements.txt
+++ b/demo_apps/RAG_Chatbot_example/requirements.txt
@@ -0,0 +1,6 @@
 
				+gradio
			
 
				+pypdf
			
 
				+langchain
			
 
				+sentence-transformers
			
 
				+faiss-cpu
			
 
				+text-generation
			
--- a/demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.faiss
+++ b/demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.faiss
--- a/demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.pkl
+++ b/demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.pkl
--- a/demo_apps/README.md
+++ b/demo_apps/README.md
@@ -6,6 +6,7 @@ This folder contains a series of Llama 2-powered apps:
 
				 2. Llama on Google Colab
			
 
				 3. Llama on Cloud and ask Llama questions about unstructured data in a PDF
			
 
				 4. Llama on-prem with vLLM and TGI
			
 
				+5. Llama chatbot with RAG (Retrieval Augmented Generation)
			
 
				 
			
 
				 * Specialized Llama use cases:
			
 
				 1. Ask Llama to summarize a video content
			
@@ -102,4 +103,7 @@ To see how to query Llama2 and get answers with the Gradio UI both from the note
 
				 
			
 
				 Then enter your question, click Submit. You'll see in the notebook or a browser with URL http://127.0.0.1:7860 the following UI:
			
 
				 
			
 
				-![](llama2-gradio.png)
			
 
				+![](llama2-gradio.png)
			
 
				+
			
 
				+### [RAG Chatbot Example](RAG_Chatbot_example/RAG_Chatbot_Example.ipynb)
			
 
				+A complete example of how to build a Llama 2 chatbot hosted on your browser that can answer questions based on your own data.
			
--- a/demo_apps/llama-on-prem.md
+++ b/demo_apps/llama-on-prem.md
@@ -22,7 +22,9 @@ pip install vllm
 
				 
			
 
				 Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login.
			
 
				 
			
 
				+<!-- markdown-link-check-disable -->
			
 
				 There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server (see [here](https://platform.openai.com/docs/api-reference/authentication) on how the OpenAI API authenticates, but you won't need to provide a real OpenAI API key when running Llama 2 via vLLM in the OpenAI-compatible mode).
			
 
				+<!-- markdown-link-check-enable -->
			
 
				 
			
 
				 ### Deploying Llama 2 as an API Server