Browse Source

Add complete Chatbot example with RAG capability for demo apps (#291)

Chester Hu 1 year ago
parent
commit
43771602c9

+ 2 - 1
README.md

@@ -1,6 +1,6 @@
 # Llama 2 Fine-tuning / Inference Recipes, Examples and Demo Apps
 
-**[Update Nov. 14, 2023] We recently released a series of Llama 2 demo apps [here](./demo_apps). These apps show how to run Llama 2 locally, in the cloud, on-prem or with WhatsApp, and how to ask Llama 2 questions in general and about custom data (PDF, DB, or live).**
+**[Update Nov. 16, 2023] We recently released a series of Llama 2 demo apps [here](./demo_apps). These apps show how to run Llama (locally, in the cloud, or on-prem), how to ask Llama questions in general or about custom data (PDF, DB, or live), how to integrate Llama with WhatsApp, and how to implement an end-to-end chatbot with RAG (Retrieval Augmented Generation).**
 
 The 'llama-recipes' repository is a companion to the [Llama 2 model](https://github.com/facebookresearch/llama). The goal of this repository is to provide examples to quickly get started with fine-tuning for domain adaptation and how to run inference for the fine-tuned models. For ease of use, the examples use Hugging Face converted versions of the models. See steps for conversion of the model [here](#model-conversion-to-hugging-face).
 
@@ -184,6 +184,7 @@ This folder contains a series of Llama2-powered apps:
 2. Llama on Google Colab
 3. Llama on Cloud and ask Llama questions about unstructured data in a PDF
 4. Llama on-prem with vLLM and TGI
+5. Llama chatbot with RAG (Retrieval Augmented Generation)
 
 * Specialized Llama use cases:
 1. Ask Llama to summarize a video content

File diff suppressed because it is too large
+ 717 - 0
demo_apps/RAG_Chatbot_example/RAG_Chatbot_Example.ipynb


BIN
demo_apps/RAG_Chatbot_example/data/Llama Getting Started Guide.pdf


+ 6 - 0
demo_apps/RAG_Chatbot_example/requirements.txt

@@ -0,0 +1,6 @@
+gradio
+pypdf
+langchain
+sentence-transformers
+faiss-cpu
+text-generation

BIN
demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.faiss


BIN
demo_apps/RAG_Chatbot_example/vectorstore/db_faiss/index.pkl


+ 5 - 1
demo_apps/README.md

@@ -6,6 +6,7 @@ This folder contains a series of Llama 2-powered apps:
 2. Llama on Google Colab
 3. Llama on Cloud and ask Llama questions about unstructured data in a PDF
 4. Llama on-prem with vLLM and TGI
+5. Llama chatbot with RAG (Retrieval Augmented Generation)
 
 * Specialized Llama use cases:
 1. Ask Llama to summarize a video content
@@ -102,4 +103,7 @@ To see how to query Llama2 and get answers with the Gradio UI both from the note
 
 Then enter your question, click Submit. You'll see in the notebook or a browser with URL http://127.0.0.1:7860 the following UI:
 
-![](llama2-gradio.png)
+![](llama2-gradio.png)
+
+### [RAG Chatbot Example](RAG_Chatbot_example/RAG_Chatbot_Example.ipynb)
+A complete example of how to build a Llama 2 chatbot hosted on your browser that can answer questions based on your own data.

+ 2 - 0
demo_apps/llama-on-prem.md

@@ -22,7 +22,9 @@ pip install vllm
 
 Then run `huggingface-cli login` and copy and paste your Hugging Face access token to complete the login.
 
+<!-- markdown-link-check-disable -->
 There are two ways to deploy Llama 2 via vLLM, as a general API server or an OpenAI-compatible server (see [here](https://platform.openai.com/docs/api-reference/authentication) on how the OpenAI API authenticates, but you won't need to provide a real OpenAI API key when running Llama 2 via vLLM in the OpenAI-compatible mode).
+<!-- markdown-link-check-enable -->
 
 ### Deploying Llama 2 as an API Server