{ "cells": [ { "cell_type": "markdown", "id": "30eb1704-8d76-4bc9-9308-93243aeb69cb", "metadata": {}, "source": [ "\"Open\n", "\n", "## This demo app shows:\n", "* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n", "* How to ask Llama 3 questions about recent live data via the [Trvily](https://tavily.com) live search API" ] }, { "cell_type": "code", "execution_count": null, "id": "1d0005d6-e928-4d1a-981b-534a40e19e56", "metadata": {}, "outputs": [], "source": [ "!pip install llama-index \n", "!pip install llama-index-core\n", "!pip install llama-index-llms-replicate\n", "!pip install llama-index-embeddings-huggingface\n", "!pip install tavily-python" ] }, { "cell_type": "markdown", "id": "83639e83-2baa-4156-93a2-b9b6d4baf7d6", "metadata": {}, "source": [ "You will be using [Replicate](https://replicate.com/meta/meta-llama-3-8b-instruct) to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. You can also use other Llama 3 cloud providers such as [Groq](https://console.groq.com/), [Together](https://api.together.xyz/playground/language/meta-llama/Llama-3-8b-hf), or [Anyscale](https://app.endpoints.anyscale.com/playground) - see Section 2 of the Getting to Know Llama [notebook](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Getting_to_know_Llama.ipynb) for more information.\n", "\n", "If you'd like to run Llama 3 locally for the benefits of privacy, no cost or no rate limit (some Llama 3 hosting providers set limits for free plan of queries or tokens per second or minute), see [Running Llama Locally](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb)." ] }, { "cell_type": "code", "execution_count": null, "id": "e6affd70-c909-4340-924f-f282912765d5", "metadata": {}, "outputs": [], "source": [ "from getpass import getpass\n", "import os\n", "\n", "REPLICATE_API_TOKEN = getpass()\n", "os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN" ] }, { "cell_type": "markdown", "id": "18582e1f-30b1-4dc5-918a-de2995eb5b46", "metadata": {}, "source": [ "You'll set up the Llama 3 8b chat model from Replicate. You can also use Llama 3 70b model by replacing the `model` name with \"meta/meta-llama-3-70b-instruct\"." ] }, { "cell_type": "code", "execution_count": null, "id": "21fe3849", "metadata": {}, "outputs": [], "source": [ "from llama_index.core import Settings, VectorStoreIndex\n", "from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n", "from llama_index.llms.replicate import Replicate\n", "\n", "Settings.llm = Replicate(\n", " model=\"meta/meta-llama-3-8b-instruct\",\n", " temperature=0.0,\n", " additional_kwargs={\"top_p\": 1, \"max_new_tokens\": 500},\n", ")\n", "\n", "Settings.embed_model = HuggingFaceEmbedding(\n", " model_name=\"BAAI/bge-small-en-v1.5\"\n", ")" ] }, { "cell_type": "markdown", "id": "f8ff812b", "metadata": {}, "source": [ "Next you will use the [Trvily](https://tavily.com/) search engine to augment the Llama 3's responses. To create a free trial Trvily Search API, sign in with your Google or Github account [here](https://app.tavily.com/sign-in)." ] }, { "cell_type": "code", "execution_count": null, "id": "75275628-5235-4b55-8033-601c76107528", "metadata": {}, "outputs": [], "source": [ "from tavily import TavilyClient\n", "\n", "TAVILY_API_KEY = getpass()\n", "tavily = TavilyClient(api_key=TAVILY_API_KEY)" ] }, { "cell_type": "markdown", "id": "476d72da", "metadata": {}, "source": [ "Do a live web search on \"Llama 3 fine-tuning\"." ] }, { "cell_type": "code", "execution_count": null, "id": "effc9656-b18d-4d24-a80b-6066564a838b", "metadata": {}, "outputs": [], "source": [ "response = tavily.search(query=\"Llama 3 fine-tuning\")\n", "context = [{\"url\": obj[\"url\"], \"content\": obj[\"content\"]} for obj in response['results']]" ] }, { "cell_type": "code", "execution_count": null, "id": "8bed3baf-742e-473c-ada1-4459012a8a2c", "metadata": {}, "outputs": [], "source": [ "context" ] }, { "cell_type": "markdown", "id": "8e5e3b4e", "metadata": {}, "source": [ "Create documents based on the search results, index and save them to a vector store, then create a query engine." ] }, { "cell_type": "code", "execution_count": null, "id": "a5de3080-2c4b-479c-baba-793b3bee36ed", "metadata": {}, "outputs": [], "source": [ "from llama_index.core import Document\n", "\n", "documents = [Document(text=ct['content']) for ct in context]\n", "index = VectorStoreIndex.from_documents(documents)\n", "\n", "query_engine = index.as_query_engine(streaming=True)" ] }, { "cell_type": "markdown", "id": "2c4ea012", "metadata": {}, "source": [ "You are now ready to ask Llama 3 questions about the live data using the query engine." ] }, { "cell_type": "code", "execution_count": null, "id": "de91a191-d0f2-498e-88dc-b2b43423e0e5", "metadata": {}, "outputs": [], "source": [ "response = query_engine.query(\"give me a summary\")\n", "response.print_response_stream()" ] }, { "cell_type": "code", "execution_count": null, "id": "72814b20-06aa-4da8-b4dd-f0b0d74a2ea0", "metadata": {}, "outputs": [], "source": [ "query_engine.query(\"what's the latest about Llama 3 fine-tuning?\").print_response_stream()" ] }, { "cell_type": "code", "execution_count": null, "id": "a65bc037-a689-476d-b529-0059a27bc949", "metadata": {}, "outputs": [], "source": [ "query_engine.query(\"tell me more about Llama 3 fine-tuning\").print_response_stream()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }