|
@@ -0,0 +1,306 @@
|
|
|
+{
|
|
|
+ "cells": [
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "30eb1704-8d76-4bc9-9308-93243aeb69cb",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "## This demo app shows:\n",
|
|
|
+ "* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n",
|
|
|
+ "* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex\n",
|
|
|
+ "\n",
|
|
|
+ "The LangChain package is used to facilitate the call to Llama2 hosted on Replicate\n",
|
|
|
+ "\n",
|
|
|
+ "**Note** We will be using Replicate to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. \n",
|
|
|
+ "After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "68cf076e",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We start by installing the necessary packages:\n",
|
|
|
+ "- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities\n",
|
|
|
+ "- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "1d0005d6-e928-4d1a-981b-534a40e19e56",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "!pip install llama-index langchain"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "21fe3849",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# use ServiceContext to configure the LLM used and the custom embeddings \n",
|
|
|
+ "from llama_index import ServiceContext\n",
|
|
|
+ "\n",
|
|
|
+ "# VectorStoreIndex is used to index custom data \n",
|
|
|
+ "from llama_index import VectorStoreIndex\n",
|
|
|
+ "\n",
|
|
|
+ "from langchain.llms import Replicate"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "73e8e661",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Next we set up the Replicate token."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "d9d76e33",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "from getpass import getpass\n",
|
|
|
+ "import os\n",
|
|
|
+ "\n",
|
|
|
+ "REPLICATE_API_TOKEN = getpass()\n",
|
|
|
+ "os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "f8ff812b",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "In this example we will use the [YOU.com](https://you.com/) search engine to augment the LLM's responses.\n",
|
|
|
+ "To use the You.com Search API, you can email api@you.com to request an API key. "
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "75275628-5235-4b55-8033-601c76107528",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "\n",
|
|
|
+ "YOUCOM_API_KEY = getpass()\n",
|
|
|
+ "os.environ[\"YOUCOM_API_KEY\"] = YOUCOM_API_KEY"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "cb210c7c",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We then call the Llama 2 model from replicate. \n",
|
|
|
+ "\n",
|
|
|
+ "We will use the llama 2 13b chat model. You can find more Llama 2 models by searching for them on the [Replicate model explore page](https://replicate.com/explore?query=llama).\n",
|
|
|
+ "You can add them here in the format: model_name/version"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "c12fc2cb",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# set llm to be using Llama2 hosted on Replicate\n",
|
|
|
+ "llama2_13b_chat = \"meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d\"\n",
|
|
|
+ "\n",
|
|
|
+ "llm = Replicate(\n",
|
|
|
+ " model=llama2_13b_chat,\n",
|
|
|
+ " model_kwargs={\"temperature\": 0.01, \"top_p\": 1, \"max_new_tokens\":500}\n",
|
|
|
+ ")"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "476d72da",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "effc9656-b18d-4d24-a80b-6066564a838b",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "\n",
|
|
|
+ "import requests\n",
|
|
|
+ "\n",
|
|
|
+ "query = \"Meta Connect\" # you can try other live data query about sports score, stock market and weather info \n",
|
|
|
+ "headers = {\"X-API-Key\": os.environ[\"YOUCOM_API_KEY\"]}\n",
|
|
|
+ "data = requests.get(\n",
|
|
|
+ " f\"https://api.ydc-index.io/search?query={query}\",\n",
|
|
|
+ " headers=headers,\n",
|
|
|
+ ").json()"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "8bed3baf-742e-473c-ada1-4459012a8a2c",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# check the query result in JSON\n",
|
|
|
+ "import json\n",
|
|
|
+ "\n",
|
|
|
+ "print(json.dumps(data, indent=2))"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "b196e697",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We then use the [`JSONLoader`](https://llamahub.ai/l/file-json) to extract the text from the returned data. The `JSONLoader` gives us the ability to load the data into LamaIndex.\n",
|
|
|
+ "In the next cell we show how to load the JSON result with key info stored as \"snippets\".\n",
|
|
|
+ "\n",
|
|
|
+ "However, you can also add the snippets in the query result to documents like below:\n",
|
|
|
+ "```python \n",
|
|
|
+ "from llama_index import Document\n",
|
|
|
+ "snippets = [snippet for hit in data[\"hits\"] for snippet in hit[\"snippets\"]]\n",
|
|
|
+ "documents = [Document(text=s) for s in snippets]\n",
|
|
|
+ "```\n",
|
|
|
+ "This can be handy if you just need to add a list of text strings to doc"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "7c40e73f-ca13-4f4a-a753-e613df3d389e",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# one way to load the JSON result with key info stored as \"snippets\"\n",
|
|
|
+ "from llama_index import download_loader\n",
|
|
|
+ "\n",
|
|
|
+ "JsonDataReader = download_loader(\"JsonDataReader\")\n",
|
|
|
+ "loader = JsonDataReader()\n",
|
|
|
+ "documents = loader.load_data([hit[\"snippets\"] for hit in data[\"hits\"]])\n"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "8e5e3b4e",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "With the data set up, we create a vector store for the data and a query engine for it.\n",
|
|
|
+ "\n",
|
|
|
+ "For our embeddings we will use `HuggingFaceEmbeddings` whose default embedding model is sentence-transformers/all-mpnet-base-v2. This model provides a good balance between speed and performance.\n",
|
|
|
+ "To change the default model, call `HuggingFaceEmbeddings(model_name=<another_embedding_model>)`. \n",
|
|
|
+ "\n",
|
|
|
+ "For more info see https://huggingface.co/blog/mteb. "
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "a5de3080-2c4b-479c-baba-793b3bee36ed",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# use HuggingFace embeddings \n",
|
|
|
+ "from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
|
|
|
+ "from llama_index import LangchainEmbedding\n",
|
|
|
+ "\n",
|
|
|
+ "\n",
|
|
|
+ "embeddings = LangchainEmbedding(HuggingFaceEmbeddings())\n",
|
|
|
+ "print(embeddings)\n",
|
|
|
+ "\n",
|
|
|
+ "# create a ServiceContext instance to use Llama2 and custom embeddings\n",
|
|
|
+ "service_context = ServiceContext.from_defaults(llm=llm, chunk_size=800, chunk_overlap=20, embed_model=embeddings)\n",
|
|
|
+ "\n",
|
|
|
+ "# create vector store index from the documents created above\n",
|
|
|
+ "index = VectorStoreIndex.from_documents(documents, service_context=service_context)\n",
|
|
|
+ "\n",
|
|
|
+ "# create query engine from the index\n",
|
|
|
+ "query_engine = index.as_query_engine(streaming=True)"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "markdown",
|
|
|
+ "id": "2c4ea012",
|
|
|
+ "metadata": {},
|
|
|
+ "source": [
|
|
|
+ "We are now ready to ask Llama 2 a question about the live data using our query engine."
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "de91a191-d0f2-498e-88dc-b2b43423e0e5",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# ask Llama2 a summary question about the search result\n",
|
|
|
+ "response = query_engine.query(\"give me a summary\")\n",
|
|
|
+ "response.print_response_stream()"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "72814b20-06aa-4da8-b4dd-f0b0d74a2ea0",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "# more questions\n",
|
|
|
+ "query_engine.query(\"what products were announced\").print_response_stream()"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "a65bc037-a689-476d-b529-0059a27bc949",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "query_engine.query(\"tell me more about Meta AI assistant\").print_response_stream()"
|
|
|
+ ]
|
|
|
+ },
|
|
|
+ {
|
|
|
+ "cell_type": "code",
|
|
|
+ "execution_count": null,
|
|
|
+ "id": "16a56542",
|
|
|
+ "metadata": {},
|
|
|
+ "outputs": [],
|
|
|
+ "source": [
|
|
|
+ "query_engine.query(\"what are Generative AI stickers\").print_response_stream()"
|
|
|
+ ]
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "metadata": {
|
|
|
+ "kernelspec": {
|
|
|
+ "display_name": "Python 3 (ipykernel)",
|
|
|
+ "language": "python",
|
|
|
+ "name": "python3"
|
|
|
+ },
|
|
|
+ "language_info": {
|
|
|
+ "codemirror_mode": {
|
|
|
+ "name": "ipython",
|
|
|
+ "version": 3
|
|
|
+ },
|
|
|
+ "file_extension": ".py",
|
|
|
+ "mimetype": "text/x-python",
|
|
|
+ "name": "python",
|
|
|
+ "nbconvert_exporter": "python",
|
|
|
+ "pygments_lexer": "ipython3",
|
|
|
+ "version": "3.8.18"
|
|
|
+ }
|
|
|
+ },
|
|
|
+ "nbformat": 4,
|
|
|
+ "nbformat_minor": 5
|
|
|
+}
|