## This demo app shows:
* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications.
* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex.

The LangChain package is used to facilitate the call to Llama2 hosted on Replicate.

**Note** We will be using Replicate to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. 
After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate.

We start by installing the necessary packages:
- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities
- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation.

In [None]:
!pip install llama-index langchain

In [1]:
# use ServiceContext to configure the LLM used and the custom embeddings 
from llama_index import ServiceContext

# VectorStoreIndex is used to index custom data 
from llama_index import VectorStoreIndex

from langchain.llms import Replicate

Next we set up the Replicate token.

In [2]:
from getpass import getpass
import os

REPLICATE_API_TOKEN = getpass()
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN

 ········


In this example we will use the [YOU.com](https://you.com/)search engine to augment the LLM's responses.
To use the You.com Search API, you can email api@you.com to request an API key. 

In [3]:

YOUCOM_API_KEY = getpass()
os.environ["YOUCOM_API_KEY"] = YOUCOM_API_KEY

 ········


We then call the Llama 2 model from replicate. In this example we will use the llama 2 13b chat model. You can find more Llama 2 models by searching for them on the [Replicate model explore page](https://replicate.com/explore?query=llama).
You can add them here in the format: model_name/version

In [4]:
# set llm to be using Llama2 hosted on Replicate
llama2_13b_chat = "meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d"

llm = Replicate(
    model=llama2_13b_chat,
    model_kwargs={"temperature": 0.01, "top_p": 1, "max_new_tokens":500}
)

Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic.

In [5]:

import requests

query = "Meta Connect" # you can try other live data query about sports score, stock market and weather info 
headers = {"X-API-Key": os.environ["YOUCOM_API_KEY"]}
data = requests.get(
    f"https://api.ydc-index.io/search?query={query}",
    headers=headers,
).json()

In [6]:
# check the query result in JSON
import json

print(json.dumps(data, indent=2))

{
  "hits": [
    {
      "description": "A two-day virtual event focused on AI and virtual, mixed and augmented realities.",
      "snippets": [
        "About Meta Connect\nWho should attend?\n<p>Everyone is invited to join us virtually for Connect, where you will: </p><ul><li>Get an in-depth look at new Meta products and hear how the metaverse is coming alive today</li><li>Experience the latest in AI innovation</li><li>Learn how to be the first to get your hands on Meta's new products and technologies</li><li>Hear from Meta\u2019s community of developers, builders and creators</li></ul>\nIs there a cost to attend?\n<p>Connect is free! You can catch all the content on Facebook. Participating in Meta Horizon Worlds will require a Quest device.</p>\nWill sessions be recorded?\n<p>Yes, select sessions will be recorded and available for on-demand viewing after the event.</p>",
        "Expanding reality, today and tomorrow\nJoin us virtually September 27 - 28, 2023\nWays to watch\nJoin u

We then use the [JSONLoader](https://llamahub.ai/l/file-json) to extract the text from the returned data. The JSONLoader gives us the ability to load the data into LamaIndex.
In this example we show how to load the JSON result with key info stored as "snippets"

You can also add the snippets in the query result to documents for example:
```python 
from llama_index import Document
snippets = [snippet for hit in data["hits"] for snippet in hit["snippets"]]
documents = [Document(text=s) for s in snippets]
```
This can be handy if you just need to add a list of text strings to doc

In [7]:
# one way to load the JSON result with key info stored as "snippets"
from llama_index import download_loader

JsonDataReader = download_loader("JsonDataReader")
loader = JsonDataReader()
documents = loader.load_data([hit["snippets"] for hit in data["hits"]])


With the data set up, we create a vector store for the data and a query engine for it.

For our embeddings we will use `HuggingFaceEmbeddings` whose default embedding model is sentence-transformers/all-mpnet-base-v2. This model provides a good balance between speed and performance
To change the default one, call HuggingFaceEmbeddings(model_name=<another_embedding_model>). 
For more info see https://huggingface.co/blog/mteb. 

In [8]:
# use HuggingFace embeddings 
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index import LangchainEmbedding


embeddings = LangchainEmbedding(HuggingFaceEmbeddings())
print(embeddings)

# create a ServiceContext instance to use Llama2 and custom embeddings
service_context = ServiceContext.from_defaults(llm=llm, chunk_size=800, chunk_overlap=20, embed_model=embeddings)

# create vector store index from the documents created above
index = VectorStoreIndex.from_documents(documents, service_context=service_context)

# create query engine from the index
query_engine = index.as_query_engine(streaming=True)

model_name='sentence-transformers/all-mpnet-base-v2' embed_batch_size=10 callback_manager=<llama_index.callbacks.base.CallbackManager object at 0x106ff3760>


We are now ready to ask Llama 2 a question about the live data using our query engine.

In [9]:
# ask Llama2 a summary question about the search result
response = query_engine.query("give me a summary")
response.print_response_stream()

 Sure! Here's a summary of the provided text:

Meta Connect is an annual conference hosted by Meta where they showcase new hardware and technology. This year, they announced the Meta Quest 3, a standalone VR headset with improved passthrough tech, higher resolution displays, and better graphics. They also introduced the Rayban Meta Smartglasses, which are stylish streaming glasses with built-in AI models. Additionally, Meta launched AI Studio, a platform for businesses to create AI chatbots for their messaging services. The event also featured keynote speakers and developer sessions, and attendees could explore a virtual world inspired by Meta's Men

In [10]:
# more questions
query_engine.query("what products were announced").print_response_stream()

 Based on the context information provided, the following products were announced:

1. New Meta AI assistant
2. Facebook-streaming glasses
3. Next generation of Meta Quest software
4. Ray-Ban smart glasses
5. Meta AI bots for consumers
6. Virtual screen that can float in either a virtual or mixed-reality space (coming in December)
7. Generative AI stickers for Meta's messaging apps
8. Xbox Cloud Gaming.

In [11]:
query_engine.query("tell me more about Meta AI assistant").print_response_stream()

 Sure! Based on the provided context information, here's what I found out about the Meta AI assistant:

Meta has announced a new AI assistant called "Meta AI" that will soon come to its newly announced Quest 3 VR headset. This assistant can help plan trips with friends in a group chat, answer general-knowledge questions, and search the internet to provide real-time web results. It's powered by a custom model that leverages technology from Llama 2 and the company's latest large language model (LLM) research.

In text-based chats, Meta AI has access to real-time information through the company's search partnership with Bing. The assistant is designed to be interactive like a person and will be available on WhatsApp, Messenger, Instagram, and Ray-Ban smart glasses.

Additionally, there are 28 more AIs that users can message on these platforms, each with unique backstories. These AIs are part of a new universe of characters that Meta is introducing, which aim to bring new forms of creativi

In [12]:
query_engine.query("what are Generative AI stickers").print_response_stream()

 Based on the provided context information, generative AI stickers refer to a new feature announced by Meta that enables users to generate customized stickers for their chats and stories using artificial intelligence technology from Llama 2 and the foundational model for image generation called Emu. These stickers use text prompts to create multiple unique, high-quality stickers in seconds, providing an infinitely more option to convey how you're feeling at any moment. The feature is currently rolling out to select English language users over the next month in WhatsApp, Messenger, Instagram, and Facebook Stories.