## This demo app shows:
* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications
* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex
The LangChain package is used to facilitate the call to Llama2 hosted on Replicate
**Note** We will be using Replicate to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while.
After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate.
We start by installing the necessary packages:
- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities
- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation.
Next we set up the Replicate token.
In this example we will use the [YOU.com](https://you.com/) search engine to augment the LLM's responses.
To use the You.com Search API, you can email api@you.com to request an API key.
We then call the Llama 2 model from replicate.
We will use the llama 2 13b chat model. You can find more Llama 2 models by searching for them on the [Replicate model explore page](https://replicate.com/explore?query=llama).
You can add them here in the format: model_name/version
Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic.
We then use the [`JSONLoader`](https://llamahub.ai/l/file-json) to extract the text from the returned data. The `JSONLoader` gives us the ability to load the data into LamaIndex.
In the next cell we show how to load the JSON result with key info stored as "snippets".
However, you can also add the snippets in the query result to documents like below:
```python
from llama_index import Document
snippets = [snippet for hit in data["hits"] for snippet in hit["snippets"]]
documents = [Document(text=s) for s in snippets]
```
This can be handy if you just need to add a list of text strings to doc
With the data set up, we create a vector store for the data and a query engine for it.
For our embeddings we will use `HuggingFaceEmbeddings` whose default embedding model is sentence-transformers/all-mpnet-base-v2. This model provides a good balance between speed and performance.
To change the default model, call `HuggingFaceEmbeddings(model_name=
)`.
For more info see https://huggingface.co/blog/mteb. We are now ready to ask Llama 2 a question about the live data using our query engine.