{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "RJSnI0Xy-kCm"
},
"source": [
"![Meta---Logo@1x.jpg]()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LERqQn5v8-ak"
},
"source": [
"# **Getting to know Llama 3: Everything you need to start building**\n",
"Our goal in this session is to provide a guided tour of Llama 3 with comparison with Llama 2, including understanding different Llama 3 models, how and where to access them, Generative AI and Chatbot architectures, prompt engineering, RAG (Retrieval Augmented Generation), Fine-tuning and more. All this is implemented with a starter code for you to take it and use it in your Llama 3 projects."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ioVMNcTesSEk"
},
"source": [
"### **0 - Prerequisites**\n",
"* Basic understanding of Large Language Models\n",
"* Basic understanding of Python"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"executionInfo": {
"elapsed": 248,
"status": "ok",
"timestamp": 1695832228254,
"user": {
"displayName": "Amit Sangani",
"userId": "11552178012079240149"
},
"user_tz": 420
},
"id": "ktEA7qXmwdUM"
},
"outputs": [],
"source": [
"# presentation layer code\n",
"\n",
"import base64\n",
"from IPython.display import Image, display\n",
"import matplotlib.pyplot as plt\n",
"\n",
"def mm(graph):\n",
" graphbytes = graph.encode(\"ascii\")\n",
" base64_bytes = base64.b64encode(graphbytes)\n",
" base64_string = base64_bytes.decode(\"ascii\")\n",
" display(Image(url=\"https://mermaid.ink/img/\" + base64_string))\n",
"\n",
"def genai_app_arch():\n",
" mm(\"\"\"\n",
" flowchart TD\n",
" A[Users] --> B(Applications e.g. mobile, web)\n",
" B --> |Hosted API|C(Platforms e.g. Custom, HuggingFace, Replicate)\n",
" B -- optional --> E(Frameworks e.g. LangChain)\n",
" C-->|User Input|D[Llama 3]\n",
" D-->|Model Output|C\n",
" E --> C\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def rag_arch():\n",
" mm(\"\"\"\n",
" flowchart TD\n",
" A[User Prompts] --> B(Frameworks e.g. LangChain)\n",
" B <--> |Database, Docs, XLS|C[fa:fa-database External Data]\n",
" B -->|API|D[Llama 3]\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def llama2_family():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" llama-2 --> llama-2-7b\n",
" llama-2 --> llama-2-13b\n",
" llama-2 --> llama-2-70b\n",
" llama-2-7b --> llama-2-7b-chat\n",
" llama-2-13b --> llama-2-13b-chat\n",
" llama-2-70b --> llama-2-70b-chat\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def llama3_family():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" llama-3 --> llama-3-8b\n",
" llama-3 --> llama-3-70b\n",
" llama-3-8b --> llama-3-8b-base\n",
" llama-3-8b --> llama-3-8b-instruct\n",
" llama-3-70b --> llama-3-70b-base\n",
" llama-3-70b --> llama-3-70b-instruct\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def apps_and_llms():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" users --> apps\n",
" apps --> frameworks\n",
" frameworks --> platforms\n",
" platforms --> Llama 2\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"import ipywidgets as widgets\n",
"from IPython.display import display, Markdown\n",
"\n",
"# Create a text widget\n",
"API_KEY = widgets.Password(\n",
" value='',\n",
" placeholder='',\n",
" description='API_KEY:',\n",
" disabled=False\n",
")\n",
"\n",
"def md(t):\n",
" display(Markdown(t))\n",
"\n",
"def bot_arch():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" user --> prompt\n",
" prompt --> i_safety\n",
" i_safety --> context\n",
" context --> Llama_3\n",
" Llama_3 --> output\n",
" output --> o_safety\n",
" i_safety --> memory\n",
" o_safety --> memory\n",
" memory --> context\n",
" o_safety --> user\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def fine_tuned_arch():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" Custom_Dataset --> Pre-trained_Llama\n",
" Pre-trained_Llama --> Fine-tuned_Llama\n",
" Fine-tuned_Llama --> RLHF\n",
" RLHF --> |Loss:Cross-Entropy|Fine-tuned_Llama\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def load_data_faiss_arch():\n",
" mm(\"\"\"\n",
" graph LR;\n",
" documents --> textsplitter\n",
" textsplitter --> embeddings\n",
" embeddings --> vectorstore\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n",
"\n",
"def mem_context():\n",
" mm(\"\"\"\n",
" graph LR\n",
" context(text)\n",
" user_prompt --> context\n",
" instruction --> context\n",
" examples --> context\n",
" memory --> context\n",
" context --> tokenizer\n",
" tokenizer --> embeddings\n",
" embeddings --> LLM\n",
" classDef default fill:#CCE6FF,stroke:#84BCF5,textColor:#1C2B33,fontFamily:trebuchet ms;\n",
" \"\"\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i4Np_l_KtIno"
},
"source": [
"### **1 - Understanding Llama 3**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PGPSI3M5PGTi"
},
"source": [
"### **1.1 - What is Llama 3?**\n",
"\n",
"* State of the art (SOTA), Open Source LLM\n",
"* 8B, 70B\n",
"* Choosing model: Size, Quality, Cost, Speed\n",
"* Pretrained + Chat\n",
"* [Meta Llama 3 Blog](https://ai.meta.com/blog/meta-llama-3/)\n",
"* [Getting Started with Meta Llama](https://llama.meta.com/docs/get-started)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 240
},
"executionInfo": {
"elapsed": 248,
"status": "ok",
"timestamp": 1695832233087,
"user": {
"displayName": "Amit Sangani",
"userId": "11552178012079240149"
},
"user_tz": 420
},
"id": "OXRCC7wexZXd",
"outputId": "1feb1918-df4b-4cec-d09e-ffe55c12090b"
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"llama2_family()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"llama3_family()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aYeHVVh45bdT"
},
"source": [
"### **1.2 - Accessing Llama 3**\n",
"* Download + Self Host (i.e. [download Llama](https://ai.meta.com/resources/models-and-libraries/llama-downloads))\n",
"* Hosted API Platform (e.g. [Groq](https://console.groq.com/), [Replicate](https://replicate.com/meta/meta-llama-3-8b-instruct), [Together](https://api.together.xyz/playground/language/meta-llama/Llama-3-8b-hf), [Anyscale](https://app.endpoints.anyscale.com/playground))\n",
"\n",
"* Hosted Container Platform (e.g. [Azure](https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/introducing-llama-2-on-azure/ba-p/3881233), [AWS](https://aws.amazon.com/blogs/machine-learning/llama-2-foundation-models-from-meta-are-now-available-in-amazon-sagemaker-jumpstart/), [GCP](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/139))\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kBuSay8vtzL4"
},
"source": [
"### **1.3 - Use Cases of Llama 3**\n",
"* Content Generation\n",
"* Summarization\n",
"* General Chatbots\n",
"* RAG (Retrieval Augmented Generation): Chat about Your Own Data\n",
"* Fine-tuning\n",
"* Agents"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sd54g0OHuqBY"
},
"source": [
"## **2 - Using and Comparing Llama 3 and Llama 2**\n",
"\n",
"In this notebook, we will use the Llama 2 70b chat and Llama 3 8b and 70b instruct models hosted on [Groq](https://console.groq.com/). You'll need to first [sign in](https://console.groq.com/) with your github or gmail account, then get an [API token](https://console.groq.com/keys) to try Groq out for free. (Groq runs Llama models very fast and they only support one Llama 2 model: the Llama 2 70b chat).\n",
"\n",
"**Note: You can also use other Llama hosting providers such as [Replicate](https://replicate.com/blog/run-llama-3-with-an-api?input=python), [Togther](https://docs.together.ai/docs/quickstart). Simply click the links here to see how to run `pip install` and use their freel trial API key with example code to modify the following three cells in 2.1 and 2.2.**\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "h3YGMDJidHtH"
},
"source": [
"### **2.1 - Install dependencies**"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "VhN6hXwx7FCp"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting groq\n",
" Downloading groq-0.5.0-py3-none-any.whl.metadata (12 kB)\n",
"Requirement already satisfied: anyio<5,>=3.5.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (4.3.0)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (1.9.0)\n",
"Requirement already satisfied: httpx<1,>=0.23.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (0.27.0)\n",
"Requirement already satisfied: pydantic<3,>=1.9.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (2.7.0)\n",
"Requirement already satisfied: sniffio in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (1.3.1)\n",
"Requirement already satisfied: typing-extensions<5,>=4.7 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq) (4.9.0)\n",
"Requirement already satisfied: idna>=2.8 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from anyio<5,>=3.5.0->groq) (3.6)\n",
"Requirement already satisfied: certifi in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpx<1,>=0.23.0->groq) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpx<1,>=0.23.0->groq) (1.0.5)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->groq) (0.14.0)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->groq) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.18.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->groq) (2.18.1)\n",
"Downloading groq-0.5.0-py3-none-any.whl (75 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.0/75.0 kB\u001b[0m \u001b[31m3.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hInstalling collected packages: groq\n",
"Successfully installed groq-0.5.0\n"
]
}
],
"source": [
"!pip install groq"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### **2.2 - Create helpers for Llama 2 and Llama 3**\n",
"First, set your Groq API token as environment variables.\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "8hkWpqWD28ho"
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"GROQ_API_TOKEN = getpass()\n",
"\n",
"os.environ[\"GROQ_API_KEY\"] = GROQ_API_TOKEN"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create Llama 2 and Llama 3 helper functions - for chatbot type of apps, we'll use Llama 3 8b/70b instruct models, not the base models."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"id": "bVCHZmETk36v"
},
"outputs": [],
"source": [
"from groq import Groq\n",
"\n",
"client = Groq(\n",
" api_key=os.environ.get(\"GROQ_API_KEY\"),\n",
")\n",
"\n",
"def llama2(prompt, temperature=0.0, input_print=True):\n",
" chat_completion = client.chat.completions.create(\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": prompt,\n",
" }\n",
" ],\n",
" model=\"llama2-70b-4096\",\n",
" temperature=temperature,\n",
" )\n",
"\n",
" return (chat_completion.choices[0].message.content)\n",
"\n",
"def llama3_8b(prompt, temperature=0.0, input_print=True):\n",
" chat_completion = client.chat.completions.create(\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": prompt,\n",
" }\n",
" ],\n",
" model=\"llama3-8b-8192\",\n",
" temperature=temperature,\n",
" )\n",
"\n",
" return (chat_completion.choices[0].message.content)\n",
"\n",
"def llama3_70b(prompt, temperature=0.0, input_print=True):\n",
" chat_completion = client.chat.completions.create(\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": prompt,\n",
" }\n",
" ],\n",
" model=\"llama3-70b-8192\",\n",
" temperature=temperature,\n",
" )\n",
"\n",
" return (chat_completion.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5Jxq0pmf6L73"
},
"source": [
"### **2.3 - Basic QA with Llama 2 and 3**"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"id": "H93zZBIk6tNU"
},
"outputs": [
{
"data": {
"text/markdown": [
"The typical color of a llama is a light brown or beige color, often with a darker brown or black patches on their ears, neck, and legs. Some llamas may also have a white or pale colored patch on their forehead. However, it's worth noting that llamas can come in a wide range of colors, including white, black, gray, and various shades of brown and red. Some breeds, such as the Suri alpaca, can have a more diverse range of colors, including shades of red, orange, and purple."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"prompt = \"The typical color of a llama is: \"\n",
"output = llama2(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"The typical color of a llama is white! However, llamas can also come in a variety of other colors, including:\n",
"\n",
"* Suri: a soft, fluffy coat that can be white, cream, or light brown\n",
"* Huacaya: a dense, soft coat that can be white, cream, or various shades of brown, gray, or black\n",
"* Rose-gray: a light grayish-pink color\n",
"* Dark brown: a rich, dark brown color\n",
"* Black: a glossy black coat\n",
"* Red: a reddish-brown color\n",
"* Cream: a light cream or beige color\n",
"* Fawn: a light reddish-brown color\n",
"\n",
"It's worth noting that llamas can also have various patterns and markings on their coats, such as white markings on the face, legs, or belly."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Brown."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(\"The typical color of a llama is what? Answer in one word.\")\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cWs_s9y-avIT"
},
"source": [
"## **3 - Chat conversation**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "r4DyTLD5ys6t"
},
"source": [
"### **3.1 - Single-turn chat**"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"id": "EMM_egWMys6u"
},
"outputs": [
{
"data": {
"text/markdown": [
"Sure, here's a short answer:\n",
"\n",
"The average lifespan of a llama is 15-25 years."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"prompt_chat = \"What is the average lifespan of a Llama? Answer the question in few words.\"\n",
"output = llama2(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"id": "sZ7uVKDYucgi"
},
"outputs": [
{
"data": {
"text/markdown": [
"15-20 years."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"id": "WQl3wmfbyBQ1"
},
"outputs": [
{
"data": {
"text/markdown": [
"The lion, tiger, leopard, and jaguar are all members of the Felidae family."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# example without previous context. LLM's are stateless and cannot understand \"they\" without previous context\n",
"prompt_chat = \"What animal family are they? Answer the question in few words.\"\n",
"output = llama2(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Canidae."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"I'm happy to help! However, I don't see a specific animal mentioned in your question. Could you please clarify or provide more context about which animal you're referring to?"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_70b(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 3 70b doesn't hallucinate.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### **3.2 - Multi-turn chat**\n",
"Chat app requires us to send in previous context to LLM to get in valid responses. Below is an example of Multi-turn chat."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"id": "t7SZe5fT3HG3"
},
"outputs": [
{
"data": {
"text/markdown": [
"Assistant: Llamas are part of the Camelidae family, which includes camels and alpacas."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# example of multi-turn chat, with storing previous context\n",
"prompt_chat = \"\"\"\n",
"User: What is the average lifespan of a Llama?\n",
"Assistant: 15-20 years.\n",
"User: What animal family are they?\n",
"\"\"\"\n",
"output = llama2(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Llamas belong to the camelid family (Camelidae)."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 2 and 3 both behave well for using the chat history for follow up questions.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### **3.3 - Multi-turn chat with more instruction**\n",
"Adding the instructon \"Answer the question with one word\" to see the difference of Llama 2 and 3."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Camelids"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# example of multi-turn chat, with storing previous context\n",
"prompt_chat = \"\"\"\n",
"User: What is the average lifespan of a Llama?\n",
"Assistant: Sure! The average lifespan of a llama is around 20-30 years.\n",
"User: What animal family are they?\n",
"\n",
"Answer the question with one word.\n",
"\"\"\"\n",
"output = llama2(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Camelid."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt_chat)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Both Llama 3 8b and Llama 2 70b follows instructions (e.g. \"Answer the question with one word\") better than Llama 2 7b.**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "moXnmJ_xyD10"
},
"source": [
"### **4.2 - Prompt Engineering**\n",
"* Prompt engineering refers to the science of designing effective prompts to get desired responses\n",
"\n",
"* Helps reduce hallucination\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "t-v-FeZ4ztTB"
},
"source": [
"#### **4.2.1 - In-Context Learning (e.g. Zero-shot, Few-shot)**\n",
" * In-context learning - specific method of prompt engineering where demonstration of task are provided as part of prompt.\n",
" 1. Zero-shot learning - model is performing tasks without any\n",
"input examples.\n",
" 2. Few or “N-Shot” Learning - model is performing and behaving based on input examples in user's prompt."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"id": "6W71MFNZyRkQ"
},
"outputs": [
{
"data": {
"text/markdown": [
"Curious"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Zero-shot example. To get positive/negative/neutral sentiment, we need to give examples in the prompt\n",
"prompt = '''\n",
"Classify: I saw a Gecko.\n",
"Sentiment: ?\n",
"\n",
"Give one word response.\n",
"'''\n",
"output = llama2(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"id": "MCQRjf1Y1RYJ"
},
"outputs": [
{
"data": {
"text/markdown": [
"Neutral"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 3 has different opinions than Llama 2.**"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"id": "8UmdlTmpDZxA"
},
"outputs": [
{
"data": {
"text/markdown": [
"Neutral"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# By giving examples to Llama, it understands the expected output format.\n",
"\n",
"prompt = '''\n",
"Classify: I love Llamas!\n",
"Sentiment: Positive\n",
"Classify: I dont like Snakes.\n",
"Sentiment: Negative\n",
"Classify: I saw a Gecko.\n",
"Sentiment:\n",
"\n",
"Give one word response.\n",
"'''\n",
"\n",
"output = llama2(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"id": "M_EcsUo1zqFD"
},
"outputs": [
{
"data": {
"text/markdown": [
"Neutral"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_8b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 2, with few shots, has the same output \"Neutral\" as Llama 3.**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mbr124Y197xl"
},
"source": [
"#### **4.2.2 - Chain of Thought**\n",
"\"Chain of thought\" enables complex reasoning through logical step by step thinking and generates meaningful and contextually relevant responses."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"id": "Xn8zmLBQzpgj"
},
"outputs": [
{
"data": {
"text/markdown": [
"Seven."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Standard prompting\n",
"prompt = '''\n",
"Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls.\n",
"How many tennis balls does Llama have?\n",
"\n",
"Answer in one word.\n",
"'''\n",
"\n",
"output = llama3_8b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"id": "lKNOj79o1Kwu"
},
"outputs": [
{
"data": {
"text/markdown": [
"Eleven."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_70b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 3-8b did not get the right answer because it was asked to answer in one word.**"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Let's break it down step by step!\n",
"\n",
"Llama started with 5 tennis balls.\n",
"\n",
"It buys 2 more cans of tennis balls. Each can has 3 tennis balls, so that's a total of 2 x 3 = 6 new tennis balls.\n",
"\n",
"Adding the new tennis balls to the original 5, Llama now has:\n",
"5 (initial tennis balls) + 6 (new tennis balls) = 11 tennis balls\n",
"\n",
"So, Llama now has 11 tennis balls!"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# By default, Llama 3 models follow \"Chain-Of-Thought\" prompting\n",
"prompt = '''\n",
"Llama started with 5 tennis balls. It buys 2 more cans of tennis balls. Each can has 3 tennis balls.\n",
"How many tennis balls does Llama have?\n",
"'''\n",
"\n",
"output = llama3_8b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Llama started with 5 tennis balls. Then it bought 2 cans of tennis balls. Each can has 3 tennis balls. So that is 2 x 3 = 6 tennis balls. 5 + 6 = 11.\n",
"#### 11"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"output = llama3_70b(prompt)\n",
"md(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: By default, Llama 3 models identify word problems and solves it step by step!**"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"**Yes**\n",
"\n",
"Here's the step-by-step breakdown:\n",
"\n",
"1. We have 15 people who want to go to the restaurant.\n",
"2. Two people have cars that can seat 5 people each. This means we can accommodate 10 people in cars (2 cars x 5 seats per car).\n",
"3. We still have 5 people left who can't fit in the cars. We'll consider the motorcycles now.\n",
"4. Two people have motorcycles that can fit 2 people each. This means we can accommodate 4 people in motorcycles (2 motorcycles x 2 seats per motorcycle).\n",
"5. We still have 1 person left who can't fit in the cars or motorcycles. Unfortunately, we can't fit all 15 people in cars or motorcycles.\n",
"6. However, we can fit 10 people in cars (10 seats available) and 4 people in motorcycles (4 seats available), which adds up to 14 people. We still have 1 person left over.\n",
"7. Since we can't fit all 15 people in cars or motorcycles, we can't take everyone to the restaurant.\n",
"\n",
"However, we can take 14 people to the restaurant, which is the maximum number of people we can accommodate using the available cars and motorcycles.\n"
]
}
],
"source": [
"prompt = \"\"\"\n",
"15 of us want to go to a restaurant.\n",
"Two of them have cars\n",
"Each car can seat 5 people.\n",
"Two of us have motorcycles.\n",
"Each motorcycle can fit 2 people.\n",
"Can we all get to the restaurant by car or motorcycle?\n",
"Think step by step.\n",
"Provide the answer as a single yes/no answer first.\n",
"Then explain each intermediate step.\n",
"\"\"\"\n",
"output = llama3_8b(prompt)\n",
"print(output)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"**Answer:** NO\n",
"\n",
"Here's the step-by-step explanation:\n",
"\n",
"1. We have 15 people who want to go to the restaurant.\n",
"2. We have 2 cars, each of which can seat 5 people. So, the cars can accommodate a total of 2 x 5 = 10 people.\n",
"3. This leaves 15 - 10 = 5 people who still need transportation.\n",
"4. We have 2 motorcycles, each of which can fit 2 people. So, the motorcycles can accommodate a total of 2 x 2 = 4 people.\n",
"5. This still leaves 5 - 4 = 1 person who doesn't have a ride.\n",
"6. Since we can't fit all 15 people in the available cars and motorcycles, the answer is NO, we cannot all get to the restaurant by car or motorcycle.\n"
]
}
],
"source": [
"output = llama3_70b(prompt)\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note: Llama 3 70b model works correctly in this example.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Summary: Llama 2 often needs encourgement for step by step thinking to correctly reasoning. Llama 3 understands, reasons and explains better, making chain of thought unnecessary in the cases above.**"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "C7tDW-AH770Y"
},
"source": [
"### **4.3 - Retrieval Augmented Generation (RAG)**\n",
"* Prompt Eng Limitations - Knowledge cutoff & lack of specialized data\n",
"\n",
"* Retrieval Augmented Generation(RAG) allows us to retrieve snippets of information from external data sources and augment it to the user's prompt to get tailored responses from Llama 2.\n",
"\n",
"For our demo, we are going to download an external PDF file from a URL and query against the content in the pdf file to get contextually relevant information back with the help of Llama!\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 259
},
"executionInfo": {
"elapsed": 329,
"status": "ok",
"timestamp": 1695832267093,
"user": {
"displayName": "Amit Sangani",
"userId": "11552178012079240149"
},
"user_tz": 420
},
"id": "Fl1LPltpRQD9",
"outputId": "4410c9bf-3559-4a05-cebb-a5731bb094c1"
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"rag_arch()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JJaGMLl_4vYm"
},
"source": [
"#### **4.3.1 - LangChain**\n",
"LangChain is a framework that helps make it easier to implement RAG."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: langchain in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (0.1.16)\n",
"Requirement already satisfied: PyYAML>=5.3 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (6.0.1)\n",
"Requirement already satisfied: SQLAlchemy<3,>=1.4 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (2.0.29)\n",
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (3.9.4)\n",
"Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (0.6.4)\n",
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (1.33)\n",
"Requirement already satisfied: langchain-community<0.1,>=0.0.32 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (0.0.34)\n",
"Requirement already satisfied: langchain-core<0.2.0,>=0.1.42 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (0.1.45)\n",
"Requirement already satisfied: langchain-text-splitters<0.1,>=0.0.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (0.0.1)\n",
"Requirement already satisfied: langsmith<0.2.0,>=0.1.17 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (0.1.47)\n",
"Requirement already satisfied: numpy<2,>=1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (1.26.4)\n",
"Requirement already satisfied: pydantic<3,>=1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (2.7.0)\n",
"Requirement already satisfied: requests<3,>=2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (2.31.0)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain) (8.2.3)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (23.2.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.5)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.4)\n",
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (3.21.1)\n",
"Requirement already satisfied: typing-inspect<1,>=0.4.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (0.9.0)\n",
"Requirement already satisfied: jsonpointer>=1.9 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from jsonpatch<2.0,>=1.33->langchain) (2.4)\n",
"Requirement already satisfied: packaging<24.0,>=23.2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain) (23.2)\n",
"Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langsmith<0.2.0,>=0.1.17->langchain) (3.10.0)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1->langchain) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.18.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1->langchain) (2.18.1)\n",
"Requirement already satisfied: typing-extensions>=4.6.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1->langchain) (4.9.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langchain) (3.3.2)\n",
"Requirement already satisfied: idna<4,>=2.5 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langchain) (3.6)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langchain) (2.2.0)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langchain) (2024.2.2)\n",
"Requirement already satisfied: mypy-extensions>=0.3.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain) (1.0.0)\n",
"Requirement already satisfied: sentence-transformers in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (2.6.1)\n",
"Requirement already satisfied: transformers<5.0.0,>=4.32.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (4.40.0)\n",
"Requirement already satisfied: tqdm in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (4.66.2)\n",
"Requirement already satisfied: torch>=1.11.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (2.3.0.dev20240205)\n",
"Requirement already satisfied: numpy in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (1.26.4)\n",
"Requirement already satisfied: scikit-learn in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (1.4.2)\n",
"Requirement already satisfied: scipy in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (1.13.0)\n",
"Requirement already satisfied: huggingface-hub>=0.15.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (0.22.2)\n",
"Requirement already satisfied: Pillow in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sentence-transformers) (10.3.0)\n",
"Requirement already satisfied: filelock in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (3.13.1)\n",
"Requirement already satisfied: fsspec>=2023.5.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2024.2.0)\n",
"Requirement already satisfied: packaging>=20.9 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (23.2)\n",
"Requirement already satisfied: pyyaml>=5.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (6.0.1)\n",
"Requirement already satisfied: requests in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (2.31.0)\n",
"Requirement already satisfied: typing-extensions>=3.7.4.3 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from huggingface-hub>=0.15.1->sentence-transformers) (4.9.0)\n",
"Requirement already satisfied: sympy in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers) (1.11.1)\n",
"Requirement already satisfied: networkx in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers) (3.0rc1)\n",
"Requirement already satisfied: jinja2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from torch>=1.11.0->sentence-transformers) (3.1.2)\n",
"Requirement already satisfied: regex!=2019.12.17 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (2023.12.25)\n",
"Requirement already satisfied: tokenizers<0.20,>=0.19 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (0.19.1)\n",
"Requirement already satisfied: safetensors>=0.4.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from transformers<5.0.0,>=4.32.0->sentence-transformers) (0.4.2)\n",
"Requirement already satisfied: joblib>=1.2.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from scikit-learn->sentence-transformers) (1.4.0)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from scikit-learn->sentence-transformers) (3.4.0)\n",
"Requirement already satisfied: MarkupSafe>=2.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from jinja2->torch>=1.11.0->sentence-transformers) (2.1.3)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.3.2)\n",
"Requirement already satisfied: idna<4,>=2.5 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (3.6)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2.2.0)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests->huggingface-hub>=0.15.1->sentence-transformers) (2024.2.2)\n",
"Requirement already satisfied: mpmath>=0.19 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from sympy->torch>=1.11.0->sentence-transformers) (1.2.1)\n",
"Requirement already satisfied: faiss-cpu in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (1.8.0)\n",
"Requirement already satisfied: numpy in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from faiss-cpu) (1.26.4)\n",
"Requirement already satisfied: bs4 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (0.0.2)\n",
"Requirement already satisfied: beautifulsoup4 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from bs4) (4.12.3)\n",
"Requirement already satisfied: soupsieve>1.2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from beautifulsoup4->bs4) (2.5)\n",
"Collecting langchain-groq\n",
" Downloading langchain_groq-0.1.2-py3-none-any.whl.metadata (2.8 kB)\n",
"Requirement already satisfied: groq<1,>=0.4.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-groq) (0.5.0)\n",
"Requirement already satisfied: langchain-core<0.2.0,>=0.1.42 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-groq) (0.1.45)\n",
"Requirement already satisfied: anyio<5,>=3.5.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (4.3.0)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (1.9.0)\n",
"Requirement already satisfied: httpx<1,>=0.23.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (0.27.0)\n",
"Requirement already satisfied: pydantic<3,>=1.9.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (2.7.0)\n",
"Requirement already satisfied: sniffio in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (1.3.1)\n",
"Requirement already satisfied: typing-extensions<5,>=4.7 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from groq<1,>=0.4.1->langchain-groq) (4.9.0)\n",
"Requirement already satisfied: PyYAML>=5.3 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain-groq) (6.0.1)\n",
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain-groq) (1.33)\n",
"Requirement already satisfied: langsmith<0.2.0,>=0.1.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain-groq) (0.1.47)\n",
"Requirement already satisfied: packaging<24.0,>=23.2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain-groq) (23.2)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langchain-core<0.2.0,>=0.1.42->langchain-groq) (8.2.3)\n",
"Requirement already satisfied: idna>=2.8 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from anyio<5,>=3.5.0->groq<1,>=0.4.1->langchain-groq) (3.6)\n",
"Requirement already satisfied: certifi in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpx<1,>=0.23.0->groq<1,>=0.4.1->langchain-groq) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpx<1,>=0.23.0->groq<1,>=0.4.1->langchain-groq) (1.0.5)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->groq<1,>=0.4.1->langchain-groq) (0.14.0)\n",
"Requirement already satisfied: jsonpointer>=1.9 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from jsonpatch<2.0,>=1.33->langchain-core<0.2.0,>=0.1.42->langchain-groq) (2.4)\n",
"Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langsmith<0.2.0,>=0.1.0->langchain-core<0.2.0,>=0.1.42->langchain-groq) (3.10.0)\n",
"Requirement already satisfied: requests<3,>=2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from langsmith<0.2.0,>=0.1.0->langchain-core<0.2.0,>=0.1.42->langchain-groq) (2.31.0)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->groq<1,>=0.4.1->langchain-groq) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.18.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from pydantic<3,>=1.9.0->groq<1,>=0.4.1->langchain-groq) (2.18.1)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langsmith<0.2.0,>=0.1.0->langchain-core<0.2.0,>=0.1.42->langchain-groq) (3.3.2)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages (from requests<3,>=2->langsmith<0.2.0,>=0.1.0->langchain-core<0.2.0,>=0.1.42->langchain-groq) (2.2.0)\n",
"Downloading langchain_groq-0.1.2-py3-none-any.whl (11 kB)\n",
"Installing collected packages: langchain-groq\n",
"Successfully installed langchain-groq-0.1.2\n"
]
}
],
"source": [
"!pip install langchain\n",
"!pip install sentence-transformers\n",
"!pip install faiss-cpu\n",
"!pip install bs4\n",
"!pip install langchain-groq"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### **4.3.2 - LangChain Q&A Retriever**\n",
"* ConversationalRetrievalChain\n",
"\n",
"* Query the Source documents\n"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"id": "gAV2EkZqcruF"
},
"outputs": [],
"source": [
"from langchain_community.embeddings import HuggingFaceEmbeddings\n",
"from langchain_community.vectorstores import FAISS\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"import bs4\n",
"\n",
"# Step 1: Load the document from a web url\n",
"loader = WebBaseLoader([\"https://huggingface.co/blog/llama3\"])\n",
"documents = loader.load()\n",
"\n",
"# Step 2: Split the document into chunks with a specified chunk size\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)\n",
"all_splits = text_splitter.split_documents(documents)\n",
"\n",
"# Step 3: Store the document into a vector store with a specific embedding model\n",
"vectorstore = FAISS.from_documents(all_splits, HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-mpnet-base-v2\"))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/jeffxtang/anaconda3/envs/python3.11/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
" warn_deprecated(\n"
]
},
{
"data": {
"text/markdown": [
"According to the provided context, the main changes in Llama 3 compared to Llama 2 are:\n",
"\n",
"1. A new tokenizer that expands the vocabulary size to 128,256 (from 32K tokens in the previous version), which can encode text more efficiently and potentially yield stronger multilingualism.\n",
"2. The introduction of two sizes: 8B for efficient deployment and development on consumer-size GPU, and 70B for large-scale AI native applications.\n",
"3. The availability of base and instruction-tuned variants for each model size.\n",
"4. The release of Llama Guard 2, a new version of Llama Guard that was fine-tuned on Llama 3 8B."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from langchain_groq import ChatGroq\n",
"llm = ChatGroq(temperature=0, model_name=\"llama3-8b-8192\")\n",
"\n",
"from langchain.chains import ConversationalRetrievalChain\n",
"chain = ConversationalRetrievalChain.from_llm(llm,\n",
" vectorstore.as_retriever(),\n",
" return_source_documents=True)\n",
"\n",
"result = chain({\"question\": \"What’s new with Llama 3?\", \"chat_history\": []})\n",
"md(result['answer'])\n"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"id": "NmEhBe3Kiyre"
},
"outputs": [
{
"data": {
"text/markdown": [
"According to the provided context, the main changes in Llama 3 compared to Llama 2 are:\n",
"\n",
"1. A new tokenizer that expands the vocabulary size to 128,256 (from 32K tokens in the previous version), which can encode text more efficiently and potentially yield stronger multilingualism.\n",
"2. The introduction of two sizes: 8B for efficient deployment and development on consumer-size GPU, and 70B for large-scale AI native applications.\n",
"3. The availability of base and instruction-tuned variants for each model size.\n",
"4. The release of Llama Guard 2, a new version of Llama Guard that was fine-tuned on Llama 3 8B."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Query against your own data\n",
"from langchain.chains import ConversationalRetrievalChain\n",
"chain = ConversationalRetrievalChain.from_llm(llm, vectorstore.as_retriever(), return_source_documents=True)\n",
"\n",
"chat_history = []\n",
"query = \"What’s new with Llama 3?\"\n",
"result = chain({\"question\": query, \"chat_history\": chat_history})\n",
"md(result['answer'])"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"id": "CelLHIvoy2Ke"
},
"outputs": [
{
"data": {
"text/markdown": [
"According to the text, the two sizes of Llama 3 are 8B and 70B parameters."
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# This time your previous question and answer will be included as a chat history which will enable the ability\n",
"# to ask follow up questions.\n",
"chat_history = [(query, result[\"answer\"])]\n",
"query = \"What two sizes?\"\n",
"result = chain({\"question\": query, \"chat_history\": chat_history})\n",
"md(result['answer'])"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TEvefAWIJONx"
},
"source": [
"## **5 - Fine-Tuning Models**\n",
"\n",
"* Limitatons of Prompt Eng and RAG\n",
"* Fine-Tuning Arch\n",
"* Types (PEFT, LoRA, QLoRA)\n",
"* Using PyTorch for Pre-Training & Fine-Tuning\n",
"\n",
"* Evals + Quality\n",
"\n",
"Examples of Fine-Tuning:\n",
"* [Meta Llama Recipes](https://github.com/meta-llama/llama-recipes/tree/main/recipes/finetuning)\n",
"* [Hugging Face fine-tuning with Llama 3](https://huggingface.co/blog/llama3#fine-tuning-with-%F0%9F%A4%97-trl)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_8lcgdZa8onC"
},
"source": [
"## **6 - Responsible AI**\n",
"\n",
"* Power + Responsibility\n",
"* Hallucinations\n",
"* Input & Output Safety\n",
"* Red-teaming (simulating real-world cyber attackers)\n",
"* [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pbqb006R-T_k"
},
"source": [
"## **7 - Conclusion**\n",
"* Active research on LLMs and Llama\n",
"* Leverage the power of Llama and its open community\n",
"* Safety and responsible use is paramount!\n",
"\n",
"* Call-To-Action\n",
" * [Replicate Free Credits](https://replicate.fyi/connect2023) for Connect attendees!\n",
" * This notebook is available through Llama Github recipes\n",
" * Use Llama in your projects and give us feedback\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gSz5dTMxp7xo"
},
"source": [
"#### **Resources**\n",
"- [Meta Llama 3 Blog](https://ai.meta.com/blog/meta-llama-3/)\n",
"- [Getting Started with Meta Llama](https://llama.meta.com/docs/get-started)\n",
"- [Llama 3 repo](https://github.com/meta-llama/llama3)\n",
"- [Llama 3 model card](https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md)\n",
"- [LLama 3 Recipes repo](https://github.com/meta-llama/llama-recipes)\n",
"- [Responsible Use Guide](https://ai.meta.com/llama/responsible-use-guide/)\n",
"- [Acceptable Use Policy](https://ai.meta.com/llama/use-policy/)\n",
"\n"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"ioVMNcTesSEk"
],
"machine_shape": "hm",
"provenance": [],
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}