Since we are using Replicate in this example, you will need to replace <your replicate="" api="" token="">
with your API token.
To get the Replicate token:
Note After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate.
To run this example:
<your replicate="" api="" token="">
In the notebook or a browser with URL http://127.0.0.1:7860 you should see a UI with your answer.
from langchain.schema import AIMessage, HumanMessage
import gradio as gr
from langchain.llms import Replicate
import os
os.environ["REPLICATE_API_TOKEN"] = "<your replicate api token>"
llama2_13b_chat = "meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d"
llm = Replicate(
model=llama2_13b_chat,
model_kwargs={"temperature": 0.01, "top_p": 1, "max_new_tokens":500}
)
def predict(message, history):
history_langchain_format = []
for human, ai in history:
history_langchain_format.append(HumanMessage(content=human))
history_langchain_format.append(AIMessage(content=ai))
history_langchain_format.append(HumanMessage(content=message))
gpt_response = llm(message) #history_langchain_format)
return gpt_response#.content
gr.ChatInterface(predict).launch()
Init param `input` is deprecated, please use `model_kwargs` instead. Running on local URL: http://127.0.0.1:7860 To create a public link, set `share=True` in `launch()`.