LangChain Tutorial in Python - Crash Course
In this LangChain Crash Course you will learn how to build applications powered by large language models.
LangChain is a framework for developing applications powered by language models. In this LangChain Crash Course you will learn how to build applications powered by large language models. We go over all important features of this framework.
Overview:¶
- Installation
- LLMs
- Prompt Templates
- Chains
- Agents and Tools
- Memory
- Document Loaders
- Indexes
#more
Try out all the code in this Google Colab.
Installation¶
pip install langchain
LLMs¶
LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.
See all LLM providers.
pip install openai
import os
os.environ["OPENAI_API_KEY"] ="YOUR_OPENAI_TOKEN"
from langchain.llms import OpenAI
llm = OpenAI(temperature=0.9) # model_name="text-davinci-003"
text = "What would be a good company name for a company that makes colorful socks?"
print(llm(text))
pip install huggingface_hub
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN"
from langchain import HuggingFaceHub
# https://huggingface.co/google/flan-t5-xl
llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64})
llm("translate English to German: How old are you?")
Prompt Templates¶
LangChain faciliates prompt management and optimization.
Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.
llm("Can Barack Obama have a conversation with George Washington?")
A better prompt is this:
prompt = """Question: Can Barack Obama have a conversation with George Washington?
Let's think step by step.
Answer: """
llm(prompt)
This can be achieved with PromptTemplates
:
from langchain import PromptTemplate
template = """Question: {question}
Let's think step by step.
Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])
prompt.format(question="Can Barack Obama have a conversation with George Washington?")
Chains¶
Combine LLMs and Prompts in multi-step workflows.
from langchain import LLMChain
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "Can Barack Obama have a conversation with George Washington?"
print(llm_chain.run(question))
Agents and Tools¶
Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.
When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:
- Tool: A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. See available Tools.
- LLM: The language model powering the agent.
- Agent: The agent to use. See also Agent Types.
from langchain.agents import load_tools
from langchain.agents import initialize_agent
pip install wikipedia
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.run("In what year was the film Departed with Leopnardo Dicaprio released? What is this year raised to the 0.43 power?")
Memory¶
Add state to Chains and Agents.
Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
from langchain import OpenAI, ConversationChain
llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)
conversation.predict(input="Hi there!")
conversation.predict(input="Can we talk about AI?")
conversation.predict(input="I'm interested in Reinforcement Learning.")
Document Loaders¶
Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.
See all available Document Loaders.
from langchain.document_loaders import NotionDirectoryLoader
loader = NotionDirectoryLoader("Notion_DB")
docs = loader.load()
Indexes¶
Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents
- Embeddings: An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc.
- Text Splitters: When you want to deal with long pieces of text, it is necessary to split up that text into chunks.
- Vectorstores: Vector databases store and index vector embeddings from NLP models to understand the meaning and context of strings of text, sentences, and whole documents for more accurate and relevant search results. See available vectorstores.
import requests
url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
f.write(res.text)
# Document Loader
from langchain.document_loaders import TextLoader
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
# Text Splitter
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
pip install sentence_transformers
# Embeddings
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()
#text = "This is a test document."
#query_result = embeddings.embed_query(text)
#doc_result = embeddings.embed_documents([text])
pip install faiss-cpu
from langchain.vectorstores import FAISS
db = FAISS.from_documents(docs, embeddings)
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
print(docs[0].page_content)
# Save and load:
db.save_local("faiss_index")
new_db = FAISS.load_local("faiss_index", embeddings)
docs = new_db.similarity_search(query)
print(docs[0].page_content)
End-to-end example¶
Check out the https://github.com/hwchase17/chat-langchain repo.
FREE VS Code / PyCharm Extensions I Use
✅ Write cleaner code with Sourcery, instant refactoring suggestions: Link*
Python Problem-Solving Bootcamp
🚀 Solve 42 programming puzzles over the course of 21 days: Link*