LangChain Tutorial in Python - Crash Course

In this LangChain Crash Course you will learn how to build applications powered by large language models.

LangChain is a framework for developing applications powered by language models. In this LangChain Crash Course you will learn how to build applications powered by large language models. We go over all important features of this framework.

Overview:¶

Installation
LLMs
Prompt Templates
Chains
Agents and Tools
Memory
Document Loaders
Indexes

Try out all the code in this Google Colab.

Installation¶

pip install langchain

LLMs¶

LangChain provides a generic interface for many different LLMs. Most of them work via their API but you can also run local models.

See all LLM providers.

pip install openai

import os
os.environ["OPENAI_API_KEY"] ="YOUR_OPENAI_TOKEN"

from langchain.llms import OpenAI

llm = OpenAI(temperature=0.9)  # model_name="text-davinci-003"
text = "What would be a good company name for a company that makes colorful socks?"
print(llm(text))

pip install huggingface_hub

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_HF_TOKEN"

from langchain import HuggingFaceHub

# https://huggingface.co/google/flan-t5-xl
llm = HuggingFaceHub(repo_id="google/flan-t5-xl", model_kwargs={"temperature":0, "max_length":64})

llm("translate English to German: How old are you?")

Prompt Templates¶

LangChain faciliates prompt management and optimization.

Normally, when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.

llm("Can Barack Obama have a conversation with George Washington?")

A better prompt is this:

prompt = """Question: Can Barack Obama have a conversation with George Washington?

Let's think step by step.

Answer: """
llm(prompt)

This can be achieved with PromptTemplates:

from langchain import PromptTemplate

template = """Question: {question}

Let's think step by step.

Answer: """

prompt = PromptTemplate(template=template, input_variables=["question"])

prompt.format(question="Can Barack Obama have a conversation with George Washington?")

Chains¶

Combine LLMs and Prompts in multi-step workflows.

from langchain import LLMChain

llm_chain = LLMChain(prompt=prompt, llm=llm)

question = "Can Barack Obama have a conversation with George Washington?"

print(llm_chain.run(question))

Agents and Tools¶

Agents involve an LLM making decisions about which cctions to take, taking that cction, seeing an observation, and repeating that until done.

When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:

Tool: A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. See available Tools.
LLM: The language model powering the agent.
Agent: The agent to use. See also Agent Types.

from langchain.agents import load_tools
from langchain.agents import initialize_agent

pip install wikipedia

from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)

agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

agent.run("In what year was the film Departed with Leopnardo Dicaprio released? What is this year raised to the 0.43 power?")

Memory¶

Add state to Chains and Agents.

Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

from langchain import OpenAI, ConversationChain

llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)

conversation.predict(input="Hi there!")

conversation.predict(input="Can we talk about AI?")

conversation.predict(input="I'm interested in Reinforcement Learning.")

Document Loaders¶

Combining language models with your own text data is a powerful way to differentiate them. The first step in doing this is to load the data into documents (i.e., some pieces of text). This module is aimed at making this easy.

See all available Document Loaders.

from langchain.document_loaders import NotionDirectoryLoader

loader = NotionDirectoryLoader("Notion_DB")

docs = loader.load()

Indexes¶

Indexes refer to ways to structure documents so that LLMs can best interact with them. This module contains utility functions for working with documents

Embeddings: An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc.
Text Splitters: When you want to deal with long pieces of text, it is necessary to split up that text into chunks.
Vectorstores: Vector databases store and index vector embeddings from NLP models to understand the meaning and context of strings of text, sentences, and whole documents for more accurate and relevant search results. See available vectorstores.

import requests

url = "https://raw.githubusercontent.com/hwchase17/langchain/master/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
  f.write(res.text)

# Document Loader
from langchain.document_loaders import TextLoader
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

# Text Splitter
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

pip install sentence_transformers

# Embeddings
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()

#text = "This is a test document."
#query_result = embeddings.embed_query(text)
#doc_result = embeddings.embed_documents([text])

pip install faiss-cpu

from langchain.vectorstores import FAISS

db = FAISS.from_documents(docs, embeddings)

query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)

print(docs[0].page_content)

# Save and load:
db.save_local("faiss_index")
new_db = FAISS.load_local("faiss_index", embeddings)
docs = new_db.similarity_search(query)
print(docs[0].page_content)

End-to-end example¶

Check out the https://github.com/hwchase17/chat-langchain repo.

FREE VS Code / PyCharm Extensions I Use

✅ Write cleaner code with Sourcery, instant refactoring suggestions: Link*

Python Problem-Solving Bootcamp

🚀 Solve 42 programming puzzles over the course of 21 days: Link*

* These are affiliate link. By clicking on it you will not have any additional costs. Instead, you will support my project. Thank you! 🙏