---
how-tos/graph-api.md
---
# How to use the graph API
This guide demonstrates the basics of LangGraph's Graph API. It walks through [state](#define-and-update-state), as well as composing common graph structures such as [sequences](#create-a-sequence-of-steps), [branches](#create-branches), and [loops](#create-and-control-loops). It also covers LangGraph's control features, including the [Send API](#map-reduce-and-the-send-api) for map-reduce workflows and the [Command API](#combine-control-flow-and-state-updates-with-command) for combining state updates with "hops" across nodes.
## Setup
Install `langgraph`:
```bash
pip install -U langgraph
```
!!! tip "Set up LangSmith for better debugging"
Sign up for [LangSmith](https://smith.langchain.com) to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started in the [docs](https://docs.smith.langchain.com).
## Define and update state
Here we show how to define and update [state](../concepts/low_level.md#state) in LangGraph. We will demonstrate:
1. How to use state to define a graph's [schema](../concepts/low_level.md#schema)
2. How to use [reducers](../concepts/low_level.md#reducers) to control how state updates are processed.
### Define state
[State](../concepts/low_level.md#state) in LangGraph can be a `TypedDict`, `Pydantic` model, or dataclass. Below we will use `TypedDict`. See [this section](#use-pydantic-models-for-graph-state) for detail on using Pydantic.
By default, graphs will have the same input and output schema, and the state determines that schema. See [this section](#define-input-and-output-schemas) for how to define distinct input and output schemas.
Let's consider a simple example using [messages](../concepts/low_level.md#messagesstate). This represents a versatile formulation of state for many LLM applications. See our [concepts page](../concepts/low_level.md#working-with-messages-in-graph-state) for more detail.
```python
from langchain_core.messages import AnyMessage
from typing_extensions import TypedDict
class State(TypedDict):
messages: list[AnyMessage]
extra_field: int
```
This state tracks a list of [message](https://python.langchain.com/docs/concepts/messages/) objects, as well as an extra integer field.
### Update state
Let's build an example graph with a single node. Our [node](../concepts/low_level.md#nodes) is just a Python function that reads our graph's state and makes updates to it. The first argument to this function will always be the state:
```python
from langchain_core.messages import AIMessage
def node(state: State):
messages = state["messages"]
new_message = AIMessage("Hello!")
return {"messages": messages + [new_message], "extra_field": 10}
```
This node simply appends a message to our message list, and populates an extra field.
!!! important
Nodes should return updates to the state directly, instead of mutating the state.
Let's next define a simple graph containing this node. We use [StateGraph](../concepts/low_level.md#stategraph) to define a graph that operates on this state. We then use [add_node](../concepts/low_level.md#nodes) populate our graph.
```python
from langgraph.graph import StateGraph
builder = StateGraph(State)
builder.add_node(node)
builder.set_entry_point("node")
graph = builder.compile()
```
LangGraph provides built-in utilities for visualizing your graph. Let's inspect our graph. See [this section](#visualize-your-graph) for detail on visualization.
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

In this case, our graph just executes a single node. Let's proceed with a simple invocation:
```python
from langchain_core.messages import HumanMessage
result = graph.invoke({"messages": [HumanMessage("Hi")]})
result
```
```
{'messages': [HumanMessage(content='Hi'), AIMessage(content='Hello!')], 'extra_field': 10}
```
Note that:
- We kicked off invocation by updating a single key of the state.
- We receive the entire state in the invocation result.
For convenience, we frequently inspect the content of [message objects](https://python.langchain.com/docs/concepts/messages/) via pretty-print:
```python
for message in result["messages"]:
message.pretty_print()
```
```
================================ Human Message ================================
Hi
================================== Ai Message ==================================
Hello!
```
### Process state updates with reducers
Each key in the state can have its own independent [reducer](../concepts/low_level.md#reducers) function, which controls how updates from nodes are applied. If no reducer function is explicitly specified then it is assumed that all updates to the key should override it.
For `TypedDict` state schemas, we can define reducers by annotating the corresponding field of the state with a reducer function.
In the earlier example, our node updated the `"messages"` key in the state by appending a message to it. Below, we add a reducer to this key, such that updates are automatically appended:
```python hl_lines="8"
from typing_extensions import Annotated
def add(left, right):
"""Can also import `add` from the `operator` built-in."""
return left + right
class State(TypedDict):
messages: Annotated[list[AnyMessage], add]
extra_field: int
```
Now our node can be simplified:
```python hl_lines="3"
def node(state: State):
new_message = AIMessage("Hello!")
return {"messages": [new_message], "extra_field": 10}
```
```python
from langgraph.graph import START
graph = StateGraph(State).add_node(node).add_edge(START, "node").compile()
result = graph.invoke({"messages": [HumanMessage("Hi")]})
for message in result["messages"]:
message.pretty_print()
```
```
================================ Human Message ================================
Hi
================================== Ai Message ==================================
Hello!
```
#### MessagesState
In practice, there are additional considerations for updating lists of messages:
- We may wish to update an existing message in the state.
- We may want to accept short-hands for [message formats](../concepts/low_level.md#using-messages-in-your-graph), such as [OpenAI format](https://python.langchain.com/docs/concepts/messages.md#openai-format).
LangGraph includes a built-in reducer `add_messages` that handles these considerations:
```python hl_lines="4"
from langgraph.graph.message import add_messages
class State(TypedDict):
messages: Annotated[list[AnyMessage], add_messages]
extra_field: int
def node(state: State):
new_message = AIMessage("Hello!")
return {"messages": [new_message], "extra_field": 10}
graph = StateGraph(State).add_node(node).set_entry_point("node").compile()
```
```python hl_lines="1"
input_message = {"role": "user", "content": "Hi"}
result = graph.invoke({"messages": [input_message]})
for message in result["messages"]:
message.pretty_print()
```
```
================================ Human Message ================================
Hi
================================== Ai Message ==================================
Hello!
```
This is a versatile representation of state for applications involving [chat models](https://python.langchain.com/docs/concepts/chat_models/). LangGraph includes a pre-built `MessagesState` for convenience, so that we can have:
```python
from langgraph.graph import MessagesState
class State(MessagesState):
extra_field: int
```
### Define input and output schemas
By default, `StateGraph` operates with a single schema, and all nodes are expected to communicate using that schema. However, it's also possible to define distinct input and output schemas for a graph.
When distinct schemas are specified, an internal schema will still be used for communication between nodes. The input schema ensures that the provided input matches the expected structure, while the output schema filters the internal data to return only the relevant information according to the defined output schema.
Below, we'll see how to define distinct input and output schema.
```python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
# Define the schema for the input
class InputState(TypedDict):
question: str
# Define the schema for the output
class OutputState(TypedDict):
answer: str
# Define the overall schema, combining both input and output
class OverallState(InputState, OutputState):
pass
# Define the node that processes the input and generates an answer
def answer_node(state: InputState):
# Example answer and an extra key
return {"answer": "bye", "question": state["question"]}
# Build the graph with input and output schemas specified
builder = StateGraph(OverallState, input_schema=InputState, output_schema=OutputState)
builder.add_node(answer_node) # Add the answer node
builder.add_edge(START, "answer_node") # Define the starting edge
builder.add_edge("answer_node", END) # Define the ending edge
graph = builder.compile() # Compile the graph
# Invoke the graph with an input and print the result
print(graph.invoke({"question": "hi"}))
```
```
{'answer': 'bye'}
```
Notice that the output of invoke only includes the output schema.
### Pass private state between nodes
In some cases, you may want nodes to exchange information that is crucial for intermediate logic but doesn't need to be part of the main schema of the graph. This private data is not relevant to the overall input/output of the graph and should only be shared between certain nodes.
Below, we'll create an example sequential graph consisting of three nodes (node_1, node_2 and node_3), where private data is passed between the first two steps (node_1 and node_2), while the third step (node_3) only has access to the public overall state.
```python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
# The overall state of the graph (this is the public state shared across nodes)
class OverallState(TypedDict):
a: str
# Output from node_1 contains private data that is not part of the overall state
class Node1Output(TypedDict):
private_data: str
# The private data is only shared between node_1 and node_2
def node_1(state: OverallState) -> Node1Output:
output = {"private_data": "set by node_1"}
print(f"Entered node `node_1`:\n\tInput: {state}.\n\tReturned: {output}")
return output
# Node 2 input only requests the private data available after node_1
class Node2Input(TypedDict):
private_data: str
def node_2(state: Node2Input) -> OverallState:
output = {"a": "set by node_2"}
print(f"Entered node `node_2`:\n\tInput: {state}.\n\tReturned: {output}")
return output
# Node 3 only has access to the overall state (no access to private data from node_1)
def node_3(state: OverallState) -> OverallState:
output = {"a": "set by node_3"}
print(f"Entered node `node_3`:\n\tInput: {state}.\n\tReturned: {output}")
return output
# Connect nodes in a sequence
# node_2 accepts private data from node_1, whereas
# node_3 does not see the private data.
builder = StateGraph(OverallState).add_sequence([node_1, node_2, node_3])
builder.add_edge(START, "node_1")
graph = builder.compile()
# Invoke the graph with the initial state
response = graph.invoke(
{
"a": "set at start",
}
)
print()
print(f"Output of graph invocation: {response}")
```
```
Entered node `node_1`:
Input: {'a': 'set at start'}.
Returned: {'private_data': 'set by node_1'}
Entered node `node_2`:
Input: {'private_data': 'set by node_1'}.
Returned: {'a': 'set by node_2'}
Entered node `node_3`:
Input: {'a': 'set by node_2'}.
Returned: {'a': 'set by node_3'}
Output of graph invocation: {'a': 'set by node_3'}
```
### Use Pydantic models for graph state
A [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.StateGraph) accepts a `state_schema` argument on initialization that specifies the "shape" of the state that the nodes in the graph can access and update.
In our examples, we typically use a python-native `TypedDict` for `state_schema`, but `state_schema` can be any [type](https://docs.python.org/3/library/stdtypes.html#type-objects).
Here, we'll see how a [Pydantic BaseModel](https://docs.pydantic.dev/latest/api/base_model/). can be used for `state_schema` to add run time validation on **inputs**.
!!! note "Known Limitations"
- Currently, the output of the graph will **NOT** be an instance of a pydantic model.
- Run-time validation only occurs on inputs into nodes, not on the outputs.
- The validation error trace from pydantic does not show which node the error arises in.
```python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
from pydantic import BaseModel
# The overall state of the graph (this is the public state shared across nodes)
class OverallState(BaseModel):
a: str
def node(state: OverallState):
return {"a": "goodbye"}
# Build the state graph
builder = StateGraph(OverallState)
builder.add_node(node) # node_1 is the first node
builder.add_edge(START, "node") # Start the graph with node_1
builder.add_edge("node", END) # End the graph after node_1
graph = builder.compile()
# Test the graph with a valid input
graph.invoke({"a": "hello"})
```
Invoke the graph with an **invalid** input
```python
try:
graph.invoke({"a": 123}) # Should be a string
except Exception as e:
print("An exception was raised because `a` is an integer rather than a string.")
print(e)
```
```
An exception was raised because `a` is an integer rather than a string.
1 validation error for OverallState
a
Input should be a valid string [type=string_type, input_value=123, input_type=int]
For further information visit https://errors.pydantic.dev/2.9/v/string_type
```
See below for additional features of Pydantic model state:
??? example "Serialization Behavior"
When using Pydantic models as state schemas, it's important to understand how serialization works, especially when:
- Passing Pydantic objects as inputs
- Receiving outputs from the graph
- Working with nested Pydantic models
Let's see these behaviors in action.
```python
from langgraph.graph import StateGraph, START, END
from pydantic import BaseModel
class NestedModel(BaseModel):
value: str
class ComplexState(BaseModel):
text: str
count: int
nested: NestedModel
def process_node(state: ComplexState):
# Node receives a validated Pydantic object
print(f"Input state type: {type(state)}")
print(f"Nested type: {type(state.nested)}")
# Return a dictionary update
return {"text": state.text + " processed", "count": state.count + 1}
# Build the graph
builder = StateGraph(ComplexState)
builder.add_node("process", process_node)
builder.add_edge(START, "process")
builder.add_edge("process", END)
graph = builder.compile()
# Create a Pydantic instance for input
input_state = ComplexState(text="hello", count=0, nested=NestedModel(value="test"))
print(f"Input object type: {type(input_state)}")
# Invoke graph with a Pydantic instance
result = graph.invoke(input_state)
print(f"Output type: {type(result)}")
print(f"Output content: {result}")
# Convert back to Pydantic model if needed
output_model = ComplexState(**result)
print(f"Converted back to Pydantic: {type(output_model)}")
```
??? example "Runtime Type Coercion"
Pydantic performs runtime type coercion for certain data types. This can be helpful but also lead to unexpected behavior if you're not aware of it.
```python
from langgraph.graph import StateGraph, START, END
from pydantic import BaseModel
class CoercionExample(BaseModel):
# Pydantic will coerce string numbers to integers
number: int
# Pydantic will parse string booleans to bool
flag: bool
def inspect_node(state: CoercionExample):
print(f"number: {state.number} (type: {type(state.number)})")
print(f"flag: {state.flag} (type: {type(state.flag)})")
return {}
builder = StateGraph(CoercionExample)
builder.add_node("inspect", inspect_node)
builder.add_edge(START, "inspect")
builder.add_edge("inspect", END)
graph = builder.compile()
# Demonstrate coercion with string inputs that will be converted
result = graph.invoke({"number": "42", "flag": "true"})
# This would fail with a validation error
try:
graph.invoke({"number": "not-a-number", "flag": "true"})
except Exception as e:
print(f"\nExpected validation error: {e}")
```
??? example "Working with Message Models"
When working with LangChain message types in your state schema, there are important considerations for serialization. You should use `AnyMessage` (rather than `BaseMessage`) for proper serialization/deserialization when using message objects over the wire.
```python
from langgraph.graph import StateGraph, START, END
from pydantic import BaseModel
from langchain_core.messages import HumanMessage, AIMessage, AnyMessage
from typing import List
class ChatState(BaseModel):
messages: List[AnyMessage]
context: str
def add_message(state: ChatState):
return {"messages": state.messages + [AIMessage(content="Hello there!")]}
builder = StateGraph(ChatState)
builder.add_node("add_message", add_message)
builder.add_edge(START, "add_message")
builder.add_edge("add_message", END)
graph = builder.compile()
# Create input with a message
initial_state = ChatState(
messages=[HumanMessage(content="Hi")], context="Customer support chat"
)
result = graph.invoke(initial_state)
print(f"Output: {result}")
# Convert back to Pydantic model to see message types
output_model = ChatState(**result)
for i, msg in enumerate(output_model.messages):
print(f"Message {i}: {type(msg).__name__} - {msg.content}")
```
## Add runtime configuration
Sometimes you want to be able to configure your graph when calling it. For example, you might want to be able to specify what LLM or system prompt to use at runtime, *without polluting the graph state with these parameters*.
To add runtime configuration:
1. Specify a schema for your configuration
2. Add the configuration to the function signature for nodes or conditional edges
3. Pass the configuration into the graph.
See below for a simple example:
```python hl_lines="13 14 16 21 29 30"
from langchain_core.runnables import RunnableConfig
from langgraph.graph import END, StateGraph, START
from typing_extensions import TypedDict
# 1. Specify config schema
class ConfigSchema(TypedDict):
my_runtime_value: str
# 2. Define a graph that accesses the config in a node
class State(TypedDict):
my_state_value: str
def node(state: State, config: RunnableConfig):
if config["configurable"]["my_runtime_value"] == "a":
return {"my_state_value": 1}
elif config["configurable"]["my_runtime_value"] == "b":
return {"my_state_value": 2}
else:
raise ValueError("Unknown values.")
builder = StateGraph(State, config_schema=ConfigSchema)
builder.add_node(node)
builder.add_edge(START, "node")
builder.add_edge("node", END)
graph = builder.compile()
# 3. Pass in configuration at runtime:
print(graph.invoke({}, {"configurable": {"my_runtime_value": "a"}}))
print(graph.invoke({}, {"configurable": {"my_runtime_value": "b"}}))
```
```
{'my_state_value': 1}
{'my_state_value': 2}
```
??? example "Extended example: specifying LLM at runtime"
Below we demonstrate a practical example in which we configure what LLM to use at runtime. We will use both OpenAI and Anthropic models.
```python
from langchain.chat_models import init_chat_model
from langchain_core.runnables import RunnableConfig
from langgraph.graph import MessagesState
from langgraph.graph import END, StateGraph, START
from typing_extensions import TypedDict
class ConfigSchema(TypedDict):
model: str
MODELS = {
"anthropic": init_chat_model("anthropic:claude-3-5-haiku-latest"),
"openai": init_chat_model("openai:gpt-4.1-mini"),
}
def call_model(state: MessagesState, config: RunnableConfig):
model = config["configurable"].get("model", "anthropic")
model = MODELS[model]
response = model.invoke(state["messages"])
return {"messages": [response]}
builder = StateGraph(MessagesState, config_schema=ConfigSchema)
builder.add_node("model", call_model)
builder.add_edge(START, "model")
builder.add_edge("model", END)
graph = builder.compile()
# Usage
input_message = {"role": "user", "content": "hi"}
# With no configuration, uses default (Anthropic)
response_1 = graph.invoke({"messages": [input_message]})["messages"][-1]
# Or, can set OpenAI
config = {"configurable": {"model": "openai"}}
response_2 = graph.invoke({"messages": [input_message]}, config=config)["messages"][-1]
print(response_1.response_metadata["model_name"])
print(response_2.response_metadata["model_name"])
```
```
claude-3-5-haiku-20241022
gpt-4.1-mini-2025-04-14
```
??? example "Extended example: specifying model and system message at runtime"
Below we demonstrate a practical example in which we configure two parameters: the LLM and system message to use at runtime.
```python
from typing import Optional
from langchain.chat_models import init_chat_model
from langchain_core.messages import SystemMessage
from langchain_core.runnables import RunnableConfig
from langgraph.graph import END, MessagesState, StateGraph, START
from typing_extensions import TypedDict
class ConfigSchema(TypedDict):
model: Optional[str]
system_message: Optional[str]
MODELS = {
"anthropic": init_chat_model("anthropic:claude-3-5-haiku-latest"),
"openai": init_chat_model("openai:gpt-4.1-mini"),
}
def call_model(state: MessagesState, config: RunnableConfig):
model = config["configurable"].get("model", "anthropic")
model = MODELS[model]
messages = state["messages"]
if system_message := config["configurable"].get("system_message"):
messages = [SystemMessage(system_message)] + messages
response = model.invoke(messages)
return {"messages": [response]}
builder = StateGraph(MessagesState, config_schema=ConfigSchema)
builder.add_node("model", call_model)
builder.add_edge(START, "model")
builder.add_edge("model", END)
graph = builder.compile()
# Usage
input_message = {"role": "user", "content": "hi"}
config = {"configurable": {"model": "openai", "system_message": "Respond in Italian."}}
response = graph.invoke({"messages": [input_message]}, config)
for message in response["messages"]:
message.pretty_print()
```
```
================================ Human Message ================================
hi
================================== Ai Message ==================================
Ciao! Come posso aiutarti oggi?
```
## Add retry policies
There are many use cases where you may wish for your node to have a custom retry policy, for example if you are calling an API, querying a database, or calling an LLM, etc. LangGraph lets you add retry policies to nodes.
To configure a retry policy, pass the `retry_policy` parameter to the [add_node](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.state.StateGraph.add_node). The `retry_policy` parameter takes in a `RetryPolicy` named tuple object. Below we instantiate a `RetryPolicy` object with the default parameters and associate it with a node:
```python
from langgraph.pregel import RetryPolicy
builder.add_node(
"node_name",
node_function,
retry_policy=RetryPolicy(),
)
```
By default, the `retry_on` parameter uses the `default_retry_on` function, which retries on any exception except for the following:
* `ValueError`
* `TypeError`
* `ArithmeticError`
* `ImportError`
* `LookupError`
* `NameError`
* `SyntaxError`
* `RuntimeError`
* `ReferenceError`
* `StopIteration`
* `StopAsyncIteration`
* `OSError`
In addition, for exceptions from popular http request libraries such as `requests` and `httpx` it only retries on 5xx status codes.
??? example "Extended example: customizing retry policies"
Consider an example in which we are reading from a SQL database. Below we pass two different retry policies to nodes:
```python
import sqlite3
from typing_extensions import TypedDict
from langchain.chat_models import init_chat_model
from langgraph.graph import END, MessagesState, StateGraph, START
from langgraph.pregel import RetryPolicy
from langchain_community.utilities import SQLDatabase
from langchain_core.messages import AIMessage
db = SQLDatabase.from_uri("sqlite:///:memory:")
model = init_chat_model("anthropic:claude-3-5-haiku-latest")
def query_database(state: MessagesState):
query_result = db.run("SELECT * FROM Artist LIMIT 10;")
return {"messages": [AIMessage(content=query_result)]}
def call_model(state: MessagesState):
response = model.invoke(state["messages"])
return {"messages": [response]}
# Define a new graph
builder = StateGraph(MessagesState)
builder.add_node(
"query_database",
query_database,
retry_policy=RetryPolicy(retry_on=sqlite3.OperationalError),
)
builder.add_node("model", call_model, retry_policy=RetryPolicy(max_attempts=5))
builder.add_edge(START, "model")
builder.add_edge("model", "query_database")
builder.add_edge("query_database", END)
graph = builder.compile()
```
## Add node caching
Node caching is useful in cases where you want to avoid repeating operations, like when doing something expensive (either in terms of time or cost). LangGraph lets you add individualized caching policies to nodes in a graph.
To configure a cache policy, pass the `cache_policy` parameter to the [add_node](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.state.StateGraph.add_node) function. In the following example, a [`CachePolicy`](https://langchain-ai.github.io/langgraph/reference/types/?h=cachepolicy#langgraph.types.CachePolicy) object is instantiated with a time to live of 120 seconds and the default `key_func` generator. Then it is associated with a node:
```python
from langgraph.types import CachePolicy
builder.add_node(
"node_name",
node_function,
cache_policy=CachePolicy(ttl=120),
)
```
Then, to enable node-level caching for a graph, set the `cache` argument when compiling the graph. The example below uses `InMemoryCache` to set up a graph with in-memory cache, but `SqliteCache` is also available.
```python
from langgraph.cache.memory import InMemoryCache
graph = builder.compile(cache=InMemoryCache())
```
## Create a sequence of steps
!!! info "Prerequisites"
This guide assumes familiarity with the above section on [state](#define-and-update-state).
Here we demonstrate how to construct a simple sequence of steps. We will show:
1. How to build a sequential graph
2. Built-in short-hand for constructing similar graphs.
To add a sequence of nodes, we use the `.add_node` and `.add_edge` methods of our [graph](../concepts/low_level.md#stategraph):
```python
from langgraph.graph import START, StateGraph
builder = StateGraph(State)
# Add nodes
builder.add_node(step_1)
builder.add_node(step_2)
builder.add_node(step_3)
# Add edges
builder.add_edge(START, "step_1")
builder.add_edge("step_1", "step_2")
builder.add_edge("step_2", "step_3")
```
We can also use the built-in shorthand `.add_sequence`:
```python
builder = StateGraph(State).add_sequence([step_1, step_2, step_3])
builder.add_edge(START, "step_1")
```
??? info "Why split application steps into a sequence with LangGraph?"
LangGraph makes it easy to add an underlying persistence layer to your application.
This allows state to be checkpointed in between the execution of nodes, so your LangGraph nodes govern:
- How state updates are [checkpointed](../concepts/persistence.md)
- How interruptions are resumed in [human-in-the-loop](../concepts/human_in_the_loop.md) workflows
- How we can "rewind" and branch-off executions using LangGraph's [time travel](../concepts/time-travel.md) features
They also determine how execution steps are [streamed](../concepts/streaming.md), and how your application is visualized
and debugged using [LangGraph Studio](../concepts/langgraph_studio.md).
Let's demonstrate an end-to-end example. We will create a sequence of three steps:
1. Populate a value in a key of the state
2. Update the same value
3. Populate a different value
Let's first define our [state](../concepts/low_level.md#state). This governs the [schema of the graph](../concepts/low_level.md#schema), and can also specify how to apply updates. See [this section](#process-state-updates-with-reducers) for more detail.
In our case, we will just keep track of two values:
```python
from typing_extensions import TypedDict
class State(TypedDict):
value_1: str
value_2: int
```
Our [nodes](../concepts/low_level.md#nodes) are just Python functions that read our graph's state and make updates to it. The first argument to this function will always be the state:
```python
def step_1(state: State):
return {"value_1": "a"}
def step_2(state: State):
current_value_1 = state["value_1"]
return {"value_1": f"{current_value_1} b"}
def step_3(state: State):
return {"value_2": 10}
```
!!! note
Note that when issuing updates to the state, each node can just specify the value of the key it wishes to update.
By default, this will **overwrite** the value of the corresponding key. You can also use [reducers](../concepts/low_level.md#reducers) to control how updates are processed— for example, you can append successive updates to a key instead. See [this section](#process-state-updates-with-reducers) for more detail.
Finally, we define the graph. We use [StateGraph](../concepts/low_level.md#stategraph) to define a graph that operates on this state.
We will then use [add_node](../concepts/low_level.md#messagesstate) and [add_edge](../concepts/low_level.md#edges) to populate our graph and define its control flow.
```python
from langgraph.graph import START, StateGraph
builder = StateGraph(State)
# Add nodes
builder.add_node(step_1)
builder.add_node(step_2)
builder.add_node(step_3)
# Add edges
builder.add_edge(START, "step_1")
builder.add_edge("step_1", "step_2")
builder.add_edge("step_2", "step_3")
```
!!! tip "Specifying custom names"
You can specify custom names for nodes using `.add_node`:
```python
builder.add_node("my_node", step_1)
```
Note that:
- `.add_edge` takes the names of nodes, which for functions defaults to `node.__name__`.
- We must specify the entry point of the graph. For this we add an edge with the [START node](../concepts/low_level.md#start-node).
- The graph halts when there are no more nodes to execute.
We next [compile](../concepts/low_level.md#compiling-your-graph) our graph. This provides a few basic checks on the structure of the graph (e.g., identifying orphaned nodes). If we were adding persistence to our application via a [checkpointer](../concepts/persistence.md), it would also be passed in here.
```python
graph = builder.compile()
```
LangGraph provides built-in utilities for visualizing your graph. Let's inspect our sequence. See [this guide](#visualize-your-graph) for detail on visualization.
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

Let's proceed with a simple invocation:
```python
graph.invoke({"value_1": "c"})
```
```
{'value_1': 'a b', 'value_2': 10}
```
Note that:
- We kicked off invocation by providing a value for a single state key. We must always provide a value for at least one key.
- The value we passed in was overwritten by the first node.
- The second node updated the value.
- The third node populated a different value.
!!! tip "Built-in shorthand"
`langgraph>=0.2.46` includes a built-in short-hand `add_sequence` for adding node sequences. You can compile the same graph as follows:
```python hl_lines="1"
builder = StateGraph(State).add_sequence([step_1, step_2, step_3])
builder.add_edge(START, "step_1")
graph = builder.compile()
graph.invoke({"value_1": "c"})
```
## Create branches
Parallel execution of nodes is essential to speed up overall graph operation. LangGraph offers native support for parallel execution of nodes, which can significantly enhance the performance of graph-based workflows. This parallelization is achieved through fan-out and fan-in mechanisms, utilizing both standard edges and [conditional_edges](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.MessageGraph.add_conditional_edges). Below are some examples showing how to add create branching dataflows that work for you.
### Run graph nodes in parallel
In this example, we fan out from `Node A` to `B and C` and then fan in to `D`. With our state, [we specify the reducer add operation](https://langchain-ai.github.io/langgraph/concepts/low_level.md#reducers). This will combine or accumulate values for the specific key in the State, rather than simply overwriting the existing value. For lists, this means concatenating the new list with the existing list. See the above section on [state reducers](#process-state-updates-with-reducers) for more detail on updating state with reducers.
```python
import operator
from typing import Annotated, Any
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
# The operator.add reducer fn makes this append-only
aggregate: Annotated[list, operator.add]
def a(state: State):
print(f'Adding "A" to {state["aggregate"]}')
return {"aggregate": ["A"]}
def b(state: State):
print(f'Adding "B" to {state["aggregate"]}')
return {"aggregate": ["B"]}
def c(state: State):
print(f'Adding "C" to {state["aggregate"]}')
return {"aggregate": ["C"]}
def d(state: State):
print(f'Adding "D" to {state["aggregate"]}')
return {"aggregate": ["D"]}
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
builder.add_node(c)
builder.add_node(d)
builder.add_edge(START, "a")
builder.add_edge("a", "b")
builder.add_edge("a", "c")
builder.add_edge("b", "d")
builder.add_edge("c", "d")
builder.add_edge("d", END)
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

With the reducer, you can see that the values added in each node are accumulated.
```python
graph.invoke({"aggregate": []}, {"configurable": {"thread_id": "foo"}})
```
```
Adding "A" to []
Adding "B" to ['A']
Adding "C" to ['A']
Adding "D" to ['A', 'B', 'C']
```
!!! note
In the above example, nodes `"b"` and `"c"` are executed concurrently in the same [superstep](../concepts/low_level.md#graphs). Because they are in the same step, node `"d"` executes after both `"b"` and `"c"` are finished.
Importantly, updates from a parallel superstep may not be ordered consistently. If you need a consistent, predetermined ordering of updates from a parallel superstep, you should write the outputs to a separate field in the state together with a value with which to order them.
??? note "Exception handling?"
LangGraph executes nodes within [supersteps](../concepts/low_level.md#graphs), meaning that while parallel branches are executed in parallel, the entire superstep is **transactional**. If any of these branches raises an exception, **none** of the updates are applied to the state (the entire superstep errors).
Importantly, when using a [checkpointer](../concepts/persistence.md), results from successful nodes within a superstep are saved, and don't repeat when resumed.
If you have error-prone (perhaps want to handle flakey API calls), LangGraph provides two ways to address this:
1. You can write regular python code within your node to catch and handle exceptions.
2. You can set a **[retry_policy](../reference/types.md#langgraph.types.RetryPolicy)** to direct the graph to retry nodes that raise certain types of exceptions. Only failing branches are retried, so you needn't worry about performing redundant work.
Together, these let you perform parallel execution and fully control exception handling.
### Defer node execution
Deferring node execution is useful when you want to delay the execution of a node until all other pending tasks are completed. This is particularly relevant when branches have different lengths, which is common in workflows like map-reduce flows.
The above example showed how to fan-out and fan-in when each path was only one step. But what if one branch had more than one step? Let's add a node `"b_2"` in the `"b"` branch:
```python hl_lines="35"
import operator
from typing import Annotated, Any
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
# The operator.add reducer fn makes this append-only
aggregate: Annotated[list, operator.add]
def a(state: State):
print(f'Adding "A" to {state["aggregate"]}')
return {"aggregate": ["A"]}
def b(state: State):
print(f'Adding "B" to {state["aggregate"]}')
return {"aggregate": ["B"]}
def b_2(state: State):
print(f'Adding "B_2" to {state["aggregate"]}')
return {"aggregate": ["B_2"]}
def c(state: State):
print(f'Adding "C" to {state["aggregate"]}')
return {"aggregate": ["C"]}
def d(state: State):
print(f'Adding "D" to {state["aggregate"]}')
return {"aggregate": ["D"]}
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
builder.add_node(b_2)
builder.add_node(c)
builder.add_node(d, defer=True)
builder.add_edge(START, "a")
builder.add_edge("a", "b")
builder.add_edge("a", "c")
builder.add_edge("b", "b_2")
builder.add_edge("b_2", "d")
builder.add_edge("c", "d")
builder.add_edge("d", END)
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

```python
graph.invoke({"aggregate": []})
```
```
Adding "A" to []
Adding "B" to ['A']
Adding "C" to ['A']
Adding "B_2" to ['A', 'B', 'C']
Adding "D" to ['A', 'B', 'C', 'B_2']
```
In the above example, nodes `"b"` and `"c"` are executed concurrently in the same superstep. We set `defer=True` on node `d` so it will not execute until all pending tasks are finished. In this case, this means that `"d"` waits to execute until the entire `"b"` branch is finished.
### Conditional branching
If your fan-out should vary at runtime based on the state, you can use [add_conditional_edges](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.StateGraph.add_conditional_edges) to select one or more paths using the graph state. See example below, where node `a` generates a state update that determines the following node.
```python hl_lines="14 37"
import operator
from typing import Annotated, Literal, Sequence
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
aggregate: Annotated[list, operator.add]
# Add a key to the state. We will set this key to determine
# how we branch.
which: str
def a(state: State):
print(f'Adding "A" to {state["aggregate"]}')
return {"aggregate": ["A"], "which": "c"}
def b(state: State):
print(f'Adding "B" to {state["aggregate"]}')
return {"aggregate": ["B"]}
def c(state: State):
print(f'Adding "C" to {state["aggregate"]}')
return {"aggregate": ["C"]}
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
builder.add_node(c)
builder.add_edge(START, "a")
builder.add_edge("b", END)
builder.add_edge("c", END)
def conditional_edge(state: State) -> Literal["b", "c"]:
# Fill in arbitrary logic here that uses the state
# to determine the next node
return state["which"]
builder.add_conditional_edges("a", conditional_edge)
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

```python
result = graph.invoke({"aggregate": []})
print(result)
```
```
Adding "A" to []
Adding "C" to ['A']
{'aggregate': ['A', 'C'], 'which': 'c'}
```
!!! tip
Your conditional edges can route to multiple destination nodes. For example:
```python
def route_bc_or_cd(state: State) -> Sequence[str]:
if state["which"] == "cd":
return ["c", "d"]
return ["b", "c"]
```
## Map-Reduce and the Send API
LangGraph supports map-reduce and other advanced branching patterns using the Send API. Here is an example of how to use it:
```python
from langgraph.graph import StateGraph, START, END, Send
from typing_extensions import TypedDict
class OverallState(TypedDict):
topic: str
subjects: list[str]
jokes: list[str]
best_selected_joke: str
def generate_topics(state: OverallState):
return {"subjects": ["lions", "elephants", "penguins"]}
def generate_joke(state: OverallState):
joke_map = {
"lions": "Why don't lions like fast food? Because they can't catch it!",
"elephants": "Why don't elephants use computers? They're afraid of the mouse!",
"penguins": "Why don't penguins like talking to strangers at parties? Because they find it hard to break the ice."
}
return {"jokes": [joke_map[state["subject"]]]}
def continue_to_jokes(state: OverallState):
return [Send("generate_joke", {"subject": s}) for s in state["subjects"]]
def best_joke(state: OverallState):
return {"best_selected_joke": "penguins"}
builder = StateGraph(OverallState)
builder.add_node("generate_topics", generate_topics)
builder.add_node("generate_joke", generate_joke)
builder.add_node("best_joke", best_joke)
builder.add_edge(START, "generate_topics")
builder.add_conditional_edges("generate_topics", continue_to_jokes, ["generate_joke"])
builder.add_edge("generate_joke", "best_joke")
builder.add_edge("best_joke", END)
builder.add_edge("generate_topics", END)
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

```python
# Call the graph: here we call it to generate a list of jokes
for step in graph.stream({"topic": "animals"}):
print(step)
```
```
{'generate_topics': {'subjects': ['lions', 'elephants', 'penguins']}}
{'generate_joke': {'jokes': ["Why don't lions like fast food? Because they can't catch it!"]}}
{'generate_joke': {'jokes': ["Why don't elephants use computers? They're afraid of the mouse!"]}}
{'generate_joke': {'jokes': ['Why don't penguins like talking to strangers at parties? Because they find it hard to break the ice.']}}
{'best_joke': {'best_selected_joke': 'penguins'}}
```
## Create and control loops
When creating a graph with a loop, we require a mechanism for terminating execution. This is most commonly done by adding a [conditional edge](../concepts/low_level.md#conditional-edges) that routes to the [END](../concepts/low_level.md#end-node) node once we reach some termination condition.
You can also set the graph recursion limit when invoking or streaming the graph. The recursion limit sets the number of [supersteps](../concepts/low_level.md#graphs) that the graph is allowed to execute before it raises an error. Read more about the concept of recursion limits [here](../concepts/low_level.md#recursion-limit).
Let's consider a simple graph with a loop to better understand how these mechanisms work.
!!! tip
To return the last value of your state instead of receiving a recursion limit error, see the [next section](#impose-a-recursion-limit).
When creating a loop, you can include a conditional edge that specifies a termination condition:
```python
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
def route(state: State) -> Literal["b", END]:
if termination_condition(state):
return END
else:
return "b"
builder.add_edge(START, "a")
builder.add_conditional_edges("a", route)
builder.add_edge("b", "a")
graph = builder.compile()
```
To control the recursion limit, specify `"recursion_limit"` in the config. This will raise a `GraphRecursionError`, which you can catch and handle:
```python
from langgraph.errors import GraphRecursionError
try:
graph.invoke(inputs, {"recursion_limit": 3})
except GraphRecursionError:
print("Recursion Error")
```
Let's define a graph with a simple loop. Note that we use a conditional edge to implement a termination condition.
```python
import operator
from typing import Annotated, Literal
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
# The operator.add reducer fn makes this append-only
aggregate: Annotated[list, operator.add]
def a(state: State):
print(f'Node A sees {state["aggregate"]}')
return {"aggregate": ["A"]}
def b(state: State):
print(f'Node B sees {state["aggregate"]}')
return {"aggregate": ["B"]}
# Define nodes
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
# Define edges
def route(state: State) -> Literal["b", END]:
if len(state["aggregate"]) < 7:
return "b"
else:
return END
builder.add_edge(START, "a")
builder.add_conditional_edges("a", route)
builder.add_edge("b", "a")
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

This architecture is similar to a [ReAct agent](../agents/overview.md) in which node `"a"` is a tool-calling model, and node `"b"` represents the tools.
In our `route` conditional edge, we specify that we should end after the `"aggregate"` list in the state passes a threshold length.
Invoking the graph, we see that we alternate between nodes `"a"` and `"b"` before terminating once we reach the termination condition.
```python
graph.invoke({"aggregate": []})
```
```
Node A sees []
Node B sees ['A']
Node A sees ['A', 'B']
Node B sees ['A', 'B', 'A']
Node A sees ['A', 'B', 'A', 'B']
Node B sees ['A', 'B', 'A', 'B', 'A']
Node A sees ['A', 'B', 'A', 'B', 'A', 'B']
```
### Impose a recursion limit
In some applications, we may not have a guarantee that we will reach a given termination condition. In these cases, we can set the graph's [recursion limit](../concepts/low_level.md#recursion-limit). This will raise a `GraphRecursionError` after a given number of [supersteps](../concepts/low_level.md#graphs). We can then catch and handle this exception:
```python
from langgraph.errors import GraphRecursionError
try:
graph.invoke({"aggregate": []}, {"recursion_limit": 4})
except GraphRecursionError:
print("Recursion Error")
```
```
Node A sees []
Node B sees ['A']
Node C sees ['A', 'B']
Node D sees ['A', 'B']
Node A sees ['A', 'B', 'C', 'D']
Recursion Error
```
??? example "Extended example: return state on hitting recursion limit"
Instead of raising `GraphRecursionError`, we can introduce a new key to the state that keeps track of the number of steps remaining until reaching the recursion limit. We can then use this key to determine if we should end the run.
LangGraph implements a special `RemainingSteps` annotation. Under the hood, it creates a `ManagedValue` channel -- a state channel that will exist for the duration of our graph run and no longer.
```python
import operator
from typing import Annotated, Literal
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.managed.is_last_step import RemainingSteps
class State(TypedDict):
aggregate: Annotated[list, operator.add]
remaining_steps: RemainingSteps
def a(state: State):
print(f'Node A sees {state["aggregate"]}')
return {"aggregate": ["A"]}
def b(state: State):
print(f'Node B sees {state["aggregate"]}')
return {"aggregate": ["B"]}
# Define nodes
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
# Define edges
def route(state: State) -> Literal["b", END]:
if state["remaining_steps"] <= 2:
return END
else:
return "b"
builder.add_edge(START, "a")
builder.add_conditional_edges("a", route)
builder.add_edge("b", "a")
graph = builder.compile()
# Test it out
result = graph.invoke({"aggregate": []}, {"recursion_limit": 4})
print(result)
```
```
Node A sees []
Node B sees ['A']
Node A sees ['A', 'B']
{'aggregate': ['A', 'B', 'A']}
```
??? example "Extended example: loops with branches"
To better understand how the recursion limit works, let's consider a more complex example. Below we implement a loop, but one step fans out into two nodes:
```python
import operator
from typing import Annotated, Literal
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
aggregate: Annotated[list, operator.add]
def a(state: State):
print(f'Node A sees {state["aggregate"]}')
return {"aggregate": ["A"]}
def b(state: State):
print(f'Node B sees {state["aggregate"]}')
return {"aggregate": ["B"]}
def c(state: State):
print(f'Node C sees {state["aggregate"]}')
return {"aggregate": ["C"]}
def d(state: State):
print(f'Node D sees {state["aggregate"]}')
return {"aggregate": ["D"]}
# Define nodes
builder = StateGraph(State)
builder.add_node(a)
builder.add_node(b)
builder.add_node(c)
builder.add_node(d)
# Define edges
def route(state: State) -> Literal["b", END]:
if len(state["aggregate"]) < 7:
return "b"
else:
return END
builder.add_edge(START, "a")
builder.add_conditional_edges("a", route)
builder.add_edge("b", "c")
builder.add_edge("b", "d")
builder.add_edge(["c", "d"], "a")
graph = builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```

This graph looks complex, but can be conceptualized as loop of [supersteps](../concepts/low_level.md#graphs):
1. Node A
2. Node B
3. Nodes C and D
4. Node A
5. ...
We have a loop of four supersteps, where nodes C and D are executed concurrently.
Invoking the graph as before, we see that we complete two full "laps" before hitting the termination condition:
```python
result = graph.invoke({"aggregate": []})
```
```
Node A sees []
Node B sees ['A']
Node D sees ['A', 'B']
Node C sees ['A', 'B']
Node A sees ['A', 'B', 'C', 'D']
Node B sees ['A', 'B', 'C', 'D', 'A']
Node D sees ['A', 'B', 'C', 'D', 'A', 'B']
Node C sees ['A', 'B', 'C', 'D', 'A', 'B']
Node A sees ['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D']
```
However, if we set the recursion limit to four, we only complete one lap because each lap is four supersteps:
```python
from langgraph.errors import GraphRecursionError
try:
result = graph.invoke({"aggregate": []}, {"recursion_limit": 4})
except GraphRecursionError:
print("Recursion Error")
```
```
Node A sees []
Node B sees ['A']
Node C sees ['A', 'B']
Node D sees ['A', 'B']
Node A sees ['A', 'B', 'C', 'D']
Recursion Error
```
## Async
Using the [async](https://docs.python.org/3/library/asyncio.html) programming paradigm can produce significant performance improvements when running [IO-bound](https://en.wikipedia.org/wiki/I/O_bound) code concurrently (e.g., making concurrent API requests to a chat model provider).
To convert a `sync` implementation of the graph to an `async` implementation, you will need to:
1. Update `nodes` use `async def` instead of `def`.
2. Update the code inside to use `await` appropriately.
3. Invoke the graph with `.ainvoke` or `.astream` as desired.
Because many LangChain objects implement the [Runnable Protocol](https://python.langchain.com/docs/expression_language/interface/) which has `async` variants of all the `sync` methods it's typically fairly quick to upgrade a `sync` graph to an `async` graph.
See example below. To demonstrate async invocations of underlying LLMs, we will include a chat model:
{!snippets/chat_model_tabs.md!}
```python hl_lines="4 5 12"
from langchain.chat_models import init_chat_model
from langgraph.graph import MessagesState, StateGraph
async def node(state: MessagesState): # (1)!
new_message = await llm.ainvoke(state["messages"]) # (2)!
return {"messages": [new_message]}
builder = StateGraph(MessagesState).add_node(node).set_entry_point("node")
graph = builder.compile()
input_message = {"role": "user", "content": "Hello"}
result = await graph.ainvoke({"messages": [input_message]}) # (3)!
```
1. Declare nodes to be async functions.
2. Use async invocations when available within the node.
3. Use async invocations on the graph object itself.
!!! tip "Async streaming"
See the [streaming guide](./streaming.md) for examples of streaming with async.
## Combine control flow and state updates with `Command`
It can be useful to combine control flow (edges) and state updates (nodes). For example, you might want to BOTH perform state updates AND decide which node to go to next in the SAME node. LangGraph provides a way to do so by returning a [Command](../reference/types.md#langgraph.types.Command) object from node functions:
```python
def my_node(state: State) -> Command[Literal["my_other_node"]]:
return Command(
# state update
update={"foo": "bar"},
# control flow
goto="my_other_node"
)
```
We show an end-to-end example below. Let's create a simple graph with 3 nodes: A, B and C. We will first execute node A, and then decide whether to go to Node B or Node C next based on the output of node A.
```python
import random
from typing_extensions import TypedDict, Literal
from langgraph.graph import StateGraph, START
from langgraph.types import Command
# Define graph state
class State(TypedDict):
foo: str
# Define the nodes
def node_a(state: State) -> Command[Literal["node_b", "node_c"]]:
print("Called A")
value = random.choice(["a", "b"])
# this is a replacement for a conditional edge function
if value == "a":
goto = "node_b"
else:
goto = "node_c"
# note how Command allows you to BOTH update the graph state AND route to the next node
return Command(
# this is the state update
update={"foo": value},
# this is a replacement for an edge
goto=goto,
)
def node_b(state: State):
print("Called B")
return {"foo": state["foo"] + "b"}
def node_c(state: State):
print("Called C")
return {"foo": state["foo"] + "c"}
```
We can now create the `StateGraph` with the above nodes. Notice that the graph doesn't have [conditional edges](../concepts/low_level.md#conditional-edges) for routing! This is because control flow is defined with `Command` inside `node_a`.
```python
builder = StateGraph(State)
builder.add_edge(START, "node_a")
builder.add_node(node_a)
builder.add_node(node_b)
builder.add_node(node_c)
# NOTE: there are no edges between nodes A, B and C!
graph = builder.compile()
```
!!! important
You might have noticed that we used `Command` as a return type annotation, e.g. `Command[Literal["node_b", "node_c"]]`. This is necessary for the graph rendering and tells LangGraph that `node_a` can navigate to `node_b` and `node_c`.
```python
from IPython.display import display, Image
display(Image(graph.get_graph().draw_mermaid_png()))
```

If we run the graph multiple times, we'd see it take different paths (A -> B or A -> C) based on the random choice in node A.
```python
graph.invoke({"foo": ""})
```
```
Called A
Called C
```
### Navigate to a node in a parent graph
If you are using [subgraphs](../concepts/subgraphs.md), you might want to navigate from a node within a subgraph to a different subgraph (i.e. a different node in the parent graph). To do so, you can specify `graph=Command.PARENT` in `Command`:
```python
def my_node(state: State) -> Command[Literal["my_other_node"]]:
return Command(
update={"foo": "bar"},
goto="other_subgraph", # where `other_subgraph` is a node in the parent graph
graph=Command.PARENT
)
```
Let's demonstrate this using the above example. We'll do so by changing `node_a` in the above example into a single-node graph that we'll add as a subgraph to our parent graph.
!!! important "State updates with `Command.PARENT`"
When you send updates from a subgraph node to a parent graph node for a key that's shared by both parent and subgraph [state schemas](../concepts/low_level.md#schema), you **must** define a [reducer](../concepts/low_level.md#reducers) for the key you're updating in the parent graph state. See the example below.
```python hl_lines="6 23 33 37"
import operator
from typing_extensions import Annotated
class State(TypedDict):
# NOTE: we define a reducer here
foo: Annotated[str, operator.add]
def node_a(state: State):
print("Called A")
value = random.choice(["a", "b"])
# this is a replacement for a conditional edge function
if value == "a":
goto = "node_b"
else:
goto = "node_c"
# note how Command allows you to BOTH update the graph state AND route to the next node
return Command(
update={"foo": value},
goto=goto,
# this tells LangGraph to navigate to node_b or node_c in the parent graph
# NOTE: this will navigate to the closest parent graph relative to the subgraph
graph=Command.PARENT,
)
subgraph = StateGraph(State).add_node(node_a).add_edge(START, "node_a").compile()
def node_b(state: State):
print("Called B")
# NOTE: since we've defined a reducer, we don't need to manually append
# new characters to existing 'foo' value. instead, reducer will append these
# automatically (via operator.add)
return {"foo": "b"}
def node_c(state: State):
print("Called C")
return {"foo": "c"}
builder = StateGraph(State)
builder.add_edge(START, "subgraph")
builder.add_node("subgraph", subgraph)
builder.add_node(node_b)
builder.add_node(node_c)
graph = builder.compile()
```
```python
graph.invoke({"foo": ""})
```
```
Called A
Called C
```
### Use inside tools
A common use case is updating graph state from inside a tool. For example, in a customer support application you might want to look up customer information based on their account number or ID in the beginning of the conversation. To update the graph state from the tool, you can return `Command(update={"my_custom_key": "foo", "messages": [...]})` from the tool:
```python
@tool
def lookup_user_info(tool_call_id: Annotated[str, InjectedToolCallId], config: RunnableConfig):
"""Use this to look up user information to better assist them with their questions."""
user_info = get_user_info(config.get("configurable", {}).get("user_id"))
return Command(
update={
# update the state keys
"user_info": user_info,
# update the message history
"messages": [ToolMessage("Successfully looked up user information", tool_call_id=tool_call_id)]
}
)
```
!!! important
You MUST include `messages` (or any state key used for the message history) in `Command.update` when returning `Command` from a tool and the list of messages in `messages` MUST contain a `ToolMessage`. This is necessary for the resulting message history to be valid (LLM providers require AI messages with tool calls to be followed by the tool result messages).
If you are using tools that update state via `Command`, we recommend using prebuilt [`ToolNode`](../reference/agents.md#langgraph.prebuilt.tool_node.ToolNode) which automatically handles tools returning `Command` objects and propagates them to the graph state. If you're writing a custom node that calls tools, you would need to manually propagate `Command` objects returned by the tools as the update from the node.
## Visualize your graph
Here we demonstrate how to visualize the graphs you create.
You can visualize any arbitrary [Graph](https://langchain-ai.github.io/langgraph/reference/graphs/), including [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs.md#langgraph.graph.state.StateGraph). Let's have some fun by drawing fractals :).
```python
import random
from typing import Annotated, Literal
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
class State(TypedDict):
messages: Annotated[list, add_messages]
class MyNode:
def __init__(self, name: str):
self.name = name
def __call__(self, state: State):
return {"messages": [("assistant", f"Called node {self.name}")]}
def route(state) -> Literal["entry_node", "__end__"]:
if len(state["messages"]) > 10:
return "__end__"
return "entry_node"
def add_fractal_nodes(builder, current_node, level, max_level):
if level > max_level:
return
# Number of nodes to create at this level
num_nodes = random.randint(1, 3) # Adjust randomness as needed
for i in range(num_nodes):
nm = ["A", "B", "C"][i]
node_name = f"node_{current_node}_{nm}"
builder.add_node(node_name, MyNode(node_name))
builder.add_edge(current_node, node_name)
# Recursively add more nodes
r = random.random()
if r > 0.2 and level + 1 < max_level:
add_fractal_nodes(builder, node_name, level + 1, max_level)
elif r > 0.05:
builder.add_conditional_edges(node_name, route, node_name)
else:
# End
builder.add_edge(node_name, "__end__")
def build_fractal_graph(max_level: int):
builder = StateGraph(State)
entry_point = "entry_node"
builder.add_node(entry_point, MyNode(entry_point))
builder.add_edge(START, entry_point)
add_fractal_nodes(builder, entry_point, 1, max_level)
# Optional: set a finish point if required
builder.add_edge(entry_point, END) # or any specific node
return builder.compile()
app = build_fractal_graph(3)
```
### Mermaid
We can also convert a graph class into Mermaid syntax.
```python
print(app.get_graph().draw_mermaid())
```
```
%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
__start__([
]):::last
__start__ --> entry_node;
entry_node --> __end__;
entry_node --> node_entry_node_A;
entry_node --> node_entry_node_B;
node_entry_node_B --> node_node_entry_node_B_A;
node_entry_node_B --> node_node_entry_node_B_B;
node_entry_node_B --> node_node_entry_node_B_C;
node_entry_node_A -.-> entry_node;
node_entry_node_A -.-> __end__;
node_node_entry_node_B_A -.-> entry_node;
node_node_entry_node_B_A -.-> __end__;
node_node_entry_node_B_B -.-> entry_node;
node_node_entry_node_B_B -.-> __end__;
node_node_entry_node_B_C -.-> entry_node;
node_node_entry_node_B_C -.-> __end__;
classDef default fill:#f2f0ff,line-height:1.2
classDef first fill-opacity:0
classDef last fill:#bfb6fc
```
### PNG
If preferred, we could render the Graph into a `.png`. Here we could use three options:
- Using Mermaid.ink API (does not require additional packages)
- Using Mermaid + Pyppeteer (requires `pip install pyppeteer`)
- Using graphviz (which requires `pip install graphviz`)
**Using Mermaid.Ink**
By default, `draw_mermaid_png()` uses Mermaid.Ink's API to generate the diagram.
```python
from IPython.display import Image, display
from langchain_core.runnables.graph import CurveStyle, MermaidDrawMethod, NodeStyles
display(Image(app.get_graph().draw_mermaid_png()))
```

**Using Mermaid + Pyppeteer**
```python
import nest_asyncio
nest_asyncio.apply() # Required for Jupyter Notebook to run async functions
display(
Image(
app.get_graph().draw_mermaid_png(
curve_style=CurveStyle.LINEAR,
node_colors=NodeStyles(first="#ffdfba", last="#baffc9", default="#fad7de"),
wrap_label_n_words=9,
output_file_path=None,
draw_method=MermaidDrawMethod.PYPPETEER,
background_color="white",
padding=10,
)
)
)
```
**Using Graphviz**
```python
try:
display(Image(app.get_graph().draw_png()))
except ImportError:
print(
"You likely need to install dependencies for pygraphviz, see more here https://github.com/pygraphviz/pygraphviz/blob/main/INSTALL.txt"
)
```
---
how-tos/use-functional-api.md
---
# Use the functional API
The [**Functional API**](../concepts/functional_api.md) allows you to add LangGraph's key features — [persistence](../concepts/persistence.md), [memory](../how-tos/memory/add-memory.md), [human-in-the-loop](../concepts/human_in_the_loop.md), and [streaming](../concepts/streaming.md) — to your applications with minimal changes to your existing code.
!!! tip
For conceptual information on the functional API, see [Functional API](../concepts/functional_api.md).
## Creating a simple workflow
When defining an `entrypoint`, input is restricted to the first argument of the function. To pass multiple inputs, you can use a dictionary.
```python
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
value = inputs["value"]
another_value = inputs["another_value"]
...
my_workflow.invoke({"value": 1, "another_value": 2})
```
??? example "Extended example: simple workflow"
```python
import uuid
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
# Task that checks if a number is even
@task
def is_even(number: int) -> bool:
return number % 2 == 0
# Task that formats a message
@task
def format_message(is_even: bool) -> str:
return "The number is even." if is_even else "The number is odd."
# Create a checkpointer for persistence
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(inputs: dict) -> str:
"""Simple workflow to classify a number."""
even = is_even(inputs["number"]).result()
return format_message(even).result()
# Run the workflow with a unique thread ID
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
result = workflow.invoke({"number": 7}, config=config)
print(result)
```
??? example "Extended example: Compose an essay with an LLM"
This example demonstrates how to use the `@task` and `@entrypoint` decorators
syntactically. Given that a checkpointer is provided, the workflow results will
be persisted in the checkpointer.
```python
import uuid
from langchain.chat_models import init_chat_model
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
llm = init_chat_model('openai:gpt-3.5-turbo')
# Task: generate essay using an LLM
@task
def compose_essay(topic: str) -> str:
"""Generate an essay about the given topic."""
return llm.invoke([
{"role": "system", "content": "You are a helpful assistant that writes essays."},
{"role": "user", "content": f"Write an essay about {topic}."}
]).content
# Create a checkpointer for persistence
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(topic: str) -> str:
"""Simple workflow that generates an essay with an LLM."""
return compose_essay(topic).result()
# Execute the workflow
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
result = workflow.invoke("the history of flight", config=config)
print(result)
```
## Parallel execution
Tasks can be executed in parallel by invoking them concurrently and waiting for the results. This is useful for improving performance in IO bound tasks (e.g., calling APIs for LLMs).
```python
@task
def add_one(number: int) -> int:
return number + 1
@entrypoint(checkpointer=checkpointer)
def graph(numbers: list[int]) -> list[str]:
futures = [add_one(i) for i in numbers]
return [f.result() for f in futures]
```
??? example "Extended example: parallel LLM calls"
This example demonstrates how to run multiple LLM calls in parallel using `@task`. Each call generates a paragraph on a different topic, and results are joined into a single text output.
```python
import uuid
from langchain.chat_models import init_chat_model
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
# Initialize the LLM model
llm = init_chat_model("openai:gpt-3.5-turbo")
# Task that generates a paragraph about a given topic
@task
def generate_paragraph(topic: str) -> str:
response = llm.invoke([
{"role": "system", "content": "You are a helpful assistant that writes educational paragraphs."},
{"role": "user", "content": f"Write a paragraph about {topic}."}
])
return response.content
# Create a checkpointer for persistence
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(topics: list[str]) -> str:
"""Generates multiple paragraphs in parallel and combines them."""
futures = [generate_paragraph(topic) for topic in topics]
paragraphs = [f.result() for f in futures]
return "\n\n".join(paragraphs)
# Run the workflow
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
result = workflow.invoke(["quantum computing", "climate change", "history of aviation"], config=config)
print(result)
```
This example uses LangGraph's concurrency model to improve execution time, especially when tasks involve I/O like LLM completions.
## Calling graphs
The **Functional API** and the [**Graph API**](../concepts/low_level.md) can be used together in the same application as they share the same underlying runtime.
```python
from langgraph.func import entrypoint
from langgraph.graph import StateGraph
builder = StateGraph()
...
some_graph = builder.compile()
@entrypoint()
def some_workflow(some_input: dict) -> int:
# Call a graph defined using the graph API
result_1 = some_graph.invoke(...)
# Call another graph defined using the graph API
result_2 = another_graph.invoke(...)
return {
"result_1": result_1,
"result_2": result_2
}
```
??? example "Extended example: calling a simple graph from the functional API"
```python
import uuid
from typing import TypedDict
from langgraph.func import entrypoint
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph
# Define the shared state type
class State(TypedDict):
foo: int
# Define a simple transformation node
def double(state: State) -> State:
return {"foo": state["foo"] * 2}
# Build the graph using the Graph API
builder = StateGraph(State)
builder.add_node("double", double)
builder.set_entry_point("double")
graph = builder.compile()
# Define the functional API workflow
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(x: int) -> dict:
result = graph.invoke({"foo": x})
return {"bar": result["foo"]}
# Execute the workflow
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
print(workflow.invoke(5, config=config)) # Output: {'bar': 10}
```
## Call other entrypoints
You can call other **entrypoints** from within an **entrypoint** or a **task**.
```python
@entrypoint() # Will automatically use the checkpointer from the parent entrypoint
def some_other_workflow(inputs: dict) -> int:
return inputs["value"]
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
value = some_other_workflow.invoke({"value": 1})
return value
```
??? example "Extended example: calling another entrypoint"
```python
import uuid
from langgraph.func import entrypoint
from langgraph.checkpoint.memory import MemorySaver
# Initialize a checkpointer
checkpointer = MemorySaver()
# A reusable sub-workflow that multiplies a number
@entrypoint()
def multiply(inputs: dict) -> int:
return inputs["a"] * inputs["b"]
# Main workflow that invokes the sub-workflow
@entrypoint(checkpointer=checkpointer)
def main(inputs: dict) -> dict:
result = multiply.invoke({"a": inputs["x"], "b": inputs["y"]})
return {"product": result}
# Execute the main workflow
config = {"configurable": {"thread_id": str(uuid.uuid4())}}
print(main.invoke({"x": 6, "y": 7}, config=config)) # Output: {'product': 42}
```
## Streaming
The **Functional API** uses the same streaming mechanism as the **Graph API**. Please
read the [**streaming guide**](../concepts/streaming.md) section for more details.
Example of using the streaming API to stream both updates and custom data.
```python hl_lines="17"
from langgraph.func import entrypoint
from langgraph.checkpoint.memory import MemorySaver
from langgraph.config import get_stream_writer # (1)!
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def main(inputs: dict) -> int:
writer = get_stream_writer() # (2)!
writer("Started processing") # (3)!
result = inputs["x"] * 2
writer(f"Result is {result}") # (4)!
return result
config = {"configurable": {"thread_id": "abc"}}
for mode, chunk in main.stream( # (5)!
{"x": 5},
stream_mode=["custom", "updates"], # (6)!
config=config
):
print(f"{mode}: {chunk}")
```
1. Import `get_stream_writer` from `langgraph.config`.
2. Obtain a stream writer instance within the entrypoint.
3. Emit custom data before computation begins.
4. Emit another custom message after computing the result.
5. Use `.stream()` to process streamed output.
6. Specify which streaming modes to use.
```pycon
('updates', {'add_one': 2})
('updates', {'add_two': 3})
('custom', 'hello')
('custom', 'world')
('updates', {'main': 5})
```
!!! important "Async with Python < 3.11"
If using Python < 3.11 and writing async code, using `get_stream_writer()` will not work. Instead please
use the `StreamWriter` class directly. See [Async with Python < 3.11](../how-tos/streaming.md#async) for more details.
```python hl_lines="4"
from langgraph.types import StreamWriter
@entrypoint(checkpointer=checkpointer)
async def main(inputs: dict, writer: StreamWriter) -> int:
...
```
## Retry policy
```python
from langgraph.checkpoint.memory import MemorySaver
from langgraph.func import entrypoint, task
from langgraph.types import RetryPolicy
# This variable is just used for demonstration purposes to simulate a network failure.
# It's not something you will have in your actual code.
attempts = 0
# Let's configure the RetryPolicy to retry on ValueError.
# The default RetryPolicy is optimized for retrying specific network errors.
retry_policy = RetryPolicy(retry_on=ValueError)
@task(retry_policy=retry_policy)
def get_info():
global attempts
attempts += 1
if attempts < 2:
raise ValueError('Failure')
return "OK"
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def main(inputs, writer):
return get_info().result()
config = {
"configurable": {
"thread_id": "1"
}
}
main.invoke({'any_input': 'foobar'}, config=config)
```
```pycon
'OK'
```
## Caching Tasks
```python
import time
from langgraph.cache.memory import InMemoryCache
from langgraph.func import entrypoint, task
from langgraph.types import CachePolicy
@task(cache_policy=CachePolicy(ttl=120)) # (1)!
def slow_add(x: int) -> int:
time.sleep(1)
return x * 2
@entrypoint(cache=InMemoryCache())
def main(inputs: dict) -> dict[str, int]:
result1 = slow_add(inputs["x"]).result()
result2 = slow_add(inputs["x"]).result()
return {"result1": result1, "result2": result2}
for chunk in main.stream({"x": 5}, stream_mode="updates"):
print(chunk)
#> {'slow_add': 10}
#> {'slow_add': 10, '__metadata__': {'cached': True}}
#> {'main': {'result1': 10, 'result2': 10}}
```
1. `ttl` is specified in seconds. The cache will be invalidated after this time.
## Resuming after an error
```python
import time
from langgraph.checkpoint.memory import MemorySaver
from langgraph.func import entrypoint, task
from langgraph.types import StreamWriter
# This variable is just used for demonstration purposes to simulate a network failure.
# It's not something you will have in your actual code.
attempts = 0
@task()
def get_info():
"""
Simulates a task that fails once before succeeding.
Raises an exception on the first attempt, then returns "OK" on subsequent tries.
"""
global attempts
attempts += 1
if attempts < 2:
raise ValueError("Failure") # Simulate a failure on the first attempt
return "OK"
# Initialize an in-memory checkpointer for persistence
checkpointer = MemorySaver()
@task
def slow_task():
"""
Simulates a slow-running task by introducing a 1-second delay.
"""
time.sleep(1)
return "Ran slow task."
@entrypoint(checkpointer=checkpointer)
def main(inputs, writer: StreamWriter):
"""
Main workflow function that runs the slow_task and get_info tasks sequentially.
Parameters:
- inputs: Dictionary containing workflow input values.
- writer: StreamWriter for streaming custom data.
The workflow first executes `slow_task` and then attempts to execute `get_info`,
which will fail on the first invocation.
"""
slow_task_result = slow_task().result() # Blocking call to slow_task
get_info().result() # Exception will be raised here on the first attempt
return slow_task_result
# Workflow execution configuration with a unique thread identifier
config = {
"configurable": {
"thread_id": "1" # Unique identifier to track workflow execution
}
}
# This invocation will take ~1 second due to the slow_task execution
try:
# First invocation will raise an exception due to the `get_info` task failing
main.invoke({'any_input': 'foobar'}, config=config)
except ValueError:
pass # Handle the failure gracefully
```
When we resume execution, we won't need to re-run the `slow_task` as its result is already saved in the checkpoint.
```python
main.invoke(None, config=config)
```
```pycon
'Ran slow task.'
```
## Human-in-the-loop
The functional API supports [human-in-the-loop](../concepts/human_in_the_loop.md) workflows using the `interrupt` function and the `Command` primitive.
### Basic human-in-the-loop workflow
We will create three [tasks](../concepts/functional_api.md#task):
1. Append `"bar"`.
2. Pause for human input. When resuming, append human input.
3. Append `"qux"`.
```python
from langgraph.func import entrypoint, task
from langgraph.types import Command, interrupt
@task
def step_1(input_query):
"""Append bar."""
return f"{input_query} bar"
@task
def human_feedback(input_query):
"""Append user input."""
feedback = interrupt(f"Please provide feedback: {input_query}")
return f"{input_query} {feedback}"
@task
def step_3(input_query):
"""Append qux."""
return f"{input_query} qux"
```
We can now compose these tasks in an [entrypoint](../concepts/functional_api.md#entrypoint):
```python
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def graph(input_query):
result_1 = step_1(input_query).result()
result_2 = human_feedback(result_1).result()
result_3 = step_3(result_2).result()
return result_3
```
[interrupt()](../how-tos/human_in_the_loop/add-human-in-the-loop.md#pause-using-interrupt) is called inside a task, enabling a human to review and edit the output of the previous task. The results of prior tasks-- in this case `step_1`-- are persisted, so that they are not run again following the `interrupt`.
Let's send in a query string:
```python
config = {"configurable": {"thread_id": "1"}}
for event in graph.stream("foo", config):
print(event)
print("\n")
```
Note that we've paused with an `interrupt` after `step_1`. The interrupt provides instructions to resume the run. To resume, we issue a [Command](../how-tos/human_in_the_loop/add-human-in-the-loop.md#resume-using-the-command-primitive) containing the data expected by the `human_feedback` task.
```python
# Continue execution
for event in graph.stream(Command(resume="baz"), config):
print(event)
print("\n")
```
After resuming, the run proceeds through the remaining step and terminates as expected.
### Review tool calls
To review tool calls before execution, we add a `review_tool_call` function that calls [`interrupt`](../how-tos/human_in_the_loop/add-human-in-the-loop.md#pause-using-interrupt). When this function is called, execution will be paused until we issue a command to resume it.
Given a tool call, our function will `interrupt` for human review. At that point we can either:
- Accept the tool call
- Revise the tool call and continue
- Generate a custom tool message (e.g., instructing the model to re-format its tool call)
```python
from typing import Union
def review_tool_call(tool_call: ToolCall) -> Union[ToolCall, ToolMessage]:
"""Review a tool call, returning a validated version."""
human_review = interrupt(
{
"question": "Is this correct?",
"tool_call": tool_call,
}
)
review_action = human_review["action"]
review_data = human_review.get("data")
if review_action == "continue":
return tool_call
elif review_action == "update":
updated_tool_call = {**tool_call, **{"args": review_data}}
return updated_tool_call
elif review_action == "feedback":
return ToolMessage(
content=review_data, name=tool_call["name"], tool_call_id=tool_call["id"]
)
```
We can now update our [entrypoint](../concepts/functional_api.md#entrypoint) to review the generated tool calls. If a tool call is accepted or revised, we execute in the same way as before. Otherwise, we just append the `ToolMessage` supplied by the human. The results of prior tasks — in this case the initial model call — are persisted, so that they are not run again following the `interrupt`.
```python
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph.message import add_messages
from langgraph.types import Command, interrupt
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def agent(messages, previous):
if previous is not None:
messages = add_messages(previous, messages)
llm_response = call_model(messages).result()
while True:
if not llm_response.tool_calls:
break
# Review tool calls
tool_results = []
tool_calls = []
for i, tool_call in enumerate(llm_response.tool_calls):
review = review_tool_call(tool_call)
if isinstance(review, ToolMessage):
tool_results.append(review)
else: # is a validated tool call
tool_calls.append(review)
if review != tool_call:
llm_response.tool_calls[i] = review # update message
# Execute remaining tool calls
tool_result_futures = [call_tool(tool_call) for tool_call in tool_calls]
remaining_tool_results = [fut.result() for fut in tool_result_futures]
# Append to message list
messages = add_messages(
messages,
[llm_response, *tool_results, *remaining_tool_results],
)
# Call model again
llm_response = call_model(messages).result()
# Generate final response
messages = add_messages(messages, llm_response)
return entrypoint.final(value=llm_response, save=messages)
```
## Short-term memory
Short-term memory allows storing information across different **invocations** of the same **thread id**. See [short-term memory](../concepts/functional_api.md#short-term-memory) for more details.
### Manage checkpoints
You can view and delete the information stored by the checkpointer.
#### View thread state (checkpoint)
```python hl_lines="3 6 10"
config = {
"configurable": {
"thread_id": "1",
# optionally provide an ID for a specific checkpoint,
# otherwise the latest checkpoint is shown
# "checkpoint_id": "1f029ca3-1f5b-6704-8004-820c16b69a5a"
}
}
graph.get_state(config)
```
```
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob"), AIMessage(content='Hi Bob! How are you doing today?), HumanMessage(content="what's my name?"), AIMessage(content='Your name is Bob.')]}, next=(),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-1f5b-6704-8004-820c16b69a5a'}},
metadata={
'source': 'loop',
'writes': {'call_model': {'messages': AIMessage(content='Your name is Bob.')}},
'step': 4,
'parents': {},
'thread_id': '1'
},
created_at='2025-05-05T16:01:24.680462+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-1790-6b0a-8003-baf965b6a38f'}},
tasks=(),
interrupts=()
)
```
#### View the history of the thread (checkpoints)
```python hl_lines="3 6"
config = {
"configurable": {
"thread_id": "1"
}
}
list(graph.get_state_history(config))
```
```
[
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob"), AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?'), HumanMessage(content="what's my name?"), AIMessage(content='Your name is Bob.')]},
next=(),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-1f5b-6704-8004-820c16b69a5a'}},
metadata={'source': 'loop', 'writes': {'call_model': {'messages': AIMessage(content='Your name is Bob.')}}, 'step': 4, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:24.680462+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-1790-6b0a-8003-baf965b6a38f'}},
tasks=(),
interrupts=()
),
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob"), AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?'), HumanMessage(content="what's my name?")]},
next=('call_model',),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-1790-6b0a-8003-baf965b6a38f'}},
metadata={'source': 'loop', 'writes': None, 'step': 3, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:23.863421+00:00',
parent_config={...}
tasks=(PregelTask(id='8ab4155e-6b15-b885-9ce5-bed69a2c305c', name='call_model', path=('__pregel_pull', 'call_model'), error=None, interrupts=(), state=None, result={'messages': AIMessage(content='Your name is Bob.')}),),
interrupts=()
),
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob"), AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?')]},
next=('__start__',),
config={...},
metadata={'source': 'input', 'writes': {'__start__': {'messages': [{'role': 'user', 'content': "what's my name?"}]}}, 'step': 2, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:23.863173+00:00',
parent_config={...}
tasks=(PregelTask(id='24ba39d6-6db1-4c9b-f4c5-682aeaf38dcd', name='__start__', path=('__pregel_pull', '__start__'), error=None, interrupts=(), state=None, result={'messages': [{'role': 'user', 'content': "what's my name?"}]}),),
interrupts=()
),
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob"), AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?')]},
next=(),
config={...},
metadata={'source': 'loop', 'writes': {'call_model': {'messages': AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?')}}, 'step': 1, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:23.862295+00:00',
parent_config={...}
tasks=(),
interrupts=()
),
StateSnapshot(
values={'messages': [HumanMessage(content="hi! I'm bob")]},
next=('call_model',),
config={...},
metadata={'source': 'loop', 'writes': None, 'step': 0, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:22.278960+00:00',
parent_config={...}
tasks=(PregelTask(id='8cbd75e0-3720-b056-04f7-71ac805140a0', name='call_model', path=('__pregel_pull', 'call_model'), error=None, interrupts=(), state=None, result={'messages': AIMessage(content='Hi Bob! How are you doing today? Is there anything I can help you with?')}),),
interrupts=()
),
StateSnapshot(
values={'messages': []},
next=('__start__',),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1f029ca3-0870-6ce2-bfff-1f3f14c3e565'}},
metadata={'source': 'input', 'writes': {'__start__': {'messages': [{'role': 'user', 'content': "hi! I'm bob"}]}}, 'step': -1, 'parents': {}, 'thread_id': '1'},
created_at='2025-05-05T16:01:22.277497+00:00',
parent_config=None,
tasks=(PregelTask(id='d458367b-8265-812c-18e2-33001d199ce6', name='__start__', path=('__pregel_pull', '__start__'), error=None, interrupts=(), state=None, result={'messages': [{'role': 'user', 'content': "hi! I'm bob"}]}),),
interrupts=()
)
]
```
### Decouple return value from saved value
Use `entrypoint.final` to decouple what is returned to the caller from what is persisted in the checkpoint. This is useful when:
* You want to return a computed result (e.g., a summary or status), but save a different internal value for use on the next invocation.
* You need to control what gets passed to the previous parameter on the next run.
```python
from typing import Optional
from langgraph.func import entrypoint
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def accumulate(n: int, *, previous: Optional[int]) -> entrypoint.final[int, int]:
previous = previous or 0
total = previous + n
# Return the *previous* value to the caller but save the *new* total to the checkpoint.
return entrypoint.final(value=previous, save=total)
config = {"configurable": {"thread_id": "my-thread"}}
print(accumulate.invoke(1, config=config)) # 0
print(accumulate.invoke(2, config=config)) # 1
print(accumulate.invoke(3, config=config)) # 3
```
### Chatbot example
An example of a simple chatbot using the functional API and the `MemorySaver` checkpointer.
The bot is able to remember the previous conversation and continue from where it left off.
```python
from langchain_core.messages import BaseMessage
from langgraph.graph import add_messages
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
@task
def call_model(messages: list[BaseMessage]):
response = model.invoke(messages)
return response
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(inputs: list[BaseMessage], *, previous: list[BaseMessage]):
if previous:
inputs = add_messages(previous, inputs)
response = call_model(inputs).result()
return entrypoint.final(value=response, save=add_messages(inputs, response))
config = {"configurable": {"thread_id": "1"}}
input_message = {"role": "user", "content": "hi! I'm bob"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
input_message = {"role": "user", "content": "what's my name?"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
??? example "Extended example: build a simple chatbot"
[How to add thread-level persistence (functional API)](./persistence-functional.ipynb): Shows how to add thread-level persistence to a functional API workflow and implements a simple chatbot.
## Long-term memory
[long-term memory](../concepts/memory.md#long-term-memory) allows storing information across different **thread ids**. This could be useful for learning information about a given user in one conversation and using it in another.
??? example "Extended example: add long-term memory"
[How to add cross-thread persistence (functional API)](./cross-thread-persistence-functional.ipynb): Shows how to add cross-thread persistence to a functional API workflow and implements a simple chatbot.
## Workflows
* [Workflows and agent](../tutorials/workflows.md) guide for more examples of how to build workflows using the Functional API.
## Agents
* [How to create an agent from scratch (Functional API)](./react-agent-from-scratch-functional.ipynb): Shows how to create a simple agent from scratch using the functional API.
* [How to build a multi-agent network](./multi-agent-network-functional.ipynb): Shows how to build a multi-agent network using the functional API.
* [How to add multi-turn conversation in a multi-agent application (functional API)](./multi-agent-multi-turn-convo-functional.ipynb): allow an end-user to engage in a multi-turn conversation with one or more agents.
## Integrate with other libraries
* [Add LangGraph's features to other frameworks using the functional API](./autogen-integration-functional.ipynb): Add LangGraph features like persistence, memory and streaming to other agent frameworks that do not provide them out of the box.
---
how-tos/streaming.md
---
# Stream outputs
You can [stream outputs](../concepts/streaming.md) from a LangGraph agent or workflow.
## Supported stream modes
Pass one or more of the following stream modes as a list to the [`stream()`][langgraph.graph.state.CompiledStateGraph.stream] or [`astream()`][langgraph.graph.state.CompiledStateGraph.astream] methods:
| Mode | Description |
|------|-------------|
| `values` | Streams the full value of the state after each step of the graph. |
| `updates` | Streams the updates to the state after each step of the graph. If multiple updates are made in the same step (e.g., multiple nodes are run), those updates are streamed separately. |
| `custom` | Streams custom data from inside your graph nodes. |
| `messages` | Streams 2-tuples (LLM token, metadata) from any graph nodes where an LLM is invoked. |
| `debug` | Streams as much information as possible throughout the execution of the graph.
## Stream from an agent
### Agent progress
To stream agent progress, use the [`stream()`][langgraph.graph.state.CompiledStateGraph.stream] or [`astream()`][langgraph.graph.state.CompiledStateGraph.astream] methods with `stream_mode="updates"`. This emits an event after every agent step.
For example, if you have an agent that calls a tool once, you should see the following updates:
* **LLM node**: AI message with tool call requests
* **Tool node**: Tool message with execution result
* **LLM node**: Final AI response
=== "Sync"
```python hl_lines="5 7"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
for chunk in agent.stream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="updates"
):
print(chunk)
print("\n")
```
=== "Async"
```python hl_lines="5 7"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
async for chunk in agent.astream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="updates"
):
print(chunk)
print("\n")
```
### LLM tokens
To stream tokens as they are produced by the LLM, use `stream_mode="messages"`:
=== "Sync"
```python hl_lines="5 7"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
for token, metadata in agent.stream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="messages"
):
print("Token", token)
print("Metadata", metadata)
print("\n")
```
=== "Async"
```python hl_lines="5 7"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
async for token, metadata in agent.astream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="messages"
):
print("Token", token)
print("Metadata", metadata)
print("\n")
```
### Tool updates
To stream updates from tools as they are executed, you can use [get_stream_writer][langgraph.config.get_stream_writer].
=== "Sync"
```python hl_lines="1 5 7 17"
from langgraph.config import get_stream_writer
def get_weather(city: str) -> str:
"""Get weather for a given city."""
writer = get_stream_writer()
# stream any arbitrary data
writer(f"Looking up data for city: {city}")
return f"It's always sunny in {city}!"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
for chunk in agent.stream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="custom"
):
print(chunk)
print("\n")
```
=== "Async"
```python hl_lines="1 5 7 17"
from langgraph.config import get_stream_writer
def get_weather(city: str) -> str:
"""Get weather for a given city."""
writer = get_stream_writer()
# stream any arbitrary data
writer(f"Looking up data for city: {city}")
return f"It's always sunny in {city}!"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
async for chunk in agent.astream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode="custom"
):
print(chunk)
print("\n")
```
!!! Note
If you add `get_stream_writer` inside your tool, you won't be able to invoke the tool outside of a LangGraph execution context.
### Stream multiple modes
You can specify multiple streaming modes by passing stream mode as a list: `stream_mode=["updates", "messages", "custom"]`:
=== "Sync"
```python hl_lines="8"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
for stream_mode, chunk in agent.stream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode=["updates", "messages", "custom"]
):
print(chunk)
print("\n")
```
=== "Async"
```python hl_lines="8"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_weather],
)
async for stream_mode, chunk in agent.astream(
{"messages": [{"role": "user", "content": "what is the weather in sf"}]},
stream_mode=["updates", "messages", "custom"]
):
print(chunk)
print("\n")
```
### Disable streaming
In some applications you might need to disable streaming of individual tokens for a given model. This is useful in [multi-agent](../agents/multi-agent.md) systems to control which agents stream their output.
See the [Models](../agents/models.md#disable-streaming) guide to learn how to disable streaming.
## Stream from a workflow
### Basic usage example
LangGraph graphs expose the [`.stream()`][langgraph.pregel.Pregel.stream] (sync) and [`.astream()`][langgraph.pregel.Pregel.astream] (async) methods to yield streamed outputs as iterators.
=== "Sync"
```python
for chunk in graph.stream(inputs, stream_mode="updates"):
print(chunk)
```
=== "Async"
```python
async for chunk in graph.astream(inputs, stream_mode="updates"):
print(chunk)
```
??? example "Extended example: streaming updates"
```python hl_lines="24 26"
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
topic: str
joke: str
def refine_topic(state: State):
return {"topic": state["topic"] + " and cats"}
def generate_joke(state: State):
return {"joke": f"This is a joke about {state['topic']}"}
graph = (
StateGraph(State)
.add_node(refine_topic)
.add_node(generate_joke)
.add_edge(START, "refine_topic")
.add_edge("refine_topic", "generate_joke")
.add_edge("generate_joke", END)
.compile()
)
for chunk in graph.stream( # (1)!
{"topic": "ice cream"},
stream_mode="updates", # (2)!
):
print(chunk)
```
1. The `stream()` method returns an iterator that yields streamed outputs.
2. Set `stream_mode="updates"` to stream only the updates to the graph state after each node. Other stream modes are also available. See [supported stream modes](#supported-stream-modes) for details.
```output
{'refine_topic': {'topic': 'ice cream and cats'}}
{'generate_joke': {'joke': 'This is a joke about ice cream and cats'}}
``` |
### Stream multiple modes
You can pass a list as the `stream_mode` parameter to stream multiple modes at once.
The streamed outputs will be tuples of `(mode, chunk)` where `mode` is the name of the stream mode and `chunk` is the data streamed by that mode.
=== "Sync"
```python
for mode, chunk in graph.stream(inputs, stream_mode=["updates", "custom"]):
print(chunk)
```
=== "Async"
```python
async for mode, chunk in graph.astream(inputs, stream_mode=["updates", "custom"]):
print(chunk)
```
### Stream graph state
Use the stream modes `updates` and `values` to stream the state of the graph as it executes.
* `updates` streams the **updates** to the state after each step of the graph.
* `values` streams the **full value** of the state after each step of the graph.
```python
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
class State(TypedDict):
topic: str
joke: str
def refine_topic(state: State):
return {"topic": state["topic"] + " and cats"}
def generate_joke(state: State):
return {"joke": f"This is a joke about {state['topic']}"}
graph = (
StateGraph(State)
.add_node(refine_topic)
.add_node(generate_joke)
.add_edge(START, "refine_topic")
.add_edge("refine_topic", "generate_joke")
.add_edge("generate_joke", END)
.compile()
)
```
=== "updates"
Use this to stream only the **state updates** returned by the nodes after each step. The streamed outputs include the name of the node as well as the update.
```python hl_lines="3"
for chunk in graph.stream(
{"topic": "ice cream"},
stream_mode="updates",
):
print(chunk)
```
=== "values"
Use this to stream the **full state** of the graph after each step.
```python hl_lines="3"
for chunk in graph.stream(
{"topic": "ice cream"},
stream_mode="values",
):
print(chunk)
```
### Stream subgraph outputs
To include outputs from [subgraphs](../concepts/subgraphs.md) in the streamed outputs, you can set `subgraphs=True` in the `.stream()` method of the parent graph. This will stream outputs from both the parent graph and any subgraphs.
The outputs will be streamed as tuples `(namespace, data)`, where `namespace` is a tuple with the path to the node where a subgraph is invoked, e.g. `("parent_node:", "child_node:")`.
```python hl_lines="3"
for chunk in graph.stream(
{"foo": "foo"},
subgraphs=True, # (1)!
stream_mode="updates",
):
print(chunk)
```
1. Set `subgraphs=True` to stream outputs from subgraphs.
??? example "Extended example: streaming from subgraphs"
```python hl_lines="39"
from langgraph.graph import START, StateGraph
from typing import TypedDict
# Define subgraph
class SubgraphState(TypedDict):
foo: str # note that this key is shared with the parent graph state
bar: str
def subgraph_node_1(state: SubgraphState):
return {"bar": "bar"}
def subgraph_node_2(state: SubgraphState):
return {"foo": state["foo"] + state["bar"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_node(subgraph_node_2)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph_builder.add_edge("subgraph_node_1", "subgraph_node_2")
subgraph = subgraph_builder.compile()
# Define parent graph
class ParentState(TypedDict):
foo: str
def node_1(state: ParentState):
return {"foo": "hi! " + state["foo"]}
builder = StateGraph(ParentState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", subgraph)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
graph = builder.compile()
for chunk in graph.stream(
{"foo": "foo"},
stream_mode="updates",
subgraphs=True, # (1)!
):
print(chunk)
```
1. Set `subgraphs=True` to stream outputs from subgraphs.
```
((), {'node_1': {'foo': 'hi! foo'}})
(('node_2:dfddc4ba-c3c5-6887-5012-a243b5b377c2',), {'subgraph_node_1': {'bar': 'bar'}})
(('node_2:dfddc4ba-c3c5-6887-5012-a243b5b377c2',), {'subgraph_node_2': {'foo': 'hi! foobar'}})
((), {'node_2': {'foo': 'hi! foobar'}})
```
**Note** that we are receiving not just the node updates, but we also the namespaces which tell us what graph (or subgraph) we are streaming from.
### Debugging {#debug}
Use the `debug` streaming mode to stream as much information as possible throughout the execution of the graph. The streamed outputs include the name of the node as well as the full state.
```python hl_lines="3"
for chunk in graph.stream(
{"topic": "ice cream"},
stream_mode="debug",
):
print(chunk)
```
### LLM tokens {#messages}
Use the `messages` streaming mode to stream Large Language Model (LLM) outputs **token by token** from any part of your graph, including nodes, tools, subgraphs, or tasks.
The streamed output from [`messages` mode](#supported-stream-modes) is a tuple `(message_chunk, metadata)` where:
- `message_chunk`: the token or message segment from the LLM.
- `metadata`: a dictionary containing details about the graph node and LLM invocation.
> If your LLM is not available as a LangChain integration, you can stream its outputs using `custom` mode instead. See [use with any LLM](#use-with-any-llm) for details.
!!! warning "Manual config required for async in Python < 3.11"
When using Python < 3.11 with async code, you must explicitly pass `RunnableConfig` to `ainvoke()` to enable proper streaming. See [Async with Python < 3.11](#async) for details or upgrade to Python 3.11+.
```python hl_lines="17 33"
from dataclasses import dataclass
from langchain.chat_models import init_chat_model
from langgraph.graph import StateGraph, START
@dataclass
class MyState:
topic: str
joke: str = ""
llm = init_chat_model(model="openai:gpt-4o-mini")
def call_model(state: MyState):
"""Call the LLM to generate a joke about a topic"""
llm_response = llm.invoke( # (1)!
[
{"role": "user", "content": f"Generate a joke about {state.topic}"}
]
)
return {"joke": llm_response.content}
graph = (
StateGraph(MyState)
.add_node(call_model)
.add_edge(START, "call_model")
.compile()
)
for message_chunk, metadata in graph.stream( # (2)!
{"topic": "ice cream"},
stream_mode="messages",
):
if message_chunk.content:
print(message_chunk.content, end="|", flush=True)
```
1. Note that the message events are emitted even when the LLM is run using `.invoke` rather than `.stream`.
2. The "messages" stream mode returns an iterator of tuples `(message_chunk, metadata)` where `message_chunk` is the token streamed by the LLM and `metadata` is a dictionary with information about the graph node where the LLM was called and other information.
#### Filter by LLM invocation
You can associate `tags` with LLM invocations to filter the streamed tokens by LLM invocation.
```python hl_lines="10"
from langchain.chat_models import init_chat_model
llm_1 = init_chat_model(model="openai:gpt-4o-mini", tags=['joke']) # (1)!
llm_2 = init_chat_model(model="openai:gpt-4o-mini", tags=['poem']) # (2)!
graph = ... # define a graph that uses these LLMs
async for msg, metadata in graph.astream( # (3)!
{"topic": "cats"},
stream_mode="messages",
):
if metadata["tags"] == ["joke"]: # (4)!
print(msg.content, end="|", flush=True)
```
1. llm_1 is tagged with "joke".
2. llm_2 is tagged with "poem".
3. The `stream_mode` is set to "messages" to stream LLM tokens. The `metadata` contains information about the LLM invocation, including the tags.
4. Filter the streamed tokens by the `tags` field in the metadata to only include the tokens from the LLM invocation with the "joke" tag.
??? example "Extended example: filtering by tags"
```python hl_lines="42"
from typing import TypedDict
from langchain.chat_models import init_chat_model
from langgraph.graph import START, StateGraph
joke_model = init_chat_model(model="openai:gpt-4o-mini", tags=["joke"]) # (1)!
poem_model = init_chat_model(model="openai:gpt-4o-mini", tags=["poem"]) # (2)!
class State(TypedDict):
topic: str
joke: str
poem: str
async def call_model(state, config):
topic = state["topic"]
print("Writing joke...")
# Note: Passing the config through explicitly is required for python < 3.11
# Since context var support wasn't added before then: https://docs.python.org/3/library/asyncio-task.html#creating-tasks
joke_response = await joke_model.ainvoke(
[{"role": "user", "content": f"Write a joke about {topic}"}],
config, # (3)!
)
print("\n\nWriting poem...")
poem_response = await poem_model.ainvoke(
[{"role": "user", "content": f"Write a short poem about {topic}"}],
config, # (3)!
)
return {"joke": joke_response.content, "poem": poem_response.content}
graph = (
StateGraph(State)
.add_node(call_model)
.add_edge(START, "call_model")
.compile()
)
async for msg, metadata in graph.astream(
{"topic": "cats"},
stream_mode="messages", # (4)!
):
if metadata["tags"] == ["joke"]: # (4)!
print(msg.content, end="|", flush=True)
```
1. The `joke_model` is tagged with "joke".
2. The `poem_model` is tagged with "poem".
3. The `config` is passed through explicitly to ensure the context vars are propagated correctly. This is required for Python < 3.11 when using async code. Please see the [async section](#async) for more details.
4. The `stream_mode` is set to "messages" to stream LLM tokens. The `metadata` contains information about the LLM invocation, including the tags.
#### Filter by node
To stream tokens only from specific nodes, use `stream_mode="messages"` and filter the outputs by the `langgraph_node` field in the streamed metadata:
```python hl_lines="3 5"
for msg, metadata in graph.stream( # (1)!
inputs,
stream_mode="messages",
):
if msg.content and metadata["langgraph_node"] == "some_node_name": # (2)!
...
```
1. The "messages" stream mode returns a tuple of `(message_chunk, metadata)` where `message_chunk` is the token streamed by the LLM and `metadata` is a dictionary with information about the graph node where the LLM was called and other information.
2. Filter the streamed tokens by the `langgraph_node` field in the metadata to only include the tokens from the `write_poem` node.
??? example "Extended example: streaming LLM tokens from specific nodes"
```python hl_lines="40 44"
from typing import TypedDict
from langgraph.graph import START, StateGraph
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini")
class State(TypedDict):
topic: str
joke: str
poem: str
def write_joke(state: State):
topic = state["topic"]
joke_response = model.invoke(
[{"role": "user", "content": f"Write a joke about {topic}"}]
)
return {"joke": joke_response.content}
def write_poem(state: State):
topic = state["topic"]
poem_response = model.invoke(
[{"role": "user", "content": f"Write a short poem about {topic}"}]
)
return {"poem": poem_response.content}
graph = (
StateGraph(State)
.add_node(write_joke)
.add_node(write_poem)
# write both the joke and the poem concurrently
.add_edge(START, "write_joke")
.add_edge(START, "write_poem")
.compile()
)
for msg, metadata in graph.stream( # (1)!
{"topic": "cats"},
stream_mode="messages",
):
if msg.content and metadata["langgraph_node"] == "write_poem": # (2)!
print(msg.content, end="|", flush=True)
```
1. The "messages" stream mode returns a tuple of `(message_chunk, metadata)` where `message_chunk` is the token streamed by the LLM and `metadata` is a dictionary with information about the graph node where the LLM was called and other information.
2. Filter the streamed tokens by the `langgraph_node` field in the metadata to only include the tokens from the `write_poem` node.
### Stream custom data
To send **custom user-defined data** from inside a LangGraph node or tool, follow these steps:
1. Use `get_stream_writer()` to access the stream writer and emit custom data.
2. Set `stream_mode="custom"` when calling `.stream()` or `.astream()` to get the custom data in the stream. You can combine multiple modes (e.g., `["updates", "custom"]`), but at least one must be `"custom"`.
!!! warning "No `get_stream_writer()` in async for Python < 3.11"
In async code running on Python < 3.11, `get_stream_writer()` will not work.
Instead, add a `writer` parameter to your node or tool and pass it manually.
See [Async with Python < 3.11](#async) for usage examples.
=== "node"
```python
from typing import TypedDict
from langgraph.config import get_stream_writer
from langgraph.graph import StateGraph, START
class State(TypedDict):
query: str
answer: str
def node(state: State):
writer = get_stream_writer() # (1)!
writer({"custom_key": "Generating custom data inside node"}) # (2)!
return {"answer": "some data"}
graph = (
StateGraph(State)
.add_node(node)
.add_edge(START, "node")
.compile()
)
inputs = {"query": "example"}
# Usage
for chunk in graph.stream(inputs, stream_mode="custom"): # (3)!
print(chunk)
```
1. Get the stream writer to send custom data.
2. Emit a custom key-value pair (e.g., progress update).
3. Set `stream_mode="custom"` to receive the custom data in the stream.
=== "tool"
```python hl_lines="8 10"
from langchain_core.tools import tool
from langgraph.config import get_stream_writer
@tool
def query_database(query: str) -> str:
"""Query the database."""
writer = get_stream_writer() # (1)!
writer({"data": "Retrieved 0/100 records", "type": "progress"}) # (2)!
# perform query
writer({"data": "Retrieved 100/100 records", "type": "progress"}) # (3)!
return "some-answer"
graph = ... # define a graph that uses this tool
for chunk in graph.stream(inputs, stream_mode="custom"): # (4)!
print(chunk)
```
1. Access the stream writer to send custom data.
2. Emit a custom key-value pair (e.g., progress update).
3. Emit another custom key-value pair.
4. Set `stream_mode="custom"` to receive the custom data in the stream.
### Use with any LLM
You can use `stream_mode="custom"` to stream data from **any LLM API** — even if that API does **not** implement the LangChain chat model interface.
This lets you integrate raw LLM clients or external services that provide their own streaming interfaces, making LangGraph highly flexible for custom setups.
```python hl_lines="5 8 20"
from langgraph.config import get_stream_writer
def call_arbitrary_model(state):
"""Example node that calls an arbitrary model and streams the output"""
writer = get_stream_writer() # (1)!
# Assume you have a streaming client that yields chunks
for chunk in your_custom_streaming_client(state["topic"]): # (2)!
writer({"custom_llm_chunk": chunk}) # (3)!
return {"result": "completed"}
graph = (
StateGraph(State)
.add_node(call_arbitrary_model)
# Add other nodes and edges as needed
.compile()
)
for chunk in graph.stream(
{"topic": "cats"},
stream_mode="custom", # (4)!
):
# The chunk will contain the custom data streamed from the llm
print(chunk)
```
1. Get the stream writer to send custom data.
2. Generate LLM tokens using your custom streaming client.
3. Use the writer to send custom data to the stream.
4. Set `stream_mode="custom"` to receive the custom data in the stream.
??? example "Extended example: streaming arbitrary chat model"
```python
import operator
import json
from typing import TypedDict
from typing_extensions import Annotated
from langgraph.graph import StateGraph, START
from openai import AsyncOpenAI
openai_client = AsyncOpenAI()
model_name = "gpt-4o-mini"
async def stream_tokens(model_name: str, messages: list[dict]):
response = await openai_client.chat.completions.create(
messages=messages, model=model_name, stream=True
)
role = None
async for chunk in response:
delta = chunk.choices[0].delta
if delta.role is not None:
role = delta.role
if delta.content:
yield {"role": role, "content": delta.content}
# this is our tool
async def get_items(place: str) -> str:
"""Use this tool to list items one might find in a place you're asked about."""
writer = get_stream_writer()
response = ""
async for msg_chunk in stream_tokens(
model_name,
[
{
"role": "user",
"content": (
"Can you tell me what kind of items "
f"i might find in the following place: '{place}'. "
"List at least 3 such items separating them by a comma. "
"And include a brief description of each item."
),
}
],
):
response += msg_chunk["content"]
writer(msg_chunk)
return response
class State(TypedDict):
messages: Annotated[list[dict], operator.add]
# this is the tool-calling graph node
async def call_tool(state: State):
ai_message = state["messages"][-1]
tool_call = ai_message["tool_calls"][-1]
function_name = tool_call["function"]["name"]
if function_name != "get_items":
raise ValueError(f"Tool {function_name} not supported")
function_arguments = tool_call["function"]["arguments"]
arguments = json.loads(function_arguments)
function_response = await get_items(**arguments)
tool_message = {
"tool_call_id": tool_call["id"],
"role": "tool",
"name": function_name,
"content": function_response,
}
return {"messages": [tool_message]}
graph = (
StateGraph(State)
.add_node(call_tool)
.add_edge(START, "call_tool")
.compile()
)
```
Let's invoke the graph with an AI message that includes a tool call:
```python
inputs = {
"messages": [
{
"content": None,
"role": "assistant",
"tool_calls": [
{
"id": "1",
"function": {
"arguments": '{"place":"bedroom"}',
"name": "get_items",
},
"type": "function",
}
],
}
]
}
async for chunk in graph.astream(
inputs,
stream_mode="custom",
):
print(chunk["content"], end="|", flush=True)
```
### Disable streaming for specific chat models
If your application mixes models that support streaming with those that do not, you may need to explicitly disable streaming for
models that do not support it.
Set `disable_streaming=True` when initializing the model.
=== "init_chat_model"
```python hl_lines="5"
from langchain.chat_models import init_chat_model
model = init_chat_model(
"anthropic:claude-3-7-sonnet-latest",
disable_streaming=True # (1)!
)
```
1. Set `disable_streaming=True` to disable streaming for the chat model.
=== "chat model interface"
```python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="o1-preview", disable_streaming=True) # (1)!
```
1. Set `disable_streaming=True` to disable streaming for the chat model.
### Async with Python < 3.11 { #async }
In Python versions < 3.11, [asyncio tasks](https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task) do not support the `context` parameter.
This limits LangGraph ability to automatically propagate context, and affects LangGraph’s streaming mechanisms in two key ways:
1. You **must** explicitly pass [`RunnableConfig`](https://python.langchain.com/docs/concepts/runnables/#runnableconfig) into async LLM calls (e.g., `ainvoke()`), as callbacks are not automatically propagated.
2. You **cannot** use `get_stream_writer()` in async nodes or tools — you must pass a `writer` argument directly.
??? example "Extended example: async LLM call with manual config"
```python hl_lines="16 29"
from typing import TypedDict
from langgraph.graph import START, StateGraph
from langchain.chat_models import init_chat_model
llm = init_chat_model(model="openai:gpt-4o-mini")
class State(TypedDict):
topic: str
joke: str
async def call_model(state, config): # (1)!
topic = state["topic"]
print("Generating joke...")
joke_response = await llm.ainvoke(
[{"role": "user", "content": f"Write a joke about {topic}"}],
config, # (2)!
)
return {"joke": joke_response.content}
graph = (
StateGraph(State)
.add_node(call_model)
.add_edge(START, "call_model")
.compile()
)
async for chunk, metadata in graph.astream(
{"topic": "ice cream"},
stream_mode="messages", # (3)!
):
if chunk.content:
print(chunk.content, end="|", flush=True)
```
1. Accept `config` as an argument in the async node function.
2. Pass `config` to `llm.ainvoke()` to ensure proper context propagation.
3. Set `stream_mode="messages"` to stream LLM tokens.
??? example "Extended example: async custom streaming with stream writer"
```python hl_lines="8 21"
from typing import TypedDict
from langgraph.types import StreamWriter
class State(TypedDict):
topic: str
joke: str
async def generate_joke(state: State, writer: StreamWriter): # (1)!
writer({"custom_key": "Streaming custom data while generating a joke"})
return {"joke": f"This is a joke about {state['topic']}"}
graph = (
StateGraph(State)
.add_node(generate_joke)
.add_edge(START, "generate_joke")
.compile()
)
async for chunk in graph.astream(
{"topic": "ice cream"},
stream_mode="custom", # (2)!
):
print(chunk)
```
1. Add `writer` as an argument in the function signature of the async node or tool. LangGraph will automatically pass the stream writer to the function.
2. Set `stream_mode="custom"` to receive the custom data in the stream.
---
how-tos/subgraph.md
---
# Use subgraphs
This guide explains the mechanics of using [subgraphs](../concepts/subgraphs.md). A common application of subgraphs is to build [multi-agent](../concepts/multi_agent.md) systems.
When adding subgraphs, you need to define how the parent graph and the subgraph communicate:
* [Shared state schemas](#shared-state-schemas) — parent and subgraph have **shared state keys** in their state [schemas](../concepts/low_level.md#state)
* [Different state schemas](#different-state-schemas) — **no shared state keys** in parent and subgraph [schemas](../concepts/low_level.md#state)
## Setup
```bash
pip install -U langgraph
```
!!! tip "Set up LangSmith for LangGraph development"
Sign up for [LangSmith](https://smith.langchain.com) to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started [here](https://docs.smith.langchain.com).
## Shared state schemas
A common case is for the parent graph and subgraph to communicate over a shared state key (channel) in the [schema](../concepts/low_level.md#state). For example, in [multi-agent](../concepts/multi_agent.md) systems, the agents often communicate over a shared [messages](https://langchain-ai.github.io/langgraph/concepts/low_level.md#why-use-messages) key.
If your subgraph shares state keys with the parent graph, you can follow these steps to add it to your graph:
1. Define the subgraph workflow (`subgraph_builder` in the example below) and compile it
2. Pass compiled subgraph to the `.add_node` method when defining the parent graph workflow
```python
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START
class State(TypedDict):
foo: str
# Subgraph
def subgraph_node_1(state: State):
return {"foo": "hi! " + state["foo"]}
subgraph_builder = StateGraph(State)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph = subgraph_builder.compile()
# Parent graph
builder = StateGraph(State)
builder.add_node("node_1", subgraph)
builder.add_edge(START, "node_1")
graph = builder.compile()
```
??? example "Full example: shared state schemas"
```python
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START
# Define subgraph
class SubgraphState(TypedDict):
foo: str # (1)!
bar: str # (2)!
def subgraph_node_1(state: SubgraphState):
return {"bar": "bar"}
def subgraph_node_2(state: SubgraphState):
# note that this node is using a state key ('bar') that is only available in the subgraph
# and is sending update on the shared state key ('foo')
return {"foo": state["foo"] + state["bar"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_node(subgraph_node_2)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph_builder.add_edge("subgraph_node_1", "subgraph_node_2")
subgraph = subgraph_builder.compile()
# Define parent graph
class ParentState(TypedDict):
foo: str
def node_1(state: ParentState):
return {"foo": "hi! " + state["foo"]}
builder = StateGraph(ParentState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", subgraph)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
graph = builder.compile()
for chunk in graph.stream({"foo": "foo"}):
print(chunk)
```
1. This key is shared with the parent graph state
2. This key is private to the `SubgraphState` and is not visible to the parent graph
```
{'node_1': {'foo': 'hi! foo'}}
{'node_2': {'foo': 'hi! foobar'}}
```
## Different state schemas
For more complex systems you might want to define subgraphs that have a **completely different schema** from the parent graph (no shared keys). For example, you might want to keep a private message history for each of the agents in a [multi-agent](../concepts/multi_agent.md) system.
If that's the case for your application, you need to define a node **function that invokes the subgraph**. This function needs to transform the input (parent) state to the subgraph state before invoking the subgraph, and transform the results back to the parent state before returning the state update from the node.
```python
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START
class SubgraphState(TypedDict):
bar: str
# Subgraph
def subgraph_node_1(state: SubgraphState):
return {"bar": "hi! " + state["bar"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph = subgraph_builder.compile()
# Parent graph
class State(TypedDict):
foo: str
def call_subgraph(state: State):
subgraph_output = subgraph.invoke({"bar": state["foo"]}) # (1)!
return {"foo": subgraph_output["bar"]} # (2)!
builder = StateGraph(State)
builder.add_node("node_1", call_subgraph)
builder.add_edge(START, "node_1")
graph = builder.compile()
```
1. Transform the state to the subgraph state
2. Transform response back to the parent state
??? example "Full example: different state schemas"
```python
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START
# Define subgraph
class SubgraphState(TypedDict):
# note that none of these keys are shared with the parent graph state
bar: str
baz: str
def subgraph_node_1(state: SubgraphState):
return {"baz": "baz"}
def subgraph_node_2(state: SubgraphState):
return {"bar": state["bar"] + state["baz"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_node(subgraph_node_2)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph_builder.add_edge("subgraph_node_1", "subgraph_node_2")
subgraph = subgraph_builder.compile()
# Define parent graph
class ParentState(TypedDict):
foo: str
def node_1(state: ParentState):
return {"foo": "hi! " + state["foo"]}
def node_2(state: ParentState):
response = subgraph.invoke({"bar": state["foo"]}) # (1)!
return {"foo": response["bar"]} # (2)!
builder = StateGraph(ParentState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", node_2)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
graph = builder.compile()
for chunk in graph.stream({"foo": "foo"}, subgraphs=True):
print(chunk)
```
1. Transform the state to the subgraph state
2. Transform response back to the parent state
```
((), {'node_1': {'foo': 'hi! foo'}})
(('node_2:9c36dd0f-151a-cb42-cbad-fa2f851f9ab7',), {'grandchild_1': {'my_grandchild_key': 'hi Bob, how are you'}})
(('node_2:9c36dd0f-151a-cb42-cbad-fa2f851f9ab7',), {'grandchild_2': {'bar': 'hi! foobaz'}})
((), {'node_2': {'foo': 'hi! foobaz'}})
```
??? example "Full example: different state schemas (two levels of subgraphs)"
This is an example with two levels of subgraphs: parent -> child -> grandchild.
```python
# Grandchild graph
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START, END
class GrandChildState(TypedDict):
my_grandchild_key: str
def grandchild_1(state: GrandChildState) -> GrandChildState:
# NOTE: child or parent keys will not be accessible here
return {"my_grandchild_key": state["my_grandchild_key"] + ", how are you"}
grandchild = StateGraph(GrandChildState)
grandchild.add_node("grandchild_1", grandchild_1)
grandchild.add_edge(START, "grandchild_1")
grandchild.add_edge("grandchild_1", END)
grandchild_graph = grandchild.compile()
# Child graph
class ChildState(TypedDict):
my_child_key: str
def call_grandchild_graph(state: ChildState) -> ChildState:
# NOTE: parent or grandchild keys won't be accessible here
grandchild_graph_input = {"my_grandchild_key": state["my_child_key"]} # (1)!
grandchild_graph_output = grandchild_graph.invoke(grandchild_graph_input)
return {"my_child_key": grandchild_graph_output["my_grandchild_key"] + " today?"} # (2)!
child = StateGraph(ChildState)
child.add_node("child_1", call_grandchild_graph) # (3)!
child.add_edge(START, "child_1")
child.add_edge("child_1", END)
child_graph = child.compile()
# Parent graph
class ParentState(TypedDict):
my_key: str
def parent_1(state: ParentState) -> ParentState:
# NOTE: child or grandchild keys won't be accessible here
return {"my_key": "hi " + state["my_key"]}
def parent_2(state: ParentState) -> ParentState:
return {"my_key": state["my_key"] + " bye!"}
def call_child_graph(state: ParentState) -> ParentState:
child_graph_input = {"my_child_key": state["my_key"]} # (4)!
child_graph_output = child_graph.invoke(child_graph_input)
return {"my_key": child_graph_output["my_child_key"]} # (5)!
parent = StateGraph(ParentState)
parent.add_node("parent_1", parent_1)
parent.add_node("child", call_child_graph) # (6)!
parent.add_node("parent_2", parent_2)
parent.add_edge(START, "parent_1")
parent.add_edge("parent_1", "child")
parent.add_edge("child", "parent_2")
parent.add_edge("parent_2", END)
parent_graph = parent.compile()
for chunk in parent_graph.stream({"my_key": "Bob"}, subgraphs=True):
print(chunk)
```
1. We're transforming the state from the child state channels (`my_child_key`) to the child state channels (`my_grandchild_key`)
2. We're transforming the state from the grandchild state channels (`my_grandchild_key`) back to the child state channels (`my_child_key`)
3. We're passing a function here instead of just compiled graph (`grandchild_graph`)
4. We're transforming the state from the parent state channels (`my_key`) to the child state channels (`my_child_key`)
5. We're transforming the state from the child state channels (`my_child_key`) back to the parent state channels (`my_key`)
6. We're passing a function here instead of just a compiled graph (`child_graph`)
```
((), {'parent_1': {'my_key': 'hi Bob'}})
(('child:2e26e9ce-602f-862c-aa66-1ea5a4655e3b', 'child_1:781bb3b1-3971-84ce-810b-acf819a03f9c'), {'grandchild_1': {'my_grandchild_key': 'hi Bob, how are you'}})
(('child:2e26e9ce-602f-862c-aa66-1ea5a4655e3b',), {'child_1': {'my_child_key': 'hi Bob, how are you today?'}})
((), {'child': {'my_key': 'hi Bob, how are you today?'}})
((), {'parent_2': {'my_key': 'hi Bob, how are you today? bye!'}})
```
## Add persistence
You only need to **provide the checkpointer when compiling the parent graph**. LangGraph will automatically propagate the checkpointer to the child subgraphs.
```python
from langgraph.graph import START, StateGraph
from langgraph.checkpoint.memory import InMemorySaver
from typing_extensions import TypedDict
class State(TypedDict):
foo: str
# Subgraph
def subgraph_node_1(state: State):
return {"foo": state["foo"] + "bar"}
subgraph_builder = StateGraph(State)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph = subgraph_builder.compile()
# Parent graph
builder = StateGraph(State)
builder.add_node("node_1", subgraph)
builder.add_edge(START, "node_1")
checkpointer = InMemorySaver()
graph = builder.compile(checkpointer=checkpointer)
```
If you want the subgraph to **have its own memory**, you can compile it `with checkpointer=True`. This is useful in [multi-agent](../concepts/multi_agent.md) systems, if you want agents to keep track of their internal message histories:
```python
subgraph_builder = StateGraph(...)
subgraph = subgraph_builder.compile(checkpointer=True)
```
## View subgraph state
When you enable [persistence](../concepts/persistence.md), you can [inspect the graph state](../concepts/persistence.md#checkpoints) (checkpoint) via `graph.get_state(config)`. To view the subgraph state, you can use `graph.get_state(config, subgraphs=True)`.
!!! important "Available **only** when interrupted"
Subgraph state can only be viewed **when the subgraph is interrupted**. Once you resume the graph, you won't be able to access the subgraph state.
??? example "View interrupted subgraph state"
```python
from langgraph.graph import START, StateGraph
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import interrupt, Command
from typing_extensions import TypedDict
class State(TypedDict):
foo: str
# Subgraph
def subgraph_node_1(state: State):
value = interrupt("Provide value:")
return {"foo": state["foo"] + value}
subgraph_builder = StateGraph(State)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph = subgraph_builder.compile()
# Parent graph
builder = StateGraph(State)
builder.add_node("node_1", subgraph)
builder.add_edge(START, "node_1")
checkpointer = InMemorySaver()
graph = builder.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "1"}}
graph.invoke({"foo": ""}, config)
parent_state = graph.get_state(config)
subgraph_state = graph.get_state(config, subgraphs=True).tasks[0].state # (1)!
# resume the subgraph
graph.invoke(Command(resume="bar"), config)
```
1. This will be available only when the subgraph is interrupted. Once you resume the graph, you won't be able to access the subgraph state.
## Stream subgraph outputs
To include outputs from subgraphs in the streamed outputs, you can set `subgraphs=True` in the `.stream()` method of the parent graph. This will stream outputs from both the parent graph and any subgraphs.
```python
for chunk in graph.stream(
{"foo": "foo"},
subgraphs=True, # (1)!
stream_mode="updates",
):
print(chunk)
```
1. Set `subgraphs=True` to stream outputs from subgraphs.
??? example "Stream from subgraphs"
```python
from typing_extensions import TypedDict
from langgraph.graph.state import StateGraph, START
# Define subgraph
class SubgraphState(TypedDict):
foo: str
bar: str
def subgraph_node_1(state: SubgraphState):
return {"bar": "bar"}
def subgraph_node_2(state: SubgraphState):
# note that this node is using a state key ('bar') that is only available in the subgraph
# and is sending update on the shared state key ('foo')
return {"foo": state["foo"] + state["bar"]}
subgraph_builder = StateGraph(SubgraphState)
subgraph_builder.add_node(subgraph_node_1)
subgraph_builder.add_node(subgraph_node_2)
subgraph_builder.add_edge(START, "subgraph_node_1")
subgraph_builder.add_edge("subgraph_node_1", "subgraph_node_2")
subgraph = subgraph_builder.compile()
# Define parent graph
class ParentState(TypedDict):
foo: str
def node_1(state: ParentState):
return {"foo": "hi! " + state["foo"]}
builder = StateGraph(ParentState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", subgraph)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
graph = builder.compile()
for chunk in graph.stream(
{"foo": "foo"},
stream_mode="updates",
subgraphs=True, # (1)!
):
print(chunk)
```
1. Set `subgraphs=True` to stream outputs from subgraphs.
```
((), {'node_1': {'foo': 'hi! foo'}})
(('node_2:e58e5673-a661-ebb0-70d4-e298a7fc28b7',), {'subgraph_node_1': {'bar': 'bar'}})
(('node_2:e58e5673-a661-ebb0-70d4-e298a7fc28b7',), {'subgraph_node_2': {'foo': 'hi! foobar'}})
((), {'node_2': {'foo': 'hi! foobar'}})
---
how-tos/multi_agent.md
---
# Build multi-agent systems
A single agent might struggle if it needs to specialize in multiple domains or manage many tools. To tackle this, you can break your agent into smaller, independent agents and composing them into a [multi-agent system](../concepts/multi_agent.md).
In multi-agent systems, agents need to communicate between each other. They do so via [handoffs](#handoffs) — a primitive that describes which agent to hand control to and the payload to send to that agent.
This guide covers the following:
* implementing [handoffs](#handoffs) between agents
* using handoffs and the prebuilt [agent](../agents/agents.md) to [build a custom multi-agent system](#build-a-multi-agent-system)
To get started with building multi-agent systems, check out LangGraph [prebuilt implementations](#prebuilt-implementations) of two of the most popular multi-agent architectures — [supervisor](../agents/multi-agent.md#supervisor) and [swarm](../agents/multi-agent.md#swarm).
## Handoffs
To set up communication between the agents in a multi-agent system you can use [**handoffs**](../concepts/multi_agent.md#handoffs) — a pattern where one agent *hands off* control to another. Handoffs allow you to specify:
- **destination**: target agent to navigate to (e.g., name of the LangGraph node to go to)
- **payload**: information to pass to that agent (e.g., state update)
### Create handoffs
To implement handoffs, you can return `Command` objects from your agent nodes or tools:
```python hl_lines="13 14 23 24 25"
from typing import Annotated
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command
def create_handoff_tool(*, agent_name: str, description: str | None = None):
name = f"transfer_to_{agent_name}"
description = description or f"Transfer to {agent_name}"
@tool(name, description=description)
def handoff_tool(
state: Annotated[MessagesState, InjectedState], # (1)!
tool_call_id: Annotated[str, InjectedToolCallId],
) -> Command:
tool_message = {
"role": "tool",
"content": f"Successfully transferred to {agent_name}",
"name": name,
"tool_call_id": tool_call_id,
}
return Command( # (2)!
goto=agent_name, # (3)!
update={"messages": state["messages"] + [tool_message]}, # (4)!
graph=Command.PARENT, # (5)!
)
return handoff_tool
```
1. Access the [state](../concepts/low_level.md#state) of the agent that is calling the handoff tool using the [InjectedState][langgraph.prebuilt.InjectedState] annotation.
2. The `Command` primitive allows specifying a state update and a node transition as a single operation, making it useful for implementing handoffs.
3. Name of the agent or node to hand off to.
4. Take the agent's messages and **add** them to the parent's **state** as part of the handoff. The next agent will see the parent state.
5. Indicate to LangGraph that we need to navigate to agent node in a **parent** multi-agent graph.
!!! tip
If you want to use tools that return `Command`, you can either use prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] / [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode] components, or implement your own tool-executing node that collects `Command` objects returned by the tools and returns a list of them, e.g.:
```python
def call_tools(state):
...
commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]
return commands
```
!!! Important
This handoff implementation assumes that:
- each agent receives overall message history (across all agents) in the multi-agent system as its input. If you want more control over agent inputs, see [this section](#control-agent-inputs)
- each agent outputs its internal messages history to the overall message history of the multi-agent system. If you want more control over **how agent outputs are added**, wrap the agent in a separate node function:
```python hl_lines="5"
def call_hotel_assistant(state):
# return agent's final response,
# excluding inner monologue
response = hotel_assistant.invoke(state)
return {"messages": response["messages"][-1]}
```
### Control agent inputs
You can use the [`Send()`][langgraph.types.Send] primitive to directly send data to the worker agents during the handoff. For example, you can request that the calling agent populate a task description for the next agent:
```python hl_lines="5 26"
from typing import Annotated
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command, Send
def create_task_description_handoff_tool(
*, agent_name: str, description: str | None = None
):
name = f"transfer_to_{agent_name}"
description = description or f"Ask {agent_name} for help."
@tool(name, description=description)
def handoff_tool(
# this is populated by the calling agent
task_description: Annotated[
str,
"Description of what the next agent should do, including all of the relevant context.",
],
# these parameters are ignored by the LLM
state: Annotated[MessagesState, InjectedState],
) -> Command:
task_description_message = {"role": "user", "content": task_description}
agent_input = {**state, "messages": [task_description_message]}
return Command(
goto=[Send(agent_name, agent_input)],
graph=Command.PARENT,
)
return handoff_tool
```
See the multi-agent [supervisor](../tutorials/multi_agent/agent_supervisor.ipynb#4-create-delegation-tasks) example for a full example of using [`Send()`][langgraph.types.Send] in handoffs.
## Build a multi-agent system
You can use handoffs in any agents built with LangGraph. We recommend using the prebuilt [agent](../agents/overview.md) or [`ToolNode`](./tool-calling.md#toolnode), as they natively support handoffs tools returning `Command`. Below is an example of how you can implement a multi-agent system for booking travel using handoffs:
```python hl_lines="16 17 21 22 28 29"
from langgraph.prebuilt import create_react_agent
from langgraph.graph import StateGraph, START, MessagesState
def create_handoff_tool(*, agent_name: str, description: str | None = None):
# same implementation as above
...
return Command(...)
# Handoffs
transfer_to_hotel_assistant = create_handoff_tool(agent_name="hotel_assistant")
transfer_to_flight_assistant = create_handoff_tool(agent_name="flight_assistant")
# Define agents
flight_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[..., transfer_to_hotel_assistant],
name="flight_assistant"
)
hotel_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[..., transfer_to_flight_assistant],
name="hotel_assistant"
)
# Define multi-agent graph
multi_agent_graph = (
StateGraph(MessagesState)
.add_node(flight_assistant)
.add_node(hotel_assistant)
.add_edge(START, "flight_assistant")
.compile()
)
```
??? example "Full example: Multi-agent system for booking travel"
```python hl_lines="56 57 66 67 68 94 96 100 102 124"
from typing import Annotated
from langchain_core.messages import convert_to_messages
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command
# We'll use `pretty_print_messages` helper to render the streamed agent outputs nicely
def pretty_print_message(message, indent=False):
pretty_message = message.pretty_repr(html=True)
if not indent:
print(pretty_message)
return
indented = "\n".join("\t" + c for c in pretty_message.split("\n"))
print(indented)
def pretty_print_messages(update, last_message=False):
is_subgraph = False
if isinstance(update, tuple):
ns, update = update
# skip parent graph updates in the printouts
if len(ns) == 0:
return
graph_id = ns[-1].split(":")[0]
print(f"Update from subgraph {graph_id}:")
print("\n")
is_subgraph = True
for node_name, node_update in update.items():
update_label = f"Update from node {node_name}:"
if is_subgraph:
update_label = "\t" + update_label
print(update_label)
print("\n")
messages = convert_to_messages(node_update["messages"])
if last_message:
messages = messages[-1:]
for m in messages:
pretty_print_message(m, indent=is_subgraph)
print("\n")
def create_handoff_tool(*, agent_name: str, description: str | None = None):
name = f"transfer_to_{agent_name}"
description = description or f"Transfer to {agent_name}"
@tool(name, description=description)
def handoff_tool(
state: Annotated[MessagesState, InjectedState], # (1)!
tool_call_id: Annotated[str, InjectedToolCallId],
) -> Command:
tool_message = {
"role": "tool",
"content": f"Successfully transferred to {agent_name}",
"name": name,
"tool_call_id": tool_call_id,
}
return Command( # (2)!
goto=agent_name, # (3)!
update={"messages": state["messages"] + [tool_message]}, # (4)!
graph=Command.PARENT, # (5)!
)
return handoff_tool
# Handoffs
transfer_to_hotel_assistant = create_handoff_tool(
agent_name="hotel_assistant",
description="Transfer user to the hotel-booking assistant.",
)
transfer_to_flight_assistant = create_handoff_tool(
agent_name="flight_assistant",
description="Transfer user to the flight-booking assistant.",
)
# Simple agent tools
def book_hotel(hotel_name: str):
"""Book a hotel"""
return f"Successfully booked a stay at {hotel_name}."
def book_flight(from_airport: str, to_airport: str):
"""Book a flight"""
return f"Successfully booked a flight from {from_airport} to {to_airport}."
# Define agents
flight_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[book_flight, transfer_to_hotel_assistant],
prompt="You are a flight booking assistant",
name="flight_assistant"
)
hotel_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[book_hotel, transfer_to_flight_assistant],
prompt="You are a hotel booking assistant",
name="hotel_assistant"
)
# Define multi-agent graph
multi_agent_graph = (
StateGraph(MessagesState)
.add_node(flight_assistant)
.add_node(hotel_assistant)
.add_edge(START, "flight_assistant")
.compile()
)
# Run the multi-agent graph
for chunk in multi_agent_graph.stream(
{
"messages": [
{
"role": "user",
"content": "book a flight from BOS to JFK and a stay at McKittrick Hotel"
}
]
},
subgraphs=True
):
pretty_print_messages(chunk)
```
1. Access agent's state
2. The `Command` primitive allows specifying a state update and a node transition as a single operation, making it useful for implementing handoffs.
3. Name of the agent or node to hand off to.
4. Take the agent's messages and **add** them to the parent's **state** as part of the handoff. The next agent will see the parent state.
5. Indicate to LangGraph that we need to navigate to agent node in a **parent** multi-agent graph.
## Multi-turn conversation
Users might want to engage in a *multi-turn conversation* with one or more agents. To build a system that can handle this, you can create a node that uses an [`interrupt`][langgraph.types.interrupt] to collect user input and routes back to the **active** agent.
The agents can then be implemented as nodes in a graph that executes agent steps and determines the next action:
1. **Wait for user input** to continue the conversation, or
2. **Route to another agent** (or back to itself, such as in a loop) via a [handoff](#handoffs)
```python
def human(state) -> Command[Literal["agent", "another_agent"]]:
"""A node for collecting user input."""
user_input = interrupt(value="Ready for user input.")
# Determine the active agent.
active_agent = ...
...
return Command(
update={
"messages": [{
"role": "human",
"content": user_input,
}]
},
goto=active_agent
)
def agent(state) -> Command[Literal["agent", "another_agent", "human"]]:
# The condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.
goto = get_next_agent(...) # 'agent' / 'another_agent'
if goto:
return Command(goto=goto, update={"my_state_key": "my_state_value"})
else:
return Command(goto="human") # Go to human node
```
??? example "Full example: multi-agent system for travel recommendations"
In this example, we will build a team of travel assistant agents that can communicate with each other via handoffs.
We will create 2 agents:
* travel_advisor: can help with travel destination recommendations. Can ask hotel_advisor for help.
* hotel_advisor: can help with hotel recommendations. Can ask travel_advisor for help.
```python
from langchain_anthropic import ChatAnthropic
from langgraph.graph import MessagesState, StateGraph, START
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import MemorySaver
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
class MultiAgentState(MessagesState):
last_active_agent: str
# Define travel advisor tools and ReAct agent
travel_advisor_tools = [
get_travel_recommendations,
make_handoff_tool(agent_name="hotel_advisor"),
]
travel_advisor = create_react_agent(
model,
travel_advisor_tools,
prompt=(
"You are a general travel expert that can recommend travel destinations (e.g. countries, cities, etc). "
"If you need hotel recommendations, ask 'hotel_advisor' for help. "
"You MUST include human-readable response before transferring to another agent."
),
)
def call_travel_advisor(
state: MultiAgentState,
) -> Command[Literal["hotel_advisor", "human"]]:
# You can also add additional logic like changing the input to the agent / output from the agent, etc.
# NOTE: we're invoking the ReAct agent with the full history of messages in the state
response = travel_advisor.invoke(state)
update = {**response, "last_active_agent": "travel_advisor"}
return Command(update=update, goto="human")
# Define hotel advisor tools and ReAct agent
hotel_advisor_tools = [
get_hotel_recommendations,
make_handoff_tool(agent_name="travel_advisor"),
]
hotel_advisor = create_react_agent(
model,
hotel_advisor_tools,
prompt=(
"You are a hotel expert that can provide hotel recommendations for a given destination. "
"If you need help picking travel destinations, ask 'travel_advisor' for help."
"You MUST include human-readable response before transferring to another agent."
),
)
def call_hotel_advisor(
state: MultiAgentState,
) -> Command[Literal["travel_advisor", "human"]]:
response = hotel_advisor.invoke(state)
update = {**response, "last_active_agent": "hotel_advisor"}
return Command(update=update, goto="human")
def human_node(
state: MultiAgentState, config
) -> Command[Literal["hotel_advisor", "travel_advisor", "human"]]:
"""A node for collecting user input."""
user_input = interrupt(value="Ready for user input.")
active_agent = state["last_active_agent"]
return Command(
update={
"messages": [
{
"role": "human",
"content": user_input,
}
]
},
goto=active_agent,
)
builder = StateGraph(MultiAgentState)
builder.add_node("travel_advisor", call_travel_advisor)
builder.add_node("hotel_advisor", call_hotel_advisor)
# This adds a node to collect human input, which will route
# back to the active agent.
builder.add_node("human", human_node)
# We'll always start with a general travel advisor.
builder.add_edge(START, "travel_advisor")
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
```
Let's test a multi turn conversation with this application.
```python
import uuid
thread_config = {"configurable": {"thread_id": str(uuid.uuid4())}}
inputs = [
# 1st round of conversation,
{
"messages": [
{"role": "user", "content": "i wanna go somewhere warm in the caribbean"}
]
},
# Since we're using `interrupt`, we'll need to resume using the Command primitive.
# 2nd round of conversation,
Command(
resume="could you recommend a nice hotel in one of the areas and tell me which area it is."
),
# 3rd round of conversation,
Command(
resume="i like the first one. could you recommend something to do near the hotel?"
),
]
for idx, user_input in enumerate(inputs):
print()
print(f"--- Conversation Turn {idx + 1} ---")
print()
print(f"User: {user_input}")
print()
for update in graph.stream(
user_input,
config=thread_config,
stream_mode="updates",
):
for node_id, value in update.items():
if isinstance(value, dict) and value.get("messages", []):
last_message = value["messages"][-1]
if isinstance(last_message, dict) or last_message.type != "ai":
continue
print(f"{node_id}: {last_message.content}")
```
```
--- Conversation Turn 1 ---
User: {'messages': [{'role': 'user', 'content': 'i wanna go somewhere warm in the caribbean'}]}
travel_advisor: Based on the recommendations, Aruba would be an excellent choice for your Caribbean getaway! Aruba is known as "One Happy Island" and offers:
- Year-round warm weather with consistent temperatures around 82°F (28°C)
- Beautiful white sand beaches like Eagle Beach and Palm Beach
- Clear turquoise waters perfect for swimming and snorkeling
- Minimal rainfall and location outside the hurricane belt
- A blend of Caribbean and Dutch culture
- Great dining options and nightlife
- Various water sports and activities
Would you like me to get some specific hotel recommendations in Aruba for your stay? I can transfer you to our hotel advisor who can help with accommodations.
--- Conversation Turn 2 ---
User: Command(resume='could you recommend a nice hotel in one of the areas and tell me which area it is.')
hotel_advisor: Based on the recommendations, I can suggest two excellent options:
1. The Ritz-Carlton, Aruba - Located in Palm Beach
- This luxury resort is situated in the vibrant Palm Beach area
- Known for its exceptional service and amenities
- Perfect if you want to be close to dining, shopping, and entertainment
- Features multiple restaurants, a casino, and a world-class spa
- Located on a pristine stretch of Palm Beach
2. Bucuti & Tara Beach Resort - Located in Eagle Beach
- An adults-only boutique resort on Eagle Beach
- Known for being more intimate and peaceful
- Award-winning for its sustainability practices
- Perfect for a romantic getaway or peaceful vacation
- Located on one of the most beautiful beaches in the Caribbean
Would you like more specific information about either of these properties or their locations?
--- Conversation Turn 3 ---
User: Command(resume='i like the first one. could you recommend something to do near the hotel?')
travel_advisor: Near the Ritz-Carlton in Palm Beach, here are some highly recommended activities:
1. Visit the Palm Beach Plaza Mall - Just a short walk from the hotel, featuring shopping, dining, and entertainment
2. Try your luck at the Stellaris Casino - It's right in the Ritz-Carlton
3. Take a sunset sailing cruise - Many depart from the nearby pier
4. Visit the California Lighthouse - A scenic landmark just north of Palm Beach
5. Enjoy water sports at Palm Beach:
- Jet skiing
- Parasailing
- Snorkeling
- Stand-up paddleboarding
Would you like more specific information about any of these activities or would you like to know about other options in the area?
```
## Prebuilt implementations
LangGraph comes with prebuilt implementations of two of the most popular multi-agent architectures:
- [supervisor](../agents/multi-agent.md#supervisor) — individual agents are coordinated by a central supervisor agent. The supervisor controls all communication flow and task delegation, making decisions about which agent to invoke based on the current context and task requirements. You can use [`langgraph-supervisor`](https://github.com/langchain-ai/langgraph-supervisor-py) library to create a supervisor multi-agent systems.
- [swarm](../agents/multi-agent.md#supervisor) — agents dynamically hand off control to one another based on their specializations. The system remembers which agent was last active, ensuring that on subsequent interactions, the conversation resumes with that agent. You can use [`langgraph-swarm`](https://github.com/langchain-ai/langgraph-swarm-py) library to create a swarm multi-agent systems.
---
how-tos/tool-calling.md
---
# Call tools
[Tools](../concepts/tools.md) encapsulate a callable function and its input schema. These can be passed to compatible [chat models](https://python.langchain.com/docs/concepts/chat_models), allowing the model to decide whether to invoke a tool and determine the appropriate arguments.
You can [define your own tools](#define-a-tool) or use [prebuilt tools](#prebuilt-tools)
## Define a tool
Define a basic tool with the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator:
```python hl_lines="3"
from langchain_core.tools import tool
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
```
## Run a tool
Tools conform to the [Runnable interface](https://python.langchain.com/docs/concepts/runnables/), which means you can run a tool using the `invoke` method:
```python
multiply.invoke({"a": 6, "b": 7}) # returns 42
```
If the tool is invoked with `type="tool_call"`, it will return a [ToolMessage](https://python.langchain.com/docs/concepts/messages/#toolmessage):
```python
tool_call = {
"type": "tool_call",
"id": "1",
"args": {"a": 42, "b": 7}
}
multiply.invoke(tool_call) # returns a ToolMessage object
```
Output:
```pycon
ToolMessage(content='294', name='multiply', tool_call_id='1')
```
## Use in an agent
To create a tool-calling agent, you can use the prebuilt [create_react_agent][langgraph.prebuilt.chat_agent_executor.create_react_agent]:
```python hl_lines="2 9"
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet",
tools=[multiply]
)
agent.invoke({"messages": [{"role": "user", "content": "what's 42 x 7?"}]})
```
## Use in a workflow
If you are writing a custom workflow, you will need to:
1. register the tools with the chat model
2. call the tool if the model decides to use it
Use `model.bind_tools()` to register the tools with the model.
```python hl_lines="5"
from langchain.chat_models import init_chat_model
model = init_chat_model(model="claude-3-5-haiku-latest")
model_with_tools = model.bind_tools([multiply])
```
LLMs automatically determine if a tool invocation is necessary and handle calling the tool with the appropriate arguments.
??? example "Extended example: attach tools to a chat model"
```python hl_lines="10"
from langchain_core.tools import tool
from langchain.chat_models import init_chat_model
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
model = init_chat_model(model="claude-3-5-haiku-latest")
model_with_tools = model.bind_tools([multiply])
response_message = model_with_tools.invoke("what's 42 x 7?")
tool_call = response_message.tool_calls[0]
multiply.invoke(tool_call)
```
```pycon
ToolMessage(
content='294',
name='multiply',
tool_call_id='toolu_0176DV4YKSD8FndkeuuLj36c'
)
```
#### ToolNode
To execute tools in custom workflows, use the prebuilt [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode] or implement your own custom node.
`ToolNode` is a specialized node for executing tools in a workflow. It provides the following features:
* Supports both synchronous and asynchronous tools.
* Executes multiple tools concurrently.
* Handles errors during tool execution (`handle_tool_errors=True`, enabled by default). See [handling tool errors](#handle-errors) for more details.
`ToolNode` operates on [`MessagesState`](../concepts/low_level.md#messagesstate):
* **Input**: `MessagesState`, where the last message is an `AIMessage` containing the `tool_calls` parameter.
* **Output**: `MessagesState` updated with the resulting [`ToolMessage`](https://python.langchain.com/docs/concepts/messages/#toolmessage) from executed tools.
```python hl_lines="1 14"
from langgraph.prebuilt import ToolNode
def get_weather(location: str):
"""Call to get the current weather."""
if location.lower() in ["sf", "san francisco"]:
return "It's 60 degrees and foggy."
else:
return "It's 90 degrees and sunny."
def get_coolest_cities():
"""Get a list of coolest cities"""
return "nyc, sf"
tool_node = ToolNode([get_weather, get_coolest_cities])
tool_node.invoke({"messages": [...]})
```
??? example "Single tool call"
```python hl_lines="13"
from langchain_core.messages import AIMessage
from langgraph.prebuilt import ToolNode
# Define tools
@tool
def get_weather(location: str):
"""Call to get the current weather."""
if location.lower() in ["sf", "san francisco"]:
return "It's 60 degrees and foggy."
else:
return "It's 90 degrees and sunny."
tool_node = ToolNode([get_weather])
message_with_single_tool_call = AIMessage(
content="",
tool_calls=[
{
"name": "get_weather",
"args": {"location": "sf"},
"id": "tool_call_id",
"type": "tool_call",
}
],
)
tool_node.invoke({"messages": [message_with_single_tool_call]})
```
```
{'messages': [ToolMessage(content="It's 60 degrees and foggy.", name='get_weather', tool_call_id='tool_call_id')]}
```
??? example "Multiple tool calls"
```python hl_lines="17 37"
from langchain_core.messages import AIMessage
from langgraph.prebuilt import ToolNode
# Define tools
def get_weather(location: str):
"""Call to get the current weather."""
if location.lower() in ["sf", "san francisco"]:
return "It's 60 degrees and foggy."
else:
return "It's 90 degrees and sunny."
def get_coolest_cities():
"""Get a list of coolest cities"""
return "nyc, sf"
tool_node = ToolNode([get_weather, get_coolest_cities])
message_with_multiple_tool_calls = AIMessage(
content="",
tool_calls=[
{
"name": "get_coolest_cities",
"args": {},
"id": "tool_call_id_1",
"type": "tool_call",
},
{
"name": "get_weather",
"args": {"location": "sf"},
"id": "tool_call_id_2",
"type": "tool_call",
},
],
)
tool_node.invoke({"messages": [message_with_multiple_tool_calls]}) # (1)!
```
1. `ToolNode` will execute both tools in parallel
```
{
'messages': [
ToolMessage(content='nyc, sf', name='get_coolest_cities', tool_call_id='tool_call_id_1'),
ToolMessage(content="It's 60 degrees and foggy.", name='get_weather', tool_call_id='tool_call_id_2')
]
}
```
??? example "Use with a chat model"
```python hl_lines="11 14 17"
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import ToolNode
def get_weather(location: str):
"""Call to get the current weather."""
if location.lower() in ["sf", "san francisco"]:
return "It's 60 degrees and foggy."
else:
return "It's 90 degrees and sunny."
tool_node = ToolNode([get_weather])
model = init_chat_model(model="claude-3-5-haiku-latest")
model_with_tools = model.bind_tools([get_weather]) # (1)!
response_message = model_with_tools.invoke("what's the weather in sf?")
tool_node.invoke({"messages": [response_message]})
```
1. Use `.bind_tools()` to attach the tool schema to the chat model
```
{'messages': [ToolMessage(content="It's 60 degrees and foggy.", name='get_weather', tool_call_id='toolu_01Pnkgw5JeTRxXAU7tyHT4UW')]}
```
??? example "Use in a tool-calling agent"
This is an example of creating a tool-calling agent from scratch using `ToolNode`. You can also use LangGraph's prebuilt [agent](../agents/agents.md).
```python hl_lines="12 15 33"
from langchain.chat_models import init_chat_model
from langgraph.prebuilt import ToolNode
from langgraph.graph import StateGraph, MessagesState, START, END
def get_weather(location: str):
"""Call to get the current weather."""
if location.lower() in ["sf", "san francisco"]:
return "It's 60 degrees and foggy."
else:
return "It's 90 degrees and sunny."
tool_node = ToolNode([get_weather])
model = init_chat_model(model="claude-3-5-haiku-latest")
model_with_tools = model.bind_tools([get_weather])
def should_continue(state: MessagesState):
messages = state["messages"]
last_message = messages[-1]
if last_message.tool_calls:
return "tools"
return END
def call_model(state: MessagesState):
messages = state["messages"]
response = model_with_tools.invoke(messages)
return {"messages": [response]}
builder = StateGraph(MessagesState)
# Define the two nodes we will cycle between
builder.add_node("call_model", call_model)
builder.add_node("tools", tool_node)
builder.add_edge(START, "call_model")
builder.add_conditional_edges("call_model", should_continue, ["tools", END])
builder.add_edge("tools", "call_model")
graph = builder.compile()
graph.invoke({"messages": [{"role": "user", "content": "what's the weather in sf?"}]})
```
```
{
'messages': [
HumanMessage(content="what's the weather in sf?"),
AIMessage(
content=[{'text': "I'll help you check the weather in San Francisco right now.", 'type': 'text'}, {'id': 'toolu_01A4vwUEgBKxfFVc5H3v1CNs', 'input': {'location': 'San Francisco'}, 'name': 'get_weather', 'type': 'tool_use'}],
tool_calls=[{'name': 'get_weather', 'args': {'location': 'San Francisco'}, 'id': 'toolu_01A4vwUEgBKxfFVc5H3v1CNs', 'type': 'tool_call'}]
),
ToolMessage(content="It's 60 degrees and foggy."),
AIMessage(content="The current weather in San Francisco is 60 degrees and foggy. Typical San Francisco weather with its famous marine layer!")
]
}
```
## Tool customization
### Parameter descriptions
Auto-generate descriptions from docstrings:
```python hl_lines="1 3"
from langchain_core.tools import tool
@tool("multiply_tool", parse_docstring=True)
def multiply(a: int, b: int) -> int:
"""Multiply two numbers.
Args:
a: First operand
b: Second operand
"""
return a * b
```
### Explicit input schema
Define schemas using `args_schema`:
```python hl_lines="9"
from pydantic import BaseModel, Field
from langchain_core.tools import tool
class MultiplyInputSchema(BaseModel):
"""Multiply two numbers"""
a: int = Field(description="First operand")
b: int = Field(description="Second operand")
@tool("multiply_tool", args_schema=MultiplyInputSchema)
def multiply(a: int, b: int) -> int:
return a * b
```
### Tool name
Override the default tool name (function name) using the first argument:
```python hl_lines="3"
from langchain_core.tools import tool
@tool("multiply_tool")
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
```
## Context management
Tools within LangGraph sometimes require context data, such as runtime-only arguments (e.g., user IDs or session details), that should not be controlled by the model. LangGraph provides three methods for managing such context:
| Type | Usage Scenario | Mutable | Lifetime |
|-----------------------------------------|------------------------------------------|---------|--------------------------|
| [Configuration](#configuration) | Static, immutable runtime data | ❌ | Single invocation |
| [Short-term memory](#short-term-memory) | Dynamic, changing data during invocation | ✅ | Single invocation |
| [Long-term memory](#long-term-memory) | Persistent, cross-session data | ✅ | Across multiple sessions |
### Configuration
Use configuration when you have **immutable** runtime data that tools require, such as user identifiers. You pass these arguments via [`RunnableConfig`](https://python.langchain.com/docs/concepts/runnables/#runnableconfig) at invocation and access them in the tool:
```python hl_lines="5 13"
from langchain_core.tools import tool
from langchain_core.runnables import RunnableConfig
@tool
def get_user_info(config: RunnableConfig) -> str:
"""Retrieve user information based on user ID."""
user_id = config["configurable"].get("user_id")
return "User is John Smith" if user_id == "user_123" else "Unknown user"
# Invocation example with an agent
agent.invoke(
{"messages": [{"role": "user", "content": "look up user info"}]},
config={"configurable": {"user_id": "user_123"}}
)
```
??? example "Extended example: Access config in tools"
```python hl_lines="6 9 19"
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
def get_user_info(
config: RunnableConfig,
) -> str:
"""Look up user info."""
user_id = config["configurable"].get("user_id")
return "User is John Smith" if user_id == "user_123" else "Unknown user"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_user_info],
)
agent.invoke(
{"messages": [{"role": "user", "content": "look up user information"}]},
config={"configurable": {"user_id": "user_123"}}
)
```
### Short-term memory
Short-term memory maintains **dynamic** state that changes during a single execution.
To **access** (read) the graph state inside the tools, you can use a special parameter **annotation** — [`InjectedState`][langgraph.prebuilt.InjectedState]:
```python hl_lines="12"
from typing import Annotated, NotRequired
from langchain_core.tools import tool
from langgraph.prebuilt import InjectedState, create_react_agent
from langgraph.prebuilt.chat_agent_executor import AgentState
class CustomState(AgentState):
# The user_name field in short-term state
user_name: NotRequired[str]
@tool
def get_user_name(
state: Annotated[CustomState, InjectedState]
) -> str:
"""Retrieve the current user-name from state."""
# Return stored name or a default if not set
return state.get("user_name", "Unknown user")
# Example agent setup
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_user_name],
state_schema=CustomState,
)
# Invocation: reads the name from state (initially empty)
agent.invoke({"messages": "what's my name?"})
```
Use a tool that returns a `Command` to **update** `user_name` and append a confirmation message:
```python hl_lines="12 13 14 15 16 17"
from typing import Annotated
from langgraph.types import Command
from langchain_core.messages import ToolMessage
from langchain_core.tools import tool, InjectedToolCallId
@tool
def update_user_name(
new_name: str,
tool_call_id: Annotated[str, InjectedToolCallId]
) -> Command:
"""Update user-name in short-term memory."""
return Command(update={
"user_name": new_name,
"messages": [
ToolMessage(f"Updated user name to {new_name}", tool_call_id=tool_call_id)
]
})
```
!!! important
If you want to use tools that return `Command` and update graph state, you can either use prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] / [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode] components, or implement your own tool-executing node that collects `Command` objects returned by the tools and returns a list of them, e.g.:
```python
def call_tools(state):
...
commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]
return commands
```
### Long-term memory
Use [long-term memory](../concepts/memory.md#long-term-memory) to store user-specific or application-specific data across conversations. This is useful for applications like chatbots, where you want to remember user preferences or other information.
To use long-term memory, you need to:
1. [Configure a store](memory/add-memory.md#add-long-term-memory) to persist data across invocations.
2. Use the [`get_store`][langgraph.config.get_store] function to access the store from within tools or prompts.
To **access** information in the store:
```python hl_lines="4 11 13"
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool
from langgraph.graph import StateGraph
from langgraph.config import get_store
@tool
def get_user_info(config: RunnableConfig) -> str:
"""Look up user info."""
# Same as that provided to `builder.compile(store=store)`
# or `create_react_agent`
store = get_store()
user_id = config["configurable"].get("user_id")
user_info = store.get(("users",), user_id)
return str(user_info.value) if user_info else "Unknown user"
builder = StateGraph(...)
...
graph = builder.compile(store=store)
```
??? example "Access long-term memory"
```python hl_lines="7 9 22 24 30 36"
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool
from langgraph.config import get_store
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
store = InMemoryStore() # (1)!
store.put( # (2)!
("users",), # (3)!
"user_123", # (4)!
{
"name": "John Smith",
"language": "English",
} # (5)!
)
@tool
def get_user_info(config: RunnableConfig) -> str:
"""Look up user info."""
# Same as that provided to `create_react_agent`
store = get_store() # (6)!
user_id = config["configurable"].get("user_id")
user_info = store.get(("users",), user_id) # (7)!
return str(user_info.value) if user_info else "Unknown user"
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[get_user_info],
store=store # (8)!
)
# Run the agent
agent.invoke(
{"messages": [{"role": "user", "content": "look up user information"}]},
config={"configurable": {"user_id": "user_123"}}
)
```
1. The `InMemoryStore` is a store that stores data in memory. In a production setting, you would typically use a database or other persistent storage. Please review the [store documentation][../reference/store.md) for more options. If you're deploying with **LangGraph Platform**, the platform will provide a production-ready store for you.
2. For this example, we write some sample data to the store using the `put` method. Please see the [BaseStore.put][langgraph.store.base.BaseStore.put] API reference for more details.
3. The first argument is the namespace. This is used to group related data together. In this case, we are using the `users` namespace to group user data.
4. A key within the namespace. This example uses a user ID for the key.
5. The data that we want to store for the given user.
6. The `get_store` function is used to access the store. You can call it from anywhere in your code, including tools and prompts. This function returns the store that was passed to the agent when it was created.
7. The `get` method is used to retrieve data from the store. The first argument is the namespace, and the second argument is the key. This will return a `StoreValue` object, which contains the value and metadata about the value.
8. The `store` is passed to the agent. This enables the agent to access the store when running tools. You can also use the `get_store` function to access the store from anywhere in your code.
To **update** information in the store:
```python hl_lines="4 11 13"
from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool
from langgraph.graph import StateGraph
from langgraph.config import get_store
@tool
def save_user_info(user_info: str, config: RunnableConfig) -> str:
"""Save user info."""
# Same as that provided to `builder.compile(store=store)`
# or `create_react_agent`
store = get_store()
user_id = config["configurable"].get("user_id")
store.put(("users",), user_id, user_info)
return "Successfully saved user info."
builder = StateGraph(...)
...
graph = builder.compile(store=store)
```
??? example "Update long-term memory"
```python hl_lines="17 19 25 31"
from typing_extensions import TypedDict
from langchain_core.tools import tool
from langgraph.config import get_store
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
store = InMemoryStore() # (1)!
class UserInfo(TypedDict): # (2)!
name: str
@tool
def save_user_info(user_info: UserInfo, config: RunnableConfig) -> str: # (3)!
"""Save user info."""
# Same as that provided to `create_react_agent`
store = get_store() # (4)!
user_id = config["configurable"].get("user_id")
store.put(("users",), user_id, user_info) # (5)!
return "Successfully saved user info."
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[save_user_info],
store=store
)
# Run the agent
agent.invoke(
{"messages": [{"role": "user", "content": "My name is John Smith"}]},
config={"configurable": {"user_id": "user_123"}} # (6)!
)
# You can access the store directly to get the value
store.get(("users",), "user_123").value
```
1. The `InMemoryStore` is a store that stores data in memory. In a production setting, you would typically use a database or other persistent storage. Please review the [store documentation](../reference/store.md) for more options. If you're deploying with **LangGraph Platform**, the platform will provide a production-ready store for you.
2. The `UserInfo` class is a `TypedDict` that defines the structure of the user information. The LLM will use this to format the response according to the schema.
3. The `save_user_info` function is a tool that allows an agent to update user information. This could be useful for a chat application where the user wants to update their profile information.
4. The `get_store` function is used to access the store. You can call it from anywhere in your code, including tools and prompts. This function returns the store that was passed to the agent when it was created.
5. The `put` method is used to store data in the store. The first argument is the namespace, and the second argument is the key. This will store the user information in the store.
6. The `user_id` is passed in the config. This is used to identify the user whose information is being updated.
## Advanced tool features
### Immediate return
Use `return_direct=True` to immediately return a tool's result without executing additional logic.
This is useful for tools that should not trigger further processing or tool calls, allowing you to return results directly to the user.
```python hl_lines="1"
@tool(return_direct=True)
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
```
??? example "Extended example: Using return_direct in a prebuilt agent"
```python hl_lines="4"
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
@tool(return_direct=True)
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[add]
)
agent.invoke(
{"messages": [{"role": "user", "content": "what's 3 + 5?"}]}
)
```
!!! important "Using without prebuilt components"
If you are building a custom workflow and are not relying on `create_react_agent` or `ToolNode`, you will also
need to implement the control flow to handle `return_direct=True`.
### Force tool use
If you need to force a specific tool to be used, you will need to configure this
at the **model** level using the `tool_choice` parameter in the `bind_tools` method.
Force specific tool usage via tool_choice:
```python hl_lines="11"
@tool(return_direct=True)
def greet(user_name: str) -> int:
"""Greet user."""
return f"Hello {user_name}!"
tools = [greet]
configured_model = model.bind_tools(
tools,
# Force the use of the 'greet' tool
tool_choice={"type": "tool", "name": "greet"}
)
```
??? example "Extended example: Force tool usage in an agent"
To force the agent to use specific tools, you can set the `tool_choice` option in `model.bind_tools()`:
```python hl_lines="3 11"
from langchain_core.tools import tool
@tool(return_direct=True)
def greet(user_name: str) -> int:
"""Greet user."""
return f"Hello {user_name}!"
tools = [greet]
agent = create_react_agent(
model=model.bind_tools(tools, tool_choice={"type": "tool", "name": "greet"}),
tools=tools
)
agent.invoke(
{"messages": [{"role": "user", "content": "Hi, I am Bob"}]}
)
```
!!! Warning "Avoid infinite loops"
Forcing tool usage without stopping conditions can create infinite loops. Use one of the following safeguards:
- Mark the tool with [`return_direct=True`](#immediate-return to end the loop after execution.
- Set [`recursion_limit`](../concepts/low_level.md#recursion-limit) to restrict the number of execution steps.
!!! tip "Tool choice configuration"
The `tool_choice` parameter is used to configure which tool should be used by the model when it decides to call a tool. This is useful when you want to ensure that a specific tool is always called for a particular task or when you want to override the model's default behavior of choosing a tool based on its internal logic.
Note that not all models support this feature, and the exact configuration may vary depending on the model you are using.
### Disable parallel calls
For supported providers, you can disable parallel tool calling by setting `parallel_tool_calls=False` via the `model.bind_tools()` method:
```python hl_lines="3"
model.bind_tools(
tools,
parallel_tool_calls=False
)
```
??? example "Extended example: disable parallel tool calls in a prebuilt agent"
```python hl_lines="15"
from langchain.chat_models import init_chat_model
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
model = init_chat_model("anthropic:claude-3-5-sonnet-latest", temperature=0)
tools = [add, multiply]
agent = create_react_agent(
# disable parallel tool calls
model=model.bind_tools(tools, parallel_tool_calls=False),
tools=tools
)
agent.invoke(
{"messages": [{"role": "user", "content": "what's 3 + 5 and 4 * 7?"}]}
)
```
### Handle errors
LangGraph provides built-in error handling for tool execution through the prebuilt [ToolNode][langgraph.prebuilt.tool_node.ToolNode] component, used both independently and in prebuilt agents.
By **default**, `ToolNode` catches exceptions raised during tool execution and returns them as `ToolMessage` objects with a status indicating an error.
```python
from langchain_core.messages import AIMessage
from langgraph.prebuilt import ToolNode
def multiply(a: int, b: int) -> int:
if a == 42:
raise ValueError("The ultimate error")
return a * b
# Default error handling (enabled by default)
tool_node = ToolNode([multiply])
message = AIMessage(
content="",
tool_calls=[{
"name": "multiply",
"args": {"a": 42, "b": 7},
"id": "tool_call_id",
"type": "tool_call"
}]
)
result = tool_node.invoke({"messages": [message]})
```
Output:
```pycon
{'messages': [
ToolMessage(
content="Error: ValueError('The ultimate error')\n Please fix your mistakes.",
name='multiply',
tool_call_id='tool_call_id',
status='error'
)
]}
```
#### Disable error handling
To propagate exceptions directly, disable error handling:
```python
tool_node = ToolNode([multiply], handle_tool_errors=False)
```
With error handling disabled, exceptions raised by tools will propagate up, requiring explicit management.
#### Custom error messages
Provide a custom error message by setting `handle_tool_errors` to a string:
```python
tool_node = ToolNode(
[multiply],
handle_tool_errors="Can't use 42 as the first operand, please switch operands!"
)
```
Example output:
```python
{'messages': [
ToolMessage(
content="Can't use 42 as the first operand, please switch operands!",
name='multiply',
tool_call_id='tool_call_id',
status='error'
)
]}
```
#### Error handling in agents
Error handling in prebuilt agents (`create_react_agent`) leverages `ToolNode`:
```python
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=[multiply]
)
# Default error handling
agent.invoke({"messages": [{"role": "user", "content": "what's 42 x 7?"}]})
```
To disable or customize error handling in prebuilt agents, explicitly pass a configured `ToolNode`:
```python
custom_tool_node = ToolNode(
[multiply],
handle_tool_errors="Cannot use 42 as a first operand!"
)
agent_custom = create_react_agent(
model="anthropic:claude-3-7-sonnet-latest",
tools=custom_tool_node
)
agent_custom.invoke({"messages": [{"role": "user", "content": "what's 42 x 7?"}]})
```
### Handle large numbers of tools
As the number of available tools grows, you may want to limit the scope of the LLM's selection, to decrease token consumption and to help manage sources of error in LLM reasoning.
To address this, you can dynamically adjust the tools available to a model by retrieving relevant tools at runtime using semantic search.
See [`langgraph-bigtool`](https://github.com/langchain-ai/langgraph-bigtool) prebuilt library for a ready-to-use implementation.
## Prebuilt tools
### LLM provider tools
You can use prebuilt tools from model providers by passing a dictionary with tool specs to the `tools` parameter of `create_react_agent`. For example, to use the `web_search_preview` tool from OpenAI:
```python
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(
model="openai:gpt-4o-mini",
tools=[{"type": "web_search_preview"}]
)
response = agent.invoke(
{"messages": ["What was a positive news story from today?"]}
)
```
Please consult the documentation for the specific model you are using to see which tools are available and how to use them.
### LangChain tools
Additionally, LangChain supports a wide range of prebuilt tool integrations for interacting with APIs, databases, file systems, web data, and more. These tools extend the functionality of agents and enable rapid development.
You can browse the full list of available integrations in the [LangChain integrations directory](https://python.langchain.com/docs/integrations/tools/).
Some commonly used tool categories include:
- **Search**: Bing, SerpAPI, Tavily
- **Code interpreters**: Python REPL, Node.js REPL
- **Databases**: SQL, MongoDB, Redis
- **Web data**: Web scraping and browsing
- **APIs**: OpenWeatherMap, NewsAPI, and others
These integrations can be configured and added to your agents using the same `tools` parameter shown in the examples above.
---
how-tos/use-remote-graph.md
---
# How to interact with the deployment using RemoteGraph
!!! info "Prerequisites"
- [LangGraph Platform](../concepts/langgraph_platform.md)
- [LangGraph Server](../concepts/langgraph_server.md)
`RemoteGraph` is an interface that allows you to interact with your LangGraph Platform deployment as if it were a regular, locally-defined LangGraph graph (e.g. a `CompiledGraph`). This guide shows you how you can initialize a `RemoteGraph` and interact with it.
## Initializing the graph
When initializing a `RemoteGraph`, you must always specify:
- `name`: the name of the graph you want to interact with. This is the same graph name you use in `langgraph.json` configuration file for your deployment.
- `api_key`: a valid LangSmith API key. Can be set as an environment variable (`LANGSMITH_API_KEY`) or passed directly via the `api_key` argument. The API key could also be provided via the `client` / `sync_client` arguments, if `LangGraphClient` / `SyncLangGraphClient` were initialized with `api_key` argument.
Additionally, you have to provide one of the following:
- `url`: URL of the deployment you want to interact with. If you pass `url` argument, both sync and async clients will be created using the provided URL, headers (if provided) and default configuration values (e.g. timeout, etc).
- `client`: a `LangGraphClient` instance for interacting with the deployment asynchronously (e.g. using `.astream()`, `.ainvoke()`, `.aget_state()`, `.aupdate_state()`, etc.)
- `sync_client`: a `SyncLangGraphClient` instance for interacting with the deployment synchronously (e.g. using `.stream()`, `.invoke()`, `.get_state()`, `.update_state()`, etc.)
!!! Note
If you pass both `client` or `sync_client` as well as `url` argument, they will take precedence over the `url` argument. If none of the `client` / `sync_client` / `url` arguments are provided, `RemoteGraph` will raise a `ValueError` at runtime.
### Using URL
=== "Python"
```python
from langgraph.pregel.remote import RemoteGraph
url =
graph_name = "agent"
remote_graph = RemoteGraph(graph_name, url=url)
```
=== "JavaScript"
```ts
import { RemoteGraph } from "@langchain/langgraph/remote";
const url = ``;
const graphName = "agent";
const remoteGraph = new RemoteGraph({ graphId: graphName, url });
```
### Using clients
=== "Python"
```python
from langgraph_sdk import get_client, get_sync_client
from langgraph.pregel.remote import RemoteGraph
url =
graph_name = "agent"
client = get_client(url=url)
sync_client = get_sync_client(url=url)
remote_graph = RemoteGraph(graph_name, client=client, sync_client=sync_client)
```
=== "JavaScript"
```ts
import { Client } from "@langchain/langgraph-sdk";
import { RemoteGraph } from "@langchain/langgraph/remote";
const client = new Client({ apiUrl: `` });
const graphName = "agent";
const remoteGraph = new RemoteGraph({ graphId: graphName, client });
```
## Invoking the graph
Since `RemoteGraph` is a `Runnable` that implements the same methods as `CompiledGraph`, you can interact with it the same way you normally would with a compiled graph, i.e. by calling `.invoke()`, `.stream()`, `.get_state()`, `.update_state()`, etc (as well as their async counterparts).
### Asynchronously
!!! Note
To use the graph asynchronously, you must provide either the `url` or `client` when initializing the `RemoteGraph`.
=== "Python"
```python
# invoke the graph
result = await remote_graph.ainvoke({
"messages": [{"role": "user", "content": "what's the weather in sf"}]
})
# stream outputs from the graph
async for chunk in remote_graph.astream({
"messages": [{"role": "user", "content": "what's the weather in la"}]
}):
print(chunk)
```
=== "JavaScript"
```ts
// invoke the graph
const result = await remoteGraph.invoke({
messages: [{role: "user", content: "what's the weather in sf"}]
})
// stream outputs from the graph
for await (const chunk of await remoteGraph.stream({
messages: [{role: "user", content: "what's the weather in la"}]
})):
console.log(chunk)
```
### Synchronously
!!! Note
To use the graph synchronously, you must provide either the `url` or `sync_client` when initializing the `RemoteGraph`.
=== "Python"
```python
# invoke the graph
result = remote_graph.invoke({
"messages": [{"role": "user", "content": "what's the weather in sf"}]
})
# stream outputs from the graph
for chunk in remote_graph.stream({
"messages": [{"role": "user", "content": "what's the weather in la"}]
}):
print(chunk)
```
## Thread-level persistence
By default, the graph runs (i.e. `.invoke()` or `.stream()` invocations) are stateless - the checkpoints and the final state of the graph are not persisted. If you would like to persist the outputs of the graph run (for example, to enable human-in-the-loop features), you can create a thread and provide the thread ID via the `config` argument, same as you would with a regular compiled graph:
=== "Python"
```python
from langgraph_sdk import get_sync_client
url =
graph_name = "agent"
sync_client = get_sync_client(url=url)
remote_graph = RemoteGraph(graph_name, url=url)
# create a thread (or use an existing thread instead)
thread = sync_client.threads.create()
# invoke the graph with the thread config
config = {"configurable": {"thread_id": thread["thread_id"]}}
result = remote_graph.invoke({
"messages": [{"role": "user", "content": "what's the weather in sf"}]
}, config=config)
# verify that the state was persisted to the thread
thread_state = remote_graph.get_state(config)
print(thread_state)
```
=== "JavaScript"
```ts
import { Client } from "@langchain/langgraph-sdk";
import { RemoteGraph } from "@langchain/langgraph/remote";
const url = ``;
const graphName = "agent";
const client = new Client({ apiUrl: url });
const remoteGraph = new RemoteGraph({ graphId: graphName, url });
// create a thread (or use an existing thread instead)
const thread = await client.threads.create();
// invoke the graph with the thread config
const config = { configurable: { thread_id: thread.thread_id }};
const result = await remoteGraph.invoke({
messages: [{ role: "user", content: "what's the weather in sf" }],
}, config);
// verify that the state was persisted to the thread
const threadState = await remoteGraph.getState(config);
console.log(threadState);
```
## Using as a subgraph
!!! Note
If you need to use a `checkpointer` with a graph that has a `RemoteGraph` subgraph node, make sure to use UUIDs as thread IDs.
Since the `RemoteGraph` behaves the same way as a regular `CompiledGraph`, it can be also used as a subgraph in another graph. For example:
=== "Python"
```python
from langgraph_sdk import get_sync_client
from langgraph.graph import StateGraph, MessagesState, START
from typing import TypedDict
url =
graph_name = "agent"
remote_graph = RemoteGraph(graph_name, url=url)
# define parent graph
builder = StateGraph(MessagesState)
# add remote graph directly as a node
builder.add_node("child", remote_graph)
builder.add_edge(START, "child")
graph = builder.compile()
# invoke the parent graph
result = graph.invoke({
"messages": [{"role": "user", "content": "what's the weather in sf"}]
})
print(result)
# stream outputs from both the parent graph and subgraph
for chunk in graph.stream({
"messages": [{"role": "user", "content": "what's the weather in sf"}]
}, subgraphs=True):
print(chunk)
```
=== "JavaScript"
```ts
import { MessagesAnnotation, StateGraph, START } from "@langchain/langgraph";
import { RemoteGraph } from "@langchain/langgraph/remote";
const url = ``;
const graphName = "agent";
const remoteGraph = new RemoteGraph({ graphId: graphName, url });
// define parent graph and add remote graph directly as a node
const graph = new StateGraph(MessagesAnnotation)
.addNode("child", remoteGraph)
.addEdge(START, "child")
.compile()
// invoke the parent graph
const result = await graph.invoke({
messages: [{ role: "user", content: "what's the weather in sf" }]
});
console.log(result);
// stream outputs from both the parent graph and subgraph
for await (const chunk of await graph.stream({
messages: [{ role: "user", content: "what's the weather in la" }]
}, { subgraphs: true })) {
console.log(chunk);
}
```
---
how-tos/multi_agent.ipynb
---
# Build multi-agent systems
A single agent might struggle if it needs to specialize in multiple domains or manage many tools. To tackle this, you can break your agent into smaller, independent agents and composing them into a multi-agent system.
In multi-agent systems, agents need to communicate between each other. They do so via handoffs — a primitive that describes which agent to hand control to and the payload to send to that agent.
This guide covers the following:
* implementing handoffs between agents
* using handoffs and the prebuilt agent to build a custom multi-agent system
To get started with building multi-agent systems, check out LangGraph prebuilt implementations of two of the most popular multi-agent architectures — supervisor and swarm.
## Handoffs
To set up communication between the agents in a multi-agent system you can use **handoffs** — a pattern where one agent *hands off* control to another. Handoffs allow you to specify:
- **destination**: target agent to navigate to (e.g., name of the LangGraph node to go to)
- **payload**: information to pass to that agent (e.g., state update)
### Create handoffs
To implement handoffs, you can return `Command` objects from your agent nodes or tools:
```python hl_lines="13 14 23 24 25"
from typing import Annotated
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command
def create_handoff_tool(*, agent_name: str, description: str | None = None):
name = f"transfer_to_{agent_name}"
description = description or f"Transfer to {agent_name}"
@tool(name, description=description)
def handoff_tool(
state: Annotated[MessagesState, InjectedState], # (1)!
tool_call_id: Annotated[str, InjectedToolCallId],
) -> Command:
tool_message = {
"role": "tool",
"content": f"Successfully transferred to {agent_name}",
"name": name,
"tool_call_id": tool_call_id,
}
return Command( # (2)!
goto=agent_name, # (3)!
update={"messages": state["messages"] + [tool_message]}, # (4)!
graph=Command.PARENT, # (5)!
)
return handoff_tool
```
1. Access the state of the agent that is calling the handoff tool using the [InjectedState][langgraph.prebuilt.InjectedState] annotation. See this guide for more information.
2. The `Command` primitive allows specifying a state update and a node transition as a single operation, making it useful for implementing handoffs.
3. Name of the agent or node to hand off to.
4. Take the agent's messages and **add** them to the parent's **state** as part of the handoff. The next agent will see the parent state.
5. Indicate to LangGraph that we need to navigate to agent node in a **parent** multi-agent graph.
!!! tip
If you want to use tools that return `Command`, you can either use prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] / [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode] components, or implement your own tool-executing node that collects `Command` objects returned by the tools and returns a list of them, e.g.:
```python
def call_tools(state):
...
commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]
return commands
```
!!! Important
This handoff implementation assumes that:
- each agent receives overall message history (across all agents) in the multi-agent system as its input. If you want more control over agent inputs, see this section
- each agent outputs its internal messages history to the overall message history of the multi-agent system. If you want more control over **how agent outputs are added**, wrap the agent in a separate node function:
```python hl_lines="5"
def call_hotel_assistant(state):
# return agent's final response,
# excluding inner monologue
response = hotel_assistant.invoke(state)
return {"messages": response["messages"][-1]}
```
### Control agent inputs
You can use the [`Send()`][langgraph.types.Send] primitive to directly send data to the worker agents during the handoff. For example, you can request that the calling agent populate a task description for the next agent:
```python hl_lines="5 26"
from typing import Annotated
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command, Send
def create_task_description_handoff_tool(
*, agent_name: str, description: str | None = None
):
name = f"transfer_to_{agent_name}"
description = description or f"Ask {agent_name} for help."
@tool(name, description=description)
def handoff_tool(
# this is populated by the calling agent
task_description: Annotated[
str,
"Description of what the next agent should do, including all of the relevant context.",
],
# these parameters are ignored by the LLM
state: Annotated[MessagesState, InjectedState],
) -> Command:
task_description_message = {"role": "user", "content": task_description}
agent_input = {**state, "messages": [task_description_message]}
return Command(
goto=[Send(agent_name, agent_input)],
graph=Command.PARENT,
)
return handoff_tool
```
See the multi-agent supervisor tutorial for a full example of using [`Send()`][langgraph.types.Send] in handoffs.
## Build a multi-agent system
You can use handoffs in any agents built with LangGraph. We recommend using the prebuilt agent or `ToolNode`, as they natively support handoffs tools returning `Command`. Below is an example of how you can implement a multi-agent system for booking travel using handoffs:
```python hl_lines="16 17 21 22 28 29"
from langgraph.prebuilt import create_react_agent
from langgraph.graph import StateGraph, START, MessagesState
def create_handoff_tool(*, agent_name: str, description: str | None = None):
# same implementation as above
...
return Command(...)
# Handoffs
transfer_to_hotel_assistant = create_handoff_tool(agent_name="hotel_assistant")
transfer_to_flight_assistant = create_handoff_tool(agent_name="flight_assistant")
# Define agents
flight_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[..., transfer_to_hotel_assistant],
name="flight_assistant"
)
hotel_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[..., transfer_to_flight_assistant],
name="hotel_assistant"
)
# Define multi-agent graph
multi_agent_graph = (
StateGraph(MessagesState)
.add_node(flight_assistant)
.add_node(hotel_assistant)
.add_edge(START, "flight_assistant")
.compile()
)
```
??? example "Full example: Multi-agent system for booking travel"
```python hl_lines="56 57 66 67 68 94 96 100 102 124"
from typing import Annotated
from langchain_core.messages import convert_to_messages
from langchain_core.tools import tool, InjectedToolCallId
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.graph import StateGraph, START, MessagesState
from langgraph.types import Command
# We'll use `pretty_print_messages` helper to render the streamed agent outputs nicely
def pretty_print_message(message, indent=False):
pretty_message = message.pretty_repr(html=True)
if not indent:
print(pretty_message)
return
indented = "\n".join("\t" + c for c in pretty_message.split("\n"))
print(indented)
def pretty_print_messages(update, last_message=False):
is_subgraph = False
if isinstance(update, tuple):
ns, update = update
# skip parent graph updates in the printouts
if len(ns) == 0:
return
graph_id = ns[-1].split(":")[0]
print(f"Update from subgraph {graph_id}:")
print("\n")
is_subgraph = True
for node_name, node_update in update.items():
update_label = f"Update from node {node_name}:"
if is_subgraph:
update_label = "\t" + update_label
print(update_label)
print("\n")
messages = convert_to_messages(node_update["messages"])
if last_message:
messages = messages[-1:]
for m in messages:
pretty_print_message(m, indent=is_subgraph)
print("\n")
def create_handoff_tool(*, agent_name: str, description: str | None = None):
name = f"transfer_to_{agent_name}"
description = description or f"Transfer to {agent_name}"
@tool(name, description=description)
def handoff_tool(
state: Annotated[MessagesState, InjectedState], # (1)!
tool_call_id: Annotated[str, InjectedToolCallId],
) -> Command:
tool_message = {
"role": "tool",
"content": f"Successfully transferred to {agent_name}",
"name": name,
"tool_call_id": tool_call_id,
}
return Command( # (2)!
goto=agent_name, # (3)!
update={"messages": state["messages"] + [tool_message]}, # (4)!
graph=Command.PARENT, # (5)!
)
return handoff_tool
# Handoffs
transfer_to_hotel_assistant = create_handoff_tool(
agent_name="hotel_assistant",
description="Transfer user to the hotel-booking assistant.",
)
transfer_to_flight_assistant = create_handoff_tool(
agent_name="flight_assistant",
description="Transfer user to the flight-booking assistant.",
)
# Simple agent tools
def book_hotel(hotel_name: str):
"""Book a hotel"""
return f"Successfully booked a stay at {hotel_name}."
def book_flight(from_airport: str, to_airport: str):
"""Book a flight"""
return f"Successfully booked a flight from {from_airport} to {to_airport}."
# Define agents
flight_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[book_flight, transfer_to_hotel_assistant],
prompt="You are a flight booking assistant",
name="flight_assistant"
)
hotel_assistant = create_react_agent(
model="anthropic:claude-3-5-sonnet-latest",
tools=[book_hotel, transfer_to_flight_assistant],
prompt="You are a hotel booking assistant",
name="hotel_assistant"
)
# Define multi-agent graph
multi_agent_graph = (
StateGraph(MessagesState)
.add_node(flight_assistant)
.add_node(hotel_assistant)
.add_edge(START, "flight_assistant")
.compile()
)
# Run the multi-agent graph
for chunk in multi_agent_graph.stream(
{
"messages": [
{
"role": "user",
"content": "book a flight from BOS to JFK and a stay at McKittrick Hotel"
}
]
},
subgraphs=True
):
pretty_print_messages(chunk)
```
1. Access agent's state
2. The `Command` primitive allows specifying a state update and a node transition as a single operation, making it useful for implementing handoffs.
3. Name of the agent or node to hand off to.
4. Take the agent's messages and **add** them to the parent's **state** as part of the handoff. The next agent will see the parent state.
5. Indicate to LangGraph that we need to navigate to agent node in a **parent** multi-agent graph.
## Multi-turn conversation
Users might want to engage in a *multi-turn conversation* with one or more agents. To build a system that can handle this, you can create a node that uses an [`interrupt`][langgraph.types.interrupt] to collect user input and routes back to the **active** agent.
The agents can then be implemented as nodes in a graph that executes agent steps and determines the next action:
1. **Wait for user input** to continue the conversation, or
2. **Route to another agent** (or back to itself, such as in a loop) via a handoff
```python
def human(state) -> Command[Literal["agent", "another_agent"]]:
"""A node for collecting user input."""
user_input = interrupt(value="Ready for user input.")
# Determine the active agent.
active_agent = ...
...
return Command(
update={
"messages": [{
"role": "human",
"content": user_input,
}]
},
goto=active_agent
)
def agent(state) -> Command[Literal["agent", "another_agent", "human"]]:
# The condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.
goto = get_next_agent(...) # 'agent' / 'another_agent'
if goto:
return Command(goto=goto, update={"my_state_key": "my_state_value"})
else:
return Command(goto="human") # Go to human node
```
??? example "Full example: multi-agent system for travel recommendations"
In this example, we will build a team of travel assistant agents that can communicate with each other via handoffs.
We will create 2 agents:
* travel_advisor: can help with travel destination recommendations. Can ask hotel_advisor for help.
* hotel_advisor: can help with hotel recommendations. Can ask travel_advisor for help.
```python
from langchain_anthropic import ChatAnthropic
from langgraph.graph import MessagesState, StateGraph, START
from langgraph.prebuilt import create_react_agent, InjectedState
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import MemorySaver
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
class MultiAgentState(MessagesState):
last_active_agent: str
# Define travel advisor tools and ReAct agent
travel_advisor_tools = [
get_travel_recommendations,
make_handoff_tool(agent_name="hotel_advisor"),
]
travel_advisor = create_react_agent(
model,
travel_advisor_tools,
prompt=(
"You are a general travel expert that can recommend travel destinations (e.g. countries, cities, etc). "
"If you need hotel recommendations, ask 'hotel_advisor' for help. "
"You MUST include human-readable response before transferring to another agent."
),
)
def call_travel_advisor(
state: MultiAgentState,
) -> Command[Literal["hotel_advisor", "human"]]:
# You can also add additional logic like changing the input to the agent / output from the agent, etc.
# NOTE: we're invoking the ReAct agent with the full history of messages in the state
response = travel_advisor.invoke(state)
update = {**response, "last_active_agent": "travel_advisor"}
return Command(update=update, goto="human")
# Define hotel advisor tools and ReAct agent
hotel_advisor_tools = [
get_hotel_recommendations,
make_handoff_tool(agent_name="travel_advisor"),
]
hotel_advisor = create_react_agent(
model,
hotel_advisor_tools,
prompt=(
"You are a hotel expert that can provide hotel recommendations for a given destination. "
"If you need help picking travel destinations, ask 'travel_advisor' for help."
"You MUST include human-readable response before transferring to another agent."
),
)
def call_hotel_advisor(
state: MultiAgentState,
) -> Command[Literal["travel_advisor", "human"]]:
response = hotel_advisor.invoke(state)
update = {**response, "last_active_agent": "hotel_advisor"}
return Command(update=update, goto="human")
def human_node(
state: MultiAgentState, config
) -> Command[Literal["hotel_advisor", "travel_advisor", "human"]]:
"""A node for collecting user input."""
user_input = interrupt(value="Ready for user input.")
active_agent = state["last_active_agent"]
return Command(
update={
"messages": [
{
"role": "human",
"content": user_input,
}
]
},
goto=active_agent,
)
builder = StateGraph(MultiAgentState)
builder.add_node("travel_advisor", call_travel_advisor)
builder.add_node("hotel_advisor", call_hotel_advisor)
# This adds a node to collect human input, which will route
# back to the active agent.
builder.add_node("human", human_node)
# We'll always start with a general travel advisor.
builder.add_edge(START, "travel_advisor")
checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)
```
Let's test a multi turn conversation with this application.
```python
import uuid
thread_config = {"configurable": {"thread_id": str(uuid.uuid4())}}
inputs = [
# 1st round of conversation,
{
"messages": [
{"role": "user", "content": "i wanna go somewhere warm in the caribbean"}
]
},
# Since we're using `interrupt`, we'll need to resume using the Command primitive.
# 2nd round of conversation,
Command(
resume="could you recommend a nice hotel in one of the areas and tell me which area it is."
),
# 3rd round of conversation,
Command(
resume="i like the first one. could you recommend something to do near the hotel?"
),
]
for idx, user_input in enumerate(inputs):
print()
print(f"--- Conversation Turn {idx + 1} ---")
print()
print(f"User: {user_input}")
print()
for update in graph.stream(
user_input,
config=thread_config,
stream_mode="updates",
):
for node_id, value in update.items():
if isinstance(value, dict) and value.get("messages", []):
last_message = value["messages"][-1]
if isinstance(last_message, dict) or last_message.type != "ai":
continue
print(f"{node_id}: {last_message.content}")
```
```
--- Conversation Turn 1 ---
User: {'messages': [{'role': 'user', 'content': 'i wanna go somewhere warm in the caribbean'}]}
travel_advisor: Based on the recommendations, Aruba would be an excellent choice for your Caribbean getaway! Aruba is known as "One Happy Island" and offers:
- Year-round warm weather with consistent temperatures around 82°F (28°C)
- Beautiful white sand beaches like Eagle Beach and Palm Beach
- Clear turquoise waters perfect for swimming and snorkeling
- Minimal rainfall and location outside the hurricane belt
- A blend of Caribbean and Dutch culture
- Great dining options and nightlife
- Various water sports and activities
Would you like me to get some specific hotel recommendations in Aruba for your stay? I can transfer you to our hotel advisor who can help with accommodations.
--- Conversation Turn 2 ---
User: Command(resume='could you recommend a nice hotel in one of the areas and tell me which area it is.')
hotel_advisor: Based on the recommendations, I can suggest two excellent options:
1. The Ritz-Carlton, Aruba - Located in Palm Beach
- This luxury resort is situated in the vibrant Palm Beach area
- Known for its exceptional service and amenities
- Perfect if you want to be close to dining, shopping, and entertainment
- Features multiple restaurants, a casino, and a world-class spa
- Located on a pristine stretch of Palm Beach
2. Bucuti & Tara Beach Resort - Located in Eagle Beach
- An adults-only boutique resort on Eagle Beach
- Known for being more intimate and peaceful
- Award-winning for its sustainability practices
- Perfect for a romantic getaway or peaceful vacation
- Located on one of the most beautiful beaches in the Caribbean
Would you like more specific information about either of these properties or their locations?
--- Conversation Turn 3 ---
User: Command(resume='i like the first one. could you recommend something to do near the hotel?')
travel_advisor: Near the Ritz-Carlton in Palm Beach, here are some highly recommended activities:
1. Visit the Palm Beach Plaza Mall - Just a short walk from the hotel, featuring shopping, dining, and entertainment
2. Try your luck at the Stellaris Casino - It's right in the Ritz-Carlton
3. Take a sunset sailing cruise - Many depart from the nearby pier
4. Visit the California Lighthouse - A scenic landmark just north of Palm Beach
5. Enjoy water sports at Palm Beach:
- Jet skiing
- Parasailing
- Snorkeling
- Stand-up paddleboarding
Would you like more specific information about any of these activities or would you like to know about other options in the area?
```
## Prebuilt implementations
LangGraph comes with prebuilt implementations of two of the most popular multi-agent architectures:
- supervisor — individual agents are coordinated by a central supervisor agent. The supervisor controls all communication flow and task delegation, making decisions about which agent to invoke based on the current context and task requirements. You can use [`langgraph-supervisor`](https://github.com/langchain-ai/langgraph-supervisor-py) library to create a supervisor multi-agent systems.
- swarm — agents dynamically hand off control to one another based on their specializations. The system remembers which agent was last active, ensuring that on subsequent interactions, the conversation resumes with that agent. You can use [`langgraph-swarm`](https://github.com/langchain-ai/langgraph-swarm-py) library to create a swarm multi-agent systems.
---
how-tos/disable-streaming.ipynb
---
# How to disable streaming for models that don't support it
Prerequisites
This guide assumes familiarity with the following:
Some chat models, including the new O1 models from OpenAI (depending on when you're reading this), do not support streaming. This can lead to issues when using the [astream_events API](https://python.langchain.com/docs/concepts/#astream_events), as it calls models in streaming mode, expecting streaming to function properly.
In this guide, we’ll show you how to disable streaming for models that don’t support it, ensuring they they're never called in streaming mode, even when invoked through the astream_events API.
```python
from langchain_openai import ChatOpenAI
from langgraph.graph import MessagesState
from langgraph.graph import StateGraph, START, END
llm = ChatOpenAI(model="o1-preview", temperature=1)
graph_builder = StateGraph(MessagesState)
def chatbot(state: MessagesState):
return {"messages": [llm.invoke(state["messages"])]}
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)
graph = graph_builder.compile()
```
```python
from IPython.display import Image, display
display(Image(graph.get_graph().draw_mermaid_png()))
```
## Without disabling streaming
Now that we've defined our graph, let's try to call `astream_events` without disabling streaming. This should throw an error because the `o1` model does not support streaming natively:
```python
input = {"messages": {"role": "user", "content": "how many r's are in strawberry?"}}
try:
async for event in graph.astream_events(input, version="v2"):
if event["event"] == "on_chat_model_end":
print(event["data"]["output"].content, end="", flush=True)
except:
print("Streaming not supported!")
```
```output
Streaming not supported!
```
An error occurred as we expected, luckily there is an easy fix!
## Disabling streaming
Now without making any changes to our graph, let's set the [disable_streaming](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html#langchain_core.language_models.chat_models.BaseChatModel.disable_streaming) parameter on our model to be `True` which will solve the problem:
```python
llm = ChatOpenAI(model="o1-preview", temperature=1, disable_streaming=True)
graph_builder = StateGraph(MessagesState)
def chatbot(state: MessagesState):
return {"messages": [llm.invoke(state["messages"])]}
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)
graph = graph_builder.compile()
```
And now, rerunning with the same input, we should see no errors:
```python
input = {"messages": {"role": "user", "content": "how many r's are in strawberry?"}}
async for event in graph.astream_events(input, version="v2"):
if event["event"] == "on_chat_model_end":
print(event["data"]["output"].content, end="", flush=True)
```
```output
There are three "r"s in the word "strawberry".
```
---
how-tos/multi-agent-multi-turn-convo-functional.ipynb
---
# How to add multi-turn conversation in a multi-agent application (functional API)
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Multi-agent systems
- Human-in-the-loop
- Functional API
- Command
- LangGraph Glossary
In this how-to guide, we’ll build an application that allows an end-user to engage in a *multi-turn conversation* with one or more agents. We'll create a node that uses an `interrupt` to collect user input and routes back to the **active** agent.
The agents will be implemented as tasks in a workflow that executes agent steps and determines the next action:
1. **Wait for user input** to continue the conversation, or
2. **Route to another agent** (or back to itself, such as in a loop) via a **handoff**.
```python
from langgraph.func import entrypoint, task
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
from langgraph.types import interrupt
# Define a tool to signal intent to hand off to a different agent
# Note: this is not using Command(goto) syntax for navigating to different agents:
# `workflow()` below handles the handoffs explicitly
@tool(return_direct=True)
def transfer_to_hotel_advisor():
"""Ask hotel advisor agent for help."""
return "Successfully transferred to hotel advisor"
# define an agent
travel_advisor_tools = [transfer_to_hotel_advisor, ...]
travel_advisor = create_react_agent(model, travel_advisor_tools)
# define a task that calls an agent
@task
def call_travel_advisor(messages):
response = travel_advisor.invoke({"messages": messages})
return response["messages"]
# define the multi-agent network workflow
@entrypoint(checkpointer)
def workflow(messages):
call_active_agent = call_travel_advisor
while True:
agent_messages = call_active_agent(messages).result()
ai_msg = get_last_ai_msg(agent_messages)
if not ai_msg.tool_calls:
user_input = interrupt(value="Ready for user input.")
messages = messages + [{"role": "user", "content": user_input}]
continue
messages = messages + agent_messages
call_active_agent = get_next_agent(messages)
return entrypoint.final(value=agent_messages[-1], save=messages)
```
## Setup
First, let's install the required packages
```python
# %%capture --no-stderr
# %pip install -U langgraph langchain-anthropic
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
```
```output
ANTHROPIC_API_KEY: ········
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
In this example we will build a team of travel assistant agents that can communicate with each other.
We will create 2 agents:
* `travel_advisor`: can help with travel destination recommendations. Can ask `hotel_advisor` for help.
* `hotel_advisor`: can help with hotel recommendations. Can ask `travel_advisor` for help.
This is a fully-connected network - every agent can talk to any other agent.
```python
import random
from typing_extensions import Literal
from langchain_core.tools import tool
@tool
def get_travel_recommendations():
"""Get recommendation for travel destinations"""
return random.choice(["aruba", "turks and caicos"])
@tool
def get_hotel_recommendations(location: Literal["aruba", "turks and caicos"]):
"""Get hotel recommendations for a given destination."""
return {
"aruba": [
"The Ritz-Carlton, Aruba (Palm Beach)"
"Bucuti & Tara Beach Resort (Eagle Beach)"
],
"turks and caicos": ["Grace Bay Club", "COMO Parrot Cay"],
}[location]
@tool(return_direct=True)
def transfer_to_hotel_advisor():
"""Ask hotel advisor agent for help."""
return "Successfully transferred to hotel advisor"
@tool(return_direct=True)
def transfer_to_travel_advisor():
"""Ask travel advisor agent for help."""
return "Successfully transferred to travel advisor"
```
!!! note "Transfer tools"
You might have noticed that we're using `@tool(return_direct=True)` in the transfer tools. This is done so that individual agents (e.g., `travel_advisor`) can exit the ReAct loop early once these tools are called. This is the desired behavior, as we want to detect when the agent calls this tool and hand control off _immediately_ to a different agent.
**NOTE**: This is meant to work with the prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] -- if you are building a custom agent, make sure to manually add logic for handling early exit for tools that are marked with `return_direct`.
Let's now create our agents using the prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] and our multi-agent workflow. Note that will be calling [`interrupt`][langgraph.types.interrupt] every time after we get the final response from each of the agents.
```python
import uuid
from langchain_core.messages import AIMessage
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langgraph.graph import add_messages
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
from langgraph.types import interrupt, Command
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
# Define travel advisor ReAct agent
travel_advisor_tools = [
get_travel_recommendations,
transfer_to_hotel_advisor,
]
travel_advisor = create_react_agent(
model,
travel_advisor_tools,
prompt=(
"You are a general travel expert that can recommend travel destinations (e.g. countries, cities, etc). "
"If you need hotel recommendations, ask 'hotel_advisor' for help. "
"You MUST include human-readable response before transferring to another agent."
),
)
@task
def call_travel_advisor(messages):
# You can also add additional logic like changing the input to the agent / output from the agent, etc.
# NOTE: we're invoking the ReAct agent with the full history of messages in the state
response = travel_advisor.invoke({"messages": messages})
return response["messages"]
# Define hotel advisor ReAct agent
hotel_advisor_tools = [get_hotel_recommendations, transfer_to_travel_advisor]
hotel_advisor = create_react_agent(
model,
hotel_advisor_tools,
prompt=(
"You are a hotel expert that can provide hotel recommendations for a given destination. "
"If you need help picking travel destinations, ask 'travel_advisor' for help."
"You MUST include human-readable response before transferring to another agent."
),
)
@task
def call_hotel_advisor(messages):
response = hotel_advisor.invoke({"messages": messages})
return response["messages"]
checkpointer = MemorySaver()
def string_to_uuid(input_string):
return str(uuid.uuid5(uuid.NAMESPACE_URL, input_string))
@entrypoint(checkpointer=checkpointer)
def multi_turn_graph(messages, previous):
previous = previous or []
messages = add_messages(previous, messages)
call_active_agent = call_travel_advisor
while True:
agent_messages = call_active_agent(messages).result()
messages = add_messages(messages, agent_messages)
# Find the last AI message
# If one of the handoff tools is called, the last message returned
# by the agent will be a ToolMessage because we set them to have
# "return_direct=True". This means that the last AIMessage will
# have tool calls.
# Otherwise, the last returned message will be an AIMessage with
# no tool calls, which means we are ready for new input.
ai_msg = next(m for m in reversed(agent_messages) if isinstance(m, AIMessage))
if not ai_msg.tool_calls:
user_input = interrupt(value="Ready for user input.")
# Add user input as a human message
# NOTE: we generate unique ID for the human message based on its content
# it's important, since on subsequent invocations previous user input (interrupt) values
# will be looked up again and we will attempt to add them again here
# `add_messages` deduplicates messages based on the ID, ensuring correct message history
human_message = {
"role": "user",
"content": user_input,
"id": string_to_uuid(user_input),
}
messages = add_messages(messages, [human_message])
continue
tool_call = ai_msg.tool_calls[-1]
if tool_call["name"] == "transfer_to_hotel_advisor":
call_active_agent = call_hotel_advisor
elif tool_call["name"] == "transfer_to_travel_advisor":
call_active_agent = call_travel_advisor
else:
raise ValueError(f"Expected transfer tool, got '{tool_call['name']}'")
return entrypoint.final(value=agent_messages[-1], save=messages)
```
## Test multi-turn conversation
Let's test a multi turn conversation with this application.
```python
thread_config = {"configurable": {"thread_id": uuid.uuid4()}}
inputs = [
# 1st round of conversation,
{
"role": "user",
"content": "i wanna go somewhere warm in the caribbean",
"id": str(uuid.uuid4()),
},
# Since we're using `interrupt`, we'll need to resume using the Command primitive.
# 2nd round of conversation,
Command(
resume="could you recommend a nice hotel in one of the areas and tell me which area it is."
),
# 3rd round of conversation,
Command(
resume="i like the first one. could you recommend something to do near the hotel?"
),
]
for idx, user_input in enumerate(inputs):
print()
print(f"--- Conversation Turn {idx + 1} ---")
print()
print(f"User: {user_input}")
print()
for update in multi_turn_graph.stream(
user_input,
config=thread_config,
stream_mode="updates",
):
for node_id, value in update.items():
if isinstance(value, list) and value:
last_message = value[-1]
if isinstance(last_message, dict) or last_message.type != "ai":
continue
print(f"{node_id}: {last_message.content}")
```
```output
--- Conversation Turn 1 ---
User: {'role': 'user', 'content': 'i wanna go somewhere warm in the caribbean', 'id': 'f48d82a7-7efa-43f5-ad4c-541758c95f61'}
call_travel_advisor: Based on the recommendations, Aruba would be an excellent choice for your Caribbean getaway! Known as "One Happy Island," Aruba offers:
- Year-round warm weather with consistent temperatures around 82°F (28°C)
- Beautiful white sand beaches like Eagle Beach and Palm Beach
- Crystal clear waters perfect for swimming and snorkeling
- Minimal rainfall and location outside the hurricane belt
- Rich culture blending Dutch and Caribbean influences
- Various activities from water sports to desert-like landscape exploration
- Excellent dining and shopping options
Would you like me to help you find suitable accommodations in Aruba? I can transfer you to our hotel advisor who can recommend specific hotels based on your preferences.
--- Conversation Turn 2 ---
User: Command(resume='could you recommend a nice hotel in one of the areas and tell me which area it is.')
call_hotel_advisor: I can recommend two excellent options in different areas:
1. The Ritz-Carlton, Aruba - Located in Palm Beach
- Luxury beachfront resort
- Located in the vibrant Palm Beach area, known for its lively atmosphere
- Close to restaurants, shopping, and nightlife
- Perfect for those who want a more active vacation with plenty of amenities nearby
2. Bucuti & Tara Beach Resort - Located in Eagle Beach
- Adults-only boutique resort
- Situated on the quieter Eagle Beach
- Known for its romantic atmosphere and excellent service
- Ideal for couples seeking a more peaceful, intimate setting
Would you like more specific information about either of these properties or their locations?
--- Conversation Turn 3 ---
User: Command(resume='i like the first one. could you recommend something to do near the hotel?')
call_travel_advisor: Near The Ritz-Carlton in Palm Beach, here are some popular activities you can enjoy:
1. Palm Beach Strip - Take a walk along this bustling strip filled with restaurants, shops, and bars
2. Visit the Bubali Bird Sanctuary - Just a short distance away
3. Try your luck at the Stellaris Casino - Located right in The Ritz-Carlton
4. Water Sports at Palm Beach - Right in front of the hotel you can:
- Go parasailing
- Try jet skiing
- Take a sunset sailing cruise
5. Visit the Palm Beach Plaza Mall - High-end shopping just a short walk away
6. Enjoy dinner at Madame Janette's - One of Aruba's most famous restaurants nearby
Would you like more specific information about any of these activities or other suggestions in the area?
```
---
how-tos/run-id-langsmith.ipynb
---
# How to pass custom run ID or set tags and metadata for graph runs in LangSmith
Prerequisites
This guide assumes familiarity with the following:
Debugging graph runs can sometimes be difficult to do in an IDE or terminal. [LangSmith](https://docs.smith.langchain.com) lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read the [LangSmith documentation](https://docs.smith.langchain.com) for more information on how to get started.
To make it easier to identify and analyzed traces generated during graph invocation, you can set additional configuration at run time (see [RunnableConfig](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig)):
| **Field** | **Type** | **Description** |
|-------------|---------------------|--------------------------------------------------------------------------------------------------------------------|
| run_name | `str` | Name for the tracer run for this call. Defaults to the name of the class. |
| run_id | `UUID` | Unique identifier for the tracer run for this call. If not provided, a new UUID will be generated. |
| tags | `List[str]` | Tags for this call and any sub-calls (e.g., a Chain calling an LLM). You can use these to filter calls. |
| metadata | `Dict[str, Any]` | Metadata for this call and any sub-calls (e.g., a Chain calling an LLM). Keys should be strings, values should be JSON-serializable. |
LangGraph graphs implement the [LangChain Runnable Interface](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) and accept a second argument (`RunnableConfig`) in methods like `invoke`, `ainvoke`, `stream` etc.
The LangSmith platform will allow you to search and filter traces based on `run_name`, `run_id`, `tags` and `metadata`.
## TLDR
```python
import uuid
# Generate a random UUID -- it must be a UUID
config = {"run_id": uuid.uuid4()}, "tags": ["my_tag1"], "metadata": {"a": 5}}
# Works with all standard Runnable methods
# like invoke, batch, ainvoke, astream_events etc
graph.stream(inputs, config, stream_mode="values")
```
The rest of the how to guide will show a full agent.
## Setup
First, let's install the required packages and set our API keys
```shell
pip install --quiet -U langgraph langchain_openai
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
_set_env("LANGSMITH_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Define the graph
For this example we will use the [prebuilt ReAct agent](https://langchain-ai.github.io/langgraph/how-tos/create-react-agent/).
```python
from langchain_openai import ChatOpenAI
from typing import Literal
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
# First we initialize the model we want to use.
model = ChatOpenAI(model="gpt-4o", temperature=0)
# For this tutorial we will use custom tool that returns pre-defined values for weather in two cities (NYC & SF)
@tool
def get_weather(city: Literal["nyc", "sf"]):
"""Use this to get weather information."""
if city == "nyc":
return "It might be cloudy in nyc"
elif city == "sf":
return "It's always sunny in sf"
else:
raise AssertionError("Unknown city")
tools = [get_weather]
# Define the graph
graph = create_react_agent(model, tools=tools)
```
## Run your graph
Now that we've defined our graph let's run it once and view the trace in LangSmith. In order for our trace to be easily accessible in LangSmith, we will pass in a custom `run_id` in the config.
This assumes that you have set your `LANGSMITH_API_KEY` environment variable.
Note that you can also configure what project to trace to by setting the `LANGCHAIN_PROJECT` environment variable, by default runs will be traced to the `default` project.
```python
import uuid
def print_stream(stream):
for s in stream:
message = s["messages"][-1]
if isinstance(message, tuple):
print(message)
else:
message.pretty_print()
inputs = {"messages": [("user", "what is the weather in sf")]}
config = {"run_name": "agent_007", "tags": ["cats are awesome"]}
print_stream(graph.stream(inputs, config, stream_mode="values"))
```
```output
================================ Human Message =================================
what is the weather in sf
================================== Ai Message ==================================
Tool Calls:
get_weather (call_9ZudXyMAdlUjptq9oMGtQo8o)
Call ID: call_9ZudXyMAdlUjptq9oMGtQo8o
Args:
city: sf
================================= Tool Message =================================
Name: get_weather
It's always sunny in sf
================================== Ai Message ==================================
The weather in San Francisco is currently sunny.
```
## View the trace in LangSmith
Now that we've ran our graph, let's head over to LangSmith and view our trace. First click into the project that you traced to (in our case the default project). You should see a run with the custom run name "agent_007".
In addition, you will be able to filter traces after the fact using the tags or metadata provided. For example,
---
how-tos/create-react-agent-manage-message-history.ipynb
---
# How to manage conversation history in a ReAct Agent
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Prebuilt create_react_agent
- Persistence
- Short-term Memory
- [Trimming Messages](https://python.langchain.com/docs/how_to/trim_messages/)
Message history can grow quickly and exceed LLM context window size, whether you're building chatbots with many conversation turns or agentic systems with numerous tool calls. There are several strategies for managing the message history:
* message trimming — remove first or last N messages in the history
* summarization — summarize earlier messages in the history and replace them with a summary
* custom strategies (e.g., message filtering, etc.)
To manage message history in `create_react_agent`, you need to define a `pre_model_hook` function or [runnable](https://python.langchain.com/docs/concepts/runnables/) that takes graph state an returns a state update:
* Trimming example:
```python hl_lines="1 2 3 4 19 25"
from langchain_core.messages.utils import (
trim_messages,
count_tokens_approximately
)
from langgraph.prebuilt import create_react_agent
# This function will be called every time before the node that calls LLM
def pre_model_hook(state):
trimmed_messages = trim_messages(
state["messages"],
strategy="last",
token_counter=count_tokens_approximately,
max_tokens=384,
start_on="human",
end_on=("human", "tool"),
)
# You can return updated messages either under `llm_input_messages` or
# `messages` key (see the note below)
return {"llm_input_messages": trimmed_messages}
checkpointer = InMemorySaver()
agent = create_react_agent(
model,
tools,
pre_model_hook=pre_model_hook,
checkpointer=checkpointer,
)
```
* Summarization example:
```python hl_lines="1 20 27 28"
from langmem.short_term import SummarizationNode
from langchain_core.messages.utils import count_tokens_approximately
from langgraph.prebuilt.chat_agent_executor import AgentState
from langgraph.checkpoint.memory import InMemorySaver
from typing import Any
model = ChatOpenAI(model="gpt-4o")
summarization_node = SummarizationNode(
token_counter=count_tokens_approximately,
model=model,
max_tokens=384,
max_summary_tokens=128,
output_messages_key="llm_input_messages",
)
class State(AgentState):
# NOTE: we're adding this key to keep track of previous summary information
# to make sure we're not summarizing on every LLM call
context: dict[str, Any]
checkpointer = InMemorySaver()
graph = create_react_agent(
model,
tools,
pre_model_hook=summarization_node,
state_schema=State,
checkpointer=checkpointer,
)
```
!!! Important
* To **keep the original message history unmodified** in the graph state and pass the updated history **only as the input to the LLM**, return updated messages under `llm_input_messages` key
* To **overwrite the original message history** in the graph state with the updated history, return updated messages under `messages` key
To overwrite the `messages` key, you need to do the following:
```python
from langchain_core.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
def pre_model_hook(state):
updated_messages = ...
return {
"messages": [RemoveMessage(id=REMOVE_ALL_MESSAGES), *updated_messages]
...
}
```
## Setup
First, let's install the required packages and set our API keys
```shell
pip install -U langgraph langchain-openai langmem
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Keep the original message history unmodified
Let's build a ReAct agent with a step that manages the conversation history: when the length of the history exceeds a specified number of tokens, we will call [`trim_messages`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.trim_messages.html) utility that that will reduce the history while satisfying LLM provider constraints.
There are two ways that the updated message history can be applied inside ReAct agent:
* **Keep the original message history unmodified** in the graph state and pass the updated history **only as the input to the LLM**
* **Overwrite the original message history** in the graph state with the updated history
Let's start by implementing the first one. We'll need to first define model and tools for our agent:
```python
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o", temperature=0)
def get_weather(location: str) -> str:
"""Use this to get weather information."""
if any([city in location.lower() for city in ["nyc", "new york city"]]):
return "It might be cloudy in nyc, with a chance of rain and temperatures up to 80 degrees."
elif any([city in location.lower() for city in ["sf", "san francisco"]]):
return "It's always sunny in sf"
else:
return f"I am not sure what the weather is in {location}"
tools = [get_weather]
```
Now let's implement `pre_model_hook` — a function that will be added as a new node and called every time **before** the node that calls the LLM (the `agent` node).
Our implementation will wrap the `trim_messages` call and return the trimmed messages under `llm_input_messages`. This will **keep the original message history unmodified** in the graph state and pass the updated history **only as the input to the LLM**
```python hl_lines="4 5 6 7 22 29"
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages.utils import (
trim_messages,
count_tokens_approximately,
)
# This function will be added as a new node in ReAct agent graph
# that will run every time before the node that calls the LLM.
# The messages returned by this function will be the input to the LLM.
def pre_model_hook(state):
trimmed_messages = trim_messages(
state["messages"],
strategy="last",
token_counter=count_tokens_approximately,
max_tokens=384,
start_on="human",
end_on=("human", "tool"),
)
return {"llm_input_messages": trimmed_messages}
checkpointer = InMemorySaver()
graph = create_react_agent(
model,
tools,
pre_model_hook=pre_model_hook,
checkpointer=checkpointer,
)
```
```python
from IPython.display import display, Image
display(Image(graph.get_graph().draw_mermaid_png()))
```
We'll also define a utility to render the agent outputs nicely:
```python
def print_stream(stream, output_messages_key="llm_input_messages"):
for chunk in stream:
for node, update in chunk.items():
print(f"Update from node: {node}")
messages_key = (
output_messages_key if node == "pre_model_hook" else "messages"
)
for message in update[messages_key]:
if isinstance(message, tuple):
print(message)
else:
message.pretty_print()
print("\n\n")
```
Now let's run the agent with a few different queries to reach the specified max tokens limit:
```python
config = {"configurable": {"thread_id": "1"}}
inputs = {"messages": [("user", "What's the weather in NYC?")]}
result = graph.invoke(inputs, config=config)
inputs = {"messages": [("user", "What's it known for?")]}
result = graph.invoke(inputs, config=config)
```
Let's see how many tokens we have in the message history so far:
```python
messages = result["messages"]
count_tokens_approximately(messages)
```
```output
415
```
You can see that we are close to the `max_tokens` threshold, so on the next invocation we should see `pre_model_hook` kick-in and trim the message history. Let's run it again:
```python
inputs = {"messages": [("user", "where can i find the best bagel?")]}
print_stream(graph.stream(inputs, config=config, stream_mode="updates"))
```
```output
Update from node: pre_model_hook
================================ Human Message =================================
What's it known for?
================================== Ai Message ==================================
New York City is known for a variety of iconic landmarks, cultural institutions, and vibrant neighborhoods. Some of the most notable features include:
1. **Statue of Liberty**: A symbol of freedom and democracy, located on Liberty Island.
2. **Times Square**: Known for its bright lights, Broadway theaters, and bustling atmosphere.
3. **Central Park**: A large public park offering a natural retreat in the middle of the city.
4. **Empire State Building**: An iconic skyscraper offering panoramic views of the city.
5. **Broadway**: Famous for its world-class theater productions.
6. **Wall Street**: The financial hub of the United States.
7. **Museums**: Including the Metropolitan Museum of Art, Museum of Modern Art (MoMA), and the American Museum of Natural History.
8. **Diverse Cuisine**: A melting pot of cultures offering a wide range of culinary experiences.
9. **Cultural Diversity**: A rich tapestry of cultures and communities from around the world.
10. **Fashion**: A global fashion capital, hosting events like New York Fashion Week.
These are just a few highlights of what makes New York City a unique and vibrant place.
================================ Human Message =================================
where can i find the best bagel?
Update from node: agent
================================== Ai Message ==================================
New York City is famous for its bagels, and there are several places renowned for serving some of the best. Here are a few top spots where you can find excellent bagels in NYC:
1. **Ess-a-Bagel**: Known for their large, chewy bagels with a variety of spreads and toppings.
2. **Russ & Daughters**: A classic spot offering traditional bagels with high-quality smoked fish and cream cheese.
3. **H&H Bagels**: Famous for their fresh, hand-rolled bagels.
4. **Murray’s Bagels**: Offers a wide selection of bagels and spreads, with a no-toasting policy to preserve freshness.
5. **Absolute Bagels**: Known for their authentic, fluffy bagels and a variety of cream cheese options.
6. **Tompkins Square Bagels**: Offers creative bagel sandwiches and a wide range of spreads.
7. **Bagel Hole**: Known for their smaller, denser bagels with a crispy crust.
Each of these places has its own unique style and flavor, so it might be worth trying a few to find your personal favorite!
```
You can see that the `pre_model_hook` node now only returned the last 3 messages, as expected. However, the existing message history is untouched:
```python
updated_messages = graph.get_state(config).values["messages"]
assert [(m.type, m.content) for m in updated_messages[: len(messages)]] == [
(m.type, m.content) for m in messages
]
```
## Overwrite the original message history
Let's now change the `pre_model_hook` to **overwrite** the message history in the graph state. To do this, we’ll return the updated messages under `messages` key. We’ll also include a special `RemoveMessage(REMOVE_ALL_MESSAGES)` object, which tells `create_react_agent` to remove previous messages from the graph state:
```python hl_lines="16 23"
from langchain_core.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
def pre_model_hook(state):
trimmed_messages = trim_messages(
state["messages"],
strategy="last",
token_counter=count_tokens_approximately,
max_tokens=384,
start_on="human",
end_on=("human", "tool"),
)
# NOTE that we're now returning the messages under the `messages` key
# We also remove the existing messages in the history to ensure we're overwriting the history
return {"messages": [RemoveMessage(REMOVE_ALL_MESSAGES)] + trimmed_messages}
checkpointer = InMemorySaver()
graph = create_react_agent(
model,
tools,
pre_model_hook=pre_model_hook,
checkpointer=checkpointer,
)
```
Now let's run the agent with the same queries as before:
```python
config = {"configurable": {"thread_id": "1"}}
inputs = {"messages": [("user", "What's the weather in NYC?")]}
result = graph.invoke(inputs, config=config)
inputs = {"messages": [("user", "What's it known for?")]}
result = graph.invoke(inputs, config=config)
messages = result["messages"]
inputs = {"messages": [("user", "where can i find the best bagel?")]}
print_stream(
graph.stream(inputs, config=config, stream_mode="updates"),
output_messages_key="messages",
)
```
```output
Update from node: pre_model_hook
================================ Remove Message ================================
================================ Human Message =================================
What's it known for?
================================== Ai Message ==================================
New York City is known for a variety of iconic landmarks, cultural institutions, and vibrant neighborhoods. Some of the most notable features include:
1. **Statue of Liberty**: A symbol of freedom and democracy, located on Liberty Island.
2. **Times Square**: Known for its bright lights, Broadway theaters, and bustling atmosphere.
3. **Central Park**: A large public park offering a natural oasis amidst the urban environment.
4. **Empire State Building**: An iconic skyscraper offering panoramic views of the city.
5. **Broadway**: Famous for its world-class theater productions and musicals.
6. **Wall Street**: The financial hub of the United States, located in the Financial District.
7. **Museums**: Including the Metropolitan Museum of Art, Museum of Modern Art (MoMA), and the American Museum of Natural History.
8. **Diverse Cuisine**: A melting pot of cultures, offering a wide range of international foods.
9. **Cultural Diversity**: Known for its diverse population and vibrant cultural scene.
10. **Brooklyn Bridge**: An iconic suspension bridge connecting Manhattan and Brooklyn.
These are just a few highlights, as NYC is a city with endless attractions and activities.
================================ Human Message =================================
where can i find the best bagel?
Update from node: agent
================================== Ai Message ==================================
New York City is famous for its bagels, and there are several places renowned for serving some of the best. Here are a few top spots where you can find delicious bagels in NYC:
1. **Ess-a-Bagel**: Known for its large, chewy bagels and a wide variety of spreads and toppings. Locations in Midtown and the East Village.
2. **Russ & Daughters**: A historic appetizing store on the Lower East Side, famous for its bagels with lox and cream cheese.
3. **Absolute Bagels**: Located on the Upper West Side, this spot is popular for its fresh, fluffy bagels.
4. **Murray’s Bagels**: Known for its traditional, hand-rolled bagels. Located in Greenwich Village.
5. **Tompkins Square Bagels**: Offers a wide selection of bagels and creative cream cheese flavors. Located in the East Village.
6. **Bagel Hole**: A small shop in Park Slope, Brooklyn, known for its classic, no-frills bagels.
7. **Leo’s Bagels**: Located in the Financial District, known for its authentic New York-style bagels.
Each of these places has its own unique style and flavor, so it might be worth trying a few to find your personal favorite!
```
You can see that the `pre_model_hook` node returned the last 3 messages again. However, this time, the message history is modified in the graph state as well:
```python
updated_messages = graph.get_state(config).values["messages"]
assert (
# First 2 messages in the new history are the same as last 2 messages in the old
[(m.type, m.content) for m in updated_messages[:2]]
== [(m.type, m.content) for m in messages[-2:]]
)
```
## Summarizing message history
Finally, let's apply a different strategy for managing message history — summarization. Just as with trimming, you can choose to keep original message history unmodified or overwrite it. The example below will only show the former.
We will use the [`SummarizationNode`](https://langchain-ai.github.io/langmem/guides/summarization/#using-summarizationnode) from the prebuilt `langmem` library. Once the message history reaches the token limit, the summarization node will summarize earlier messages to make sure they fit into `max_tokens`.
```python hl_lines="1 20 28 29"
from langmem.short_term import SummarizationNode
from langgraph.prebuilt.chat_agent_executor import AgentState
from typing import Any
model = ChatOpenAI(model="gpt-4o")
summarization_model = model.bind(max_tokens=128)
summarization_node = SummarizationNode(
token_counter=count_tokens_approximately,
model=summarization_model,
max_tokens=384,
max_summary_tokens=128,
output_messages_key="llm_input_messages",
)
class State(AgentState):
# NOTE: we're adding this key to keep track of previous summary information
# to make sure we're not summarizing on every LLM call
context: dict[str, Any]
checkpointer = InMemorySaver()
graph = create_react_agent(
# limit the output size to ensure consistent behavior
model.bind(max_tokens=256),
tools,
pre_model_hook=summarization_node,
state_schema=State,
checkpointer=checkpointer,
)
```
```python
config = {"configurable": {"thread_id": "1"}}
inputs = {"messages": [("user", "What's the weather in NYC?")]}
result = graph.invoke(inputs, config=config)
inputs = {"messages": [("user", "What's it known for?")]}
result = graph.invoke(inputs, config=config)
inputs = {"messages": [("user", "where can i find the best bagel?")]}
print_stream(graph.stream(inputs, config=config, stream_mode="updates"))
```
```output
Update from node: pre_model_hook
================================ System Message ================================
Summary of the conversation so far: The user asked about the current weather in New York City. In response, the assistant provided information that it might be cloudy, with a chance of rain, and temperatures reaching up to 80 degrees.
================================ Human Message =================================
What's it known for?
================================== Ai Message ==================================
New York City, often referred to as NYC, is known for its:
1. **Landmarks and Iconic Sites**:
- **Statue of Liberty**: A symbol of freedom and democracy.
- **Central Park**: A vast green oasis in the middle of the city.
- **Empire State Building**: Once the tallest building in the world, offering stunning views of the city.
- **Times Square**: Known for its bright lights and bustling atmosphere.
2. **Cultural Institutions**:
- **Broadway**: Renowned for theatrical performances and musicals.
- **Metropolitan Museum of Art** and **Museum of Modern Art (MoMA)**: World-class art collections.
- **American Museum of Natural History**: Known for its extensive exhibits ranging from dinosaurs to space exploration.
3. **Diverse Neighborhoods and Cuisine**:
- NYC is famous for having a melting pot of cultures, reflected in neighborhoods like Chinatown, Little Italy, and Harlem.
- The city offers a wide range of international cuisines, from street food to high-end dining.
4. **Financial District**:
- Home to Wall Street, the New York Stock Exchange (NYSE), and other major financial institutions.
5. **Media and Entertainment**:
- Major hub for television, film, and media, with numerous studios and networks based there.
6. **Fashion**:
- Often referred to as one of the "Big Four" fashion capitals, hosting events like New York Fashion Week.
7. **Sports**:
- Known for its passionate sports culture with teams like the Yankees (MLB), Mets (MLB), Knicks (NBA), and Rangers (NHL).
These elements, among others, contribute to NYC's reputation as a vibrant and dynamic city.
================================ Human Message =================================
where can i find the best bagel?
Update from node: agent
================================== Ai Message ==================================
Finding the best bagel in New York City can be subjective, as there are many beloved spots across the city. However, here are some renowned bagel shops you might want to try:
1. **Ess-a-Bagel**: Known for its chewy and flavorful bagels, located in Midtown and Stuyvesant Town.
2. **Bagel Hole**: A favorite for traditionalists, offering classic and dense bagels, located in Park Slope, Brooklyn.
3. **Russ & Daughters**: A legendary appetizing store on the Lower East Side, famous for their bagels with lox.
4. **Murray’s Bagels**: Located in Greenwich Village, known for their fresh and authentic New York bagels.
5. **Absolute Bagels**: Located on the Upper West Side, they’re known for their fresh, fluffy bagels with a variety of spreads.
6. **Tompkins Square Bagels**: In the East Village, famous for their creative cream cheese options and fresh bagels.
7. **Zabar’s**: A landmark on the Upper West Side known for their classic bagels and smoked fish.
Each of these spots offers a unique take on the classic New York bagel experience, and trying several might be the best way to discover your personal favorite!
```
You can see that the earlier messages have now been replaced with the summary of the earlier conversation!
---
how-tos/autogen-integration-functional.ipynb
---
# How to integrate LangGraph (functional API) with AutoGen, CrewAI, and other frameworks
LangGraph is a framework for building agentic and multi-agent applications. LangGraph can be easily integrated with other agent frameworks.
The primary reasons you might want to integrate LangGraph with other agent frameworks:
- create multi-agent systems where individual agents are built with different frameworks
- leverage LangGraph to add features like persistence, streaming, short and long-term memory and more
The simplest way to integrate agents from other frameworks is by calling those agents inside a LangGraph node:
```python
import autogen
from langgraph.func import entrypoint, task
autogen_agent = autogen.AssistantAgent(name="assistant", ...)
user_proxy = autogen.UserProxyAgent(name="user_proxy", ...)
@task
def call_autogen_agent(messages):
response = user_proxy.initiate_chat(
autogen_agent,
message=messages[-1],
...
)
...
@entrypoint()
def workflow(messages):
response = call_autogen_agent(messages).result()
return response
workflow.invoke(
[
{
"role": "user",
"content": "Find numbers between 10 and 30 in fibonacci sequence",
}
]
)
```
In this guide we show how to build a LangGraph chatbot that integrates with AutoGen, but you can follow the same approach with other frameworks.
## Setup
```python
%pip install autogen langgraph
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
```output
OPENAI_API_KEY: ········
```
## Define AutoGen agent
Here we define our AutoGen agent. Adapted from official tutorial [here](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_web_info.ipynb).
```python
import autogen
import os
config_list = [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}]
llm_config = {
"timeout": 600,
"cache_seed": 42,
"config_list": config_list,
"temperature": 0,
}
autogen_agent = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={
"work_dir": "web",
"use_docker": False,
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
llm_config=llm_config,
system_message="Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.",
)
```
---
## Create the workflow
We will now create a LangGraph chatbot graph that calls AutoGen agent.
```python
from langchain_core.messages import convert_to_openai_messages, BaseMessage
from langgraph.func import entrypoint, task
from langgraph.graph import add_messages
from langgraph.checkpoint.memory import MemorySaver
@task
def call_autogen_agent(messages: list[BaseMessage]):
# convert to openai-style messages
messages = convert_to_openai_messages(messages)
response = user_proxy.initiate_chat(
autogen_agent,
message=messages[-1],
# pass previous message history as context
carryover=messages[:-1],
)
# get the final response from the agent
content = response.chat_history[-1]["content"]
return {"role": "assistant", "content": content}
# add short-term memory for storing conversation history
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(messages: list[BaseMessage], previous: list[BaseMessage]):
messages = add_messages(previous or [], messages)
response = call_autogen_agent(messages).result()
return entrypoint.final(value=response, save=add_messages(messages, response))
```
## Run the graph
We can now run the graph.
```python hl_lines="2 11"
# pass the thread ID to persist agent outputs for future interactions
config = {"configurable": {"thread_id": "1"}}
for chunk in workflow.stream(
[
{
"role": "user",
"content": "Find numbers between 10 and 30 in fibonacci sequence",
}
],
config,
):
print(chunk)
```
```output
user_proxy (to assistant):
Find numbers between 10 and 30 in fibonacci sequence
--------------------------------------------------------------------------------
assistant (to user_proxy):
To find numbers between 10 and 30 in the Fibonacci sequence, we can generate the Fibonacci sequence and check which numbers fall within this range. Here's a plan:
1. Generate Fibonacci numbers starting from 0.
2. Continue generating until the numbers exceed 30.
3. Collect and print the numbers that are between 10 and 30.
Let's implement this in Python:
\`\`\`python
# filename: fibonacci_range.py
def fibonacci_sequence():
a, b = 0, 1
while a <= 30:
if 10 <= a <= 30:
print(a)
a, b = b, a + b
fibonacci_sequence()
\`\`\`
This script will print the Fibonacci numbers between 10 and 30. Please execute the code to see the result.
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
user_proxy (to assistant):
exitcode: 0 (execution succeeded)
Code output:
13
21
--------------------------------------------------------------------------------
assistant (to user_proxy):
The Fibonacci numbers between 10 and 30 are 13 and 21.
These numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1.
The sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
As you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.
TERMINATE
--------------------------------------------------------------------------------
{'call_autogen_agent': {'role': 'assistant', 'content': 'The Fibonacci numbers between 10 and 30 are 13 and 21. \n\nThese numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1. \n\nThe sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...\n\nAs you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.\n\nTERMINATE'}}
{'workflow': {'role': 'assistant', 'content': 'The Fibonacci numbers between 10 and 30 are 13 and 21. \n\nThese numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1. \n\nThe sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...\n\nAs you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.\n\nTERMINATE'}}
```
Since we're leveraging LangGraph's [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/) features we can now continue the conversation using the same thread ID -- LangGraph will automatically pass previous history to the AutoGen agent:
```python hl_lines="8"
for chunk in workflow.stream(
[
{
"role": "user",
"content": "Multiply the last number by 3",
}
],
config,
):
print(chunk)
```
```output
user_proxy (to assistant):
Multiply the last number by 3
Context:
Find numbers between 10 and 30 in fibonacci sequence
The Fibonacci numbers between 10 and 30 are 13 and 21.
These numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1.
The sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
As you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.
TERMINATE
--------------------------------------------------------------------------------
assistant (to user_proxy):
The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:
21 * 3 = 63
TERMINATE
--------------------------------------------------------------------------------
{'call_autogen_agent': {'role': 'assistant', 'content': 'The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:\n\n21 * 3 = 63\n\nTERMINATE'}}
{'workflow': {'role': 'assistant', 'content': 'The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:\n\n21 * 3 = 63\n\nTERMINATE'}}
```
---
how-tos/react-agent-structured-output.ipynb
---
# How to force tool-calling agent to structure output
Prerequisites
This guide assumes familiarity with the following:
You might want your agent to return its output in a structured format. For example, if the output of the agent is used by some other downstream software, you may want the output to be in the same structured format every time the agent is invoked to ensure consistency.
This notebook will walk through two different options for forcing a tool calling agent to structure its output. We will be using a basic [ReAct agent](https://langchain-ai.github.io/langgraph/how-tos/create-react-agent/) (a model node and a tool-calling node) together with a third node at the end that will format response for the user. Both of the options will use the same graph structure as shown in the diagram below, but will have different mechanisms under the hood.
**Option 1**
The first way you can force your tool calling agent to have structured output is to bind the output you would like as an additional tool for the `agent` node to use. In contrast to the basic ReAct agent, the `agent` node in this case is not selecting between `tools` and `END` but rather selecting between the specific tools it calls. The expected flow in this case is that the LLM in the `agent` node will first select the action tool, and after receiving the action tool output it will call the response tool, which will then route to the `respond` node which simply structures the arguments from the `agent` node tool call.
**Pros and Cons**
The benefit to this format is that you only need one LLM, and can save money and latency because of this. The downside to this option is that it isn't guaranteed that the single LLM will call the correct tool when you want it to. We can help the LLM by setting `tool_choice` to `any` when we use `bind_tools` which forces the LLM to select at least one tool at every turn, but this is far from a foolproof strategy. In addition, another downside is that the agent might call *multiple* tools, so we need to check for this explicitly in our routing function (or if we are using OpenAI we can set `parallell_tool_calling=False` to ensure only one tool is called at a time).
**Option 2**
The second way you can force your tool calling agent to have structured output is to use a second LLM (in this case `model_with_structured_output`) to respond to the user.
In this case, you will define a basic ReAct agent normally, but instead of having the `agent` node choose between the `tools` node and ending the conversation, the `agent` node will choose between the `tools` node and the `respond` node. The `respond` node will contain a second LLM that uses structured output, and once called will return directly to the user. You can think of this method as basic ReAct with one extra step before responding to the user.
**Pros and Cons**
The benefit of this method is that it guarantees structured output (as long as `.with_structured_output` works as expected with the LLM). The downside to using this approach is that it requires making an additional LLM call before responding to the user, which can increase costs as well as latency. In addition, by not providing the `agent` node LLM with information about the desired output schema there is a risk that the `agent` LLM will fail to call the correct tools required to answer in the correct output schema.
Note that both of these options will follow the exact same graph structure (see the diagram above), in that they are both exact replicas of the basic ReAct architecture but with a `respond` node before the end.
## Setup
First, let's install the required packages and set our API keys
```shell
pip install -U langgraph langchain_anthropic
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Define model, tools, and graph state
Now we can define how we want to structure our output, define our graph state, and also our tools and the models we are going to use.
To use structured output, we will use the `with_structured_output` method from LangChain, which you can read more about [here](https://python.langchain.com/docs/how_to/structured_output/).
We are going to use a single tool in this example for finding the weather, and will return a structured weather response to the user.
```python
from pydantic import BaseModel, Field
from typing import Literal
from langchain_core.tools import tool
from langchain_anthropic import ChatAnthropic
from langgraph.graph import MessagesState
class WeatherResponse(BaseModel):
"""Respond to the user with this"""
temperature: float = Field(description="The temperature in fahrenheit")
wind_directon: str = Field(
description="The direction of the wind in abbreviated form"
)
wind_speed: float = Field(description="The speed of the wind in km/h")
# Inherit 'messages' key from MessagesState, which is a list of chat messages
class AgentState(MessagesState):
# Final structured response from the agent
final_response: WeatherResponse
@tool
def get_weather(city: Literal["nyc", "sf"]):
"""Use this to get weather information."""
if city == "nyc":
return "It is cloudy in NYC, with 5 mph winds in the North-East direction and a temperature of 70 degrees"
elif city == "sf":
return "It is 75 degrees and sunny in SF, with 3 mph winds in the South-East direction"
else:
raise AssertionError("Unknown city")
tools = [get_weather]
model = ChatAnthropic(model="claude-3-opus-20240229")
model_with_tools = model.bind_tools(tools)
model_with_structured_output = model.with_structured_output(WeatherResponse)
```
## Option 1: Bind output as tool
Let's now examine how we would use the single LLM option.
### Define Graph
The graph definition is very similar to the one above, the only difference is we no longer call an LLM in the `response` node, and instead bind the `WeatherResponse` tool to our LLM that already contains the `get_weather` tool.
```python
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
tools = [get_weather, WeatherResponse]
# Force the model to use tools by passing tool_choice="any"
model_with_response_tool = model.bind_tools(tools, tool_choice="any")
# Define the function that calls the model
def call_model(state: AgentState):
response = model_with_response_tool.invoke(state["messages"])
# We return a list, because this will get added to the existing list
return {"messages": [response]}
# Define the function that responds to the user
def respond(state: AgentState):
# Construct the final answer from the arguments of the last tool call
weather_tool_call = state["messages"][-1].tool_calls[0]
response = WeatherResponse(**weather_tool_call["args"])
# Since we're using tool calling to return structured output,
# we need to add a tool message corresponding to the WeatherResponse tool call,
# This is due to LLM providers' requirement that AI messages with tool calls
# need to be followed by a tool message for each tool call
tool_message = {
"type": "tool",
"content": "Here is your structured response",
"tool_call_id": weather_tool_call["id"],
}
# We return the final answer
return {"final_response": response, "messages": [tool_message]}
# Define the function that determines whether to continue or not
def should_continue(state: AgentState):
messages = state["messages"]
last_message = messages[-1]
# If there is only one tool call and it is the response tool call we respond to the user
if (
len(last_message.tool_calls) == 1
and last_message.tool_calls[0]["name"] == "WeatherResponse"
):
return "respond"
# Otherwise we will use the tool node again
else:
return "continue"
# Define a new graph
workflow = StateGraph(AgentState)
# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))
# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")
# We now add a conditional edge
workflow.add_conditional_edges(
"agent",
should_continue,
{
"continue": "tools",
"respond": "respond",
},
)
workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()
```
### Usage
Now we can run our graph to check that it worked as intended:
```python
answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
"final_response"
]
```
```python
answer
```
```output
WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=3.0)
```
Again, the agent returned a `WeatherResponse` object as we expected.
## Option 2: 2 LLMs
Let's now dive into how we would use a second LLM to force structured output.
### Define Graph
We can now define our graph:
```python
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.messages import HumanMessage
# Define the function that calls the model
def call_model(state: AgentState):
response = model_with_tools.invoke(state["messages"])
# We return a list, because this will get added to the existing list
return {"messages": [response]}
# Define the function that responds to the user
def respond(state: AgentState):
# We call the model with structured output in order to return the same format to the user every time
# state['messages'][-2] is the last ToolMessage in the convo, which we convert to a HumanMessage for the model to use
# We could also pass the entire chat history, but this saves tokens since all we care to structure is the output of the tool
response = model_with_structured_output.invoke(
[HumanMessage(content=state["messages"][-2].content)]
)
# We return the final answer
return {"final_response": response}
# Define the function that determines whether to continue or not
def should_continue(state: AgentState):
messages = state["messages"]
last_message = messages[-1]
# If there is no function call, then we respond to the user
if not last_message.tool_calls:
return "respond"
# Otherwise if there is, we continue
else:
return "continue"
# Define a new graph
workflow = StateGraph(AgentState)
# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))
# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")
# We now add a conditional edge
workflow.add_conditional_edges(
"agent",
should_continue,
{
"continue": "tools",
"respond": "respond",
},
)
workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()
```
### Usage
We can now invoke our graph to verify that the output is being structured as desired:
```python
answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
"final_response"
]
```
```python
answer
```
```output
WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=4.83)
```
As we can see, the agent returned a `WeatherResponse` object as we expected. If would now be easy to use this agent in a more complex software stack without having to worry about the output of the agent not matching the format expected from the next step in the stack.
---
how-tos/persistence-functional.ipynb
---
# How to add thread-level persistence (functional API)
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Functional API
- Persistence
- Memory
- [Chat Models](https://python.langchain.com/docs/concepts/chat_models/)
!!! info "Not needed for LangGraph API users"
If you're using the LangGraph API, you needn't manually implement a checkpointer. The API automatically handles checkpointing for you. This guide is relevant when implementing LangGraph in your own custom server.
Many AI applications need memory to share context across multiple interactions on the same thread (e.g., multiple turns of a conversation). In LangGraph functional API, this kind of memory can be added to any [entrypoint()][langgraph.func.entrypoint] workflow using [thread-level persistence](https://langchain-ai.github.io/langgraph/concepts/persistence).
When creating a LangGraph workflow, you can set it up to persist its results by using a [checkpointer](https://langchain-ai.github.io/langgraph/reference/checkpoints/#basecheckpointsaver):
1. Create an instance of a checkpointer:
```python
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
```
2. Pass `checkpointer` instance to the `entrypoint()` decorator:
```python
from langgraph.func import entrypoint
@entrypoint(checkpointer=checkpointer)
def workflow(inputs)
...
```
3. Optionally expose `previous` parameter in the workflow function signature:
```python
@entrypoint(checkpointer=checkpointer)
def workflow(
inputs,
*,
# you can optionally specify `previous` in the workflow function signature
# to access the return value from the workflow as of the last execution
previous
):
previous = previous or []
combined_inputs = previous + inputs
result = do_something(combined_inputs)
...
```
4. Optionally choose which values will be returned from the workflow and which will be saved by the checkpointer as `previous`:
```python
@entrypoint(checkpointer=checkpointer)
def workflow(inputs, *, previous):
...
result = do_something(...)
return entrypoint.final(value=result, save=combine(inputs, result))
```
This guide shows how you can add thread-level persistence to your workflow.
!!! tip "Note"
If you need memory that is __shared__ across multiple conversations or users (cross-thread persistence), check out this how-to guide.
!!! tip "Note"
If you need to add thread-level persistence to a `StateGraph`, check out this how-to guide.
## Setup
First we need to install the packages required
```shell
pip install --quiet -U langgraph langchain_anthropic
```
Next, we need to set API key for Anthropic (the LLM we will use).
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Example: simple chatbot with short-term memory
We will be using a workflow with a single task that calls a [chat model](https://python.langchain.com/docs/concepts/chat_models/).
Let's first define the model we'll be using:
```python
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
```
Now we can define our task and workflow. To add in persistence, we need to pass in a [Checkpointer](https://langchain-ai.github.io/langgraph/reference/checkpoints/#langgraph.checkpoint.base.BaseCheckpointSaver) to the [entrypoint()][langgraph.func.entrypoint] decorator.
```python
from langchain_core.messages import BaseMessage
from langgraph.graph import add_messages
from langgraph.func import entrypoint, task
from langgraph.checkpoint.memory import MemorySaver
@task
def call_model(messages: list[BaseMessage]):
response = model.invoke(messages)
return response
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def workflow(inputs: list[BaseMessage], *, previous: list[BaseMessage]):
if previous:
inputs = add_messages(previous, inputs)
response = call_model(inputs).result()
return entrypoint.final(value=response, save=add_messages(inputs, response))
```
If we try to use this workflow, the context of the conversation will be persisted across interactions:
!!! note Note
If you're using LangGraph Platform or LangGraph Studio, you __don't need__ to pass checkpointer to the entrypoint decorator, since it's done automatically.
We can now interact with the agent and see that it remembers previous messages!
```python
config = {"configurable": {"thread_id": "1"}}
input_message = {"role": "user", "content": "hi! I'm bob"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
Hi Bob! I'm Claude. Nice to meet you! How are you today?
```
You can always resume previous threads:
```python
input_message = {"role": "user", "content": "what's my name?"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
Your name is Bob.
```
If we want to start a new conversation, we can pass in a different `thread_id`. Poof! All the memories are gone!
```python
input_message = {"role": "user", "content": "what's my name?"}
for chunk in workflow.stream(
[input_message],
{"configurable": {"thread_id": "2"}},
stream_mode="values",
):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
I don't know your name unless you tell me. Each conversation I have starts fresh, so I don't have access to any previous interactions or personal information unless you share it with me.
```
!!! tip "Streaming tokens"
If you would like to stream LLM tokens from your chatbot, you can use `stream_mode="messages"`. Check out this how-to guide to learn more.
---
how-tos/autogen-langgraph-platform.ipynb
---
# How to use LangGraph Platform to deploy CrewAI, AutoGen, and other frameworks
[LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/langgraph_platform/) provides infrastructure for deploying agents. This integrates seamlessly with LangGraph, but can also work with other frameworks. The way to make this work is to wrap the agent in a single LangGraph node, and have that be the entire graph.
Doing so will allow you to deploy to LangGraph Platform, and allows you to get a lot of the [benefits](https://langchain-ai.github.io/langgraph/concepts/langgraph_platform/). You get horizontally scalable infrastructure, a task queue to handle bursty operations, a persistence layer to power short term memory, and long term memory support.
In this guide we show how to do this with an AutoGen agent, but this method should work for agents defined in other frameworks like CrewAI, LlamaIndex, and others as well.
## Setup
```python
%pip install autogen langgraph
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
## Define autogen agent
Here we define our AutoGen agent. From https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_web_info.ipynb
```python
import autogen
import os
config_list = [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}]
llm_config = {
"timeout": 600,
"cache_seed": 42,
"config_list": config_list,
"temperature": 0,
}
autogen_agent = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={
"work_dir": "web",
"use_docker": False,
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
llm_config=llm_config,
system_message="Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.",
)
```
## Wrap in LangGraph
We now wrap the AutoGen agent in a single LangGraph node, and make that the entire graph.
The main thing this involves is defining an Input and Output schema for the node, which you would need to do if deploying this manually, so it's no extra work
```python
from langgraph.graph import StateGraph, MessagesState
def call_autogen_agent(state: MessagesState):
last_message = state["messages"][-1]
response = user_proxy.initiate_chat(autogen_agent, message=last_message.content)
# get the final response from the agent
content = response.chat_history[-1]["content"]
return {"messages": {"role": "assistant", "content": content}}
graph = StateGraph(MessagesState)
graph.add_node(call_autogen_agent)
graph.set_entry_point("call_autogen_agent")
graph = graph.compile()
```
## Deploy with LangGraph Platform
You can now deploy this as you normally would with LangGraph Platform. See [these instructions](https://langchain-ai.github.io/langgraph/concepts/deployment_options/) for more details.
---
how-tos/multi-agent-network-functional.ipynb
---
# How to build a multi-agent network (functional API)
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Multi-agent systems
- Functional API
- Command
- LangGraph Glossary
In this how-to guide we will demonstrate how to implement a multi-agent network architecture where each agent can communicate with every other agent (many-to-many connections) and can decide which agent to call next. We will be using functional API — individual agents will be defined as tasks and the agent handoffs will be defined in the main [entrypoint()][langgraph.func.entrypoint]:
```python
from langgraph.func import entrypoint
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
# Define a tool to signal intent to hand off to a different agent
@tool(return_direct=True)
def transfer_to_hotel_advisor():
"""Ask hotel advisor agent for help."""
return "Successfully transferred to hotel advisor"
# define an agent
travel_advisor_tools = [transfer_to_hotel_advisor, ...]
travel_advisor = create_react_agent(model, travel_advisor_tools)
# define a task that calls an agent
@task
def call_travel_advisor(messages):
response = travel_advisor.invoke({"messages": messages})
return response["messages"]
# define the multi-agent network workflow
@entrypoint()
def workflow(messages):
call_active_agent = call_travel_advisor
while True:
agent_messages = call_active_agent(messages).result()
messages = messages + agent_messages
call_active_agent = get_next_agent(messages)
return messages
```
## Setup
First, let's install the required packages
```shell
pip install -U langgraph langchain-anthropic
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
```
```output
ANTHROPIC_API_KEY: ········
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Travel agent example
In this example we will build a team of travel assistant agents that can communicate with each other.
We will create 2 agents:
* `travel_advisor`: can help with travel destination recommendations. Can ask `hotel_advisor` for help.
* `hotel_advisor`: can help with hotel recommendations. Can ask `travel_advisor` for help.
This is a fully-connected network - every agent can talk to any other agent.
First, let's create some of the tools that the agents will be using:
```python
import random
from typing_extensions import Literal
from langchain_core.tools import tool
@tool
def get_travel_recommendations():
"""Get recommendation for travel destinations"""
return random.choice(["aruba", "turks and caicos"])
@tool
def get_hotel_recommendations(location: Literal["aruba", "turks and caicos"]):
"""Get hotel recommendations for a given destination."""
return {
"aruba": [
"The Ritz-Carlton, Aruba (Palm Beach)"
"Bucuti & Tara Beach Resort (Eagle Beach)"
],
"turks and caicos": ["Grace Bay Club", "COMO Parrot Cay"],
}[location]
@tool(return_direct=True)
def transfer_to_hotel_advisor():
"""Ask hotel advisor agent for help."""
return "Successfully transferred to hotel advisor"
@tool(return_direct=True)
def transfer_to_travel_advisor():
"""Ask travel advisor agent for help."""
return "Successfully transferred to travel advisor"
```
!!! note "Transfer tools"
You might have noticed that we're using `@tool(return_direct=True)` in the transfer tools. This is done so that individual agents (e.g., `travel_advisor`) can exit the ReAct loop early once these tools are called. This is the desired behavior, as we want to detect when the agent calls this tool and hand control off _immediately_ to a different agent.
**NOTE**: This is meant to work with the prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] -- if you are building a custom agent, make sure to manually add logic for handling early exit for tools that are marked with `return_direct`.
Now let's define our agent tasks and combine them into a single multi-agent network workflow:
```python
from langchain_core.messages import AIMessage
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langgraph.graph import add_messages
from langgraph.func import entrypoint, task
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
# Define travel advisor ReAct agent
travel_advisor_tools = [
get_travel_recommendations,
transfer_to_hotel_advisor,
]
travel_advisor = create_react_agent(
model,
travel_advisor_tools,
prompt=(
"You are a general travel expert that can recommend travel destinations (e.g. countries, cities, etc). "
"If you need hotel recommendations, ask 'hotel_advisor' for help. "
"You MUST include human-readable response before transferring to another agent."
),
)
@task
def call_travel_advisor(messages):
# You can also add additional logic like changing the input to the agent / output from the agent, etc.
# NOTE: we're invoking the ReAct agent with the full history of messages in the state
response = travel_advisor.invoke({"messages": messages})
return response["messages"]
# Define hotel advisor ReAct agent
hotel_advisor_tools = [get_hotel_recommendations, transfer_to_travel_advisor]
hotel_advisor = create_react_agent(
model,
hotel_advisor_tools,
prompt=(
"You are a hotel expert that can provide hotel recommendations for a given destination. "
"If you need help picking travel destinations, ask 'travel_advisor' for help."
"You MUST include human-readable response before transferring to another agent."
),
)
@task
def call_hotel_advisor(messages):
response = hotel_advisor.invoke({"messages": messages})
return response["messages"]
@entrypoint()
def workflow(messages):
messages = add_messages([], messages)
call_active_agent = call_travel_advisor
while True:
agent_messages = call_active_agent(messages).result()
messages = add_messages(messages, agent_messages)
ai_msg = next(m for m in reversed(agent_messages) if isinstance(m, AIMessage))
if not ai_msg.tool_calls:
break
tool_call = ai_msg.tool_calls[-1]
if tool_call["name"] == "transfer_to_travel_advisor":
call_active_agent = call_travel_advisor
elif tool_call["name"] == "transfer_to_hotel_advisor":
call_active_agent = call_hotel_advisor
else:
raise ValueError(f"Expected transfer tool, got '{tool_call['name']}'")
return messages
```
Lastly, let's define a helper to render the agent outputs:
```python
from langchain_core.messages import convert_to_messages
def pretty_print_messages(update):
if isinstance(update, tuple):
ns, update = update
# skip parent graph updates in the printouts
if len(ns) == 0:
return
graph_id = ns[-1].split(":")[0]
print(f"Update from subgraph {graph_id}:")
print("\n")
for node_name, node_update in update.items():
print(f"Update from node {node_name}:")
print("\n")
for m in convert_to_messages(node_update["messages"]):
m.pretty_print()
print("\n")
```
Let's test it out using the same input as our original multi-agent system:
```python
for chunk in workflow.stream(
[
{
"role": "user",
"content": "i wanna go somewhere warm in the caribbean. pick one destination and give me hotel recommendations",
}
],
subgraphs=True,
):
pretty_print_messages(chunk)
```
```output
Update from subgraph call_travel_advisor:
Update from node agent:
================================== Ai Message ==================================
[{'text': "I'll help you find a warm Caribbean destination and then get some hotel recommendations for you.\n\nLet me first get some destination recommendations for the Caribbean region.", 'type': 'text'}, {'id': 'toolu_015vT8PkPq1VXvjrDvSpWUwJ', 'input': {}, 'name': 'get_travel_recommendations', 'type': 'tool_use'}]
Tool Calls:
get_travel_recommendations (toolu_015vT8PkPq1VXvjrDvSpWUwJ)
Call ID: toolu_015vT8PkPq1VXvjrDvSpWUwJ
Args:
Update from subgraph call_travel_advisor:
Update from node tools:
================================= Tool Message =================================
Name: get_travel_recommendations
turks and caicos
Update from subgraph call_travel_advisor:
Update from node agent:
================================== Ai Message ==================================
[{'text': "Based on the recommendation, I suggest Turks and Caicos! This beautiful British Overseas Territory is known for its stunning white-sand beaches, crystal-clear turquoise waters, and year-round warm weather. Grace Bay Beach in Providenciales is consistently ranked among the world's best beaches. The islands offer excellent snorkeling, diving, and water sports opportunities, plus a relaxed Caribbean atmosphere.\n\nNow, let me connect you with our hotel advisor to get some specific hotel recommendations for Turks and Caicos.", 'type': 'text'}, {'id': 'toolu_01JY7pNNWFuaWoe9ymxFYiPV', 'input': {}, 'name': 'transfer_to_hotel_advisor', 'type': 'tool_use'}]
Tool Calls:
transfer_to_hotel_advisor (toolu_01JY7pNNWFuaWoe9ymxFYiPV)
Call ID: toolu_01JY7pNNWFuaWoe9ymxFYiPV
Args:
Update from subgraph call_travel_advisor:
Update from node tools:
================================= Tool Message =================================
Name: transfer_to_hotel_advisor
Successfully transferred to hotel advisor
Update from subgraph call_hotel_advisor:
Update from node agent:
================================== Ai Message ==================================
[{'text': 'Let me get some hotel recommendations for Turks and Caicos:', 'type': 'text'}, {'id': 'toolu_0129ELa7jFocn16bowaGNapg', 'input': {'location': 'turks and caicos'}, 'name': 'get_hotel_recommendations', 'type': 'tool_use'}]
Tool Calls:
get_hotel_recommendations (toolu_0129ELa7jFocn16bowaGNapg)
Call ID: toolu_0129ELa7jFocn16bowaGNapg
Args:
location: turks and caicos
Update from subgraph call_hotel_advisor:
Update from node tools:
================================= Tool Message =================================
Name: get_hotel_recommendations
["Grace Bay Club", "COMO Parrot Cay"]
Update from subgraph call_hotel_advisor:
Update from node agent:
================================== Ai Message ==================================
Here are two excellent hotel options in Turks and Caicos:
1. Grace Bay Club: This luxury resort is located on the world-famous Grace Bay Beach. It offers all-oceanfront suites, exceptional dining options, and personalized service. The resort features adult-only and family-friendly sections, making it perfect for any type of traveler.
2. COMO Parrot Cay: This exclusive private island resort offers the ultimate luxury escape. It's known for its pristine beach, world-class spa, and holistic wellness programs. The resort provides an intimate, secluded experience with top-notch amenities and service.
Would you like more specific information about either of these properties or would you like to explore hotels in another destination?
```
Voila - `travel_advisor` picks a destination and then makes a decision to call `hotel_advisor` for more info!
---
how-tos/react-agent-from-scratch.ipynb
---
# How to create a ReAct agent from scratch
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Tool calling agent
- [Chat Models](https://python.langchain.com/docs/concepts/chat_models/)
- [Messages](https://python.langchain.com/docs/concepts/messages/)
- LangGraph Glossary
Using the prebuilt ReAct agent [create_react_agent][langgraph.prebuilt.chat_agent_executor.create_react_agent] is a great way to get started, but sometimes you might want more control and customization. In those cases, you can create a custom ReAct agent. This guide shows how to implement ReAct agent from scratch using LangGraph.
## Setup
First, let's install the required packages and set our API keys:
```shell
pip install -U langgraph langchain-openai
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM aps built with LangGraph — read more about how to get started in the docs.
## Create ReAct agent
Now that you have installed the required packages and set your environment variables, we can code our ReAct agent!
### Define graph state
We are going to define the most basic ReAct state in this example, which will just contain a list of messages.
For your specific use case, feel free to add any other state keys that you need.
```python
from typing import (
Annotated,
Sequence,
TypedDict,
)
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
class AgentState(TypedDict):
"""The state of the agent."""
# add_messages is a reducer
# See https://langchain-ai.github.io/langgraph/concepts/low_level/#reducers
messages: Annotated[Sequence[BaseMessage], add_messages]
```
### Define model and tools
Next, let's define the tools and model we will use for our example.
```python
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
model = ChatOpenAI(model="gpt-4o-mini")
@tool
def get_weather(location: str):
"""Call to get the weather from a specific location."""
# This is a placeholder for the actual implementation
# Don't let the LLM know this though 😊
if any([city in location.lower() for city in ["sf", "san francisco"]]):
return "It's sunny in San Francisco, but you better look out if you're a Gemini 😈."
else:
return f"I am not sure what the weather is in {location}"
tools = [get_weather]
model = model.bind_tools(tools)
```
### Define nodes and edges
Next let's define our nodes and edges. In our basic ReAct agent there are only two nodes, one for calling the model and one for using tools, however you can modify this basic structure to work better for your use case. The tool node we define here is a simplified version of the prebuilt [`ToolNode`](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/), which has some additional features.
Perhaps you want to add a node for [adding structured output](https://langchain-ai.github.io/langgraph/how-tos/react-agent-structured-output/) or a node for executing some external action (sending an email, adding a calendar event, etc.). Maybe you just want to change the way the `call_model` node works and how `should_continue` decides whether to call tools - the possibilities are endless and LangGraph makes it easy to customize this basic structure for your specific use case.
```python
import json
from langchain_core.messages import ToolMessage, SystemMessage
from langchain_core.runnables import RunnableConfig
tools_by_name = {tool.name: tool for tool in tools}
# Define our tool node
def tool_node(state: AgentState):
outputs = []
for tool_call in state["messages"][-1].tool_calls:
tool_result = tools_by_name[tool_call["name"]].invoke(tool_call["args"])
outputs.append(
ToolMessage(
content=json.dumps(tool_result),
name=tool_call["name"],
tool_call_id=tool_call["id"],
)
)
return {"messages": outputs}
# Define the node that calls the model
def call_model(
state: AgentState,
config: RunnableConfig,
):
# this is similar to customizing the create_react_agent with 'prompt' parameter, but is more flexible
system_prompt = SystemMessage(
"You are a helpful AI assistant, please respond to the users query to the best of your ability!"
)
response = model.invoke([system_prompt] + state["messages"], config)
# We return a list, because this will get added to the existing list
return {"messages": [response]}
# Define the conditional edge that determines whether to continue or not
def should_continue(state: AgentState):
messages = state["messages"]
last_message = messages[-1]
# If there is no function call, then we finish
if not last_message.tool_calls:
return "end"
# Otherwise if there is, we continue
else:
return "continue"
```
### Define the graph
Now that we have defined all of our nodes and edges, we can define and compile our graph. Depending on if you have added more nodes or different edges, you will need to edit this to fit your specific use case.
```python
from langgraph.graph import StateGraph, END
# Define a new graph
workflow = StateGraph(AgentState)
# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)
# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")
# We now add a conditional edge
workflow.add_conditional_edges(
# First, we define the start node. We use `agent`.
# This means these are the edges taken after the `agent` node is called.
"agent",
# Next, we pass in the function that will determine which node is called next.
should_continue,
# Finally we pass in a mapping.
# The keys are strings, and the values are other nodes.
# END is a special node marking that the graph should finish.
# What will happen is we will call `should_continue`, and then the output of that
# will be matched against the keys in this mapping.
# Based on which one it matches, that node will then be called.
{
# If `tools`, then we call the tool node.
"continue": "tools",
# Otherwise we finish.
"end": END,
},
)
# We now add a normal edge from `tools` to `agent`.
# This means that after `tools` is called, `agent` node is called next.
workflow.add_edge("tools", "agent")
# Now we can compile and visualize our graph
graph = workflow.compile()
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
```
## Use ReAct agent
Now that we have created our react agent, let's actually put it to the test!
```python
# Helper function for formatting the stream nicely
def print_stream(stream):
for s in stream:
message = s["messages"][-1]
if isinstance(message, tuple):
print(message)
else:
message.pretty_print()
inputs = {"messages": [("user", "what is the weather in sf")]}
print_stream(graph.stream(inputs, stream_mode="values"))
```
```output
================================ Human Message =================================
what is the weather in sf
================================== Ai Message ==================================
Tool Calls:
get_weather (call_azW0cQ4XjWWj0IAkWAxq9nLB)
Call ID: call_azW0cQ4XjWWj0IAkWAxq9nLB
Args:
location: San Francisco
================================= Tool Message =================================
Name: get_weather
"It's sunny in San Francisco, but you better look out if you're a Gemini \ud83d\ude08."
================================== Ai Message ==================================
The weather in San Francisco is sunny! However, it seems there's a playful warning for Geminis. Enjoy the sunshine!
```
Perfect! The graph correctly calls the `get_weather` tool and responds to the user after receiving the information from the tool.
---
how-tos/many-tools.ipynb
---
# How to handle large numbers of tools
Prerequisites
This guide assumes familiarity with the following:
The subset of available tools to call is generally at the discretion of the model (although many providers also enable the user to [specify or constrain the choice of tool](https://python.langchain.com/docs/how_to/tool_choice/)). As the number of available tools grows, you may want to limit the scope of the LLM's selection, to decrease token consumption and to help manage sources of error in LLM reasoning.
Here we will demonstrate how to dynamically adjust the tools available to a model. Bottom line up front: like [RAG](https://python.langchain.com/docs/concepts/#retrieval) and similar methods, we prefix the model invocation by retrieving over available tools. Although we demonstrate one implementation that searches over tool descriptions, the details of the tool selection can be customized as needed.
## Setup
First, let's install the required packages and set our API keys
```shell
pip install --quiet -U langgraph langchain_openai numpy
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started here.
## Define the tools
Let's consider a toy example in which we have one tool for each publicly traded company in the [S&P 500 index](https://en.wikipedia.org/wiki/S%26P_500). Each tool fetches company-specific information based on the year provided as a parameter.
We first construct a registry that associates a unique identifier with a schema for each tool. We will represent the tools using JSON schema, which can be bound directly to chat models supporting tool calling.
```python
import re
import uuid
from langchain_core.tools import StructuredTool
def create_tool(company: str) -> dict:
"""Create schema for a placeholder tool."""
# Remove non-alphanumeric characters and replace spaces with underscores for the tool name
formatted_company = re.sub(r"[^\w\s]", "", company).replace(" ", "_")
def company_tool(year: int) -> str:
# Placeholder function returning static revenue information for the company and year
return f"{company} had revenues of $100 in {year}."
return StructuredTool.from_function(
company_tool,
name=formatted_company,
description=f"Information about {company}",
)
# Abbreviated list of S&P 500 companies for demonstration
s_and_p_500_companies = [
"3M",
"A.O. Smith",
"Abbott",
"Accenture",
"Advanced Micro Devices",
"Yum! Brands",
"Zebra Technologies",
"Zimmer Biomet",
"Zoetis",
]
# Create a tool for each company and store it in a registry with a unique UUID as the key
tool_registry = {
str(uuid.uuid4()): create_tool(company) for company in s_and_p_500_companies
}
```
## Define the graph
### Tool selection
We will construct a node that retrieves a subset of available tools given the information in the state-- such as a recent user message. In general, the full scope of [retrieval solutions](https://python.langchain.com/docs/concepts/#retrieval) are available for this step. As a simple solution, we index embeddings of tool descriptions in a vector store, and associate user queries to tools via semantic search.
```python
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_openai import OpenAIEmbeddings
tool_documents = [
Document(
page_content=tool.description,
id=id,
metadata={"tool_name": tool.name},
)
for id, tool in tool_registry.items()
]
vector_store = InMemoryVectorStore(embedding=OpenAIEmbeddings())
document_ids = vector_store.add_documents(tool_documents)
```
### Incorporating with an agent
We will use a typical React agent graph (e.g., as used in the [quickstart](https://langchain-ai.github.io/langgraph/tutorials/introduction/#part-2-enhancing-the-chatbot-with-tools)), with some modifications:
- We add a `selected_tools` key to the state, which stores our selected subset of tools;
- We set the entry point of the graph to be a `select_tools` node, which populates this element of the state;
- We bind the selected subset of tools to the chat model within the `agent` node.
```python
from typing import Annotated
from langchain_openai import ChatOpenAI
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
# Define the state structure using TypedDict.
# It includes a list of messages (processed by add_messages)
# and a list of selected tool IDs.
class State(TypedDict):
messages: Annotated[list, add_messages]
selected_tools: list[str]
builder = StateGraph(State)
# Retrieve all available tools from the tool registry.
tools = list(tool_registry.values())
llm = ChatOpenAI()
# The agent function processes the current state
# by binding selected tools to the LLM.
def agent(state: State):
# Map tool IDs to actual tools
# based on the state's selected_tools list.
selected_tools = [tool_registry[id] for id in state["selected_tools"]]
# Bind the selected tools to the LLM for the current interaction.
llm_with_tools = llm.bind_tools(selected_tools)
# Invoke the LLM with the current messages and return the updated message list.
return {"messages": [llm_with_tools.invoke(state["messages"])]}
# The select_tools function selects tools based on the user's last message content.
def select_tools(state: State):
last_user_message = state["messages"][-1]
query = last_user_message.content
tool_documents = vector_store.similarity_search(query)
return {"selected_tools": [document.id for document in tool_documents]}
builder.add_node("agent", agent)
builder.add_node("select_tools", select_tools)
tool_node = ToolNode(tools=tools)
builder.add_node("tools", tool_node)
builder.add_conditional_edges("agent", tools_condition, path_map=["tools", "__end__"])
builder.add_edge("tools", "agent")
builder.add_edge("select_tools", "agent")
builder.add_edge(START, "select_tools")
graph = builder.compile()
```
```python
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
```
```python
user_input = "Can you give me some information about AMD in 2022?"
result = graph.invoke({"messages": [("user", user_input)]})
```
```python
print(result["selected_tools"])
```
```output
['ab9c0d59-3d16-448d-910c-73cf10a26020', 'f5eff8f6-7fb9-47b6-b54f-19872a52db84', '2962e168-9ef4-48dc-8b7c-9227e7956d39', '24a9fb82-19fe-4a88-944e-47bc4032e94a']
```
```python
for message in result["messages"]:
message.pretty_print()
```
```output
================================ Human Message =================================
Can you give me some information about AMD in 2022?
================================== Ai Message ==================================
Tool Calls:
Advanced_Micro_Devices (call_CRxQ0oT7NY7lqf35DaRNTJ35)
Call ID: call_CRxQ0oT7NY7lqf35DaRNTJ35
Args:
year: 2022
================================= Tool Message =================================
Name: Advanced_Micro_Devices
Advanced Micro Devices had revenues of $100 in 2022.
================================== Ai Message ==================================
In 2022, Advanced Micro Devices (AMD) had revenues of $100.
```
## Repeating tool selection
To manage errors from incorrect tool selection, we could revisit the `select_tools` node. One option for implementing this is to modify `select_tools` to generate the vector store query using all messages in the state (e.g., with a chat model) and add an edge routing from `tools` to `select_tools`.
We implement this change below. For demonstration purposes, we simulate an error in the initial tool selection by adding a `hack_remove_tool_condition` to the `select_tools` node, which removes the correct tool on the first iteration of the node. Note that on the second iteration, the agent finishes the run as it has access to the correct tool.
Using Pydantic with LangChain
This notebook uses Pydantic v2 BaseModel, which requires langchain-core >= 0.3. Using langchain-core < 0.3 will result in errors due to mixing of Pydantic v1 and v2 BaseModels.
```python
from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage
from langgraph.pregel.retry import RetryPolicy
from pydantic import BaseModel, Field
class QueryForTools(BaseModel):
"""Generate a query for additional tools."""
query: str = Field(..., description="Query for additional tools.")
def select_tools(state: State):
"""Selects tools based on the last message in the conversation state.
If the last message is from a human, directly uses the content of the message
as the query. Otherwise, constructs a query using a system message and invokes
the LLM to generate tool suggestions.
"""
last_message = state["messages"][-1]
hack_remove_tool_condition = False # Simulate an error in the first tool selection
if isinstance(last_message, HumanMessage):
query = last_message.content
hack_remove_tool_condition = True # Simulate wrong tool selection
else:
assert isinstance(last_message, ToolMessage)
system = SystemMessage(
"Given this conversation, generate a query for additional tools. "
"The query should be a short string containing what type of information "
"is needed. If no further information is needed, "
"set more_information_needed False and populate a blank string for the query."
)
input_messages = [system] + state["messages"]
response = llm.bind_tools([QueryForTools], tool_choice=True).invoke(
input_messages
)
query = response.tool_calls[0]["args"]["query"]
# Search the tool vector store using the generated query
tool_documents = vector_store.similarity_search(query)
if hack_remove_tool_condition:
# Simulate error by removing the correct tool from the selection
selected_tools = [
document.id
for document in tool_documents
if document.metadata["tool_name"] != "Advanced_Micro_Devices"
]
else:
selected_tools = [document.id for document in tool_documents]
return {"selected_tools": selected_tools}
graph_builder = StateGraph(State)
graph_builder.add_node("agent", agent)
graph_builder.add_node(
"select_tools", select_tools, retry_policy=RetryPolicy(max_attempts=3)
)
tool_node = ToolNode(tools=tools)
graph_builder.add_node("tools", tool_node)
graph_builder.add_conditional_edges(
"agent",
tools_condition,
)
graph_builder.add_edge("tools", "select_tools")
graph_builder.add_edge("select_tools", "agent")
graph_builder.add_edge(START, "select_tools")
graph = graph_builder.compile()
```
```python
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
```
```python
user_input = "Can you give me some information about AMD in 2022?"
result = graph.invoke({"messages": [("user", user_input)]})
```
```python
for message in result["messages"]:
message.pretty_print()
```
```output
================================ Human Message =================================
Can you give me some information about AMD in 2022?
================================== Ai Message ==================================
Tool Calls:
Accenture (call_qGmwFnENwwzHOYJXiCAaY5Mx)
Call ID: call_qGmwFnENwwzHOYJXiCAaY5Mx
Args:
year: 2022
================================= Tool Message =================================
Name: Accenture
Accenture had revenues of $100 in 2022.
================================== Ai Message ==================================
Tool Calls:
Advanced_Micro_Devices (call_u9e5UIJtiieXVYi7Y9GgyDpn)
Call ID: call_u9e5UIJtiieXVYi7Y9GgyDpn
Args:
year: 2022
================================= Tool Message =================================
Name: Advanced_Micro_Devices
Advanced Micro Devices had revenues of $100 in 2022.
================================== Ai Message ==================================
In 2022, AMD had revenues of $100.
```
## Next steps
This guide provides a minimal implementation for dynamically selecting tools. There is a host of possible improvements and optimizations:
- **Repeating tool selection**: Here, we repeated tool selection by modifying the `select_tools` node. Another option is to equip the agent with a `reselect_tools` tool, allowing it to re-select tools at its discretion.
- **Optimizing tool selection**: In general, the full scope of [retrieval solutions](https://python.langchain.com/docs/concepts/#retrieval) are available for tool selection. Additional options include:
- Group tools and retrieve over groups;
- Use a chat model to select tools or groups of tool.
---
how-tos/react-agent-from-scratch-functional.ipynb
---
# How to create a ReAct agent from scratch (Functional API)
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- [Chat Models](https://python.langchain.com/docs/concepts/chat_models)
- [Messages](https://python.langchain.com/docs/concepts/messages)
- [Tool Calling](https://python.langchain.com/docs/concepts/tool_calling/)
- Entrypoints and Tasks
This guide demonstrates how to implement a ReAct agent using the LangGraph Functional API.
The ReAct agent is a tool-calling agent that operates as follows:
1. Queries are issued to a chat model;
2. If the model generates no tool calls, we return the model response.
3. If the model generates tool calls, we execute the tool calls with available tools, append them as [tool messages](https://python.langchain.com/docs/concepts/messages/) to our message list, and repeat the process.
This is a simple and versatile set-up that can be extended with memory, human-in-the-loop capabilities, and other features. See the dedicated how-to guides for examples.
## Setup
First, let's install the required packages and set our API keys:
```shell
pip install -U langgraph langchain-openai
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM aps built with LangGraph — read more about how to get started in the docs.
## Create ReAct agent
Now that you have installed the required packages and set your environment variables, we can create our agent.
### Define model and tools
Let's first define the tools and model we will use for our example. Here we will use a single place-holder tool that gets a description of the weather for a location.
We will use an [OpenAI](https://python.langchain.com/docs/integrations/providers/openai/) chat model for this example, but any model [supporting tool-calling](https://python.langchain.com/docs/integrations/chat/) will suffice.
```python
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
model = ChatOpenAI(model="gpt-4o-mini")
@tool
def get_weather(location: str):
"""Call to get the weather from a specific location."""
# This is a placeholder for the actual implementation
if any([city in location.lower() for city in ["sf", "san francisco"]]):
return "It's sunny!"
elif "boston" in location.lower():
return "It's rainy!"
else:
return f"I am not sure what the weather is in {location}"
tools = [get_weather]
```
### Define tasks
We next define the tasks we will execute. Here there are two different tasks:
1. **Call model**: We want to query our chat model with a list of messages.
2. **Call tool**: If our model generates tool calls, we want to execute them.
```python
from langchain_core.messages import ToolMessage
from langgraph.func import entrypoint, task
tools_by_name = {tool.name: tool for tool in tools}
@task
def call_model(messages):
"""Call model with a sequence of messages."""
response = model.bind_tools(tools).invoke(messages)
return response
@task
def call_tool(tool_call):
tool = tools_by_name[tool_call["name"]]
observation = tool.invoke(tool_call["args"])
return ToolMessage(content=observation, tool_call_id=tool_call["id"])
```
### Define entrypoint
Our entrypoint will handle the orchestration of these two tasks. As described above, when our `call_model` task generates tool calls, the `call_tool` task will generate responses for each. We append all messages to a single messages list.
!!! tip
Note that because tasks return future-like objects, the below implementation executes tools in parallel.
```python
from langgraph.graph.message import add_messages
@entrypoint()
def agent(messages):
llm_response = call_model(messages).result()
while True:
if not llm_response.tool_calls:
break
# Execute tools
tool_result_futures = [
call_tool(tool_call) for tool_call in llm_response.tool_calls
]
tool_results = [fut.result() for fut in tool_result_futures]
# Append to message list
messages = add_messages(messages, [llm_response, *tool_results])
# Call model again
llm_response = call_model(messages).result()
return llm_response
```
## Usage
To use our agent, we invoke it with a messages list. Based on our implementation, these can be LangChain [message](https://python.langchain.com/docs/concepts/messages/) objects or OpenAI-style dicts:
```python
user_message = {"role": "user", "content": "What's the weather in san francisco?"}
print(user_message)
for step in agent.stream([user_message]):
for task_name, message in step.items():
if task_name == "agent":
continue # Just print task updates
print(f"\n{task_name}:")
message.pretty_print()
```
```output
{'role': 'user', 'content': "What's the weather in san francisco?"}
call_model:
================================== Ai Message ==================================
Tool Calls:
get_weather (call_tNnkrjnoz6MNfCHJpwfuEQ0v)
Call ID: call_tNnkrjnoz6MNfCHJpwfuEQ0v
Args:
location: san francisco
call_tool:
================================= Tool Message =================================
It's sunny!
call_model:
================================== Ai Message ==================================
The weather in San Francisco is sunny!
```
Perfect! The graph correctly calls the `get_weather` tool and responds to the user after receiving the information from the tool. Check out the LangSmith trace [here](https://smith.langchain.com/public/d5a0d5ea-bdaa-4032-911e-7db177c8141b/r).
## Add thread-level persistence
Adding thread-level persistence lets us support conversational experiences with our agent: subsequent invocations will append to the prior messages list, retaining the full conversational context.
To add thread-level persistence to our agent:
1. Select a checkpointer: here we will use MemorySaver, a simple in-memory checkpointer.
2. Update our entrypoint to accept the previous messages state as a second argument. Here, we simply append the message updates to the previous sequence of messages.
3. Choose which values will be returned from the workflow and which will be saved by the checkpointer as `previous` using `entrypoint.final` (optional)
```python hl_lines="3 6 7 8 9 30"
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
@entrypoint(checkpointer=checkpointer)
def agent(messages, previous):
if previous is not None:
messages = add_messages(previous, messages)
llm_response = call_model(messages).result()
while True:
if not llm_response.tool_calls:
break
# Execute tools
tool_result_futures = [
call_tool(tool_call) for tool_call in llm_response.tool_calls
]
tool_results = [fut.result() for fut in tool_result_futures]
# Append to message list
messages = add_messages(messages, [llm_response, *tool_results])
# Call model again
llm_response = call_model(messages).result()
# Generate final response
messages = add_messages(messages, llm_response)
return entrypoint.final(value=llm_response, save=messages)
```
We will now need to pass in a config when running our application. The config will specify an identifier for the conversational thread.
!!! tip
Read more about thread-level persistence in our concepts page and how-to guides.
```python
config = {"configurable": {"thread_id": "1"}}
```
We start a thread the same way as before, this time passing in the config:
```python hl_lines="4"
user_message = {"role": "user", "content": "What's the weather in san francisco?"}
print(user_message)
for step in agent.stream([user_message], config):
for task_name, message in step.items():
if task_name == "agent":
continue # Just print task updates
print(f"\n{task_name}:")
message.pretty_print()
```
```output
{'role': 'user', 'content': "What's the weather in san francisco?"}
call_model:
================================== Ai Message ==================================
Tool Calls:
get_weather (call_lubbUSdDofmOhFunPEZLBz3g)
Call ID: call_lubbUSdDofmOhFunPEZLBz3g
Args:
location: San Francisco
call_tool:
================================= Tool Message =================================
It's sunny!
call_model:
================================== Ai Message ==================================
The weather in San Francisco is sunny!
```
When we ask a follow-up conversation, the model uses the prior context to infer that we are asking about the weather:
```python
user_message = {"role": "user", "content": "How does it compare to Boston, MA?"}
print(user_message)
for step in agent.stream([user_message], config):
for task_name, message in step.items():
if task_name == "agent":
continue # Just print task updates
print(f"\n{task_name}:")
message.pretty_print()
```
```output
{'role': 'user', 'content': 'How does it compare to Boston, MA?'}
call_model:
================================== Ai Message ==================================
Tool Calls:
get_weather (call_8sTKYAhSIHOdjLD5d6gaswuV)
Call ID: call_8sTKYAhSIHOdjLD5d6gaswuV
Args:
location: Boston, MA
call_tool:
================================= Tool Message =================================
It's rainy!
call_model:
================================== Ai Message ==================================
Compared to San Francisco, which is sunny, Boston, MA is experiencing rainy weather.
```
In the [LangSmith trace](https://smith.langchain.com/public/20a1116b-bb3b-44c1-8765-7a28663439d9/r), we can see that the full conversational context is retained in each model call.
---
how-tos/cross-thread-persistence-functional.ipynb
---
# How to add cross-thread persistence (functional API)
!!! info "Prerequisites"
This guide assumes familiarity with the following:
- Functional API
- Persistence
- Memory
- [Chat Models](https://python.langchain.com/docs/concepts/chat_models/)
LangGraph allows you to persist data across **different threads**. For instance, you can store information about users (their names or preferences) in a shared (cross-thread) memory and reuse them in the new threads (e.g., new conversations).
When using the functional API, you can set it up to store and retrieve memories by using the [Store](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.BaseStore) interface:
1. Create an instance of a `Store`
```python
from langgraph.store.memory import InMemoryStore, BaseStore
store = InMemoryStore()
```
2. Pass the `store` instance to the `entrypoint()` decorator and expose `store` parameter in the function signature:
```python
from langgraph.func import entrypoint
@entrypoint(store=store)
def workflow(inputs: dict, store: BaseStore):
my_task(inputs).result()
...
```
In this guide, we will show how to construct and use a workflow that has a shared memory implemented using the [Store](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.BaseStore) interface.
!!! note Note
Support for the [`Store`](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.BaseStore) API that is used in this guide was added in LangGraph `v0.2.32`.
Support for __index__ and __query__ arguments of the [`Store`](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.BaseStore) API that is used in this guide was added in LangGraph `v0.2.54`.
!!! tip "Note"
If you need to add cross-thread persistence to a `StateGraph`, check out this how-to guide.
## Setup
First, let's install the required packages and set our API keys
```shell
pip install -U langchain_anthropic langchain_openai langgraph
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
_set_env("OPENAI_API_KEY")
```
!!! tip "Set up [LangSmith](https://smith.langchain.com) for LangGraph development"
Sign up for LangSmith to quickly spot issues and improve the performance of your LangGraph projects. LangSmith lets you use trace data to debug, test, and monitor your LLM apps built with LangGraph — read more about how to get started [here](https://docs.smith.langchain.com)
## Example: simple chatbot with long-term memory
### Define store
In this example we will create a workflow that will be able to retrieve information about a user's preferences. We will do so by defining an `InMemoryStore` - an object that can store data in memory and query that data.
When storing objects using the `Store` interface you define two things:
* the namespace for the object, a tuple (similar to directories)
* the object key (similar to filenames)
In our example, we'll be using `("memories", )` as namespace and random UUID as key for each new memory.
Importantly, to determine the user, we will be passing `user_id` via the config keyword argument of the node function.
Let's first define our store!
```python
from langgraph.store.memory import InMemoryStore
from langchain_openai import OpenAIEmbeddings
in_memory_store = InMemoryStore(
index={
"embed": OpenAIEmbeddings(model="text-embedding-3-small"),
"dims": 1536,
}
)
```
### Create workflow
```python
import uuid
from langchain_anthropic import ChatAnthropic
from langchain_core.runnables import RunnableConfig
from langchain_core.messages import BaseMessage
from langgraph.func import entrypoint, task
from langgraph.graph import add_messages
from langgraph.checkpoint.memory import MemorySaver
from langgraph.store.base import BaseStore
model = ChatAnthropic(model="claude-3-5-sonnet-latest")
@task
def call_model(messages: list[BaseMessage], memory_store: BaseStore, user_id: str):
namespace = ("memories", user_id)
last_message = messages[-1]
memories = memory_store.search(namespace, query=str(last_message.content))
info = "\n".join([d.value["data"] for d in memories])
system_msg = f"You are a helpful assistant talking to the user. User info: {info}"
# Store new memories if the user asks the model to remember
if "remember" in last_message.content.lower():
memory = "User name is Bob"
memory_store.put(namespace, str(uuid.uuid4()), {"data": memory})
response = model.invoke([{"role": "system", "content": system_msg}] + messages)
return response
# NOTE: we're passing the store object here when creating a workflow via entrypoint()
@entrypoint(checkpointer=MemorySaver(), store=in_memory_store)
def workflow(
inputs: list[BaseMessage],
*,
previous: list[BaseMessage],
config: RunnableConfig,
store: BaseStore,
):
user_id = config["configurable"]["user_id"]
previous = previous or []
inputs = add_messages(previous, inputs)
response = call_model(inputs, store, user_id).result()
return entrypoint.final(value=response, save=add_messages(inputs, response))
```
!!! note Note
If you're using LangGraph Cloud or LangGraph Studio, you __don't need__ to pass store to the entrypoint decorator, since it's done automatically.
### Run the workflow!
Now let's specify a user ID in the config and tell the model our name:
```python
config = {"configurable": {"thread_id": "1", "user_id": "1"}}
input_message = {"role": "user", "content": "Hi! Remember: my name is Bob"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
Hello Bob! Nice to meet you. I'll remember that your name is Bob. How can I help you today?
```
```python
config = {"configurable": {"thread_id": "2", "user_id": "1"}}
input_message = {"role": "user", "content": "what is my name?"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
Your name is Bob.
```
We can now inspect our in-memory store and verify that we have in fact saved the memories for the user:
```python
for memory in in_memory_store.search(("memories", "1")):
print(memory.value)
```
```output
{'data': 'User name is Bob'}
```
Let's now run the workflow for another user to verify that the memories about the first user are self contained:
```python
config = {"configurable": {"thread_id": "3", "user_id": "2"}}
input_message = {"role": "user", "content": "what is my name?"}
for chunk in workflow.stream([input_message], config, stream_mode="values"):
chunk.pretty_print()
```
```output
================================== Ai Message ==================================
I don't have any information about your name. I can only see our current conversation without any prior context or personal details about you. If you'd like me to know your name, feel free to tell me!
```
---
how-tos/autogen-integration.ipynb
---
# How to integrate LangGraph with AutoGen, CrewAI, and other frameworks
LangGraph is a framework for building agentic and multi-agent applications. LangGraph can be easily integrated with other agent frameworks.
The primary reasons you might want to integrate LangGraph with other agent frameworks:
- create multi-agent systems where individual agents are built with different frameworks
- leverage LangGraph to add features like persistence, streaming, short and long-term memory and more
The simplest way to integrate agents from other frameworks is by calling those agents inside a LangGraph node:
```python
from langgraph.graph import StateGraph, MessagesState, START
autogen_agent = autogen.AssistantAgent(name="assistant", ...)
user_proxy = autogen.UserProxyAgent(name="user_proxy", ...)
def call_autogen_agent(state: MessagesState):
response = user_proxy.initiate_chat(
autogen_agent,
message=state["messages"][-1],
...
)
...
graph = (
StateGraph(MessagesState)
.add_node(call_autogen_agent)
.add_edge(START, "call_autogen_agent")
.compile()
)
graph.invoke({
"messages": [
{
"role": "user",
"content": "Find numbers between 10 and 30 in fibonacci sequence",
}
]
})
```
In this guide we show how to build a LangGraph chatbot that integrates with AutoGen, but you can follow the same approach with other frameworks.
## Setup
```python
%pip install autogen langgraph
```
```python
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
```
```output
OPENAI_API_KEY: ········
```
## Define AutoGen agent
Here we define our AutoGen agent. Adapted from official tutorial [here](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_web_info.ipynb).
```python
import autogen
import os
config_list = [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}]
llm_config = {
"timeout": 600,
"cache_seed": 42,
"config_list": config_list,
"temperature": 0,
}
autogen_agent = autogen.AssistantAgent(
name="assistant",
llm_config=llm_config,
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={
"work_dir": "web",
"use_docker": False,
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
llm_config=llm_config,
system_message="Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.",
)
```
---
## Create the graph
We will now create a LangGraph chatbot graph that calls AutoGen agent.
```python
from langchain_core.messages import convert_to_openai_messages
from langgraph.graph import StateGraph, MessagesState, START
from langgraph.checkpoint.memory import MemorySaver
def call_autogen_agent(state: MessagesState):
# convert to openai-style messages
messages = convert_to_openai_messages(state["messages"])
response = user_proxy.initiate_chat(
autogen_agent,
message=messages[-1],
# pass previous message history as context
carryover=messages[:-1],
)
# get the final response from the agent
content = response.chat_history[-1]["content"]
return {"messages": {"role": "assistant", "content": content}}
# add short-term memory for storing conversation history
checkpointer = MemorySaver()
builder = StateGraph(MessagesState)
builder.add_node(call_autogen_agent)
builder.add_edge(START, "call_autogen_agent")
graph = builder.compile(checkpointer=checkpointer)
```
```python
from IPython.display import display, Image
display(Image(graph.get_graph().draw_mermaid_png()))
```
## Run the graph
We can now run the graph.
```python hl_lines="2 13"
# pass the thread ID to persist agent outputs for future interactions
config = {"configurable": {"thread_id": "1"}}
for chunk in graph.stream(
{
"messages": [
{
"role": "user",
"content": "Find numbers between 10 and 30 in fibonacci sequence",
}
]
},
config,
):
print(chunk)
```
```output
user_proxy (to assistant):
Find numbers between 10 and 30 in fibonacci sequence
--------------------------------------------------------------------------------
assistant (to user_proxy):
To find numbers between 10 and 30 in the Fibonacci sequence, we can generate the Fibonacci sequence and check which numbers fall within this range. Here's a plan:
1. Generate Fibonacci numbers starting from 0.
2. Continue generating until the numbers exceed 30.
3. Collect and print the numbers that are between 10 and 30.
Let's implement this in Python:
\`\`\`python
# filename: fibonacci_range.py
def fibonacci_sequence():
a, b = 0, 1
while a <= 30:
if 10 <= a <= 30:
print(a)
a, b = b, a + b
fibonacci_sequence()
\`\`\`
This script will print the Fibonacci numbers between 10 and 30. Please execute the code to see the result.
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
user_proxy (to assistant):
exitcode: 0 (execution succeeded)
Code output:
13
21
--------------------------------------------------------------------------------
assistant (to user_proxy):
The Fibonacci numbers between 10 and 30 are 13 and 21.
These numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1.
The sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
As you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.
TERMINATE
--------------------------------------------------------------------------------
{'call_autogen_agent': {'messages': {'role': 'assistant', 'content': 'The Fibonacci numbers between 10 and 30 are 13 and 21. \n\nThese numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1. \n\nThe sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...\n\nAs you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.\n\nTERMINATE'}}}
```
Since we're leveraging LangGraph's [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/) features we can now continue the conversation using the same thread ID -- LangGraph will automatically pass previous history to the AutoGen agent:
```python hl_lines="10"
for chunk in graph.stream(
{
"messages": [
{
"role": "user",
"content": "Multiply the last number by 3",
}
]
},
config,
):
print(chunk)
```
```output
user_proxy (to assistant):
Multiply the last number by 3
Context:
Find numbers between 10 and 30 in fibonacci sequence
The Fibonacci numbers between 10 and 30 are 13 and 21.
These numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1.
The sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
As you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.
TERMINATE
--------------------------------------------------------------------------------
assistant (to user_proxy):
The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:
21 * 3 = 63
TERMINATE
--------------------------------------------------------------------------------
{'call_autogen_agent': {'messages': {'role': 'assistant', 'content': 'The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:\n\n21 * 3 = 63\n\nTERMINATE'}}}
```
---
concepts/langgraph_self_hosted_control_plane.md
---
# Self-Hosted Control Plane
There are two versions of the self-hosted deployment: [Self-Hosted Data Plane](./deployment_options.md#self-hosted-data-plane) and [Self-Hosted Control Plane](./deployment_options.md#self-hosted-control-plane).
!!! info "Important"
The Self-Hosted Control Plane deployment option requires an [Enterprise](plans.md) plan.
## Requirements
- You use `langgraph-cli` and/or [LangGraph Studio](./langgraph_studio.md) app to test graph locally.
- You use `langgraph build` command to build image.
- You have a Self-Hosted LangSmith instance deployed.
- You are using Ingress for your LangSmith instance. All agents will be deployed as Kubernetes services behind this ingress.
## Self-Hosted Control Plane
The [Self-Hosted Control Plane](./langgraph_self_hosted_control_plane.md) deployment option is a fully self-hosted model for deployment where you manage the [control plane](./langgraph_control_plane.md) and [data plane](./langgraph_data_plane.md) in your cloud. This option gives you full control and responsibility of the control plane and data plane infrastructure.
| | [Control plane](../concepts/langgraph_control_plane.md) | [Data plane](../concepts/langgraph_data_plane.md) |
|-------------------|-------------------|------------|
| **What is it?** |
Control plane UI for creating deployments and revisions
Control plane APIs for creating deployments and revisions
|
Data plane "listener" for reconciling deployments with control plane state
LangGraph Servers
Postgres, Redis, etc
|
| **Where is it hosted?** | Your cloud | Your cloud |
| **Who provisions and manages it?** | You | You |
### Architecture

### Compute Platforms
- **Kubernetes**: The Self-Hosted Control Plane deployment option supports deploying control plane and data plane infrastructure to any Kubernetes cluster.
!!! tip
If you would like to enable this on your LangSmith instance, please follow the [Self-Hosted Control Plane deployment guide](../cloud/deployment/self_hosted_control_plane.md).
---
concepts/plans.md
---
---
search:
boost: 2
---
# LangGraph Platform Plans
## Overview
LangGraph Platform is a solution for deploying agentic applications in production.
There are three different plans for using it.
- **Developer**: All [LangSmith](https://smith.langchain.com/) users have access to this plan. You can sign up for this plan simply by creating a LangSmith account. This gives you access to the [Standalone Container (Lite)](./deployment_options.md) deployment option.
- **Plus**: All [LangSmith](https://smith.langchain.com/) users with a [Plus account](https://docs.smith.langchain.com/administration/pricing) have access to this plan. You can sign up for this plan simply by upgrading your LangSmith account to the Plus plan type. This gives you access to the [Cloud](./deployment_options.md#cloud-saas) deployment option.
- **Enterprise**: This is separate from LangSmith plans. You can sign up for this plan by contacting sales@langchain.dev. This gives you access to all [deployment options](./deployment_options.md).
## Plan Details
| | Developer | Plus | Enterprise |
|------------------------------------------------------------------|---------------------------------------------|-------------------------------------------------------|-----------------------------------------------------|
| Deployment Options | Standalone Container (Lite) | Cloud SaaS |
Cloud SaaS
Self-Hosted Data Plane
Self-Hosted Control Plane
Standalone Container (Enterprise)
|
| Usage | Free, limited to 1M [nodes executed](../concepts/faq.md#what-does-nodes-executed-mean-for-langgraph-platform-usage) per year | See [Pricing](https://www.langchain.com/langgraph-platform-pricing) | Custom |
| APIs for retrieving and updating state and conversational history | ✅ | ✅ | ✅ |
| APIs for retrieving and updating long-term memory | ✅ | ✅ | ✅ |
| Horizontally scalable task queues and servers | ✅ | ✅ | ✅ |
| Real-time streaming of outputs and intermediate steps | ✅ | ✅ | ✅ |
| Assistants API (configurable templates for LangGraph apps) | ✅ | ✅ | ✅ |
| Cron scheduling | -- | ✅ | ✅ |
| LangGraph Studio for prototyping | ✅ | ✅ | ✅ |
| Authentication & authorization to call the LangGraph APIs | -- | Coming Soon! | Coming Soon! |
| Smart caching to reduce traffic to LLM API | -- | Coming Soon! | Coming Soon! |
| Publish/subscribe API for state | -- | Coming Soon! | Coming Soon! |
| Scheduling prioritization | -- | Coming Soon! | Coming Soon! |
For pricing information, see [LangGraph Platform Pricing](https://www.langchain.com/langgraph-platform-pricing).
## Related
For more information, please see:
* [Deployment Options conceptual guide](./deployment_options.md)
* [LangGraph Platform Pricing](https://www.langchain.com/langgraph-platform-pricing)
* [LangSmith Plans](https://docs.smith.langchain.com/administration/pricing)
---
concepts/subgraphs.md
---
# Subgraphs
A subgraph is a [graph](./low_level.md#graphs) that is used as a [node](./low_level.md#nodes) in another graph — this is the concept of encapsulation applied to LangGraph. Subgraphs allow you to build complex systems with multiple components that are themselves graphs.

Some reasons for using subgraphs are:
- building [multi-agent systems](./multi_agent.md)
- when you want to reuse a set of nodes in multiple graphs
- when you want different teams to work on different parts of the graph independently, you can define each part as a subgraph, and as long as the subgraph interface (the input and output schemas) is respected, the parent graph can be built without knowing any details of the subgraph
The main question when adding subgraphs is how the parent graph and subgraph communicate, i.e. how they pass the [state](./low_level.md#state) between each other during the graph execution. There are two scenarios:
* parent and subgraph have **shared state keys** in their state [schemas](./low_level.md#state). In this case, you can [include the subgraph as a node in the parent graph](../how-tos/subgraph.md#shared-state-schemas)
```python hl_lines="12 17"
from langgraph.graph import StateGraph, MessagesState, START
# Subgraph
def call_model(state: MessagesState):
response = model.invoke(state["messages"])
return {"messages": response}
subgraph_builder = StateGraph(State)
subgraph_builder.add_node(call_model)
...
subgraph = subgraph_builder.compile()
# Parent graph
builder = StateGraph(State)
builder.add_node("subgraph_node", subgraph)
builder.add_edge(START, "subgraph_node")
graph = builder.compile()
...
graph.invoke({"messages": [{"role": "user", "content": "hi!"}]})
```
* parent graph and subgraph have **different schemas** (no shared state keys in their state [schemas](./low_level.md#state)). In this case, you have to [call the subgraph from inside a node in the parent graph](../how-tos/subgraph.md#different-state-schemas): this is useful when the parent graph and the subgraph have different state schemas and you need to transform state before or after calling the subgraph
```python hl_lines="7 11 19 28"
from typing_extensions import TypedDict, Annotated
from langchain_core.messages import AnyMessage
from langgraph.graph import StateGraph, MessagesState, START
from langgraph.graph.message import add_messages
class SubgraphMessagesState(TypedDict):
subgraph_messages: Annotated[list[AnyMessage], add_messages]
# Subgraph
def call_model(state: SubgraphMessagesState):
response = model.invoke(state["subgraph_messages"])
return {"subgraph_messages": response}
subgraph_builder = StateGraph(SubgraphMessagesState)
subgraph_builder.add_node("call_model_from_subgraph", call_model)
subgraph_builder.add_edge(START, "call_model_from_subgraph")
...
subgraph = subgraph_builder.compile()
# Parent graph
def call_subgraph(state: MessagesState):
response = subgraph.invoke({"subgraph_messages": state["messages"]})
return {"messages": response["subgraph_messages"]}
builder = StateGraph(State)
builder.add_node("subgraph_node", call_subgraph)
builder.add_edge(START, "subgraph_node")
graph = builder.compile()
...
graph.invoke({"messages": [{"role": "user", "content": "hi!"}]})
```
---
concepts/scalability_and_resilience.md
---
---
search:
boost: 2
---
# Scalability & Resilience
LangGraph Platform is designed to scale horizontally with your workload. Each instance of the service is stateless, and keeps no resources in memory. The service is designed to gracefully handle new instances being added or removed, including hard shutdown cases.
## Server scalability
As you add more instances to a service, they will share the HTTP load as long as an appropriate load balancer mechanism is placed in front of them. In most deployment modalities we configure a load balancer for the service automatically. In the “self-hosted without control plane” modality it’s your responsibility to add a load balancer. Since the instances are stateless any load balancing strategy will work, no session stickiness is needed, or recommended. Any instance of the server can communicate with any queue instance (through Redis PubSub), meaning that requests to cancel or stream an in-progress run can be handled by any arbitrary instance.
## Queue scalability
As you add more instances to a service, they will increase run throughput linearly, as each instance is configured to handle a set number of concurrent runs (by default 10). Each attempt for each run will be handled by a single instance, with exactly-once semantics enforced through Postgres’s MVCC model (refer to section below for crash resilience details). Attempts that fail due to transient database errors are retried up to 3 times. We do not make use of long-lived transactions or locks, this enables us to make more efficient use of Postgres resources.
## Resilience
While a run is being handled by a queue instance, a periodic heartbeat timestamp will be recorded in Redis by that queue worker.
When a graceful shutdown request is received (SIGINT) an instance enters shutdown mode, which
- stops accepting new HTTP requests
- gives any in-progress runs a limited number of seconds to finish (if not finished it will be put back in the queue)
- stops the instance from picking up more runs from the queue
If a hard shutdown occurs due to a server crash or an infrastructure failure, any runs that were in progress will be picked up by an internal sweeper task that looks for in-progress runs that have breached their heartbeat window. The sweeper runs every 2 minutes and will put the runs back in the queue for another instance to pick them up.
## Postgres resilience
For deployment modalities where we manage the Postgres database, we have periodic backups and continuously replicated standby replicas for automatic failover. This Postgres configuration is available in the [Cloud SaaS deployment option](../concepts/langgraph_cloud.md) for [`Production` deployment types](../concepts/langgraph_control_plane.md#deployment-types) only.
All communication with Postgres implements retries for retry-able errors. If Postgres is momentarily unavailable, such as during a database restart, most/all traffic should continue to succeed. Prolonged failure of Postgres will render the LangGraph Server unavailable.
## Redis resilience
All data that requires durable storage is stored in Postgres, not Redis. Redis is used only for ephemeral metadata, and communication between instances. Therefore we place no durability requirements on Redis.
All communication with Redis implements retries for retry-able errors. If Redis is momentarily unavailable, such as during a database restart, most/all traffic should continue to succeed. Prolonged failure of Redis will render the LangGraph Server unavailable.
---
concepts/time-travel.md
---
---
search:
boost: 2
---
# Time Travel ⏱️
When working with non-deterministic systems that make model-based decisions (e.g., agents powered by LLMs), it can be useful to examine their decision-making process in detail:
1. 🤔 **Understand reasoning**: Analyze the steps that led to a successful result.
2. 🐞 **Debug mistakes**: Identify where and why errors occurred.
3. 🔍 **Explore alternatives**: Test different paths to uncover better solutions.
LangGraph provides [time travel functionality](../how-tos/human_in_the_loop/time-travel.md) to support these use cases. Specifically, you can resume execution from a prior checkpoint — either replaying the same state or modifying it to explore alternatives. In all cases, resuming past execution produces a new fork in the history.
!!! tip
For information on how to use time travel, see [Use time travel](../how-tos/human_in_the_loop/time-travel.md) and [Time travel using Server API](../cloud/how-tos/human_in_the_loop_time_travel.md).
---
concepts/agentic_concepts.md
---
---
search:
boost: 2
---
# Agent architectures
Many LLM applications implement a particular control flow of steps before and / or after LLM calls. As an example, [RAG](https://github.com/langchain-ai/rag-from-scratch) performs retrieval of documents relevant to a user question, and passes those documents to an LLM in order to ground the model's response in the provided document context.
Instead of hard-coding a fixed control flow, we sometimes want LLM systems that can pick their own control flow to solve more complex problems! This is one definition of an [agent](https://blog.langchain.dev/what-is-an-agent/): *an agent is a system that uses an LLM to decide the control flow of an application.* There are many ways that an LLM can control application:
- An LLM can route between two potential paths
- An LLM can decide which of many tools to call
- An LLM can decide whether the generated answer is sufficient or more work is needed
As a result, there are many different types of [agent architectures](https://blog.langchain.dev/what-is-a-cognitive-architecture/), which give an LLM varying levels of control.

## Router
A router allows an LLM to select a single step from a specified set of options. This is an agent architecture that exhibits a relatively limited level of control because the LLM usually focuses on making a single decision and produces a specific output from a limited set of pre-defined options. Routers typically employ a few different concepts to achieve this.
### Structured Output
Structured outputs with LLMs work by providing a specific format or schema that the LLM should follow in its response. This is similar to tool calling, but more general. While tool calling typically involves selecting and using predefined functions, structured outputs can be used for any type of formatted response. Common methods to achieve structured outputs include:
1. Prompt engineering: Instructing the LLM to respond in a specific format via the system prompt.
2. Output parsers: Using post-processing to extract structured data from LLM responses.
3. Tool calling: Leveraging built-in tool calling capabilities of some LLMs to generate structured outputs.
Structured outputs are crucial for routing as they ensure the LLM's decision can be reliably interpreted and acted upon by the system. Learn more about [structured outputs in this how-to guide](https://python.langchain.com/docs/how_to/structured_output/).
## Tool-calling agent
While a router allows an LLM to make a single decision, more complex agent architectures expand the LLM's control in two key ways:
1. Multi-step decision making: The LLM can make a series of decisions, one after another, instead of just one.
2. Tool access: The LLM can choose from and use a variety of tools to accomplish tasks.
[ReAct](https://arxiv.org/abs/2210.03629) is a popular general purpose agent architecture that combines these expansions, integrating three core concepts.
1. [Tool calling](#tool-calling): Allowing the LLM to select and use various tools as needed.
2. [Memory](#memory): Enabling the agent to retain and use information from previous steps.
3. [Planning](#planning): Empowering the LLM to create and follow multi-step plans to achieve goals.
This architecture allows for more complex and flexible agent behaviors, going beyond simple routing to enable dynamic problem-solving with multiple steps. Unlike the original [paper](https://arxiv.org/abs/2210.03629), today's agents rely on LLMs' [tool calling](#tool-calling) capabilities and operate on a list of [messages](./low_level.md#why-use-messages).
In LangGraph, you can use the prebuilt [agent](../agents/agents.md#2-create-an-agent) to get started with tool-calling agents.
### Tool calling
Tools are useful whenever you want an agent to interact with external systems. External systems (e.g., APIs) often require a particular input schema or payload, rather than natural language. When we bind an API, for example, as a tool, we give the model awareness of the required input schema. The model will choose to call a tool based upon the natural language input from the user and it will return an output that adheres to the tool's required schema.
[Many LLM providers support tool calling](https://python.langchain.com/docs/integrations/chat/) and [tool calling interface](https://blog.langchain.dev/improving-core-tool-interfaces-and-docs-in-langchain/) in LangChain is simple: you can simply pass any Python `function` into `ChatModel.bind_tools(function)`.

### Memory
[Memory](../how-tos/memory/add-memory.md) is crucial for agents, enabling them to retain and utilize information across multiple steps of problem-solving. It operates on different scales:
1. [Short-term memory](../how-tos/memory/add-memory.md#add-short-term-memory): Allows the agent to access information acquired during earlier steps in a sequence.
2. [Long-term memory](../how-tos/memory/add-memory.md#add-long-term-memory): Enables the agent to recall information from previous interactions, such as past messages in a conversation.
LangGraph provides full control over memory implementation:
- [`State`](./low_level.md#state): User-defined schema specifying the exact structure of memory to retain.
- [`Checkpointer`](./persistence.md#checkpoints): Mechanism to store state at every step across different interactions within a session.
- [`Store`](./persistence.md#memory-store): Mechanism to store user-specific or application-level data across sessions.
This flexible approach allows you to tailor the memory system to your specific agent architecture needs. Effective memory management enhances an agent's ability to maintain context, learn from past experiences, and make more informed decisions over time. For a practical guide on adding and managing memory, see [Memory](../how-tos/memory/add-memory.md).
### Planning
In a tool-calling [agent](../agents/overview.md#what-is-an-agent), an LLM is called repeatedly in a while-loop. At each step the agent decides which tools to call, and what the inputs to those tools should be. Those tools are then executed, and the outputs are fed back into the LLM as observations. The while-loop terminates when the agent decides it has enough information to solve the user request and it is not worth calling any more tools.
## Custom agent architectures
While routers and tool-calling agents (like ReAct) are common, [customizing agent architectures](https://blog.langchain.dev/why-you-should-outsource-your-agentic-infrastructure-but-own-your-cognitive-architecture/) often leads to better performance for specific tasks. LangGraph offers several powerful features for building tailored agent systems:
### Human-in-the-loop
Human involvement can significantly enhance agent reliability, especially for sensitive tasks. This can involve:
- Approving specific actions
- Providing feedback to update the agent's state
- Offering guidance in complex decision-making processes
Human-in-the-loop patterns are crucial when full automation isn't feasible or desirable. Learn more in our [human-in-the-loop guide](./human_in_the_loop.md).
### Parallelization
Parallel processing is vital for efficient multi-agent systems and complex tasks. LangGraph supports parallelization through its [Send](./low_level.md#send) API, enabling:
- Concurrent processing of multiple states
- Implementation of map-reduce-like operations
- Efficient handling of independent subtasks
For practical implementation, see our [map-reduce tutorial](../how-tos/graph-api.md#map-reduce-and-the-send-api)
### Subgraphs
[Subgraphs](./subgraphs.md) are essential for managing complex agent architectures, particularly in [multi-agent systems](./multi_agent.md). They allow:
- Isolated state management for individual agents
- Hierarchical organization of agent teams
- Controlled communication between agents and the main system
Subgraphs communicate with the parent graph through overlapping keys in the state schema. This enables flexible, modular agent design. For implementation details, refer to our [subgraph how-to guide](../how-tos/subgraph.md).
### Reflection
Reflection mechanisms can significantly improve agent reliability by:
1. Evaluating task completion and correctness
2. Providing feedback for iterative improvement
3. Enabling self-correction and learning
While often LLM-based, reflection can also use deterministic methods. For instance, in coding tasks, compilation errors can serve as feedback. This approach is demonstrated in [this video using LangGraph for self-corrective code generation](https://www.youtube.com/watch?v=MvNdgmM7uyc).
By leveraging these features, LangGraph enables the creation of sophisticated, task-specific agent architectures that can handle complex workflows, collaborate effectively, and continuously improve their performance.
---
concepts/assistants.md
---
# Assistants
**Assistants** allow you to manage configurations (like prompts, LLM selection, tools) separately from your graph's core logic, enabling rapid changes that don't alter the graph architecture. It is a way to create multiple specialized versions of the same graph architecture, each optimized for different use cases through configuration variations rather than structural changes.
For example, imagine a general-purpose writing agent built on a common graph architecture. While the structure remains the same, different writing styles—such as blog posts and tweets—require tailored configurations to optimize performance. To support these variations, you can create multiple assistants (e.g., one for blogs and another for tweets) that share the underlying graph but differ in model selection and system prompt.

The LangGraph Cloud API provides several endpoints for creating and managing assistants and their versions. See the [API reference](../cloud/reference/api/api_ref.html#tag/assistants) for more details.
!!! info
Assistants are a [LangGraph Platform](langgraph_platform.md) concept. They are not available in the open source LangGraph library.
## Configuration
Assistants build on the LangGraph open source concept of [configuration](low_level.md#configuration).
While configuration is available in the open source LangGraph library, assistants are only present in [LangGraph Platform](langgraph_platform.md). This is due to the fact that assistants are tightly coupled to your deployed graph. Upon deployment, LangGraph Server will automatically create a default assistant for each graph using the graph's default configuration settings.
In practice, an assistant is just an _instance_ of a graph with a specific configuration. Therefore, multiple assistants can reference the same graph but can contain different configurations (e.g. prompts, models, tools). The LangGraph Server API provides several endpoints for creating and managing assistants. See the [API reference](../cloud/reference/api/api_ref.html) and [this how-to](../cloud/how-tos/configuration_cloud.md) for more details on how to create assistants.
## Versioning
Assistants support versioning to track changes over time.
Once you've created an assistant, subsequent edits to that assistant will create new versions. See [this how-to](../cloud/how-tos/configuration_cloud.md#create-a-new-version-for-your-assistant) for more details on how to manage assistant versions.
## Execution
A **run** is an invocation of an assistant. Each run may have its own input, configuration, and metadata, which may affect execution and output of the underlying graph. A run can optionally be executed on a [thread](./persistence.md#threads).
The LangGraph Platform API provides several endpoints for creating and managing runs. See the [API reference](../cloud/reference/api/api_ref.html#tag/thread-runs/) for more details.
---
concepts/langgraph_control_plane.md
---
---
search:
boost: 2
---
# LangGraph Control Plane
The term "control plane" is used broadly to refer to the control plane UI where users create and update [LangGraph Servers](./langgraph_server.md) (deployments) and the control plane APIs that support the UI experience.
When a user makes an update through the control plane UI, the update is stored in the control plane state. The [LangGraph Data Plane](./langgraph_data_plane.md) "listener" application polls for these updates by calling the control plane APIs.
## Control Plane UI
From the control plane UI, you can:
- View a list of outstanding deployments.
- View details of an individual deployment.
- Create a new deployment.
- Update a deployment.
- Update environment variables for a deployment.
- View build and server logs of a deployment.
- View deployment metrics such as CPU and memory usage.
- Delete a deployment.
The Control Plane UI is embedded in [LangSmith](https://docs.smith.langchain.com/langgraph_cloud).
## Control Plane API
This section describes the data model of the control plane API. The API is used to create, update, and delete deployments. See the [control plane API reference](../cloud/reference/api/api_ref_control_plane.md) for more details.
### Deployment
A deployment is an instance of a LangGraph Server. A single deployment can have many revisions.
### Revision
A revision is an iteration of a deployment. When a new deployment is created, an initial revision is automatically created. To deploy code changes or update secrets for a deployment, a new revision must be created.
## Control Plane Features
This section describes various features of the control plane.
### Deployment Types
For simplicity, the control plane offers two deployment types with different resource allocations: `Development` and `Production`.
| **Deployment Type** | **CPU/Memory** | **Scaling** | **Database** |
|---------------------|-----------------|---------------------|----------------------------------------------------------------------------------|
| Development | 1 CPU, 1 GB RAM | Up to 1 container | 10 GB disk, no backups |
| Production | 2 CPU, 2 GB RAM | Up to 10 containers | Autoscaling disk, automatic backups, highly available (multi-zone configuration) |
CPU and memory resources are per container.
!!! warning "Immutable Deployment Type"
Once a deployment is created, the deployment type cannot be changed.
!!! info "Resource Customization"
For `Production` type deployments, resources can be manually increased on a case-by-case basis depending on use case and capacity constraints. Contact support@langchain.dev to request an increase in resources.
For `Development` types deployments, database disk size can be manually increased on a case-by-case basis depending on use case and capacity constraints. For most use cases, [TTLs](../how-tos/ttl/configure_ttl.md) should be configured to manage disk usage. Contact support@langchain.dev to request an increase in resources.
Resources for [Self-Hosted Data Plane](../concepts/langgraph_self_hosted_data_plane.md) and [Self-Hosted Control Plane](../concepts/langgraph_self_hosted_control_plane.md) deployments can be fully customized.
### Database Provisioning
The control plane and [LangGraph Data Plane](./langgraph_data_plane.md) "listener" application coordinate to automatically create a Postgres database for each deployment. The database serves as the [persistence layer](../concepts/persistence.md) for the deployment.
When implementing a LangGraph application, a [checkpointer](../concepts/persistence.md#checkpointer-libraries) does not need to be configured by the developer. Instead, a checkpointer is automatically configured for the graph. Any checkpointer configured for a graph will be replaced by the one that is automatically configured.
There is no direct access to the database. All access to the database occurs through the [LangGraph Server](../concepts/langgraph_server.md).
The database is never deleted until the deployment itself is deleted.
!!! info
A custom Postgres instance can be configured for [Self-Hosted Data Plane](../concepts/langgraph_self_hosted_data_plane.md) and [Self-Hosted Control Plane](../concepts/langgraph_self_hosted_control_plane.md) deployments.
### Asynchronous Deployment
Infrastructure for deployments and revisions are provisioned and deployed asynchronously. They are not deployed immediately after submission. Currently, deployment can take up to several minutes.
- When a new deployment is created, a new database is created for the deployment. Database creation is a one-time step. This step contributes to a longer deployment time for the initial revision of the deployment.
- When a subsequent revision is created for a deployment, there is no database creation step. The deployment time for a subsequent revision is significantly faster compared to the deployment time of the initial revision.
- The deployment process for each revision contains a build step, which can take up to a few minutes.
The control plane and [LangGraph Data Plane](./langgraph_data_plane.md) "listener" application coordinate to achieve asynchronous deployments.
### Monitoring
After a deployment is ready, the control plane monitors the deployment and records various metrics, such as:
- CPU and memory usage of the deployment.
- Number of container restarts.
- Number of replicas (this will increase with [autoscaling](../concepts/langgraph_data_plane.md#autoscaling)).
- [Postgres](../concepts/langgraph_data_plane.md#postgres) CPU, memory usage, and disk usage.
These metrics are displayed as charts in the Control Plane UI.
### LangSmith Integration
A [LangSmith](https://docs.smith.langchain.com/) tracing project is automatically created for each deployment. The tracing project has the same name as the deployment. When creating a deployment, the `LANGCHAIN_TRACING` and `LANGSMITH_API_KEY`/`LANGCHAIN_API_KEY` environment variables do not need to be specified; they are set automatically by the control plane.
When a deployment is deleted, the traces and the tracing project are not deleted.
---
concepts/mcp.md
---
# MCP
[Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) is an open protocol that standardizes how applications provide tools and context to language models. LangGraph agents can use tools defined on MCP servers through the `langchain-mcp-adapters` library.

Install the `langchain-mcp-adapters` library to use MCP tools in LangGraph:
```bash
pip install langchain-mcp-adapters
```
## Authenticate to an MCP server
You can set up [custom authentication middleware](../how-tos/auth/custom_auth.md) to authenticate a user with an MCP server to get access to user-scoped tools within your LangGraph Platform deployment.
!!! note
Custom authentication is a LangGraph Platform feature.
An example architecture for this flow:
```mermaid
sequenceDiagram
%% Actors
participant ClientApp as Client
participant AuthProv as Auth Provider
participant LangGraph as LangGraph Backend
participant SecretStore as Secret Store
participant MCPServer as MCP Server
%% Platform login / AuthN
ClientApp ->> AuthProv: 1. Login (username / password)
AuthProv -->> ClientApp: 2. Return token
ClientApp ->> LangGraph: 3. Request with token
Note over LangGraph: 4. Validate token (@auth.authenticate)
LangGraph -->> AuthProv: 5. Fetch user info
AuthProv -->> LangGraph: 6. Confirm validity
%% Fetch user tokens from secret store
LangGraph ->> SecretStore: 6a. Fetch user tokens
SecretStore -->> LangGraph: 6b. Return tokens
Note over LangGraph: 7. Apply access control (@auth.on.*)
%% MCP round-trip
Note over LangGraph: 8. Build MCP client with user token
LangGraph ->> MCPServer: 9. Call MCP tool (with header)
Note over MCPServer: 10. MCP validates header and runs tool
MCPServer -->> LangGraph: 11. Tool response
%% Return to caller
LangGraph -->> ClientApp: 12. Return resources / tool output
```
For more information, see [MCP endpoint in LangGraph Server](../concepts/server-mcp.md#use-user-scoped-mcp-tools-in-your-deployment).
---
concepts/persistence.md
---
---
search:
boost: 2
---
# Persistence
LangGraph has a built-in persistence layer, implemented through checkpointers. When you compile graph with a checkpointer, the checkpointer saves a `checkpoint` of the graph state at every super-step. Those checkpoints are saved to a `thread`, which can be accessed after graph execution. Because `threads` allow access to graph's state after execution, several powerful capabilities including human-in-the-loop, memory, time travel, and fault-tolerance are all possible. Below, we'll discuss each of these concepts in more detail.

!!! info "LangGraph API handles checkpointing automatically"
When using the LangGraph API, you don't need to implement or configure checkpointers manually. The API handles all persistence infrastructure for you behind the scenes.
## Threads
A thread is a unique ID or thread identifier assigned to each checkpoint saved by a checkpointer. It contains the accumulated state of a sequence of [runs](./assistants.md#execution). When a run is executed, the [state](../concepts/low_level.md#state) of the underlying graph of the assistant will be persisted to the thread.
When invoking graph with a checkpointer, you **must** specify a `thread_id` as part of the `configurable` portion of the config:
```python
{"configurable": {"thread_id": "1"}}
```
A thread's current and historical state can be retrieved. To persist state, a thread must be created prior to executing a run. The LangGraph Platform API provides several endpoints for creating and managing threads and thread state. See the [API reference](../cloud/reference/api/api_ref.html#tag/threads) for more details.
## Checkpoints
The state of a thread at a particular point in time is called a checkpoint. Checkpoint is a snapshot of the graph state saved at each super-step and is represented by `StateSnapshot` object with the following key properties:
- `config`: Config associated with this checkpoint.
- `metadata`: Metadata associated with this checkpoint.
- `values`: Values of the state channels at this point in time.
- `next` A tuple of the node names to execute next in the graph.
- `tasks`: A tuple of `PregelTask` objects that contain information about next tasks to be executed. If the step was previously attempted, it will include error information. If a graph was interrupted [dynamically](../how-tos/human_in_the_loop/breakpoints.md#dynamic-breakpoints) from within a node, tasks will contain additional data associated with interrupts.
Checkpoints are persisted and can be used to restore the state of a thread at a later time.
Let's see what checkpoints are saved when a simple graph is invoked as follows:
```python
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import InMemorySaver
from typing import Annotated
from typing_extensions import TypedDict
from operator import add
class State(TypedDict):
foo: str
bar: Annotated[list[str], add]
def node_a(state: State):
return {"foo": "a", "bar": ["a"]}
def node_b(state: State):
return {"foo": "b", "bar": ["b"]}
workflow = StateGraph(State)
workflow.add_node(node_a)
workflow.add_node(node_b)
workflow.add_edge(START, "node_a")
workflow.add_edge("node_a", "node_b")
workflow.add_edge("node_b", END)
checkpointer = InMemorySaver()
graph = workflow.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "1"}}
graph.invoke({"foo": ""}, config)
```
After we run the graph, we expect to see exactly 4 checkpoints:
* empty checkpoint with `START` as the next node to be executed
* checkpoint with the user input `{'foo': '', 'bar': []}` and `node_a` as the next node to be executed
* checkpoint with the outputs of `node_a` `{'foo': 'a', 'bar': ['a']}` and `node_b` as the next node to be executed
* checkpoint with the outputs of `node_b` `{'foo': 'b', 'bar': ['a', 'b']}` and no next nodes to be executed
Note that we `bar` channel values contain outputs from both nodes as we have a reducer for `bar` channel.
### Get state
When interacting with the saved graph state, you **must** specify a [thread identifier](#threads). You can view the *latest* state of the graph by calling `graph.get_state(config)`. This will return a `StateSnapshot` object that corresponds to the latest checkpoint associated with the thread ID provided in the config or a checkpoint associated with a checkpoint ID for the thread, if provided.
```python
# get the latest state snapshot
config = {"configurable": {"thread_id": "1"}}
graph.get_state(config)
# get a state snapshot for a specific checkpoint_id
config = {"configurable": {"thread_id": "1", "checkpoint_id": "1ef663ba-28fe-6528-8002-5a559208592c"}}
graph.get_state(config)
```
In our example, the output of `get_state` will look like this:
```
StateSnapshot(
values={'foo': 'b', 'bar': ['a', 'b']},
next=(),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28fe-6528-8002-5a559208592c'}},
metadata={'source': 'loop', 'writes': {'node_b': {'foo': 'b', 'bar': ['b']}}, 'step': 2},
created_at='2024-08-29T19:19:38.821749+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f9-6ec4-8001-31981c2c39f8'}}, tasks=()
)
```
### Get state history
You can get the full history of the graph execution for a given thread by calling `graph.get_state_history(config)`. This will return a list of `StateSnapshot` objects associated with the thread ID provided in the config. Importantly, the checkpoints will be ordered chronologically with the most recent checkpoint / `StateSnapshot` being the first in the list.
```python
config = {"configurable": {"thread_id": "1"}}
list(graph.get_state_history(config))
```
In our example, the output of `get_state_history` will look like this:
```
[
StateSnapshot(
values={'foo': 'b', 'bar': ['a', 'b']},
next=(),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28fe-6528-8002-5a559208592c'}},
metadata={'source': 'loop', 'writes': {'node_b': {'foo': 'b', 'bar': ['b']}}, 'step': 2},
created_at='2024-08-29T19:19:38.821749+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f9-6ec4-8001-31981c2c39f8'}},
tasks=(),
),
StateSnapshot(
values={'foo': 'a', 'bar': ['a']}, next=('node_b',),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f9-6ec4-8001-31981c2c39f8'}},
metadata={'source': 'loop', 'writes': {'node_a': {'foo': 'a', 'bar': ['a']}}, 'step': 1},
created_at='2024-08-29T19:19:38.819946+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f4-6b4a-8000-ca575a13d36a'}},
tasks=(PregelTask(id='6fb7314f-f114-5413-a1f3-d37dfe98ff44', name='node_b', error=None, interrupts=()),),
),
StateSnapshot(
values={'foo': '', 'bar': []},
next=('node_a',),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f4-6b4a-8000-ca575a13d36a'}},
metadata={'source': 'loop', 'writes': None, 'step': 0},
created_at='2024-08-29T19:19:38.817813+00:00',
parent_config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f0-6c66-bfff-6723431e8481'}},
tasks=(PregelTask(id='f1b14528-5ee5-579c-949b-23ef9bfbed58', name='node_a', error=None, interrupts=()),),
),
StateSnapshot(
values={'bar': []},
next=('__start__',),
config={'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef663ba-28f0-6c66-bfff-6723431e8481'}},
metadata={'source': 'input', 'writes': {'foo': ''}, 'step': -1},
created_at='2024-08-29T19:19:38.816205+00:00',
parent_config=None,
tasks=(PregelTask(id='6d27aa2e-d72b-5504-a36f-8620e54a76dd', name='__start__', error=None, interrupts=()),),
)
]
```

### Replay
It's also possible to play-back a prior graph execution. If we `invoke` a graph with a `thread_id` and a `checkpoint_id`, then we will *re-play* the previously executed steps _before_ a checkpoint that corresponds to the `checkpoint_id`, and only execute the steps _after_ the checkpoint.
* `thread_id` is the ID of a thread.
* `checkpoint_id` is an identifier that refers to a specific checkpoint within a thread.
You must pass these when invoking the graph as part of the `configurable` portion of the config:
```python
config = {"configurable": {"thread_id": "1", "checkpoint_id": "0c62ca34-ac19-445d-bbb0-5b4984975b2a"}}
graph.invoke(None, config=config)
```
Importantly, LangGraph knows whether a particular step has been executed previously. If it has, LangGraph simply *re-plays* that particular step in the graph and does not re-execute the step, but only for the steps _before_ the provided `checkpoint_id`. All of the steps _after_ `checkpoint_id` will be executed (i.e., a new fork), even if they have been executed previously. See this [how to guide on time-travel to learn more about replaying](../how-tos/human_in_the_loop/time-travel.md).

### Update state
In addition to re-playing the graph from specific `checkpoints`, we can also *edit* the graph state. We do this using `graph.update_state()`. This method accepts three different arguments:
#### `config`
The config should contain `thread_id` specifying which thread to update. When only the `thread_id` is passed, we update (or fork) the current state. Optionally, if we include `checkpoint_id` field, then we fork that selected checkpoint.
#### `values`
These are the values that will be used to update the state. Note that this update is treated exactly as any update from a node is treated. This means that these values will be passed to the [reducer](./low_level.md#reducers) functions, if they are defined for some of the channels in the graph state. This means that `update_state` does NOT automatically overwrite the channel values for every channel, but only for the channels without reducers. Let's walk through an example.
Let's assume you have defined the state of your graph with the following schema (see full example above):
```python
from typing import Annotated
from typing_extensions import TypedDict
from operator import add
class State(TypedDict):
foo: int
bar: Annotated[list[str], add]
```
Let's now assume the current state of the graph is
```
{"foo": 1, "bar": ["a"]}
```
If you update the state as below:
```
graph.update_state(config, {"foo": 2, "bar": ["b"]})
```
Then the new state of the graph will be:
```
{"foo": 2, "bar": ["a", "b"]}
```
The `foo` key (channel) is completely changed (because there is no reducer specified for that channel, so `update_state` overwrites it). However, there is a reducer specified for the `bar` key, and so it appends `"b"` to the state of `bar`.
#### `as_node`
The final thing you can optionally specify when calling `update_state` is `as_node`. If you provided it, the update will be applied as if it came from node `as_node`. If `as_node` is not provided, it will be set to the last node that updated the state, if not ambiguous. The reason this matters is that the next steps to execute depend on the last node to have given an update, so this can be used to control which node executes next. See this [how to guide on time-travel to learn more about forking state](../how-tos/human_in_the_loop/time-travel.md).

## Memory Store

A [state schema](low_level.md#schema) specifies a set of keys that are populated as a graph is executed. As discussed above, state can be written by a checkpointer to a thread at each graph step, enabling state persistence.
But, what if we want to retain some information *across threads*? Consider the case of a chatbot where we want to retain specific information about the user across *all* chat conversations (e.g., threads) with that user!
With checkpointers alone, we cannot share information across threads. This motivates the need for the [`Store`](../reference/store.md#langgraph.store.base.BaseStore) interface. As an illustration, we can define an `InMemoryStore` to store information about a user across threads. We simply compile our graph with a checkpointer, as before, and with our new `in_memory_store` variable.
!!! info "LangGraph API handles stores automatically"
When using the LangGraph API, you don't need to implement or configure stores manually. The API handles all storage infrastructure for you behind the scenes.
### Basic Usage
First, let's showcase this in isolation without using LangGraph.
```python
from langgraph.store.memory import InMemoryStore
in_memory_store = InMemoryStore()
```
Memories are namespaced by a `tuple`, which in this specific example will be `(, "memories")`. The namespace can be any length and represent anything, does not have to be user specific.
```python
user_id = "1"
namespace_for_memory = (user_id, "memories")
```
We use the `store.put` method to save memories to our namespace in the store. When we do this, we specify the namespace, as defined above, and a key-value pair for the memory: the key is simply a unique identifier for the memory (`memory_id`) and the value (a dictionary) is the memory itself.
```python
memory_id = str(uuid.uuid4())
memory = {"food_preference" : "I like pizza"}
in_memory_store.put(namespace_for_memory, memory_id, memory)
```
We can read out memories in our namespace using the `store.search` method, which will return all memories for a given user as a list. The most recent memory is the last in the list.
```python
memories = in_memory_store.search(namespace_for_memory)
memories[-1].dict()
{'value': {'food_preference': 'I like pizza'},
'key': '07e0caf4-1631-47b7-b15f-65515d4c1843',
'namespace': ['1', 'memories'],
'created_at': '2024-10-02T17:22:31.590602+00:00',
'updated_at': '2024-10-02T17:22:31.590605+00:00'}
```
Each memory type is a Python class ([`Item`](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.Item)) with certain attributes. We can access it as a dictionary by converting via `.dict` as above.
The attributes it has are:
- `value`: The value (itself a dictionary) of this memory
- `key`: A unique key for this memory in this namespace
- `namespace`: A list of strings, the namespace of this memory type
- `created_at`: Timestamp for when this memory was created
- `updated_at`: Timestamp for when this memory was updated
### Semantic Search
Beyond simple retrieval, the store also supports semantic search, allowing you to find memories based on meaning rather than exact matches. To enable this, configure the store with an embedding model:
```python
from langchain.embeddings import init_embeddings
store = InMemoryStore(
index={
"embed": init_embeddings("openai:text-embedding-3-small"), # Embedding provider
"dims": 1536, # Embedding dimensions
"fields": ["food_preference", "$"] # Fields to embed
}
)
```
Now when searching, you can use natural language queries to find relevant memories:
```python
# Find memories about food preferences
# (This can be done after putting memories into the store)
memories = store.search(
namespace_for_memory,
query="What does the user like to eat?",
limit=3 # Return top 3 matches
)
```
You can control which parts of your memories get embedded by configuring the `fields` parameter or by specifying the `index` parameter when storing memories:
```python
# Store with specific fields to embed
store.put(
namespace_for_memory,
str(uuid.uuid4()),
{
"food_preference": "I love Italian cuisine",
"context": "Discussing dinner plans"
},
index=["food_preference"] # Only embed "food_preferences" field
)
# Store without embedding (still retrievable, but not searchable)
store.put(
namespace_for_memory,
str(uuid.uuid4()),
{"system_info": "Last updated: 2024-01-01"},
index=False
)
```
### Using in LangGraph
With this all in place, we use the `in_memory_store` in LangGraph. The `in_memory_store` works hand-in-hand with the checkpointer: the checkpointer saves state to threads, as discussed above, and the `in_memory_store` allows us to store arbitrary information for access *across* threads. We compile the graph with both the checkpointer and the `in_memory_store` as follows.
```python
from langgraph.checkpoint.memory import InMemorySaver
# We need this because we want to enable threads (conversations)
checkpointer = InMemorySaver()
# ... Define the graph ...
# Compile the graph with the checkpointer and store
graph = graph.compile(checkpointer=checkpointer, store=in_memory_store)
```
We invoke the graph with a `thread_id`, as before, and also with a `user_id`, which we'll use to namespace our memories to this particular user as we showed above.
```python
# Invoke the graph
user_id = "1"
config = {"configurable": {"thread_id": "1", "user_id": user_id}}
# First let's just say hi to the AI
for update in graph.stream(
{"messages": [{"role": "user", "content": "hi"}]}, config, stream_mode="updates"
):
print(update)
```
We can access the `in_memory_store` and the `user_id` in *any node* by passing `store: BaseStore` and `config: RunnableConfig` as node arguments. Here's how we might use semantic search in a node to find relevant memories:
```python
def update_memory(state: MessagesState, config: RunnableConfig, *, store: BaseStore):
# Get the user id from the config
user_id = config["configurable"]["user_id"]
# Namespace the memory
namespace = (user_id, "memories")
# ... Analyze conversation and create a new memory
# Create a new memory ID
memory_id = str(uuid.uuid4())
# We create a new memory
store.put(namespace, memory_id, {"memory": memory})
```
As we showed above, we can also access the store in any node and use the `store.search` method to get memories. Recall the memories are returned as a list of objects that can be converted to a dictionary.
```python
memories[-1].dict()
{'value': {'food_preference': 'I like pizza'},
'key': '07e0caf4-1631-47b7-b15f-65515d4c1843',
'namespace': ['1', 'memories'],
'created_at': '2024-10-02T17:22:31.590602+00:00',
'updated_at': '2024-10-02T17:22:31.590605+00:00'}
```
We can access the memories and use them in our model call.
```python
def call_model(state: MessagesState, config: RunnableConfig, *, store: BaseStore):
# Get the user id from the config
user_id = config["configurable"]["user_id"]
# Namespace the memory
namespace = (user_id, "memories")
# Search based on the most recent message
memories = store.search(
namespace,
query=state["messages"][-1].content,
limit=3
)
info = "\n".join([d.value["memory"] for d in memories])
# ... Use memories in the model call
```
If we create a new thread, we can still access the same memories so long as the `user_id` is the same.
```python
# Invoke the graph
config = {"configurable": {"thread_id": "2", "user_id": "1"}}
# Let's say hi again
for update in graph.stream(
{"messages": [{"role": "user", "content": "hi, tell me about my memories"}]}, config, stream_mode="updates"
):
print(update)
```
When we use the LangGraph Platform, either locally (e.g., in LangGraph Studio) or with LangGraph Platform, the base store is available to use by default and does not need to be specified during graph compilation. To enable semantic search, however, you **do** need to configure the indexing settings in your `langgraph.json` file. For example:
```json
{
...
"store": {
"index": {
"embed": "openai:text-embeddings-3-small",
"dims": 1536,
"fields": ["$"]
}
}
}
```
See the [deployment guide](../cloud/deployment/semantic_search.md) for more details and configuration options.
## Checkpointer libraries
Under the hood, checkpointing is powered by checkpointer objects that conform to [BaseCheckpointSaver][langgraph.checkpoint.base.BaseCheckpointSaver] interface. LangGraph provides several checkpointer implementations, all implemented via standalone, installable libraries:
* `langgraph-checkpoint`: The base interface for checkpointer savers ([BaseCheckpointSaver][langgraph.checkpoint.base.BaseCheckpointSaver]) and serialization/deserialization interface ([SerializerProtocol][langgraph.checkpoint.serde.base.SerializerProtocol]). Includes in-memory checkpointer implementation ([InMemorySaver][langgraph.checkpoint.memory.InMemorySaver]) for experimentation. LangGraph comes with `langgraph-checkpoint` included.
* `langgraph-checkpoint-sqlite`: An implementation of LangGraph checkpointer that uses SQLite database ([SqliteSaver][langgraph.checkpoint.sqlite.SqliteSaver] / [AsyncSqliteSaver][langgraph.checkpoint.sqlite.aio.AsyncSqliteSaver]). Ideal for experimentation and local workflows. Needs to be installed separately.
* `langgraph-checkpoint-postgres`: An advanced checkpointer that uses Postgres database ([PostgresSaver][langgraph.checkpoint.postgres.PostgresSaver] / [AsyncPostgresSaver][langgraph.checkpoint.postgres.aio.AsyncPostgresSaver]), used in LangGraph Platform. Ideal for using in production. Needs to be installed separately.
### Checkpointer interface
Each checkpointer conforms to [BaseCheckpointSaver][langgraph.checkpoint.base.BaseCheckpointSaver] interface and implements the following methods:
* `.put` - Store a checkpoint with its configuration and metadata.
* `.put_writes` - Store intermediate writes linked to a checkpoint (i.e. [pending writes](#pending-writes)).
* `.get_tuple` - Fetch a checkpoint tuple using for a given configuration (`thread_id` and `checkpoint_id`). This is used to populate `StateSnapshot` in `graph.get_state()`.
* `.list` - List checkpoints that match a given configuration and filter criteria. This is used to populate state history in `graph.get_state_history()`
If the checkpointer is used with asynchronous graph execution (i.e. executing the graph via `.ainvoke`, `.astream`, `.abatch`), asynchronous versions of the above methods will be used (`.aput`, `.aput_writes`, `.aget_tuple`, `.alist`).
!!! note Note
For running your graph asynchronously, you can use `InMemorySaver`, or async versions of Sqlite/Postgres checkpointers -- `AsyncSqliteSaver` / `AsyncPostgresSaver` checkpointers.
### Serializer
When checkpointers save the graph state, they need to serialize the channel values in the state. This is done using serializer objects.
`langgraph_checkpoint` defines [protocol][langgraph.checkpoint.serde.base.SerializerProtocol] for implementing serializers provides a default implementation ([JsonPlusSerializer][langgraph.checkpoint.serde.jsonplus.JsonPlusSerializer]) that handles a wide variety of types, including LangChain and LangGraph primitives, datetimes, enums and more.
#### Serialization with `pickle`
The default serializer, [`JsonPlusSerializer`][langgraph.checkpoint.serde.jsonplus.JsonPlusSerializer], uses ormsgpack and JSON under the hood, which is not suitable for all types of objects.
If you want to fallback to pickle for objects not currently supported by our msgpack encoder (such as Pandas dataframes),
you can use the `pickle_fallback` argument of the `JsonPlusSerializer`:
```python
from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.serde.jsonplus import JsonPlusSerializer
# ... Define the graph ...
graph.compile(
checkpointer=MemorySaver(serde=JsonPlusSerializer(pickle_fallback=True))
)
```
#### Encryption
Checkpointers can optionally encrypt all persisted state. To enable this, pass an instance of [`EncryptedSerializer`][langgraph.checkpoint.serde.encrypted.EncryptedSerializer] to the `serde` argument of any `BaseCheckpointSaver` implementation. The easiest way to create an encrypted serializer is via [`from_pycryptodome_aes`][langgraph.checkpoint.serde.encrypted.EncryptedSerializer.from_pycryptodome_aes], which reads the AES key from the `LANGGRAPH_AES_KEY` environment variable (or accepts a `key` argument):
```python
import sqlite3
from langgraph.checkpoint.serde.encrypted import EncryptedSerializer
from langgraph.checkpoint.sqlite import SqliteSaver
serde = EncryptedSerializer.from_pycryptodome_aes() # reads LANGGRAPH_AES_KEY
checkpointer = SqliteSaver(sqlite3.connect("checkpoint.db"), serde=serde)
```
```python
from langgraph.checkpoint.serde.encrypted import EncryptedSerializer
from langgraph.checkpoint.postgres import PostgresSaver
serde = EncryptedSerializer.from_pycryptodome_aes()
checkpointer = PostgresSaver.from_conn_string("postgresql://...", serde=serde)
checkpointer.setup()
```
When running on LangGraph Platform, encryption is automatically enabled whenever `LANGGRAPH_AES_KEY` is present, so you only need to provide the environment variable. Other encryption schemes can be used by implementing [`CipherProtocol`][langgraph.checkpoint.serde.base.CipherProtocol] and supplying it to `EncryptedSerializer`.
## Capabilities
### Human-in-the-loop
First, checkpointers facilitate [human-in-the-loop workflows](agentic_concepts.md#human-in-the-loop) workflows by allowing humans to inspect, interrupt, and approve graph steps. Checkpointers are needed for these workflows as the human has to be able to view the state of a graph at any point in time, and the graph has to be to resume execution after the human has made any updates to the state. See [these how-to guides](../how-tos/human_in_the_loop/breakpoints.md) for concrete examples.
### Memory
Second, checkpointers allow for ["memory"](../concepts/memory.md) between interactions. In the case of repeated human interactions (like conversations) any follow up messages can be sent to that thread, which will retain its memory of previous ones. See [Add memory](../how-tos/memory/add-memory.md) for information on how to add and manage conversation memory using checkpointers.
### Time Travel
Third, checkpointers allow for ["time travel"](time-travel.md), allowing users to replay prior graph executions to review and / or debug specific graph steps. In addition, checkpointers make it possible to fork the graph state at arbitrary checkpoints to explore alternative trajectories.
### Fault-tolerance
Lastly, checkpointing also provides fault-tolerance and error recovery: if one or more nodes fail at a given superstep, you can restart your graph from the last successful step. Additionally, when a graph node fails mid-execution at a given superstep, LangGraph stores pending checkpoint writes from any other nodes that completed successfully at that superstep, so that whenever we resume graph execution from that superstep we don't re-run the successful nodes.
#### Pending writes
Additionally, when a graph node fails mid-execution at a given superstep, LangGraph stores pending checkpoint writes from any other nodes that completed successfully at that superstep, so that whenever we resume graph execution from that superstep we don't re-run the successful nodes.
---
concepts/pregel.md
---
---
search:
boost: 2
---
# LangGraph runtime
[Pregel][langgraph.pregel.Pregel] implements LangGraph's runtime, managing the execution of LangGraph applications.
Compiling a [StateGraph][langgraph.graph.StateGraph] or creating an [entrypoint][langgraph.func.entrypoint] produces a [Pregel][langgraph.pregel.Pregel] instance that can be invoked with input.
This guide explains the runtime at a high level and provides instructions for directly implementing applications with Pregel.
> **Note:** The [Pregel][langgraph.pregel.Pregel] runtime is named after [Google's Pregel algorithm](https://research.google/pubs/pub37252/), which describes an efficient method for large-scale parallel computation using graphs.
## Overview
In LangGraph, Pregel combines [**actors**](https://en.wikipedia.org/wiki/Actor_model) and **channels** into a single application. **Actors** read data from channels and write data to channels. Pregel organizes the execution of the application into multiple steps, following the **Pregel Algorithm**/**Bulk Synchronous Parallel** model.
Each step consists of three phases:
- **Plan**: Determine which **actors** to execute in this step. For example, in the first step, select the **actors** that subscribe to the special **input** channels; in subsequent steps, select the **actors** that subscribe to channels updated in the previous step.
- **Execution**: Execute all selected **actors** in parallel, until all complete, or one fails, or a timeout is reached. During this phase, channel updates are invisible to actors until the next step.
- **Update**: Update the channels with the values written by the **actors** in this step.
Repeat until no **actors** are selected for execution, or a maximum number of steps is reached.
## Actors
An **actor** is a `PregelNode`. It subscribes to channels, reads data from them, and writes data to them. It can be thought of as an **actor** in the Pregel algorithm. `PregelNodes` implement LangChain's Runnable interface.
## Channels
Channels are used to communicate between actors (PregelNodes). Each channel has a value type, an update type, and an update function – which takes a sequence of updates and modifies the stored value. Channels can be used to send data from one chain to another, or to send data from a chain to itself in a future step. LangGraph provides a number of built-in channels:
- [LastValue][langgraph.channels.LastValue]: The default channel, stores the last value sent to the channel, useful for input and output values, or for sending data from one step to the next.
- [Topic][langgraph.channels.Topic]: A configurable PubSub Topic, useful for sending multiple values between **actors**, or for accumulating output. Can be configured to deduplicate values or to accumulate values over the course of multiple steps.
- [BinaryOperatorAggregate][langgraph.channels.BinaryOperatorAggregate]: stores a persistent value, updated by applying a binary operator to the current value and each update sent to the channel, useful for computing aggregates over multiple steps; e.g.,`total = BinaryOperatorAggregate(int, operator.add)`
## Examples
While most users will interact with Pregel through the [StateGraph][langgraph.graph.StateGraph] API or
the [entrypoint][langgraph.func.entrypoint] decorator, it is possible to interact with Pregel directly.
Below are a few different examples to give you a sense of the Pregel API.
=== "Single node"
```python
from langgraph.channels import EphemeralValue
from langgraph.pregel import Pregel, NodeBuilder
node1 = (
NodeBuilder().subscribe_only("a")
.do(lambda x: x + x)
.write_to("b")
)
app = Pregel(
nodes={"node1": node1},
channels={
"a": EphemeralValue(str),
"b": EphemeralValue(str),
},
input_channels=["a"],
output_channels=["b"],
)
app.invoke({"a": "foo"})
```
```con
{'b': 'foofoo'}
```
=== "Multiple nodes"
```python
from langgraph.channels import LastValue, EphemeralValue
from langgraph.pregel import Pregel, NodeBuilder
node1 = (
NodeBuilder().subscribe_only("a")
.do(lambda x: x + x)
.write_to("b")
)
node2 = (
NodeBuilder().subscribe_only("b")
.do(lambda x: x + x)
.write_to("c")
)
app = Pregel(
nodes={"node1": node1, "node2": node2},
channels={
"a": EphemeralValue(str),
"b": LastValue(str),
"c": EphemeralValue(str),
},
input_channels=["a"],
output_channels=["b", "c"],
)
app.invoke({"a": "foo"})
```
```con
{'b': 'foofoo', 'c': 'foofoofoofoo'}
```
=== "Topic"
```python
from langgraph.channels import EphemeralValue, Topic
from langgraph.pregel import Pregel, NodeBuilder
node1 = (
NodeBuilder().subscribe_only("a")
.do(lambda x: x + x)
.write_to("b", "c")
)
node2 = (
NodeBuilder().subscribe_to("b")
.do(lambda x: x["b"] + x["b"])
.write_to("c")
)
app = Pregel(
nodes={"node1": node1, "node2": node2},
channels={
"a": EphemeralValue(str),
"b": EphemeralValue(str),
"c": Topic(str, accumulate=True),
},
input_channels=["a"],
output_channels=["c"],
)
app.invoke({"a": "foo"})
```
```pycon
{'c': ['foofoo', 'foofoofoofoo']}
```
=== "BinaryOperatorAggregate"
This examples demonstrates how to use the BinaryOperatorAggregate channel to implement a reducer.
```python
from langgraph.channels import EphemeralValue, BinaryOperatorAggregate
from langgraph.pregel import Pregel, NodeBuilder
node1 = (
NodeBuilder().subscribe_only("a")
.do(lambda x: x + x)
.write_to("b", "c")
)
node2 = (
NodeBuilder().subscribe_only("b")
.do(lambda x: x + x)
.write_to("c")
)
def reducer(current, update):
if current:
return current + " | " + update
else:
return update
app = Pregel(
nodes={"node1": node1, "node2": node2},
channels={
"a": EphemeralValue(str),
"b": EphemeralValue(str),
"c": BinaryOperatorAggregate(str, operator=reducer),
},
input_channels=["a"],
output_channels=["c"],
)
app.invoke({"a": "foo"})
```
=== "Cycle"
This example demonstrates how to introduce a cycle in the graph, by having
a chain write to a channel it subscribes to. Execution will continue
until a None value is written to the channel.
```python
from langgraph.channels import EphemeralValue
from langgraph.pregel import Pregel, NodeBuilder, ChannelWriteEntry
example_node = (
NodeBuilder().subscribe_only("value")
.do(lambda x: x + x if len(x) < 10 else None)
.write_to(ChannelWriteEntry("value", skip_none=True))
)
app = Pregel(
nodes={"example_node": example_node},
channels={
"value": EphemeralValue(str),
},
input_channels=["value"],
output_channels=["value"],
)
app.invoke({"value": "a"})
```
```pycon
{'value': 'aaaaaaaaaaaaaaaa'}
```
## High-level API
LangGraph provides two high-level APIs for creating a Pregel application: the [StateGraph (Graph API)](./low_level.md) and the [Functional API](functional_api.md).
=== "StateGraph (Graph API)"
The [StateGraph (Graph API)][langgraph.graph.StateGraph] is a higher-level abstraction that simplifies the creation of Pregel applications. It allows you to define a graph of nodes and edges. When you compile the graph, the StateGraph API automatically creates the Pregel application for you.
```python
from typing import TypedDict, Optional
from langgraph.constants import START
from langgraph.graph import StateGraph
class Essay(TypedDict):
topic: str
content: Optional[str]
score: Optional[float]
def write_essay(essay: Essay):
return {
"content": f"Essay about {essay['topic']}",
}
def score_essay(essay: Essay):
return {
"score": 10
}
builder = StateGraph(Essay)
builder.add_node(write_essay)
builder.add_node(score_essay)
builder.add_edge(START, "write_essay")
# Compile the graph.
# This will return a Pregel instance.
graph = builder.compile()
```
The compiled Pregel instance will be associated with a list of nodes and channels. You can inspect the nodes and channels by printing them.
```python
print(graph.nodes)
```
You will see something like this:
```pycon
{'__start__': ,
'write_essay': ,
'score_essay': }
```
```python
print(graph.channels)
```
You should see something like this
```pycon
{'topic': ,
'content': ,
'score': ,
'__start__': ,
'write_essay': ,
'score_essay': ,
'branch:__start__:__self__:write_essay': ,
'branch:__start__:__self__:score_essay': ,
'branch:write_essay:__self__:write_essay': ,
'branch:write_essay:__self__:score_essay': ,
'branch:score_essay:__self__:write_essay': ,
'branch:score_essay:__self__:score_essay': ,
'start:write_essay': }
```
=== "Functional API"
In the [Functional API](functional_api.md), you can use an [`entrypoint`][langgraph.func.entrypoint] to create
a Pregel application. The `entrypoint` decorator allows you to define a function that takes input and returns output.
```python
from typing import TypedDict, Optional
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.func import entrypoint
class Essay(TypedDict):
topic: str
content: Optional[str]
score: Optional[float]
checkpointer = InMemorySaver()
@entrypoint(checkpointer=checkpointer)
def write_essay(essay: Essay):
return {
"content": f"Essay about {essay['topic']}",
}
print("Nodes: ")
print(write_essay.nodes)
print("Channels: ")
print(write_essay.channels)
```
```pycon
Nodes:
{'write_essay': }
Channels:
{'__start__': , '__end__': , '__previous__': }
```
---
concepts/memory.md
---
---
search:
boost: 2
---
# Memory
[Memory](../how-tos/memory/add-memory.md) is a system that remembers information about previous interactions. For AI agents, memory is crucial because it lets them remember previous interactions, learn from feedback, and adapt to user preferences. As agents tackle more complex tasks with numerous user interactions, this capability becomes essential for both efficiency and user satisfaction.
This conceptual guide covers two types of memory, based on their recall scope:
- [Short-term memory](#short-term-memory), or [thread](persistence.md#threads)-scoped memory, tracks the ongoing conversation by maintaining message history within a session. LangGraph manages short-term memory as a part of your agent's [state](low_level.md#state). State is persisted to a database using a [checkpointer](persistence.md#checkpoints) so the thread can be resumed at any time. Short-term memory updates when the graph is invoked or a step is completed, and the State is read at the start of each step.
- [Long-term memory](#long-term-memory) stores user-specific or application-level data across sessions and is shared _across_ conversational threads. It can be recalled _at any time_ and _in any thread_. Memories are scoped to any custom namespace, not just within a single thread ID. LangGraph provides [stores](persistence.md#memory-store) ([reference doc](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.BaseStore)) to let you save and recall long-term memories.

## Short-term memory
[Short-term memory](../how-tos/memory/add-memory.md#add-short-term-memory) lets your application remember previous interactions within a single [thread](persistence.md#threads) or conversation. A [thread](persistence.md#threads) organizes multiple interactions in a session, similar to the way email groups messages in a single conversation.
LangGraph manages short-term memory as part of the agent's state, persisted via thread-scoped checkpoints. This state can normally include the conversation history along with other stateful data, such as uploaded files, retrieved documents, or generated artifacts. By storing these in the graph's state, the bot can access the full context for a given conversation while maintaining separation between different threads.
### Manage short-term memory
Conversation history is the most common form of short-term memory, and long conversations pose a challenge to today's LLMs. A full history may not fit inside an LLM's context window, resulting in an irrecoverable error. Even if your LLM supports the full context length, most LLMs still perform poorly over long contexts. They get "distracted" by stale or off-topic content, all while suffering from slower response times and higher costs.
Chat models accept context using messages, which include developer provided instructions (a system message) and user inputs (human messages). In chat applications, messages alternate between human inputs and model responses, resulting in a list of messages that grows longer over time. Because context windows are limited and token-rich message lists can be costly, many applications can benefit from using techniques to manually remove or forget stale information.

For more information on common techniques for managing messages, see the [Add and manage memory](../how-tos/memory/add-memory.md#manage-short-term-memory) guide.
## Long-term memory
[Long-term memory](../how-tos/memory/add-memory.md#add-long-term-memory) in LangGraph allows systems to retain information across different conversations or sessions. Unlike short-term memory, which is **thread-scoped**, long-term memory is saved within custom "namespaces."
Long-term memory is a complex challenge without a one-size-fits-all solution. However, the following questions provide a framework to help you navigate the different techniques:
- [What is the type of memory?](#memory-types) Humans use memories to remember facts ([semantic memory](#semantic-memory)), experiences ([episodic memory](#episodic-memory)), and rules ([procedural memory](#procedural-memory)). AI agents can use memory in the same ways. For example, AI agents can use memory to remember specific facts about a user to accomplish a task.
- [When do you want to update memories?](#writing-memories) Memory can be updated as part of an agent's application logic (e.g., "on the hot path"). In this case, the agent typically decides to remember facts before responding to a user. Alternatively, memory can be updated as a background task (logic that runs in the background / asynchronously and generates memories). We explain the tradeoffs between these approaches in the [section below](#writing-memories).
### Memory types
Different applications require various types of memory. Although the analogy isn't perfect, examining [human memory types](https://www.psychologytoday.com/us/basics/memory/types-of-memory?ref=blog.langchain.dev) can be insightful. Some research (e.g., the [CoALA paper](https://arxiv.org/pdf/2309.02427)) have even mapped these human memory types to those used in AI agents.
| Memory Type | What is Stored | Human Example | Agent Example |
|-------------|----------------|---------------|---------------|
| [Semantic](#semantic-memory) | Facts | Things I learned in school | Facts about a user |
| [Episodic](#episodic-memory) | Experiences | Things I did | Past agent actions |
| [Procedural](#procedural-memory) | Instructions | Instincts or motor skills | Agent system prompt |
#### Semantic memory
[Semantic memory](https://en.wikipedia.org/wiki/Semantic_memory), both in humans and AI agents, involves the retention of specific facts and concepts. In humans, it can include information learned in school and the understanding of concepts and their relationships. For AI agents, semantic memory is often used to personalize applications by remembering facts or concepts from past interactions.
!!! note
Semantic memory is different from "semantic search," which is a technique for finding similar content using "meaning" (usually as embeddings). Semantic memory is a term from psychology, referring to storing facts and knowledge, while semantic search is a method for retrieving information based on meaning rather than exact matches.
##### Profile
Semantic memories can be managed in different ways. For example, memories can be a single, continuously updated "profile" of well-scoped and specific information about a user, organization, or other entity (including the agent itself). A profile is generally just a JSON document with various key-value pairs you've selected to represent your domain.
When remembering a profile, you will want to make sure that you are **updating** the profile each time. As a result, you will want to pass in the previous profile and [ask the model to generate a new profile](https://github.com/langchain-ai/memory-template) (or some [JSON patch](https://github.com/hinthornw/trustcall) to apply to the old profile). This can be become error-prone as the profile gets larger, and may benefit from splitting a profile into multiple documents or **strict** decoding when generating documents to ensure the memory schemas remains valid.

##### Collection
Alternatively, memories can be a collection of documents that are continuously updated and extended over time. Each individual memory can be more narrowly scoped and easier to generate, which means that you're less likely to **lose** information over time. It's easier for an LLM to generate _new_ objects for new information than reconcile new information with an existing profile. As a result, a document collection tends to lead to [higher recall downstream](https://en.wikipedia.org/wiki/Precision_and_recall).
However, this shifts some complexity memory updating. The model must now _delete_ or _update_ existing items in the list, which can be tricky. In addition, some models may default to over-inserting and others may default to over-updating. See the [Trustcall](https://github.com/hinthornw/trustcall) package for one way to manage this and consider evaluation (e.g., with a tool like [LangSmith](https://docs.smith.langchain.com/tutorials/Developers/evaluation)) to help you tune the behavior.
Working with document collections also shifts complexity to memory **search** over the list. The `Store` currently supports both [semantic search](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.SearchOp.query) and [filtering by content](https://langchain-ai.github.io/langgraph/reference/store/#langgraph.store.base.SearchOp.filter).
Finally, using a collection of memories can make it challenging to provide comprehensive context to the model. While individual memories may follow a specific schema, this structure might not capture the full context or relationships between memories. As a result, when using these memories to generate responses, the model may lack important contextual information that would be more readily available in a unified profile approach.

Regardless of memory management approach, the central point is that the agent will use the semantic memories to [ground its responses](https://python.langchain.com/docs/concepts/rag/), which often leads to more personalized and relevant interactions.
#### Episodic memory
[Episodic memory](https://en.wikipedia.org/wiki/Episodic_memory), in both humans and AI agents, involves recalling past events or actions. The [CoALA paper](https://arxiv.org/pdf/2309.02427) frames this well: facts can be written to semantic memory, whereas *experiences* can be written to episodic memory. For AI agents, episodic memory is often used to help an agent remember how to accomplish a task.
In practice, episodic memories are often implemented through [few-shot example prompting](https://python.langchain.com/docs/concepts/few_shot_prompting/), where agents learn from past sequences to perform tasks correctly. Sometimes it's easier to "show" than "tell" and LLMs learn well from examples. Few-shot learning lets you ["program"](https://x.com/karpathy/status/1627366413840322562) your LLM by updating the prompt with input-output examples to illustrate the intended behavior. While various [best-practices](https://python.langchain.com/docs/concepts/#1-generating-examples) can be used to generate few-shot examples, often the challenge lies in selecting the most relevant examples based on user input.
Note that the memory [store](persistence.md#memory-store) is just one way to store data as few-shot examples. If you want to have more developer involvement, or tie few-shots more closely to your evaluation harness, you can also use a [LangSmith Dataset](https://docs.smith.langchain.com/evaluation/how_to_guides/datasets/index_datasets_for_dynamic_few_shot_example_selection) to store your data. Then dynamic few-shot example selectors can be used out-of-the box to achieve this same goal. LangSmith will index the dataset for you and enable retrieval of few shot examples that are most relevant to the user input based upon keyword similarity ([using a BM25-like algorithm](https://docs.smith.langchain.com/how_to_guides/datasets/index_datasets_for_dynamic_few_shot_example_selection) for keyword based similarity).
See this how-to [video](https://www.youtube.com/watch?v=37VaU7e7t5o) for example usage of dynamic few-shot example selection in LangSmith. Also, see this [blog post](https://blog.langchain.dev/few-shot-prompting-to-improve-tool-calling-performance/) showcasing few-shot prompting to improve tool calling performance and this [blog post](https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/) using few-shot example to align an LLMs to human preferences.
#### Procedural memory
[Procedural memory](https://en.wikipedia.org/wiki/Procedural_memory), in both humans and AI agents, involves remembering the rules used to perform tasks. In humans, procedural memory is like the internalized knowledge of how to perform tasks, such as riding a bike via basic motor skills and balance. Episodic memory, on the other hand, involves recalling specific experiences, such as the first time you successfully rode a bike without training wheels or a memorable bike ride through a scenic route. For AI agents, procedural memory is a combination of model weights, agent code, and agent's prompt that collectively determine the agent's functionality.
In practice, it is fairly uncommon for agents to modify their model weights or rewrite their code. However, it is more common for agents to modify their own prompts.
One effective approach to refining an agent's instructions is through ["Reflection"](https://blog.langchain.dev/reflection-agents/) or meta-prompting. This involves prompting the agent with its current instructions (e.g., the system prompt) along with recent conversations or explicit user feedback. The agent then refines its own instructions based on this input. This method is particularly useful for tasks where instructions are challenging to specify upfront, as it allows the agent to learn and adapt from its interactions.
For example, we built a [Tweet generator](https://www.youtube.com/watch?v=Vn8A3BxfplE) using external feedback and prompt re-writing to produce high-quality paper summaries for Twitter. In this case, the specific summarization prompt was difficult to specify *a priori*, but it was fairly easy for a user to critique the generated Tweets and provide feedback on how to improve the summarization process.
The below pseudo-code shows how you might implement this with the LangGraph memory [store](persistence.md#memory-store), using the store to save a prompt, the `update_instructions` node to get the current prompt (as well as feedback from the conversation with the user captured in `state["messages"]`), update the prompt, and save the new prompt back to the store. Then, the `call_model` get the updated prompt from the store and uses it to generate a response.
```python
# Node that *uses* the instructions
def call_model(state: State, store: BaseStore):
namespace = ("agent_instructions", )
instructions = store.get(namespace, key="agent_a")[0]
# Application logic
prompt = prompt_template.format(instructions=instructions.value["instructions"])
...
# Node that updates instructions
def update_instructions(state: State, store: BaseStore):
namespace = ("instructions",)
current_instructions = store.search(namespace)[0]
# Memory logic
prompt = prompt_template.format(instructions=instructions.value["instructions"], conversation=state["messages"])
output = llm.invoke(prompt)
new_instructions = output['new_instructions']
store.put(("agent_instructions",), "agent_a", {"instructions": new_instructions})
...
```

### Writing memories
There are two primary methods for agents to write memories: ["in the hot path"](#in-the-hot-path) and ["in the background"](#in-the-background).

#### In the hot path
Creating memories during runtime offers both advantages and challenges. On the positive side, this approach allows for real-time updates, making new memories immediately available for use in subsequent interactions. It also enables transparency, as users can be notified when memories are created and stored.
However, this method also presents challenges. It may increase complexity if the agent requires a new tool to decide what to commit to memory. In addition, the process of reasoning about what to save to memory can impact agent latency. Finally, the agent must multitask between memory creation and its other responsibilities, potentially affecting the quantity and quality of memories created.
As an example, ChatGPT uses a [save_memories](https://openai.com/index/memory-and-new-controls-for-chatgpt/) tool to upsert memories as content strings, deciding whether and how to use this tool with each user message. See our [memory-agent](https://github.com/langchain-ai/memory-agent) template as an reference implementation.
#### In the background
Creating memories as a separate background task offers several advantages. It eliminates latency in the primary application, separates application logic from memory management, and allows for more focused task completion by the agent. This approach also provides flexibility in timing memory creation to avoid redundant work.
However, this method has its own challenges. Determining the frequency of memory writing becomes crucial, as infrequent updates may leave other threads without new context. Deciding when to trigger memory formation is also important. Common strategies include scheduling after a set time period (with rescheduling if new events occur), using a cron schedule, or allowing manual triggers by users or the application logic.
See our [memory-service](https://github.com/langchain-ai/memory-template) template as an reference implementation.
### Memory storage
LangGraph stores long-term memories as JSON documents in a [store](persistence.md#memory-store). Each memory is organized under a custom `namespace` (similar to a folder) and a distinct `key` (like a file name). Namespaces often include user or org IDs or other labels that makes it easier to organize information. This structure enables hierarchical organization of memories. Cross-namespace searching is then supported through content filters.
```python
from langgraph.store.memory import InMemoryStore
def embed(texts: list[str]) -> list[list[float]]:
# Replace with an actual embedding function or LangChain embeddings object
return [[1.0, 2.0] * len(texts)]
# InMemoryStore saves data to an in-memory dictionary. Use a DB-backed store in production use.
store = InMemoryStore(index={"embed": embed, "dims": 2})
user_id = "my-user"
application_context = "chitchat"
namespace = (user_id, application_context)
store.put(
namespace,
"a-memory",
{
"rules": [
"User likes short, direct language",
"User only speaks English & python",
],
"my-key": "my-value",
},
)
# get the "memory" by ID
item = store.get(namespace, "a-memory")
# search for "memories" within this namespace, filtering on content equivalence, sorted by vector similarity
items = store.search(
namespace, filter={"my-key": "my-value"}, query="language preferences"
)
```
For more information about the memory store, see the [Persistence](persistence.md#memory-store) guide.
---
concepts/deployment_options.md
---
---
search:
boost: 2
---
# Deployment Options
## Free deployment
There are two free options for deploying LangGraph applications via the LangGraph Server:
1. [Local](../tutorials/langgraph-platform/local-server.md): Deploy for local testing and development.
1. [Standalone Container (Lite)](../concepts/langgraph_standalone_container.md): A limited version of Standalone Container for deployments unlikely to see more that 1 million node executions per year and that do not need crons and other enterprise features. Standalone Container (Lite) deployment option is free with a LangSmith API key.
## Production deployment
There are 4 main options for deploying with the [LangGraph Platform](langgraph_platform.md):
1. [Cloud SaaS](#cloud-saas)
1. [Self-Hosted Data Plane](#self-hosted-data-plane)
1. [Self-Hosted Control Plane](#self-hosted-control-plane)
1. [Standalone Container](#standalone-container)
A quick comparison:
| | **Cloud SaaS** | **Self-Hosted Data Plane** | **Self-Hosted Control Plane** | **Standalone Container** |
|----------------------|----------------|----------------------------|-------------------------------|--------------------------|
| **[Control plane UI/API](../concepts/langgraph_control_plane.md)** | Yes | Yes | Yes | No |
| **CI/CD** | Managed internally by platform | Managed externally by you | Managed externally by you | Managed externally by you |
| **Data/compute residency** | LangChain's cloud | Your cloud | Your cloud | Your cloud |
| **LangSmith compatibility** | Trace to LangSmith SaaS | Trace to LangSmith SaaS | Trace to Self-Hosted LangSmith | Optional tracing |
| **[Server version compatibility](../concepts/langgraph_server.md#server-versions)** | Enterprise | Enterprise | Enterprise | Lite, Enterprise |
| **[Pricing](https://www.langchain.com/pricing-langgraph-platform)** | Plus | Enterprise | Enterprise | Developer |
## Cloud SaaS
The [Cloud SaaS](./langgraph_cloud.md) deployment option is a fully managed model for deployment where we manage the [control plane](./langgraph_control_plane.md) and [data plane](./langgraph_data_plane.md) in our cloud. This option provides a simple way to deploy and manage your LangGraph Servers.
Connect your GitHub repositories to the platform and deploy your LangGraph Servers from the [control plane UI](./langgraph_control_plane.md#control-plane-ui). The build process (i.e. CI/CD) is managed internally by the platform.
For more information, please see:
* [Cloud SaaS Conceptual Guide](./langgraph_cloud.md)
* [How to deploy to Cloud SaaS](../cloud/deployment/cloud.md)
## Self-Hosted Data Plane
!!! info "Important"
The Self-Hosted Data Plane deployment option requires an [Enterprise](../concepts/plans.md) plan.
The [Self-Hosted Data Plane](./langgraph_self_hosted_data_plane.md) deployment option is a "hybrid" model for deployment where we manage the [control plane](./langgraph_control_plane.md) in our cloud and you manage the [data plane](./langgraph_data_plane.md) in your cloud. This option provides a way to securely manage your data plane infrastructure, while offloading control plane management to us.
Build a Docker image using the [LangGraph CLI](./langgraph_cli.md) and deploy your LangGraph Server from the [control plane UI](./langgraph_control_plane.md#control-plane-ui).
Supported Compute Platforms: [Kubernetes](https://kubernetes.io/), [Amazon ECS](https://aws.amazon.com/ecs/) (coming soon!)
For more information, please see:
* [Self-Hosted Data Plane Conceptual Guide](./langgraph_self_hosted_data_plane.md)
* [How to deploy the Self-Hosted Data Plane](../cloud/deployment/self_hosted_data_plane.md)
## Self-Hosted Control Plane
!!! info "Important"
The Self-Hosted Control Plane deployment option requires an [Enterprise](../concepts/plans.md) plan.
The [Self-Hosted Control Plane](./langgraph_self_hosted_control_plane.md) deployment option is a fully self-hosted model for deployment where you manage the [control plane](./langgraph_control_plane.md) and [data plane](./langgraph_data_plane.md) in your cloud. This option gives you full control and responsibility of the control plane and data plane infrastructure.
Build a Docker image using the [LangGraph CLI](./langgraph_cli.md) and deploy your LangGraph Server from the [control plane UI](./langgraph_control_plane.md#control-plane-ui).
Supported Compute Platforms: [Kubernetes](https://kubernetes.io/)
For more information, please see:
* [Self-Hosted Control Plane Conceptual Guide](./langgraph_self_hosted_control_plane.md)
* [How to deploy the Self-Hosted Control Plane](../cloud/deployment/self_hosted_control_plane.md)
## Standalone Container
The [Standalone Container](./langgraph_standalone_container.md) deployment option is the least restrictive model for deployment. Deploy standalone instances of a LangGraph Server in your cloud, using any of the [available](./plans.md) license options.
Build a Docker image using the [LangGraph CLI](./langgraph_cli.md) and deploy your LangGraph Server using the container deployment tooling of your choice. Images can be deployed to any compute platform.
For more information, please see:
* [Standalone Container Conceptual Guide](./langgraph_standalone_container.md)
* [How to deploy a Standalone Container](../cloud/deployment/standalone_container.md)
## Related
For more information, please see:
* [LangGraph Platform plans](./plans.md)
* [LangGraph Platform pricing](https://www.langchain.com/langgraph-platform-pricing)
---
concepts/server-mcp.md
---
---
tags:
- mcp
- platform
hide:
- tags
---
# MCP endpoint in LangGraph Server
The [Model Context Protocol (MCP)](./mcp.md) is an open protocol for describing tools and data sources in a model-agnostic format, enabling LLMs to discover and use them via a structured API.
[LangGraph Server](./langgraph_server.md) implements MCP using the [Streamable HTTP transport](https://spec.modelcontextprotocol.io/specification/2025-03-26/basic/transports/#streamable-http). This allows LangGraph **agents** to be exposed as **MCP tools**, making them usable with any MCP-compliant client supporting Streamable HTTP.
The MCP endpoint is available at `/mcp` on [LangGraph Server](./langgraph_server.md).
## Requirements
To use MCP, ensure you have the following dependencies installed:
- `langgraph-api >= 0.2.3`
- `langgraph-sdk >= 0.1.61`
Install them with:
```bash
pip install "langgraph-api>=0.2.3" "langgraph-sdk>=0.1.61"
```
## Usage overview
To enable MCP:
- Upgrade to use langgraph-api>=0.2.3. If you are deploying LangGraph Platform, this will be done for you automatically if you create a new revision.
- MCP tools (agents) will be automatically exposed.
- Connect with any MCP-compliant client that supports Streamable HTTP.
### Client
Use an MCP-compliant client to connect to the LangGraph server. The following examples show how to connect using different programming languages.
=== "JavaScript/TypeScript"
```bash
npm install @modelcontextprotocol/sdk
```
> **Note**
> Replace `serverUrl` with your LangGraph server URL and configure authentication headers as needed.
```js
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";
// Connects to the LangGraph MCP endpoint
async function connectClient(url) {
const baseUrl = new URL(url);
const client = new Client({
name: 'streamable-http-client',
version: '1.0.0'
});
const transport = new StreamableHTTPClientTransport(baseUrl);
await client.connect(transport);
console.log("Connected using Streamable HTTP transport");
console.log(JSON.stringify(await client.listTools(), null, 2));
return client;
}
const serverUrl = "http://localhost:2024/mcp";
connectClient(serverUrl)
.then(() => {
console.log("Client connected successfully");
})
.catch(error => {
console.error("Failed to connect client:", error);
});
```
=== "Python"
Install the adapter with:
```bash
pip install langchain-mcp-adapters
```
Here is an example of how to connect to a remote MCP endpoint and use an agent as a tool:
```python
# Create server parameters for stdio connection
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client
import asyncio
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
server_params = {
"url": "https://mcp-finance-agent.xxx.us.langgraph.app/mcp",
"headers": {
"X-Api-Key":"lsv2_pt_your_api_key"
}
}
async def main():
async with streamablehttp_client(**server_params) as (read, write, _):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Load the remote graph as if it was a tool
tools = await load_mcp_tools(session)
# Create and run a react agent with the tools
agent = create_react_agent("openai:gpt-4.1", tools)
# Invoke the agent with a message
agent_response = await agent.ainvoke({"messages": "What can the finance agent do for me?"})
print(agent_response)
if __name__ == "__main__":
asyncio.run(main())
```
## Expose an agent as MCP tool
When deployed, your agent will appear as a tool in the MCP endpoint
with this configuration:
- **Tool name**: The agent's name.
- **Tool description**: The agent's description.
- **Tool input schema**: The agent's input schema.
### Setting name and description
You can set the name and description of your agent in `langgraph.json`:
```json
{
"graphs": {
"my_agent": {
"path": "./my_agent/agent.py:graph",
"description": "A description of what the agent does"
}
},
"env": ".env"
}
```
After deployment, you can update the name and description using the LangGraph SDK.
### Schema
Define clear, minimal input and output schemas to avoid exposing unnecessary internal complexity to the LLM.
The default [MessagesState](./low_level.md#messagesstate) uses `AnyMessage`, which supports many message types but is too general for direct LLM exposure.
Instead, define **custom agents or workflows** that use explicitly typed input and output structures.
For example, a workflow answering documentation questions might look like this:
```python
from langgraph.graph import StateGraph, START, END
from typing_extensions import TypedDict
# Define input schema
class InputState(TypedDict):
question: str
# Define output schema
class OutputState(TypedDict):
answer: str
# Combine input and output
class OverallState(InputState, OutputState):
pass
# Define the processing node
def answer_node(state: InputState):
# Replace with actual logic and do something useful
return {"answer": "bye", "question": state["question"]}
# Build the graph with explicit schemas
builder = StateGraph(OverallState, input_schema=InputState, output_schema=OutputState)
builder.add_node(answer_node)
builder.add_edge(START, "answer_node")
builder.add_edge("answer_node", END)
graph = builder.compile()
# Run the graph
print(graph.invoke({"question": "hi"}))
```
For more details, see the [low-level concepts guide](https://langchain-ai.github.io/langgraph/concepts/low_level/#state).
## Use user-scoped MCP tools in your deployment
!!! tip "Prerequisites"
You have added your own [custom auth middleware](https://langchain-ai.github.io/langgraph/how-tos/auth/custom_auth/) that populates the `langgraph_auth_user` object, making it accessible through configurable context for every node in your graph.
To make user-scoped tools available to your LangGraph Platform deployment, start with implementing a snippet like the following:
```python
from langchain_mcp_adapters.client import MultiServerMCPClient
def get_mcp_tools_node(state, config):
user = config["configurable"].get("langgraph_auth_user")
# e.g., user["github_token"], user["email"], etc.
client = MultiServerMCPClient({
"github": {
"transport": "streamable_http", # (1)
"url": "https://my-github-mcp-server/mcp", # (2)
"headers": {
"Authorization": f"Bearer {user['github_token']}"
}
}
})
tools = await client.get_tools() # (3)
return {"tools": tools}
```
1. MCP only supports adding headers to requests made to `streamable_http` and `sse` `transport` servers.
2. Your MCP server URL.
3. Get available tools from your MCP server.
## Session behavior
The current LangGraph MCP implementation does not support sessions. Each `/mcp` request is stateless and independent.
## Authentication
The `/mcp` endpoint uses the same authentication as the rest of the LangGraph API. Refer to the [authentication guide](./auth.md) for setup details.
## Disable MCP
To disable the MCP endpoint, set `disable_mcp` to `true` in your `langgraph.json` configuration file:
```json
{
"http": {
"disable_mcp": true
}
}
```
This will prevent the server from exposing the `/mcp` endpoint.
---
concepts/langgraph_cli.md
---
---
search:
boost: 2
---
# LangGraph CLI
**LangGraph CLI** is a multi-platform command-line tool for building and running the [LangGraph API server](./langgraph_server.md) locally. The resulting server includes all API endpoints for your graph's runs, threads, assistants, etc. as well as the other services required to run your agent, including a managed database for checkpointing and storage.
## Installation
The LangGraph CLI can be installed via pip or [Homebrew](https://brew.sh/):
=== "pip"
```bash
pip install langgraph-cli
```
=== "Homebrew"
```bash
brew install langgraph-cli
```
## Commands
LangGraph CLI provides the following core functionality:
| Command | Description |
| -------- | -------|
| [`langgraph build`](../cloud/reference/cli.md#build) | Builds a Docker image for the [LangGraph API server](./langgraph_server.md) that can be directly deployed. |
| [`langgraph dev`](../cloud/reference/cli.md#dev) | Starts a lightweight development server that requires no Docker installation. This server is ideal for rapid development and testing. This is available in version 0.1.55 and up.
| [`langgraph dockerfile`](../cloud/reference/cli.md#dockerfile) | Generates a [Dockerfile](https://docs.docker.com/reference/dockerfile/) that can be used to build images for and deploy instances of the [LangGraph API server](./langgraph_server.md). This is useful if you want to further customize the dockerfile or deploy in a more custom way. |
| [`langgraph up`](../cloud/reference/cli.md#up) | Starts an instance of the [LangGraph API server](./langgraph_server.md) locally in a docker container. This requires the docker server to be running locally. It also requires a LangSmith API key for local development or a license key for production use. |
For more information, see the [LangGraph CLI Reference](../cloud/reference/cli.md).
---
concepts/human_in_the_loop.md
---
---
search:
boost: 2
tags:
- human-in-the-loop
- hil
- overview
hide:
- tags
---
# Human-in-the-loop
To review, edit, and approve tool calls in an agent or workflow, [use LangGraph's human-in-the-loop features](../how-tos/human_in_the_loop/add-human-in-the-loop.md) to enable human intervention at any point in a workflow. This is especially useful in large language model (LLM)-driven applications where model output may require validation, correction, or additional context.
{: style="max-height:400px"}
!!! tip
For information on how to use human-in-the-loop, see [Enable human intervention](../how-tos/human_in_the_loop/add-human-in-the-loop.md) and [Human-in-the-loop using Server API](../cloud/how-tos/add-human-in-the-loop.md).
## Key capabilities
* **Persistent execution state**: LangGraph allows you to pause execution **indefinitely** — for minutes, hours, or even days—until human input is received. This is possible because LangGraph checkpoints the graph state after each step, which allows the system to persist execution context and later resume the workflow, continuing from where it left off. This supports asynchronous human review or input without time constraints.
* **Flexible integration points**: HIL logic can be introduced at any point in the workflow. This allows targeted human involvement, such as approving API calls, correcting outputs, or guiding conversations.
## Patterns
There are four typical design patterns that you can implement using `interrupt` and `Command`:
- [Approve or reject](../how-tos/human_in_the_loop/add-human-in-the-loop.md#approve-or-reject): Pause the graph before a critical step, such as an API call, to review and approve the action. If the action is rejected, you can prevent the graph from executing the step, and potentially take an alternative action. This pattern often involves routing the graph based on the human's input.
- [Edit graph state](../how-tos/human_in_the_loop/add-human-in-the-loop.md#review-and-edit-state): Pause the graph to review and edit the graph state. This is useful for correcting mistakes or updating the state with additional information. This pattern often involves updating the state with the human's input.
- [Review tool calls](../how-tos/human_in_the_loop/add-human-in-the-loop.md#review-tool-calls): Pause the graph to review and edit tool calls requested by the LLM before tool execution.
- [Validate human input](../how-tos/human_in_the_loop/add-human-in-the-loop.md#validate-human-input): Pause the graph to validate human input before proceeding with the next step.
---
concepts/langgraph_data_plane.md
---
---
search:
boost: 2
---
# LangGraph Data Plane
The term "data plane" is used broadly to refer to [LangGraph Servers](./langgraph_server.md) (deployments), the corresponding infrastructure for each server, and the "listener" application that continuously polls for updates from the [LangGraph Control Plane](./langgraph_control_plane.md).
## Server Infrastructure
In addition to the [LangGraph Server](./langgraph_server.md) itself, the following infrastructure components for each server are also included in the broad definition of "data plane":
- Postgres
- Redis
- Secrets store
- Autoscalers
## "Listener" Application
The data plane "listener" application periodically calls [control plane APIs](../concepts/langgraph_control_plane.md#control-plane-api) to:
- Determine if new deployments should be created.
- Determine if existing deployments should be updated (i.e. new revisions).
- Determine if existing deployments should be deleted.
In other words, the data plane "listener" reads the latest state of the control plane (desired state) and takes action to reconcile outstanding deployments (current state) to match the latest state.
## Postgres
Postgres is the persistence layer for all user, run, and long-term memory data in a LangGraph Server. This stores both checkpoints (see more info [here](./persistence.md)), server resources (threads, runs, assistants and crons), as well as items saved in the long-term memory store (see more info [here](./persistence.md#memory-store)).
## Redis
Redis is used in each LangGraph Server as a way for server and queue workers to communicate, and to store ephemeral metadata. No user or run data is stored in Redis.
### Communication
All runs in a LangGraph Server are executed by a pool of background workers that are part of each deployment. In order to enable some features for those runs (such as cancellation and output streaming) we need a channel for two-way communication between the server and the worker handling a particular run. We use Redis to organize that communication.
1. A Redis list is used as a mechanism to wake up a worker as soon as a new run is created. Only a sentinel value is stored in this list, no actual run information. The run information is then retrieved from Postgres by the worker.
2. A combination of a Redis string and Redis PubSub channel is used for the server to communicate a run cancellation request to the appropriate worker.
3. A Redis PubSub channel is used by the worker to broadcast streaming output from an agent while the run is being handled. Any open `/stream` request in the server will subscribe to that channel and forward any events to the response as they arrive. No events are stored in Redis at any time.
### Ephemeral metadata
Runs in a LangGraph Server may be retried for specific failures (currently only for transient Postgres errors encountered during the run). In order to limit the number of retries (currently limited to 3 attempts per run) we record the attempt number in a Redis string when it is picked up. This contains no run-specific info other than its ID, and expires after a short delay.
## Data Plane Features
This section describes various features of the data plane.
### Autoscaling
[`Production` type](../concepts/langgraph_control_plane.md#deployment-types) deployments automatically scale up to 10 containers. Scaling is based on 3 metrics:
1. CPU utilization
1. Memory utilization
1. Number of pending (in progress) [runs](./assistants.md#execution)
For CPU utilization, the autoscaler targets 75% utilization. This means the autoscaler will scale the number of containers up or down to ensure that CPU utilization is at or near 75%. For memory utilization, the autoscaler targets 75% utilization as well.
For number of pending runs, the autoscaler targets 10 pending runs. For example, if the current number of containers is 1, but the number of pending runs in 20, the autoscaler will scale up the deployment to 2 containers (20 pending runs / 2 containers = 10 pending runs per container).
Each metric is computed independently and the autoscaler will determine the scaling action based on the metric that results in the largest number of containers.
Scale down actions are delayed for 30 minutes before any action is taken. In other words, if the autoscaler decides to scale down a deployment, it will first wait for 30 minutes before scaling down. After 30 minutes, the metrics are recomputed and the deployment will scale down if the recomputed metrics result in a lower number of containers than the current number. Otherwise, the deployment remains scaled up. This "cool down" period ensures that deployments do not scale up and down too frequently.
### Static IP Addresses
!!! info "Only for Cloud SaaS"
Static IP addresses are only available for [Cloud SaaS](../concepts/langgraph_cloud.md) deployments.
All traffic from deployments created after January 6th 2025 will come through a NAT gateway. This NAT gateway will have several static IP addresses depending on the data region. Refer to the table below for the list of static IP addresses:
| US | EU |
|----------------|----------------|
| 35.197.29.146 | 34.13.192.67 |
| 34.145.102.123 | 34.147.105.64 |
| 34.169.45.153 | 34.90.22.166 |
| 34.82.222.17 | 34.147.36.213 |
| 35.227.171.135 | 34.32.137.113 |
| 34.169.88.30 | 34.91.238.184 |
| 34.19.93.202 | 35.204.101.241 |
| 34.19.34.50 | 35.204.48.32 |
### Custom Postgres
!!! info
Custom Postgres instances are only available for [Self-Hosted Data Plane](../concepts/langgraph_self_hosted_data_plane.md) and [Self-Hosted Control Plane](../concepts/langgraph_self_hosted_control_plane.md) deployments.
A custom Postgres instance can be used instead of the [one automatically created by the control plane](./langgraph_control_plane.md#database-provisioning). Specify the [`POSTGRES_URI_CUSTOM`](../cloud/reference/env_var.md#postgres_uri_custom) environment variable to use a custom Postgres instance.
Multiple deployments can share the same Postgres instance. For example, for `Deployment A`, `POSTGRES_URI_CUSTOM` can be set to `postgres://:@/?host=` and for `Deployment B`, `POSTGRES_URI_CUSTOM` can be set to `postgres://:@/?host=`. `` and `database_name_2` are different databases within the same instance, but `` is shared. **The same database cannot be used for separate deployments**.
### Custom Redis
!!! info
Custom Redis instances are only available for [Self-Hosted Data Plane](../concepts/langgraph_self_hosted_control_plane.md) and [Self-Hosted Control Plane](../concepts/langgraph_self_hosted_control_plane.md) deployments.
A custom Redis instance can be used instead of the one automatically created by the control plane. Specify the [REDIS_URI_CUSTOM](../cloud/reference/env_var.md#redis_uri_custom) environment variable to use a custom Redis instance.
Multiple deployments can share the same Redis instance. For example, for `Deployment A`, `REDIS_URI_CUSTOM` can be set to `redis://:/1` and for `Deployment B`, `REDIS_URI_CUSTOM` can be set to `redis://:/2`. `1` and `2` are different database numbers within the same instance, but `` is shared. **The same database number cannot be used for separate deployments**.
### LangSmith Tracing
LangGraph Server is automatically configured to send traces to LangSmith. See the table below for details with respect to each deployment option.
| Cloud SaaS | Self-Hosted Data Plane | Self-Hosted Control Plane | Standalone Container |
|------------|------------------------|---------------------------|----------------------|
| Required
Trace to LangSmith SaaS. | Optional
Disable tracing or trace to LangSmith SaaS. | Optional
Disable tracing or trace to Self-Hosted LangSmith. | Optional
Disable tracing, trace to LangSmith SaaS, or trace to Self-Hosted LangSmith. |
### Telemetry
LangGraph Server is automatically configured to report telemetry metadata for billing purposes. See the table below for details with respect to each deployment option.
| Cloud SaaS | Self-Hosted Data Plane | Self-Hosted Control Plane | Standalone Container |
|------------|------------------------|---------------------------|----------------------|
| Telemetry sent to LangSmith SaaS. | Telemetry sent to LangSmith SaaS. | Self-reported usage (audit) for air-gapped license key.
Telemetry sent to LangSmith SaaS for LangGraph Platform License Key. | Self-reported usage (audit) for air-gapped license key.
Telemetry sent to LangSmith SaaS for LangGraph Platform License Key. |
### Licensing
LangGraph Server is automatically configured to perform license key validation. See the table below for details with respect to each deployment option.
| Cloud SaaS | Self-Hosted Data Plane | Self-Hosted Control Plane | Standalone Container |
|------------|------------------------|---------------------------|----------------------|
| LangSmith API Key validated against LangSmith SaaS. | LangSmith API Key validated against LangSmith SaaS. | Air-gapped license key or LangGraph Platform License Key validated against LangSmith SaaS. | Air-gapped license key or LangGraph Platform License Key validated against LangSmith SaaS. |
---
concepts/langgraph_studio.md
---
---
search:
boost: 2
---
# LangGraph Studio
!!! info "Prerequisites"
- [LangGraph Platform](./langgraph_platform.md)
- [LangGraph Server](./langgraph_server.md)
- [LangGraph CLI](./langgraph_cli.md)
LangGraph Studio is a specialized agent IDE that enables visualization, interaction, and debugging of agentic systems that implement the LangGraph Server API protocol. Studio also integrates with LangSmith to enable tracing, evaluation, and prompt engineering.

## Features
Key features of LangGraph Studio:
- Visualize your graph architecture
- [Run and interact with your agent](../cloud/how-tos/invoke_studio.md)
- [Manage assistants](../cloud/how-tos/studio/manage_assistants.md)
- [Manage threads](../cloud/how-tos/threads_studio.md)
- [Iterate on prompts](../cloud/how-tos/iterate_graph_studio.md)
- [Run experiments over a dataset](../cloud/how-tos/studio/run_evals.md)
- Manage [long term memory](memory.md)
- Debug agent state via [time travel](time-travel.md)
LangGraph Studio works for graphs that are deployed on [LangGraph Platform](../cloud/quick_start.md) or for graphs that are running locally via the [LangGraph Server](../tutorials/langgraph-platform/local-server.md).
Studio supports two modes:
### Graph mode
Graph mode exposes the full feature-set of Studio and is useful when you would like as many details about the execution of your agent, including the nodes traversed, intermediate states, and LangSmith integrations (such as adding to datasets and playground).
### Chat mode
Chat mode is a simpler UI for iterating on and testing chat-specific agents. It is useful for business users and those who want to test overall agent behavior. Chat mode is only supported for graph's whose state includes or extends [`MessagesState`](https://langchain-ai.github.io/langgraph/how-tos/graph-api/#messagesstate).
## Learn more
- See this guide on how to [get started](../cloud/how-tos/studio/quick_start.md) with LangGraph Studio.
---
concepts/langgraph_server.md
---
---
search:
boost: 2
---
# LangGraph Server
**LangGraph Server** offers an API for creating and managing agent-based applications. It is built on the concept of [assistants](assistants.md), which are agents configured for specific tasks, and includes built-in [persistence](persistence.md#memory-store) and a **task queue**. This versatile API supports a wide range of agentic application use cases, from background processing to real-time interactions.
Use LangGraph Server to create and manage [assistants](assistants.md), [threads](./persistence.md#threads), [runs](./assistants.md#execution), [cron jobs](../cloud/concepts/cron_jobs.md), [webhooks](../cloud/concepts/webhooks.md), and more.
!!! tip "API reference"
For detailed information on the API endpoints and data models, see [LangGraph Platform API reference docs](../cloud/reference/api/api_ref.html).
## Server versions
There are two versions of LangGraph Server:
- `Lite` is a limited version of the LangGraph Server that you can run locally or in a self-hosted manner (up to 1 million [nodes executed](../concepts/faq.md#what-does-nodes-executed-mean-for-langgraph-platform-usage) per year).
- `Enterprise` is the full version of the LangGraph Server. To use the `Enterprise` version, you must acquire a license key that you will need to specify when running the Docker image. To acquire a license key, please email sales@langchain.dev.
Feature Differences:
| | Lite | Enterprise |
|-------|------------|------------|
| [Cron Jobs](../cloud/concepts/cron_jobs.md) |❌|✅|
| [Custom Authentication](../concepts/auth.md) |❌|✅|
| [Deployment options](../concepts/deployment_options.md) | Standalone container | Cloud SaaS, Self-Hosted Data Plane, Self-Hosted Control Plane, Standalone container
## Application structure
To deploy a LangGraph Server application, you need to specify the graph(s) you want to deploy, as well as any relevant configuration settings, such as dependencies and environment variables.
Read the [application structure](./application_structure.md) guide to learn how to structure your LangGraph application for deployment.
## Parts of a deployment
When you deploy LangGraph Server, you are deploying one or more [graphs](#graphs), a database for [persistence](persistence.md), and a task queue.
### Graphs
When you deploy a graph with LangGraph Server, you are deploying a "blueprint" for an [Assistant](assistants.md).
An [Assistant](assistants.md) is a graph paired with specific configuration settings. You can create multiple assistants per graph, each with unique settings to accommodate different use cases
that can be served by the same graph.
Upon deployment, LangGraph Server will automatically create a default assistant for each graph using the graph's default configuration settings.
!!! note
We often think of a graph as implementing an [agent](agentic_concepts.md), but a graph does not necessarily need to implement an agent. For example, a graph could implement a simple
chatbot that only supports back-and-forth conversation, without the ability to influence any application control flow. In reality, as applications get more complex, a graph will often implement a more complex flow that may use [multiple agents](./multi_agent.md) working in tandem.
### Persistence and task queue
LangGraph Server leverages a database for [persistence](persistence.md) and a task queue.
Currently, only [Postgres](https://www.postgresql.org/) is supported as a database for LangGraph Server and [Redis](https://redis.io/) as the task queue.
If you're deploying using [LangGraph Platform](./langgraph_cloud.md), these components are managed for you. If you're deploying LangGraph Server on your own infrastructure, you'll need to set up and manage these components yourself.
Please review the [deployment options](./deployment_options.md) guide for more information on how these components are set up and managed.
## Learn more
* LangGraph [Application Structure](./application_structure.md) guide explains how to structure your LangGraph application for deployment.
* The [LangGraph Platform API Reference](../cloud/reference/api/api_ref.html) provides detailed information on the API endpoints and data models.
---
concepts/template_applications.md
---
---
search:
boost: 2
---
# Template Applications
Templates are open source reference applications designed to help you get started quickly when building with LangGraph. They provide working examples of common agentic workflows that can be customized to your needs.
You can create an application from a template using the LangGraph CLI.
!!! info "Requirements"
- Python >= 3.11
- [LangGraph CLI](https://langchain-ai.github.io/langgraph/cloud/reference/cli/): Requires langchain-cli[inmem] >= 0.1.58
## Install the LangGraph CLI
=== "Python"
```bash
pip install "langgraph-cli[inmem]" --upgrade
```
Or via [`uv`](https://docs.astral.sh/uv/getting-started/installation/) (recommended):
```bash
uvx --from "langgraph-cli[inmem]" langgraph dev --help
```
=== "JS"
```bash
npx @langchain/langgraph-cli --help
```
## Available Templates
| Template | Description | Python | JS/TS |
|---------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------|---------------------------------------------------------------------|
| **New LangGraph Project** | A simple, minimal chatbot with memory. | [Repo](https://github.com/langchain-ai/new-langgraph-project) | [Repo](https://github.com/langchain-ai/new-langgraphjs-project) |
| **ReAct Agent** | A simple agent that can be flexibly extended to many tools. | [Repo](https://github.com/langchain-ai/react-agent) | [Repo](https://github.com/langchain-ai/react-agent-js) |
| **Memory Agent** | A ReAct-style agent with an additional tool to store memories for use across threads. | [Repo](https://github.com/langchain-ai/memory-agent) | [Repo](https://github.com/langchain-ai/memory-agent-js) |
| **Retrieval Agent** | An agent that includes a retrieval-based question-answering system. | [Repo](https://github.com/langchain-ai/retrieval-agent-template) | [Repo](https://github.com/langchain-ai/retrieval-agent-template-js) |
| **Data-Enrichment Agent** | An agent that performs web searches and organizes its findings into a structured format. | [Repo](https://github.com/langchain-ai/data-enrichment) | [Repo](https://github.com/langchain-ai/data-enrichment-js) |
## 🌱 Create a LangGraph App
To create a new app from a template, use the `langgraph new` command.
=== "Python"
```bash
langgraph new
```
Or via [`uv`](https://docs.astral.sh/uv/getting-started/installation/) (recommended):
```bash
uvx --from "langgraph-cli[inmem]" langgraph new
```
=== "JS"
```bash
npx @langchain/langgraph-cli new
```
## Next Steps
Review the `README.md` file in the root of your new LangGraph app for more information about the template and how to customize it.
After configuring the app properly and adding your API keys, you can start the app using the LangGraph CLI:
=== "Python"
```bash
langgraph dev
```
Or via [`uv`](https://docs.astral.sh/uv/getting-started/installation/) (recommended):
```bash
uvx --from "langgraph-cli[inmem]" --with-editable . langgraph dev
```
??? info "Missing Local Package?"
If you are not using `uv` and run into a "`ModuleNotFoundError`" or "`ImportError`", even after installing the local package (`pip install -e .`), it is likely the case that you need to install the CLI into your local virtual environment to make the CLI "aware" of the local package. You can do this by running `python -m pip install "langgraph-cli[inmem]"` and re-activating your virtual environment before running `langgraph dev`.
=== "JS"
```bash
npx @langchain/langgraph-cli dev
```
See the following guides for more information on how to deploy your app:
- **[Launch Local LangGraph Server](../tutorials/langgraph-platform/local-server.md)**: This quick start guide shows how to start a LangGraph Server locally for the **ReAct Agent** template. The steps are similar for other templates.
- **[Deploy to LangGraph Platform](../cloud/quick_start.md)**: Deploy your LangGraph app using LangGraph Platform.
---
concepts/langgraph_standalone_container.md
---
---
search:
boost: 2
---
# Standalone Container
To deploy a [LangGraph Server](../concepts/langgraph_server.md), follow the how-to guide for [how to deploy a Standalone Container](../cloud/deployment/standalone_container.md).
## Overview
The Standalone Container deployment option is the least restrictive model for deployment. There is no [control plane](./langgraph_control_plane.md). [Data plane](./langgraph_data_plane.md) infrastructure is managed by you.
| | [Control plane](../concepts/langgraph_control_plane.md) | [Data plane](../concepts/langgraph_data_plane.md) |
|-------------------|-------------------|------------|
| **What is it?** | n/a |
LangGraph Servers
Postgres, Redis, etc
|
| **Where is it hosted?** | n/a | Your cloud |
| **Who provisions and manages it?** | n/a | You |
!!! warning
LangGraph Platform should not be deployed in serverless environments. Scale to zero may cause task loss and scaling up will not work reliably.
## Architecture

## Compute Platforms
### Kubernetes
The Standalone Container deployment option supports deploying data plane infrastructure to a Kubernetes cluster.
### Docker
The Standalone Container deployment option supports deploying data plane infrastructure to any Docker-supported compute platform.
## Lite vs. Enterprise
The Standalone Container deployment option supports both of the [server versions](../concepts/langgraph_server.md#langgraph-server):
- The `Lite` version is free, but has limited features.
- The `Enterprise` version has custom pricing and is fully featured.
For more details on feature difference, see [LangGraph Server](../concepts/langgraph_server.md#server-versions).
---
concepts/breakpoints.md
---
---
search:
boost: 2
---
# Breakpoints
[Breakpoints](../how-tos/human_in_the_loop/breakpoints.md) pause graph execution at defined points and let you step through each stage. They use LangGraph's [**persistence layer**](./persistence.md), which saves the graph state after each step.
With breakpoints, you can inspect the graph's state and node inputs at any point. Execution pauses **indefinitely** until you resume, as the checkpointer preserves the state.
{: style="max-height:400px"}
An example graph consisting of 3 sequential steps with a breakpoint before step_3.
!!! tip
For information on how to use breakpoints, see [Set breakpoints](../how-tos/human_in_the_loop/breakpoints.md) and [Set breakpoints using Server API](../cloud/how-tos/human_in_the_loop_breakpoint.md).
---
concepts/langgraph_platform.md
---
---
search:
boost: 2
---
# LangGraph Platform
Develop, deploy, scale, and manage agents with **LangGraph Platform** — the purpose-built platform for long-running, agentic workflows.
!!! tip "Get started with LangGraph Platform"
Check out the [LangGraph Platform quickstart](../tutorials/langgraph-platform/local-server.md) for instructions on how to use LangGraph Platform to run a LangGraph application locally.
## Why use LangGraph Platform?
LangGraph Platform makes it easy to get your agent running in production — whether it’s built with LangGraph or another framework — so you can focus on your app logic, not infrastructure. Deploy with one click to get a live endpoint, and use our robust APIs and built-in task queues to handle production scale.
- **[Streaming Support](../cloud/how-tos/streaming.md)**: As agents grow more sophisticated, they often benefit from streaming both token outputs and intermediate states back to the user. Without this, users are left waiting for potentially long operations with no feedback. LangGraph Server provides multiple streaming modes optimized for various application needs.
- **[Background Runs](../cloud/how-tos/background_run.md)**: For agents that take longer to process (e.g., hours), maintaining an open connection can be impractical. The LangGraph Server supports launching agent runs in the background and provides both polling endpoints and webhooks to monitor run status effectively.
- **Support for long runs**: Regular server setups often encounter timeouts or disruptions when handling requests that take a long time to complete. LangGraph Server’s API provides robust support for these tasks by sending regular heartbeat signals, preventing unexpected connection closures during prolonged processes.
- **Handling Burstiness**: Certain applications, especially those with real-time user interaction, may experience "bursty" request loads where numerous requests hit the server simultaneously. LangGraph Server includes a task queue, ensuring requests are handled consistently without loss, even under heavy loads.
- **[Double-texting](../cloud/how-tos/interrupt_concurrent.md)**: In user-driven applications, it’s common for users to send multiple messages rapidly. This “double texting” can disrupt agent flows if not handled properly. LangGraph Server offers built-in strategies to address and manage such interactions.
- **[Checkpointers and memory management](persistence.md#checkpoints)**: For agents needing persistence (e.g., conversation memory), deploying a robust storage solution can be complex. LangGraph Platform includes optimized [checkpointers](persistence.md#checkpoints) and a [memory store](persistence.md#memory-store), managing state across sessions without the need for custom solutions.
- **[Human-in-the-loop support](../cloud/how-tos/human_in_the_loop_breakpoint.md)**: In many applications, users require a way to intervene in agent processes. LangGraph Server provides specialized endpoints for human-in-the-loop scenarios, simplifying the integration of manual oversight into agent workflows.
- **[LangGraph Studio](./langgraph_studio.md)**: Enables visualization, interaction, and debugging of agentic systems that implement the LangGraph Server API protocol. Studio also integrates with LangSmith to enable tracing, evaluation, and prompt engineering.
- **[Deployment](./deployment_options.md)**: There are four ways to deploy on LangGraph Platform: [Cloud SaaS](../concepts/langgraph_cloud.md), [Self-Hosted Data Plane](../concepts/langgraph_self_hosted_data_plane.md), [Self-Hosted Control Plane](../concepts/langgraph_self_hosted_control_plane.md), and [Standalone Container](../concepts/langgraph_standalone_container.md).
---
concepts/sdk.md
---
---
search:
boost: 2
---
# LangGraph SDK
LangGraph Platform provides both a Python SDK for interacting with [LangGraph Server](./langgraph_server.md).
!!! tip "Python SDK reference"
For detailed information about the Python SDK, see [Python SDK reference docs](../cloud/reference/sdk/python_sdk_ref.md).
## Installation
You can install the packages using the appropriate package manager for your language:
=== "Python"
```bash
pip install langgraph-sdk
```
=== "JS"
```bash
yarn add @langchain/langgraph-sdk
```
## Python sync vs. async
The Python SDK provides both synchronous (`get_sync_client`) and asynchronous (`get_client`) clients for interacting with LangGraph Server:
=== "Sync"
```python
from langgraph_sdk import get_sync_client
client = get_sync_client(url=..., api_key=...)
client.assistants.search()
```
=== "Async"
```python
from langgraph_sdk import get_client
client = get_client(url=..., api_key=...)
await client.assistants.search()
```
## Learn more
- [Python SDK Reference](../cloud/reference/sdk/python_sdk_ref.md)
- [LangGraph CLI API Reference](../cloud/reference/cli.md)
- [JS/TS SDK Reference](../cloud/reference/sdk/js_ts_sdk_ref.md)
---
concepts/streaming.md
---
---
search:
boost: 2
---
# Streaming
LangGraph implements a streaming system to surface real-time updates, allowing for responsive and transparent user experiences.
LangGraph’s streaming system lets you surface live feedback from graph runs to your app.
There are three main categories of data you can stream:
1. **Workflow progress** — get state updates after each graph node is executed.
2. **LLM tokens** — stream language model tokens as they’re generated.
3. **Custom updates** — emit user-defined signals (e.g., “Fetched 10/100 records”).
## What’s possible with LangGraph streaming
- [**Stream LLM tokens**](../how-tos/streaming.md#messages) — capture token streams from anywhere: inside nodes, subgraphs, or tools.
- [**Emit progress notifications from tools**](../how-tos/streaming.md#stream-custom-data) — send custom updates or progress signals directly from tool functions.
- [**Stream from subgraphs**](../how-tos/streaming.md#stream-subgraph-outputs) — include outputs from both the parent graph and any nested subgraphs.
- [**Use any LLM**](../how-tos/streaming.md#use-with-any-llm) — stream tokens from any LLM, even if it's not a LangChain model using the `custom` streaming mode.
- [**Use multiple streaming modes**](../how-tos/streaming.md#stream-multiple-modes) — choose from `values` (full state), `updates` (state deltas), `messages` (LLM tokens + metadata), `custom` (arbitrary user data), or `debug` (detailed traces).
---
concepts/application_structure.md
---
---
search:
boost: 2
---
# Application Structure
## Overview
A LangGraph application consists of one or more graphs, a configuration file (`langgraph.json`), a file that specifies dependencies, and an optional `.env` file that specifies environment variables.
This guide shows a typical structure of an application and shows how the required information to deploy an application using the LangGraph Platform is specified.
## Key Concepts
To deploy using the LangGraph Platform, the following information should be provided:
1. A [LangGraph configuration file](#configuration-file-concepts) (`langgraph.json`) that specifies the dependencies, graphs, and environment variables to use for the application.
2. The [graphs](#graphs) that implement the logic of the application.
3. A file that specifies [dependencies](#dependencies) required to run the application.
4. [Environment variables](#environment-variables) that are required for the application to run.
## File Structure
Below are examples of directory structures for Python and JavaScript applications:
=== "Python (requirements.txt)"
```plaintext
my-app/
├── my_agent # all project code lies within here
│ ├── utils # utilities for your graph
│ │ ├── __init__.py
│ │ ├── tools.py # tools for your graph
│ │ ├── nodes.py # node functions for your graph
│ │ └── state.py # state definition of your graph
│ ├── __init__.py
│ └── agent.py # code for constructing your graph
├── .env # environment variables
├── requirements.txt # package dependencies
└── langgraph.json # configuration file for LangGraph
```
=== "Python (pyproject.toml)"
```plaintext
my-app/
├── my_agent # all project code lies within here
│ ├── utils # utilities for your graph
│ │ ├── __init__.py
│ │ ├── tools.py # tools for your graph
│ │ ├── nodes.py # node functions for your graph
│ │ └── state.py # state definition of your graph
│ ├── __init__.py
│ └── agent.py # code for constructing your graph
├── .env # environment variables
├── langgraph.json # configuration file for LangGraph
└── pyproject.toml # dependencies for your project
```
=== "JS (package.json)"
```plaintext
my-app/
├── src # all project code lies within here
│ ├── utils # optional utilities for your graph
│ │ ├── tools.ts # tools for your graph
│ │ ├── nodes.ts # node functions for your graph
│ │ └── state.ts # state definition of your graph
│ └── agent.ts # code for constructing your graph
├── package.json # package dependencies
├── .env # environment variables
└── langgraph.json # configuration file for LangGraph
```
!!! note
The directory structure of a LangGraph application can vary depending on the programming language and the package manager used.
## Configuration File {#configuration-file-concepts}
The `langgraph.json` file is a JSON file that specifies the dependencies, graphs, environment variables, and other settings required to deploy a LangGraph application.
See the [LangGraph configuration file reference](../cloud/reference/cli.md#configuration-file) for details on all supported keys in the JSON file.
!!! tip
The [LangGraph CLI](./langgraph_cli.md) defaults to using the configuration file `langgraph.json` in the current directory.
### Examples
=== "Python"
* The dependencies involve a custom local package and the `langchain_openai` package.
* A single graph will be loaded from the file `./your_package/your_file.py` with the variable `variable`.
* The environment variables are loaded from the `.env` file.
```json
{
"dependencies": [
"langchain_openai",
"./your_package"
],
"graphs": {
"my_agent": "./your_package/your_file.py:agent"
},
"env": "./.env"
}
```
=== "JavaScript"
* The dependencies will be loaded from a dependency file in the local directory (e.g., `package.json`).
* A single graph will be loaded from the file `./your_package/your_file.js` with the function `agent`.
* The environment variable `OPENAI_API_KEY` is set inline.
```json
{
"dependencies": [
"."
],
"graphs": {
"my_agent": "./your_package/your_file.js:agent"
},
"env": {
"OPENAI_API_KEY": "secret-key"
}
}
```
## Dependencies
A LangGraph application may depend on other Python packages or JavaScript libraries (depending on the programming language in which the application is written).
You will generally need to specify the following information for dependencies to be set up correctly:
1. A file in the directory that specifies the dependencies (e.g. `requirements.txt`, `pyproject.toml`, or `package.json`).
2. A `dependencies` key in the [LangGraph configuration file](#configuration-file-concepts) that specifies the dependencies required to run the LangGraph application.
3. Any additional binaries or system libraries can be specified using `dockerfile_lines` key in the [LangGraph configuration file](#configuration-file-concepts).
## Graphs
Use the `graphs` key in the [LangGraph configuration file](#configuration-file-concepts) to specify which graphs will be available in the deployed LangGraph application.
You can specify one or more graphs in the configuration file. Each graph is identified by a name (which should be unique) and a path for either: (1) the compiled graph or (2) a function that makes a graph is defined.
## Environment Variables
If you're working with a deployed LangGraph application locally, you can configure environment variables in the `env` key of the [LangGraph configuration file](#configuration-file-concepts).
For a production deployment, you will typically want to configure the environment variables in the deployment environment.
---
concepts/multi_agent.md
---
---
search:
boost: 2
---
# Multi-agent systems
An [agent](./agentic_concepts.md#agent-architectures) is _a system that uses an LLM to decide the control flow of an application_. As you develop these systems, they might grow more complex over time, making them harder to manage and scale. For example, you might run into the following problems:
- agent has too many tools at its disposal and makes poor decisions about which tool to call next
- context grows too complex for a single agent to keep track of
- there is a need for multiple specialization areas in the system (e.g. planner, researcher, math expert, etc.)
To tackle these, you might consider breaking your application into multiple smaller, independent agents and composing them into a **multi-agent system**. These independent agents can be as simple as a prompt and an LLM call, or as complex as a [ReAct](./agentic_concepts.md#tool-calling-agent) agent (and more!).
The primary benefits of using multi-agent systems are:
- **Modularity**: Separate agents make it easier to develop, test, and maintain agentic systems.
- **Specialization**: You can create expert agents focused on specific domains, which helps with the overall system performance.
- **Control**: You can explicitly control how agents communicate (as opposed to relying on function calling).
## Multi-agent architectures

There are several ways to connect agents in a multi-agent system:
- **Network**: each agent can communicate with [every other agent](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/multi-agent-collaboration/). Any agent can decide which other agent to call next.
- **Supervisor**: each agent communicates with a single [supervisor](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/) agent. Supervisor agent makes decisions on which agent should be called next.
- **Supervisor (tool-calling)**: this is a special case of supervisor architecture. Individual agents can be represented as tools. In this case, a supervisor agent uses a tool-calling LLM to decide which of the agent tools to call, as well as the arguments to pass to those agents.
- **Hierarchical**: you can define a multi-agent system with [a supervisor of supervisors](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/hierarchical_agent_teams/). This is a generalization of the supervisor architecture and allows for more complex control flows.
- **Custom multi-agent workflow**: each agent communicates with only a subset of agents. Parts of the flow are deterministic, and only some agents can decide which other agents to call next.
### Handoffs
In multi-agent architectures, agents can be represented as graph nodes. Each agent node executes its step(s) and decides whether to finish execution or route to another agent, including potentially routing to itself (e.g., running in a loop). A common pattern in multi-agent interactions is **handoffs**, where one agent *hands off* control to another. Handoffs allow you to specify:
- __destination__: target agent to navigate to (e.g., name of the node to go to)
- __payload__: [information to pass to that agent](#communication-and-state-management) (e.g., state update)
To implement handoffs in LangGraph, agent nodes can return [`Command`](./low_level.md#command) object that allows you to combine both control flow and state updates:
```python
def agent(state) -> Command[Literal["agent", "another_agent"]]:
# the condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.
goto = get_next_agent(...) # 'agent' / 'another_agent'
return Command(
# Specify which agent to call next
goto=goto,
# Update the graph state
update={"my_state_key": "my_state_value"}
)
```
In a more complex scenario where each agent node is itself a graph (i.e., a [subgraph](./subgraphs.md)), a node in one of the agent subgraphs might want to navigate to a different agent. For example, if you have two agents, `alice` and `bob` (subgraph nodes in a parent graph), and `alice` needs to navigate to `bob`, you can set `graph=Command.PARENT` in the `Command` object:
```python
def some_node_inside_alice(state):
return Command(
goto="bob",
update={"my_state_key": "my_state_value"},
# specify which graph to navigate to (defaults to the current graph)
graph=Command.PARENT,
)
```
!!! note
If you need to support visualization for subgraphs communicating using `Command(graph=Command.PARENT)` you would need to wrap them in a node function with `Command` annotation, e.g. instead of this:
```python
builder.add_node(alice)
```
you would need to do this:
```python
def call_alice(state) -> Command[Literal["bob"]]:
return alice.invoke(state)
builder.add_node("alice", call_alice)
```
#### Handoffs as tools
One of the most common agent types is a [tool-calling agent](../agents/overview.md). For those types of agents, a common pattern is wrapping a handoff in a tool call, e.g.:
```python
from langchain_core.tools import tool
@tool
def transfer_to_bob():
"""Transfer to bob."""
return Command(
# name of the agent (node) to go to
goto="bob",
# data to send to the agent
update={"my_state_key": "my_state_value"},
# indicate to LangGraph that we need to navigate to
# agent node in a parent graph
graph=Command.PARENT,
)
```
This is a special case of updating the graph state from tools where, in addition to the state update, the control flow is included as well.
!!! important
If you want to use tools that return `Command`, you can either use prebuilt [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent] / [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode] components, or implement your own tool-executing node that collects `Command` objects returned by the tools and returns a list of them, e.g.:
```python
def call_tools(state):
...
commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]
return commands
```
Let's now take a closer look at the different multi-agent architectures.
### Network
In this architecture, agents are defined as graph nodes. Each agent can communicate with every other agent (many-to-many connections) and can decide which agent to call next. This architecture is good for problems that do not have a clear hierarchy of agents or a specific sequence in which agents should be called.
```python
from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from langgraph.graph import StateGraph, MessagesState, START, END
model = ChatOpenAI()
def agent_1(state: MessagesState) -> Command[Literal["agent_2", "agent_3", END]]:
# you can pass relevant parts of the state to the LLM (e.g., state["messages"])
# to determine which agent to call next. a common pattern is to call the model
# with a structured output (e.g. force it to return an output with a "next_agent" field)
response = model.invoke(...)
# route to one of the agents or exit based on the LLM's decision
# if the LLM returns "__end__", the graph will finish execution
return Command(
goto=response["next_agent"],
update={"messages": [response["content"]]},
)
def agent_2(state: MessagesState) -> Command[Literal["agent_1", "agent_3", END]]:
response = model.invoke(...)
return Command(
goto=response["next_agent"],
update={"messages": [response["content"]]},
)
def agent_3(state: MessagesState) -> Command[Literal["agent_1", "agent_2", END]]:
...
return Command(
goto=response["next_agent"],
update={"messages": [response["content"]]},
)
builder = StateGraph(MessagesState)
builder.add_node(agent_1)
builder.add_node(agent_2)
builder.add_node(agent_3)
builder.add_edge(START, "agent_1")
network = builder.compile()
```
### Supervisor
In this architecture, we define agents as nodes and add a supervisor node (LLM) that decides which agent nodes should be called next. We use [`Command`](./low_level.md#command) to route execution to the appropriate agent node based on supervisor's decision. This architecture also lends itself well to running multiple agents in parallel or using [map-reduce](../how-tos/graph-api.md#map-reduce-and-the-send-api) pattern.
```python
from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from langgraph.graph import StateGraph, MessagesState, START, END
model = ChatOpenAI()
def supervisor(state: MessagesState) -> Command[Literal["agent_1", "agent_2", END]]:
# you can pass relevant parts of the state to the LLM (e.g., state["messages"])
# to determine which agent to call next. a common pattern is to call the model
# with a structured output (e.g. force it to return an output with a "next_agent" field)
response = model.invoke(...)
# route to one of the agents or exit based on the supervisor's decision
# if the supervisor returns "__end__", the graph will finish execution
return Command(goto=response["next_agent"])
def agent_1(state: MessagesState) -> Command[Literal["supervisor"]]:
# you can pass relevant parts of the state to the LLM (e.g., state["messages"])
# and add any additional logic (different models, custom prompts, structured output, etc.)
response = model.invoke(...)
return Command(
goto="supervisor",
update={"messages": [response]},
)
def agent_2(state: MessagesState) -> Command[Literal["supervisor"]]:
response = model.invoke(...)
return Command(
goto="supervisor",
update={"messages": [response]},
)
builder = StateGraph(MessagesState)
builder.add_node(supervisor)
builder.add_node(agent_1)
builder.add_node(agent_2)
builder.add_edge(START, "supervisor")
supervisor = builder.compile()
```
Check out this [tutorial](https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/) for an example of supervisor multi-agent architecture.
### Supervisor (tool-calling)
In this variant of the [supervisor](#supervisor) architecture, we define a supervisor [agent](./agentic_concepts.md#agent-architectures) which is responsible for calling sub-agents. The sub-agents are exposed to the supervisor as tools, and the supervisor agent decides which tool to call next. The supervisor agent follows a [standard implementation](./agentic_concepts.md#tool-calling-agent) as an LLM running in a while loop calling tools until it decides to stop.
```python
from typing import Annotated
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import InjectedState, create_react_agent
model = ChatOpenAI()
# this is the agent function that will be called as tool
# notice that you can pass the state to the tool via InjectedState annotation
def agent_1(state: Annotated[dict, InjectedState]):
# you can pass relevant parts of the state to the LLM (e.g., state["messages"])
# and add any additional logic (different models, custom prompts, structured output, etc.)
response = model.invoke(...)
# return the LLM response as a string (expected tool response format)
# this will be automatically turned to ToolMessage
# by the prebuilt create_react_agent (supervisor)
return response.content
def agent_2(state: Annotated[dict, InjectedState]):
response = model.invoke(...)
return response.content
tools = [agent_1, agent_2]
# the simplest way to build a supervisor w/ tool-calling is to use prebuilt ReAct agent graph
# that consists of a tool-calling LLM node (i.e. supervisor) and a tool-executing node
supervisor = create_react_agent(model, tools)
```
### Hierarchical
As you add more agents to your system, it might become too hard for the supervisor to manage all of them. The supervisor might start making poor decisions about which agent to call next, or the context might become too complex for a single supervisor to keep track of. In other words, you end up with the same problems that motivated the multi-agent architecture in the first place.
To address this, you can design your system _hierarchically_. For example, you can create separate, specialized teams of agents managed by individual supervisors, and a top-level supervisor to manage the teams.
```python
from typing import Literal
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.types import Command
model = ChatOpenAI()
# define team 1 (same as the single supervisor example above)
def team_1_supervisor(state: MessagesState) -> Command[Literal["team_1_agent_1", "team_1_agent_2", END]]:
response = model.invoke(...)
return Command(goto=response["next_agent"])
def team_1_agent_1(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:
response = model.invoke(...)
return Command(goto="team_1_supervisor", update={"messages": [response]})
def team_1_agent_2(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:
response = model.invoke(...)
return Command(goto="team_1_supervisor", update={"messages": [response]})
team_1_builder = StateGraph(Team1State)
team_1_builder.add_node(team_1_supervisor)
team_1_builder.add_node(team_1_agent_1)
team_1_builder.add_node(team_1_agent_2)
team_1_builder.add_edge(START, "team_1_supervisor")
team_1_graph = team_1_builder.compile()
# define team 2 (same as the single supervisor example above)
class Team2State(MessagesState):
next: Literal["team_2_agent_1", "team_2_agent_2", "__end__"]
def team_2_supervisor(state: Team2State):
...
def team_2_agent_1(state: Team2State):
...
def team_2_agent_2(state: Team2State):
...
team_2_builder = StateGraph(Team2State)
...
team_2_graph = team_2_builder.compile()
# define top-level supervisor
builder = StateGraph(MessagesState)
def top_level_supervisor(state: MessagesState) -> Command[Literal["team_1_graph", "team_2_graph", END]]:
# you can pass relevant parts of the state to the LLM (e.g., state["messages"])
# to determine which team to call next. a common pattern is to call the model
# with a structured output (e.g. force it to return an output with a "next_team" field)
response = model.invoke(...)
# route to one of the teams or exit based on the supervisor's decision
# if the supervisor returns "__end__", the graph will finish execution
return Command(goto=response["next_team"])
builder = StateGraph(MessagesState)
builder.add_node(top_level_supervisor)
builder.add_node("team_1_graph", team_1_graph)
builder.add_node("team_2_graph", team_2_graph)
builder.add_edge(START, "top_level_supervisor")
builder.add_edge("team_1_graph", "top_level_supervisor")
builder.add_edge("team_2_graph", "top_level_supervisor")
graph = builder.compile()
```
### Custom multi-agent workflow
In this architecture we add individual agents as graph nodes and define the order in which agents are called ahead of time, in a custom workflow. In LangGraph the workflow can be defined in two ways:
- **Explicit control flow (normal edges)**: LangGraph allows you to explicitly define the control flow of your application (i.e. the sequence of how agents communicate) explicitly, via [normal graph edges](./low_level.md#normal-edges). This is the most deterministic variant of this architecture above — we always know which agent will be called next ahead of time.
- **Dynamic control flow (Command)**: in LangGraph you can allow LLMs to decide parts of your application control flow. This can be achieved by using [`Command`](./low_level.md#command). A special case of this is a [supervisor tool-calling](#supervisor-tool-calling) architecture. In that case, the tool-calling LLM powering the supervisor agent will make decisions about the order in which the tools (agents) are being called.
```python
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, MessagesState, START
model = ChatOpenAI()
def agent_1(state: MessagesState):
response = model.invoke(...)
return {"messages": [response]}
def agent_2(state: MessagesState):
response = model.invoke(...)
return {"messages": [response]}
builder = StateGraph(MessagesState)
builder.add_node(agent_1)
builder.add_node(agent_2)
# define the flow explicitly
builder.add_edge(START, "agent_1")
builder.add_edge("agent_1", "agent_2")
```
## Communication and state management
The most important thing when building multi-agent systems is figuring out how the agents communicate.
A common, generic way for agents to communicate is via a list of messages. This opens up the following questions:
- Do agents communicate [**via handoffs or via tool calls**](#handoffs-vs-tool-calls)?
- What messages are [**passed from one agent to the next**](#message-passing-between-agents)?
- How are [**handoffs represented in the list of messages**](#representing-handoffs-in-message-history)?
- How do you [**manage state for subagents**](#state-management-for-subagents)?
Additionally, if you are dealing with more complex agents or wish to keep individual agent state separate from the multi-agent system state, you may need to use [**different state schemas**](#using-different-state-schemas).
### Handoffs vs tool calls
What is the "payload" that is being passed around between agents? In most of the architectures discussed above, the agents communicate via [handoffs](#handoffs) and pass the [graph state](./low_level.md#state) as part of the handoff payload. Specifically, agents pass around lists of messages as part of the graph state. In the case of the [supervisor with tool-calling](#supervisor-tool-calling), the payloads are tool call arguments.

### Message passing between agents
The most common way for agents to communicate is via a shared state channel, typically a list of messages. This assumes that there is always at least a single channel (key) in the state that is shared by the agents (e.g., `messages`). When communicating via a shared message list, there is an additional consideration: should the agents [share the full history](#sharing-full-thought-process) of their thought process or only [the final result](#sharing-only-final-results)?

#### Sharing full thought process
Agents can **share the full history** of their thought process (i.e., "scratchpad") with all other agents. This "scratchpad" would typically look like a [list of messages](./low_level.md#why-use-messages). The benefit of sharing the full thought process is that it might help other agents make better decisions and improve reasoning ability for the system as a whole. The downside is that as the number of agents and their complexity grows, the "scratchpad" will grow quickly and might require additional strategies for [memory management](../how-tos/memory/add-memory.md).
#### Sharing only final results
Agents can have their own private "scratchpad" and only **share the final result** with the rest of the agents. This approach might work better for systems with many agents or agents that are more complex. In this case, you would need to define agents with [different state schemas](#using-different-state-schemas).
For agents called as tools, the supervisor determines the inputs based on the tool schema. Additionally, LangGraph allows [passing state](../how-tos/tool-calling.md#short-term-memory) to individual tools at runtime, so subordinate agents can access parent state, if needed.
#### Indicating agent name in messages
It can be helpful to indicate which agent a particular AI message is from, especially for long message histories. Some LLM providers (like OpenAI) support adding a `name` parameter to messages — you can use that to attach the agent name to the message. If that is not supported, you can consider manually injecting the agent name into the message content, e.g., `alicemessage from alice`.
### Representing handoffs in message history
Handoffs are typically done via the LLM calling a dedicated [handoff tool](#handoffs-as-tools). This is represented as an [AI message](https://python.langchain.com/docs/concepts/messages/#aimessage) with tool calls that is passed to the next agent (LLM). Most LLM providers don't support receiving AI messages with tool calls **without** corresponding tool messages.
You therefore have two options:
1. Add an extra [tool message](https://python.langchain.com/docs/concepts/messages/#toolmessage) to the message list, e.g., "Successfully transferred to agent X"
2. Remove the AI message with the tool calls
In practice, we see that most developers opt for option (1).
### State management for subagents
A common practice is to have multiple agents communicating on a shared message list, but only [adding their final messages to the list](#sharing-only-final-results). This means that any intermediate messages (e.g., tool calls) are not saved in this list.
What if you __do__ want to save these messages so that if this particular subagent is invoked in the future you can pass those back in?
There are two high-level approaches to achieve that:
1. Store these messages in the shared message list, but filter the list before passing it to the subagent LLM. For example, you can choose to filter out all tool calls from **other** agents.
2. Store a separate message list for each agent (e.g., `alice_messages`) in the subagent's graph state. This would be their "view" of what the message history looks like.
### Using different state schemas
An agent might need to have a different state schema from the rest of the agents. For example, a search agent might only need to keep track of queries and retrieved documents. There are two ways to achieve this in LangGraph:
- Define [subgraph](./subgraphs.md) agents with a separate state schema. If there are no shared state keys (channels) between the subgraph and the parent graph, it’s important to [add input / output transformations](../how-tos/subgraph.md#different-state-schemas) so that the parent graph knows how to communicate with the subgraphs.
- Define agent node functions with a [private input state schema](../how-tos/graph-api.md/#pass-private-state-between-nodes) that is distinct from the overall graph state schema. This allows passing information that is only needed for executing that particular agent.
---
concepts/langgraph_cloud.md
---
---
search:
boost: 2
---
# Cloud SaaS
To deploy a [LangGraph Server](../concepts/langgraph_server.md), follow the how-to guide for [how to deploy to Cloud SaaS](../cloud/deployment/cloud.md).
## Overview
The Cloud SaaS deployment option is a fully managed model for deployment where we manage the [control plane](./langgraph_control_plane.md) and [data plane](./langgraph_data_plane.md) in our cloud.
| | [Control plane](../concepts/langgraph_control_plane.md) | [Data plane](../concepts/langgraph_data_plane.md) |
|-------------------|-------------------|------------|
| **What is it?** |
Control plane UI for creating deployments and revisions
Control plane APIs for creating deployments and revisions
|
Data plane "listener" for reconciling deployments with control plane state
LangGraph Servers
Postgres, Redis, etc
|
| **Where is it hosted?** | LangChain's cloud | LangChain's cloud |
| **Who provisions and manages it?** | LangChain | LangChain |
## Architecture

---
concepts/functional_api.md
---
---
search:
boost: 2
---
# Functional API concepts
## Overview
The **Functional API** allows you to add LangGraph's key features — [persistence](./persistence.md), [memory](../how-tos/memory/add-memory.md), [human-in-the-loop](./human_in_the_loop.md), and [streaming](./streaming.md) — to your applications with minimal changes to your existing code.
It is designed to integrate these features into existing code that may use standard language primitives for branching and control flow, such as `if` statements, `for` loops, and function calls. Unlike many data orchestration frameworks that require restructuring code into an explicit pipeline or DAG, the Functional API allows you to incorporate these capabilities without enforcing a rigid execution model.
The Functional API uses two key building blocks:
- **`@entrypoint`** – Marks a function as the starting point of a workflow, encapsulating logic and managing execution flow, including handling long-running tasks and interrupts.
- **`@task`** – Represents a discrete unit of work, such as an API call or data processing step, that can be executed asynchronously within an entrypoint. Tasks return a future-like object that can be awaited or resolved synchronously.
This provides a minimal abstraction for building workflows with state management and streaming.
!!! tip
For information on how to use the functional API, see [Use Functional API](../how-tos/use-functional-api.md).
## Functional API vs. Graph API
For users who prefer a more declarative approach, LangGraph's [Graph API](./low_level.md) allows you to define workflows using a Graph paradigm. Both APIs share the same underlying runtime, so you can use them together in the same application.
Here are some key differences:
- **Control flow**: The Functional API does not require thinking about graph structure. You can use standard Python constructs to define workflows. This will usually trim the amount of code you need to write.
- **Short-term memory**: The **GraphAPI** requires declaring a [**State**](./low_level.md#state) and may require defining [**reducers**](./low_level.md#reducers) to manage updates to the graph state. `@entrypoint` and `@tasks` do not require explicit state management as their state is scoped to the function and is not shared across functions.
- **Checkpointing**: Both APIs generate and use checkpoints. In the **Graph API** a new checkpoint is generated after every [superstep](./low_level.md). In the **Functional API**, when tasks are executed, their results are saved to an existing checkpoint associated with the given entrypoint instead of creating a new checkpoint.
- **Visualization**: The Graph API makes it easy to visualize the workflow as a graph which can be useful for debugging, understanding the workflow, and sharing with others. The Functional API does not support visualization as the graph is dynamically generated during runtime.
## Example
Below we demonstrate a simple application that writes an essay and [interrupts](human_in_the_loop.md) to request human review.
```python
from langgraph.checkpoint.memory import MemorySaver
from langgraph.func import entrypoint, task
from langgraph.types import interrupt
@task
def write_essay(topic: str) -> str:
"""Write an essay about the given topic."""
time.sleep(1) # A placeholder for a long-running task.
return f"An essay about topic: {topic}"
@entrypoint(checkpointer=MemorySaver())
def workflow(topic: str) -> dict:
"""A simple workflow that writes an essay and asks for a review."""
essay = write_essay("cat").result()
is_approved = interrupt({
# Any json-serializable payload provided to interrupt as argument.
# It will be surfaced on the client side as an Interrupt when streaming data
# from the workflow.
"essay": essay, # The essay we want reviewed.
# We can add any additional information that we need.
# For example, introduce a key called "action" with some instructions.
"action": "Please approve/reject the essay",
})
return {
"essay": essay, # The essay that was generated
"is_approved": is_approved, # Response from HIL
}
```
??? example "Detailed Explanation"
This workflow will write an essay about the topic "cat" and then pause to get a review from a human. The workflow can be interrupted for an indefinite amount of time until a review is provided.
When the workflow is resumed, it executes from the very start, but because the result of the `write_essay` task was already saved, the task result will be loaded from the checkpoint instead of being recomputed.
```python
import time
import uuid
from langgraph.func import entrypoint, task
from langgraph.types import interrupt
from langgraph.checkpoint.memory import MemorySaver
@task
def write_essay(topic: str) -> str:
"""Write an essay about the given topic."""
time.sleep(1) # This is a placeholder for a long-running task.
return f"An essay about topic: {topic}"
@entrypoint(checkpointer=MemorySaver())
def workflow(topic: str) -> dict:
"""A simple workflow that writes an essay and asks for a review."""
essay = write_essay("cat").result()
is_approved = interrupt({
# Any json-serializable payload provided to interrupt as argument.
# It will be surfaced on the client side as an Interrupt when streaming data
# from the workflow.
"essay": essay, # The essay we want reviewed.
# We can add any additional information that we need.
# For example, introduce a key called "action" with some instructions.
"action": "Please approve/reject the essay",
})
return {
"essay": essay, # The essay that was generated
"is_approved": is_approved, # Response from HIL
}
thread_id = str(uuid.uuid4())
config = {
"configurable": {
"thread_id": thread_id
}
}
for item in workflow.stream("cat", config):
print(item)
```
```pycon
{'write_essay': 'An essay about topic: cat'}
{'__interrupt__': (Interrupt(value={'essay': 'An essay about topic: cat', 'action': 'Please approve/reject the essay'}, resumable=True, ns=['workflow:f7b8508b-21c0-8b4c-5958-4e8de74d2684'], when='during'),)}
```
An essay has been written and is ready for review. Once the review is provided, we can resume the workflow:
```python
from langgraph.types import Command
# Get review from a user (e.g., via a UI)
# In this case, we're using a bool, but this can be any json-serializable value.
human_review = True
for item in workflow.stream(Command(resume=human_review), config):
print(item)
```
```pycon
{'workflow': {'essay': 'An essay about topic: cat', 'is_approved': False}}
```
The workflow has been completed and the review has been added to the essay.
## Entrypoint
The [`@entrypoint`][langgraph.func.entrypoint] decorator can be used to create a workflow from a function. It encapsulates workflow logic and manages execution flow, including handling *long-running tasks* and [interrupts](./human_in_the_loop.md).
### Definition
An **entrypoint** is defined by decorating a function with the `@entrypoint` decorator.
The function **must accept a single positional argument**, which serves as the workflow input. If you need to pass multiple pieces of data, use a dictionary as the input type for the first argument.
Decorating a function with an `entrypoint` produces a [`Pregel`][langgraph.pregel.Pregel.stream] instance which helps to manage the execution of the workflow (e.g., handles streaming, resumption, and checkpointing).
You will usually want to pass a **checkpointer** to the `@entrypoint` decorator to enable persistence and use features like **human-in-the-loop**.
=== "Sync"
```python
from langgraph.func import entrypoint
@entrypoint(checkpointer=checkpointer)
def my_workflow(some_input: dict) -> int:
# some logic that may involve long-running tasks like API calls,
# and may be interrupted for human-in-the-loop.
...
return result
```
=== "Async"
```python
from langgraph.func import entrypoint
@entrypoint(checkpointer=checkpointer)
async def my_workflow(some_input: dict) -> int:
# some logic that may involve long-running tasks like API calls,
# and may be interrupted for human-in-the-loop
...
return result
```
!!! important "Serialization"
The **inputs** and **outputs** of entrypoints must be JSON-serializable to support checkpointing. Please see the [serialization](#serialization) section for more details.
### Injectable parameters
When declaring an `entrypoint`, you can request access to additional parameters that will be injected automatically at run time. These parameters include:
| Parameter | Description |
|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **previous** | Access the state associated with the previous `checkpoint` for the given thread. See [short-term-memory](#short-term-memory). |
| **store** | An instance of [BaseStore][langgraph.store.base.BaseStore]. Useful for [long-term memory](../how-tos/use-functional-api.md#long-term-memory). |
| **writer** | Use to access the StreamWriter when working with Async Python < 3.11. See [streaming with functional API for details](../how-tos/use-functional-api.md#streaming). |
| **config** | For accessing run time configuration. See [RunnableConfig](https://python.langchain.com/docs/concepts/runnables/#runnableconfig) for information. |
!!! important
Declare the parameters with the appropriate name and type annotation.
??? example "Requesting Injectable Parameters"
```python
from langchain_core.runnables import RunnableConfig
from langgraph.func import entrypoint
from langgraph.store.base import BaseStore
from langgraph.store.memory import InMemoryStore
in_memory_store = InMemoryStore(...) # An instance of InMemoryStore for long-term memory
@entrypoint(
checkpointer=checkpointer, # Specify the checkpointer
store=in_memory_store # Specify the store
)
def my_workflow(
some_input: dict, # The input (e.g., passed via `invoke`)
*,
previous: Any = None, # For short-term memory
store: BaseStore, # For long-term memory
writer: StreamWriter, # For streaming custom data
config: RunnableConfig # For accessing the configuration passed to the entrypoint
) -> ...:
```
### Executing
Using the [`@entrypoint`](#entrypoint) yields a [`Pregel`][langgraph.pregel.Pregel.stream] object that can be executed using the `invoke`, `ainvoke`, `stream`, and `astream` methods.
=== "Invoke"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
my_workflow.invoke(some_input, config) # Wait for the result synchronously
```
=== "Async Invoke"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
await my_workflow.ainvoke(some_input, config) # Await result asynchronously
```
=== "Stream"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
for chunk in my_workflow.stream(some_input, config):
print(chunk)
```
=== "Async Stream"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
async for chunk in my_workflow.astream(some_input, config):
print(chunk)
```
### Resuming
Resuming an execution after an [interrupt][langgraph.types.interrupt] can be done by passing a **resume** value to the [Command][langgraph.types.Command] primitive.
=== "Invoke"
```python
from langgraph.types import Command
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
my_workflow.invoke(Command(resume=some_resume_value), config)
```
=== "Async Invoke"
```python
from langgraph.types import Command
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
await my_workflow.ainvoke(Command(resume=some_resume_value), config)
```
=== "Stream"
```python
from langgraph.types import Command
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
for chunk in my_workflow.stream(Command(resume=some_resume_value), config):
print(chunk)
```
=== "Async Stream"
```python
from langgraph.types import Command
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
async for chunk in my_workflow.astream(Command(resume=some_resume_value), config):
print(chunk)
```
**Resuming after an error**
To resume after an error, run the `entrypoint` with a `None` and the same **thread id** (config).
This assumes that the underlying **error** has been resolved and execution can proceed successfully.
=== "Invoke"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
my_workflow.invoke(None, config)
```
=== "Async Invoke"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
await my_workflow.ainvoke(None, config)
```
=== "Stream"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
for chunk in my_workflow.stream(None, config):
print(chunk)
```
=== "Async Stream"
```python
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
async for chunk in my_workflow.astream(None, config):
print(chunk)
```
### Short-term memory
When an `entrypoint` is defined with a `checkpointer`, it stores information between successive invocations on the same **thread id** in [checkpoints](persistence.md#checkpoints).
This allows accessing the state from the previous invocation using the `previous` parameter.
By default, the `previous` parameter is the return value of the previous invocation.
```python
@entrypoint(checkpointer=checkpointer)
def my_workflow(number: int, *, previous: Any = None) -> int:
previous = previous or 0
return number + previous
config = {
"configurable": {
"thread_id": "some_thread_id"
}
}
my_workflow.invoke(1, config) # 1 (previous was None)
my_workflow.invoke(2, config) # 3 (previous was 1 from the previous invocation)
```
#### `entrypoint.final`
[entrypoint.final][langgraph.func.entrypoint.final] is a special primitive that can be returned from an entrypoint and allows **decoupling** the value that is **saved in the checkpoint** from the **return value of the entrypoint**.
The first value is the return value of the entrypoint, and the second value is the value that will be saved in the checkpoint. The type annotation is `entrypoint.final[return_type, save_type]`.
```python
@entrypoint(checkpointer=checkpointer)
def my_workflow(number: int, *, previous: Any = None) -> entrypoint.final[int, int]:
previous = previous or 0
# This will return the previous value to the caller, saving
# 2 * number to the checkpoint, which will be used in the next invocation
# for the `previous` parameter.
return entrypoint.final(value=previous, save=2 * number)
config = {
"configurable": {
"thread_id": "1"
}
}
my_workflow.invoke(3, config) # 0 (previous was None)
my_workflow.invoke(1, config) # 6 (previous was 3 * 2 from the previous invocation)
```
## Task
A **task** represents a discrete unit of work, such as an API call or data processing step. It has two key characteristics:
* **Asynchronous Execution**: Tasks are designed to be executed asynchronously, allowing multiple operations to run concurrently without blocking.
* **Checkpointing**: Task results are saved to a checkpoint, enabling resumption of the workflow from the last saved state. (See [persistence](persistence.md) for more details).
### Definition
Tasks are defined using the `@task` decorator, which wraps a regular Python function.
```python
from langgraph.func import task
@task()
def slow_computation(input_value):
# Simulate a long-running operation
...
return result
```
!!! important "Serialization"
The **outputs** of tasks must be JSON-serializable to support checkpointing.
### Execution
**Tasks** can only be called from within an **entrypoint**, another **task**, or a [state graph node](./low_level.md#nodes).
Tasks *cannot* be called directly from the main application code.
When you call a **task**, it returns *immediately* with a future object. A future is a placeholder for a result that will be available later.
To obtain the result of a **task**, you can either wait for it synchronously (using `result()`) or await it asynchronously (using `await`).
=== "Synchronous Invocation"
```python
@entrypoint(checkpointer=checkpointer)
def my_workflow(some_input: int) -> int:
future = slow_computation(some_input)
return future.result() # Wait for the result synchronously
```
=== "Asynchronous Invocation"
```python
@entrypoint(checkpointer=checkpointer)
async def my_workflow(some_input: int) -> int:
return await slow_computation(some_input) # Await result asynchronously
```
## When to use a task
**Tasks** are useful in the following scenarios:
- **Checkpointing**: When you need to save the result of a long-running operation to a checkpoint, so you don't need to recompute it when resuming the workflow.
- **Human-in-the-loop**: If you're building a workflow that requires human intervention, you MUST use **tasks** to encapsulate any randomness (e.g., API calls) to ensure that the workflow can be resumed correctly. See the [determinism](#determinism) section for more details.
- **Parallel Execution**: For I/O-bound tasks, **tasks** enable parallel execution, allowing multiple operations to run concurrently without blocking (e.g., calling multiple APIs).
- **Observability**: Wrapping operations in **tasks** provides a way to track the progress of the workflow and monitor the execution of individual operations using [LangSmith](https://docs.smith.langchain.com/).
- **Retryable Work**: When work needs to be retried to handle failures or inconsistencies, **tasks** provide a way to encapsulate and manage the retry logic.
## Serialization
There are two key aspects to serialization in LangGraph:
1. `@entrypoint` inputs and outputs must be JSON-serializable.
2. `@task` outputs must be JSON-serializable.
These requirements are necessary for enabling checkpointing and workflow resumption. Use python primitives
like dictionaries, lists, strings, numbers, and booleans to ensure that your inputs and outputs are serializable.
Serialization ensures that workflow state, such as task results and intermediate values, can be reliably saved and restored. This is critical for enabling human-in-the-loop interactions, fault tolerance, and parallel execution.
Providing non-serializable inputs or outputs will result in a runtime error when a workflow is configured with a checkpointer.
## Determinism
To utilize features like **human-in-the-loop**, any randomness should be encapsulated inside of **tasks**. This guarantees that when execution is halted (e.g., for human in the loop) and then resumed, it will follow the same *sequence of steps*, even if **task** results are non-deterministic.
LangGraph achieves this behavior by persisting **task** and [**subgraph**](./subgraphs.md) results as they execute. A well-designed workflow ensures that resuming execution follows the *same sequence of steps*, allowing previously computed results to be retrieved correctly without having to re-execute them. This is particularly useful for long-running **tasks** or **tasks** with non-deterministic results, as it avoids repeating previously done work and allows resuming from essentially the same.
While different runs of a workflow can produce different results, resuming a **specific** run should always follow the same sequence of recorded steps. This allows LangGraph to efficiently look up **task** and **subgraph** results that were executed prior to the graph being interrupted and avoid recomputing them.
## Idempotency
Idempotency ensures that running the same operation multiple times produces the same result. This helps prevent duplicate API calls and redundant processing if a step is rerun due to a failure. Always place API calls inside **tasks** functions for checkpointing, and design them to be idempotent in case of re-execution. Re-execution can occur if a **task** starts, but does not complete successfully. Then, if the workflow is resumed, the **task** will run again. Use idempotency keys or verify existing results to avoid duplication.
## Common Pitfalls
### Handling side effects
Encapsulate side effects (e.g., writing to a file, sending an email) in tasks to ensure they are not executed multiple times when resuming a workflow.
=== "Incorrect"
In this example, a side effect (writing to a file) is directly included in the workflow, so it will be executed a second time when resuming the workflow.
```python hl_lines="5 6"
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
# This code will be executed a second time when resuming the workflow.
# Which is likely not what you want.
with open("output.txt", "w") as f:
f.write("Side effect executed")
value = interrupt("question")
return value
```
=== "Correct"
In this example, the side effect is encapsulated in a task, ensuring consistent execution upon resumption.
```python hl_lines="3 4"
from langgraph.func import task
@task
def write_to_file():
with open("output.txt", "w") as f:
f.write("Side effect executed")
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
# The side effect is now encapsulated in a task.
write_to_file().result()
value = interrupt("question")
return value
```
### Non-deterministic control flow
Operations that might give different results each time (like getting current time or random numbers) should be encapsulated in tasks to ensure that on resume, the same result is returned.
* In a task: Get random number (5) → interrupt → resume → (returns 5 again) → ...
* Not in a task: Get random number (5) → interrupt → resume → get new random number (7) → ...
This is especially important when using **human-in-the-loop** workflows with multiple interrupts calls. LangGraph keeps a list
of resume values for each task/entrypoint. When an interrupt is encountered, it's matched with the corresponding resume value.
This matching is strictly **index-based**, so the order of the resume values should match the order of the interrupts.
If order of execution is not maintained when resuming, one `interrupt` call may be matched with the wrong `resume` value, leading to incorrect results.
Please read the section on [determinism](#determinism) for more details.
=== "Incorrect"
In this example, the workflow uses the current time to determine which task to execute. This is non-deterministic because the result of the workflow depends on the time at which it is executed.
```python hl_lines="6"
from langgraph.func import entrypoint
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
t0 = inputs["t0"]
t1 = time.time()
delta_t = t1 - t0
if delta_t > 1:
result = slow_task(1).result()
value = interrupt("question")
else:
result = slow_task(2).result()
value = interrupt("question")
return {
"result": result,
"value": value
}
```
=== "Correct"
In this example, the workflow uses the input `t0` to determine which task to execute. This is deterministic because the result of the workflow depends only on the input.
```python hl_lines="5 6 12"
import time
from langgraph.func import task
@task
def get_time() -> float:
return time.time()
@entrypoint(checkpointer=checkpointer)
def my_workflow(inputs: dict) -> int:
t0 = inputs["t0"]
t1 = get_time().result()
delta_t = t1 - t0
if delta_t > 1:
result = slow_task(1).result()
value = interrupt("question")
else:
result = slow_task(2).result()
value = interrupt("question")
return {
"result": result,
"value": value
}
```
---
concepts/durable_execution.md
---
---
search:
boost: 2
---
# Durable Execution
**Durable execution** is a technique in which a process or workflow saves its progress at key points, allowing it to pause and later resume exactly where it left off. This is particularly useful in scenarios that require [human-in-the-loop](./human_in_the_loop.md), where users can inspect, validate, or modify the process before continuing, and in long-running tasks that might encounter interruptions or errors (e.g., calls to an LLM timing out). By preserving completed work, durable execution enables a process to resume without reprocessing previous steps -- even after a significant delay (e.g., a week later).
LangGraph's built-in [persistence](./persistence.md) layer provides durable execution for workflows, ensuring that the state of each execution step is saved to a durable store. This capability guarantees that if a workflow is interrupted -- whether by a system failure or for [human-in-the-loop](./human_in_the_loop.md) interactions -- it can be resumed from its last recorded state.
!!! tip
If you are using LangGraph with a checkpointer, you already have durable execution enabled. You can pause and resume workflows at any point, even after interruptions or failures.
To make the most of durable execution, ensure that your workflow is designed to be [deterministic](#determinism-and-consistent-replay) and [idempotent](#determinism-and-consistent-replay) and wrap any side effects or non-deterministic operations inside [tasks](./functional_api.md#task). You can use [tasks](./functional_api.md#task) from both the [StateGraph (Graph API)](./low_level.md) and the [Functional API](./functional_api.md).
## Requirements
To leverage durable execution in LangGraph, you need to:
1. Enable [persistence](./persistence.md) in your workflow by specifying a [checkpointer](./persistence.md#checkpointer-libraries) that will save workflow progress.
2. Specify a [thread identifier](./persistence.md#threads) when executing a workflow. This will track the execution history for a particular instance of the workflow.
3. Wrap any non-deterministic operations (e.g., random number generation) or operations with side effects (e.g., file writes, API calls) inside [tasks][langgraph.func.task] to ensure that when a workflow is resumed, these operations are not repeated for the particular run, and instead their results are retrieved from the persistence layer. For more information, see [Determinism and Consistent Replay](#determinism-and-consistent-replay).
## Determinism and Consistent Replay
When you resume a workflow run, the code does **NOT** resume from the **same line of code** where execution stopped; instead, it will identify an appropriate [starting point](#starting-points-for-resuming-workflows) from which to pick up where it left off. This means that the workflow will replay all steps from the [starting point](#starting-points-for-resuming-workflows) until it reaches the point where it was stopped.
As a result, when you are writing a workflow for durable execution, you must wrap any non-deterministic operations (e.g., random number generation) and any operations with side effects (e.g., file writes, API calls) inside [tasks](./functional_api.md#task) or [nodes](./low_level.md#nodes).
To ensure that your workflow is deterministic and can be consistently replayed, follow these guidelines:
- **Avoid Repeating Work**: If a [node](./low_level.md#nodes) contains multiple operations with side effects (e.g., logging, file writes, or network calls), wrap each operation in a separate **task**. This ensures that when the workflow is resumed, the operations are not repeated, and their results are retrieved from the persistence layer.
- **Encapsulate Non-Deterministic Operations:** Wrap any code that might yield non-deterministic results (e.g., random number generation) inside **tasks** or **nodes**. This ensures that, upon resumption, the workflow follows the exact recorded sequence of steps with the same outcomes.
- **Use Idempotent Operations**: When possible ensure that side effects (e.g., API calls, file writes) are idempotent. This means that if an operation is retried after a failure in the workflow, it will have the same effect as the first time it was executed. This is particularly important for operations that result in data writes. In the event that a **task** starts but fails to complete successfully, the workflow's resumption will re-run the **task**, relying on recorded outcomes to maintain consistency. Use idempotency keys or verify existing results to avoid unintended duplication, ensuring a smooth and predictable workflow execution.
For some examples of pitfalls to avoid, see the [Common Pitfalls](./functional_api.md#common-pitfalls) section in the functional API, which shows
how to structure your code using **tasks** to avoid these issues. The same principles apply to the [StateGraph (Graph API)][langgraph.graph.state.StateGraph].
## Using tasks in nodes
If a [node](./low_level.md#nodes) contains multiple operations, you may find it easier to convert each operation into a **task** rather than refactor the operations into individual nodes.
=== "Original"
```python hl_lines="16"
from typing import NotRequired
from typing_extensions import TypedDict
import uuid
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END
import requests
# Define a TypedDict to represent the state
class State(TypedDict):
url: str
result: NotRequired[str]
def call_api(state: State):
"""Example node that makes an API request."""
result = requests.get(state['url']).text[:100] # Side-effect
return {
"result": result
}
# Create a StateGraph builder and add a node for the call_api function
builder = StateGraph(State)
builder.add_node("call_api", call_api)
# Connect the start and end nodes to the call_api node
builder.add_edge(START, "call_api")
builder.add_edge("call_api", END)
# Specify a checkpointer
checkpointer = MemorySaver()
# Compile the graph with the checkpointer
graph = builder.compile(checkpointer=checkpointer)
# Define a config with a thread ID.
thread_id = uuid.uuid4()
config = {"configurable": {"thread_id": thread_id}}
# Invoke the graph
graph.invoke({"url": "https://www.example.com"}, config)
```
=== "With task"
```python hl_lines="19 23"
from typing import NotRequired
from typing_extensions import TypedDict
import uuid
from langgraph.checkpoint.memory import MemorySaver
from langgraph.func import task
from langgraph.graph import StateGraph, START, END
import requests
# Define a TypedDict to represent the state
class State(TypedDict):
urls: list[str]
result: NotRequired[list[str]]
@task
def _make_request(url: str):
"""Make a request."""
return requests.get(url).text[:100]
def call_api(state: State):
"""Example node that makes an API request."""
requests = [_make_request(url) for url in state['urls']]
results = [request.result() for request in requests]
return {
"results": results
}
# Create a StateGraph builder and add a node for the call_api function
builder = StateGraph(State)
builder.add_node("call_api", call_api)
# Connect the start and end nodes to the call_api node
builder.add_edge(START, "call_api")
builder.add_edge("call_api", END)
# Specify a checkpointer
checkpointer = MemorySaver()
# Compile the graph with the checkpointer
graph = builder.compile(checkpointer=checkpointer)
# Define a config with a thread ID.
thread_id = uuid.uuid4()
config = {"configurable": {"thread_id": thread_id}}
# Invoke the graph
graph.invoke({"urls": ["https://www.example.com"]}, config)
```
## Resuming Workflows
Once you have enabled durable execution in your workflow, you can resume execution for the following scenarios:
- **Pausing and Resuming Workflows:** Use the [interrupt][langgraph.types.interrupt] function to pause a workflow at specific points and the [Command][langgraph.types.Command] primitive to resume it with updated state. See [**Human-in-the-Loop**](./human_in_the_loop.md) for more details.
- **Recovering from Failures:** Automatically resume workflows from the last successful checkpoint after an exception (e.g., LLM provider outage). This involves executing the workflow with the same thread identifier by providing it with a `None` as the input value (see this [example](../how-tos/use-functional-api.md#resuming-after-an-error) with the functional API).
## Starting Points for Resuming Workflows
* If you're using a [StateGraph (Graph API)][langgraph.graph.state.StateGraph], the starting point is the beginning of the [**node**](./low_level.md#nodes) where execution stopped.
* If you're making a subgraph call inside a node, the starting point will be the **parent** node that called the subgraph that was halted.
Inside the subgraph, the starting point will be the specific [**node**](./low_level.md#nodes) where execution stopped.
* If you're using the Functional API, the starting point is the beginning of the [**entrypoint**](./functional_api.md#entrypoint) where execution stopped.
---
concepts/why-langgraph.md
---
# Overview
LangGraph is built for developers who want to build powerful, adaptable AI agents. Developers choose LangGraph for:
- **Reliability and controllability.** Steer agent actions with moderation checks and human-in-the-loop approvals. LangGraph persists context for long-running workflows, keeping your agents on course.
- **Low-level and extensible.** Build custom agents with fully descriptive, low-level primitives free from rigid abstractions that limit customization. Design scalable multi-agent systems, with each agent serving a specific role tailored to your use case.
- **First-class streaming support.** With token-by-token streaming and streaming of intermediate steps, LangGraph gives users clear visibility into agent reasoning and actions as they unfold in real time.
## Learn LangGraph basics
To get acquainted with LangGraph's key concepts and features, complete the following LangGraph basics tutorials series:
1. [Build a basic chatbot](../tutorials/get-started/1-build-basic-chatbot.md)
2. [Add tools](../tutorials/get-started/2-add-tools.md)
3. [Add memory](../tutorials/get-started/3-add-memory.md)
4. [Add human-in-the-loop controls](../tutorials/get-started/4-human-in-the-loop.md)
5. [Customize state](../tutorials/get-started/5-customize-state.md)
6. [Time travel](../tutorials/get-started/6-time-travel.md)
In completing this series of tutorials, you will build a support chatbot in LangGraph that can:
* ✅ **Answer common questions** by searching the web
* ✅ **Maintain conversation state** across calls
* ✅ **Route complex queries** to a human for review
* ✅ **Use custom state** to control its behavior
* ✅ **Rewind and explore** alternative conversation paths
---
concepts/low_level.md
---
---
search:
boost: 2
---
# Graph API concepts
## Graphs
At its core, LangGraph models agent workflows as graphs. You define the behavior of your agents using three key components:
1. [`State`](#state): A shared data structure that represents the current snapshot of your application. It can be any Python type, but is typically a `TypedDict` or Pydantic `BaseModel`.
2. [`Nodes`](#nodes): Python functions that encode the logic of your agents. They receive the current `State` as input, perform some computation or side-effect, and return an updated `State`.
3. [`Edges`](#edges): Python functions that determine which `Node` to execute next based on the current `State`. They can be conditional branches or fixed transitions.
By composing `Nodes` and `Edges`, you can create complex, looping workflows that evolve the `State` over time. The real power, though, comes from how LangGraph manages that `State`. To emphasize: `Nodes` and `Edges` are nothing more than Python functions - they can contain an LLM or just good ol' Python code.
In short: _nodes do the work, edges tell what to do next_.
LangGraph's underlying graph algorithm uses [message passing](https://en.wikipedia.org/wiki/Message_passing) to define a general program. When a Node completes its operation, it sends messages along one or more edges to other node(s). These recipient nodes then execute their functions, pass the resulting messages to the next set of nodes, and the process continues. Inspired by Google's [Pregel](https://research.google/pubs/pregel-a-system-for-large-scale-graph-processing/) system, the program proceeds in discrete "super-steps."
A super-step can be considered a single iteration over the graph nodes. Nodes that run in parallel are part of the same super-step, while nodes that run sequentially belong to separate super-steps. At the start of graph execution, all nodes begin in an `inactive` state. A node becomes `active` when it receives a new message (state) on any of its incoming edges (or "channels"). The active node then runs its function and responds with updates. At the end of each super-step, nodes with no incoming messages vote to `halt` by marking themselves as `inactive`. The graph execution terminates when all nodes are `inactive` and no messages are in transit.
### StateGraph
The `StateGraph` class is the main graph class to use. This is parameterized by a user defined `State` object.
### Compiling your graph
To build your graph, you first define the [state](#state), you then add [nodes](#nodes) and [edges](#edges), and then you compile it. What exactly is compiling your graph and why is it needed?
Compiling is a pretty simple step. It provides a few basic checks on the structure of your graph (no orphaned nodes, etc). It is also where you can specify runtime args like [checkpointers](./persistence.md) and breakpoints. You compile your graph by just calling the `.compile` method:
```python
graph = graph_builder.compile(...)
```
You **MUST** compile your graph before you can use it.
## State
The first thing you do when you define a graph is define the `State` of the graph. The `State` consists of the [schema of the graph](#schema) as well as [`reducer` functions](#reducers) which specify how to apply updates to the state. The schema of the `State` will be the input schema to all `Nodes` and `Edges` in the graph, and can be either a `TypedDict` or a `Pydantic` model. All `Nodes` will emit updates to the `State` which are then applied using the specified `reducer` function.
### Schema
The main documented way to specify the schema of a graph is by using `TypedDict`. However, we also support [using a Pydantic BaseModel](../how-tos/graph-api.md#use-pydantic-models-for-graph-state) as your graph state to add **default values** and additional data validation.
By default, the graph will have the same input and output schemas. If you want to change this, you can also specify explicit input and output schemas directly. This is useful when you have a lot of keys, and some are explicitly for input and others for output. See the [guide here](../how-tos/graph-api.md#define-input-and-output-schemas) for how to use.
#### Multiple schemas
Typically, all graph nodes communicate with a single schema. This means that they will read and write to the same state channels. But, there are cases where we want more control over this:
- Internal nodes can pass information that is not required in the graph's input / output.
- We may also want to use different input / output schemas for the graph. The output might, for example, only contain a single relevant output key.
It is possible to have nodes write to private state channels inside the graph for internal node communication. We can simply define a private schema, `PrivateState`. See [this guide](../how-tos/graph-api.md#pass-private-state-between-nodes) for more detail.
It is also possible to define explicit input and output schemas for a graph. In these cases, we define an "internal" schema that contains _all_ keys relevant to graph operations. But, we also define `input` and `output` schemas that are sub-sets of the "internal" schema to constrain the input and output of the graph. See [this guide](../how-tos/graph-api.md#define-input-and-output-schemas) for more detail.
Let's look at an example:
```python
class InputState(TypedDict):
user_input: str
class OutputState(TypedDict):
graph_output: str
class OverallState(TypedDict):
foo: str
user_input: str
graph_output: str
class PrivateState(TypedDict):
bar: str
def node_1(state: InputState) -> OverallState:
# Write to OverallState
return {"foo": state["user_input"] + " name"}
def node_2(state: OverallState) -> PrivateState:
# Read from OverallState, write to PrivateState
return {"bar": state["foo"] + " is"}
def node_3(state: PrivateState) -> OutputState:
# Read from PrivateState, write to OutputState
return {"graph_output": state["bar"] + " Lance"}
builder = StateGraph(OverallState,input_schema=InputState,output_schema=OutputState)
builder.add_node("node_1", node_1)
builder.add_node("node_2", node_2)
builder.add_node("node_3", node_3)
builder.add_edge(START, "node_1")
builder.add_edge("node_1", "node_2")
builder.add_edge("node_2", "node_3")
builder.add_edge("node_3", END)
graph = builder.compile()
graph.invoke({"user_input":"My"})
{'graph_output': 'My name is Lance'}
```
There are two subtle and important points to note here:
1. We pass `state: InputState` as the input schema to `node_1`. But, we write out to `foo`, a channel in `OverallState`. How can we write out to a state channel that is not included in the input schema? This is because a node _can write to any state channel in the graph state._ The graph state is the union of the state channels defined at initialization, which includes `OverallState` and the filters `InputState` and `OutputState`.
2. We initialize the graph with `StateGraph(OverallState,input_schema=InputState,output_schema=OutputState)`. So, how can we write to `PrivateState` in `node_2`? How does the graph gain access to this schema if it was not passed in the `StateGraph` initialization? We can do this because _nodes can also declare additional state channels_ as long as the state schema definition exists. In this case, the `PrivateState` schema is defined, so we can add `bar` as a new state channel in the graph and write to it.
### Reducers
Reducers are key to understanding how updates from nodes are applied to the `State`. Each key in the `State` has its own independent reducer function. If no reducer function is explicitly specified then it is assumed that all updates to that key should override it. There are a few different types of reducers, starting with the default type of reducer:
#### Default Reducer
These two examples show how to use the default reducer:
**Example A:**
```python
from typing_extensions import TypedDict
class State(TypedDict):
foo: int
bar: list[str]
```
In this example, no reducer functions are specified for any key. Let's assume the input to the graph is `{"foo": 1, "bar": ["hi"]}`. Let's then assume the first `Node` returns `{"foo": 2}`. This is treated as an update to the state. Notice that the `Node` does not need to return the whole `State` schema - just an update. After applying this update, the `State` would then be `{"foo": 2, "bar": ["hi"]}`. If the second node returns `{"bar": ["bye"]}` then the `State` would then be `{"foo": 2, "bar": ["bye"]}`
**Example B:**
```python
from typing import Annotated
from typing_extensions import TypedDict
from operator import add
class State(TypedDict):
foo: int
bar: Annotated[list[str], add]
```
In this example, we've used the `Annotated` type to specify a reducer function (`operator.add`) for the second key (`bar`). Note that the first key remains unchanged. Let's assume the input to the graph is `{"foo": 1, "bar": ["hi"]}`. Let's then assume the first `Node` returns `{"foo": 2}`. This is treated as an update to the state. Notice that the `Node` does not need to return the whole `State` schema - just an update. After applying this update, the `State` would then be `{"foo": 2, "bar": ["hi"]}`. If the second node returns `{"bar": ["bye"]}` then the `State` would then be `{"foo": 2, "bar": ["hi", "bye"]}`. Notice here that the `bar` key is updated by adding the two lists together.
### Working with Messages in Graph State
#### Why use messages?
Most modern LLM providers have a chat model interface that accepts a list of messages as input. LangChain's [`ChatModel`](https://python.langchain.com/docs/concepts/#chat-models) in particular accepts a list of `Message` objects as inputs. These messages come in a variety of forms such as `HumanMessage` (user input) or `AIMessage` (LLM response). To read more about what message objects are, please refer to [this](https://python.langchain.com/docs/concepts/#messages) conceptual guide.
#### Using Messages in your Graph
In many cases, it is helpful to store prior conversation history as a list of messages in your graph state. To do so, we can add a key (channel) to the graph state that stores a list of `Message` objects and annotate it with a reducer function (see `messages` key in the example below). The reducer function is vital to telling the graph how to update the list of `Message` objects in the state with each state update (for example, when a node sends an update). If you don't specify a reducer, every state update will overwrite the list of messages with the most recently provided value. If you wanted to simply append messages to the existing list, you could use `operator.add` as a reducer.
However, you might also want to manually update messages in your graph state (e.g. human-in-the-loop). If you were to use `operator.add`, the manual state updates you send to the graph would be appended to the existing list of messages, instead of updating existing messages. To avoid that, you need a reducer that can keep track of message IDs and overwrite existing messages, if updated. To achieve this, you can use the prebuilt `add_messages` function. For brand new messages, it will simply append to existing list, but it will also handle the updates for existing messages correctly.
#### Serialization
In addition to keeping track of message IDs, the `add_messages` function will also try to deserialize messages into LangChain `Message` objects whenever a state update is received on the `messages` channel. See more information on LangChain serialization/deserialization [here](https://python.langchain.com/docs/how_to/serialization/). This allows sending graph inputs / state updates in the following format:
```python
# this is supported
{"messages": [HumanMessage(content="message")]}
# and this is also supported
{"messages": [{"type": "human", "content": "message"}]}
```
Since the state updates are always deserialized into LangChain `Messages` when using `add_messages`, you should use dot notation to access message attributes, like `state["messages"][-1].content`. Below is an example of a graph that uses `add_messages` as its reducer function.
```python
from langchain_core.messages import AnyMessage
from langgraph.graph.message import add_messages
from typing import Annotated
from typing_extensions import TypedDict
class GraphState(TypedDict):
messages: Annotated[list[AnyMessage], add_messages]
```
#### MessagesState
Since having a list of messages in your state is so common, there exists a prebuilt state called `MessagesState` which makes it easy to use messages. `MessagesState` is defined with a single `messages` key which is a list of `AnyMessage` objects and uses the `add_messages` reducer. Typically, there is more state to track than just messages, so we see people subclass this state and add more fields, like:
```python
from langgraph.graph import MessagesState
class State(MessagesState):
documents: list[str]
```
## Nodes
In LangGraph, nodes are typically python functions (sync or async) where the **first** positional argument is the [state](#state), and (optionally), the **second** positional argument is a "config", containing optional [configurable parameters](#configuration) (such as a `thread_id`).
Similar to `NetworkX`, you add these nodes to a graph using the [add_node][langgraph.graph.StateGraph.add_node] method:
```python
from typing_extensions import TypedDict
from langchain_core.runnables import RunnableConfig
from langgraph.graph import StateGraph
class State(TypedDict):
input: str
results: str
builder = StateGraph(State)
def my_node(state: State, config: RunnableConfig):
print("In node: ", config["configurable"]["user_id"])
return {"results": f"Hello, {state['input']}!"}
# The second argument is optional
def my_other_node(state: State):
return state
builder.add_node("my_node", my_node)
builder.add_node("other_node", my_other_node)
...
```
Behind the scenes, functions are converted to [RunnableLambda](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableLambda.html#langchain_core.runnables.base.RunnableLambda)s, which add batch and async support to your function, along with native tracing and debugging.
If you add a node to a graph without specifying a name, it will be given a default name equivalent to the function name.
```python
builder.add_node(my_node)
# You can then create edges to/from this node by referencing it as `"my_node"`
```
### `START` Node
The `START` Node is a special node that represents the node that sends user input to the graph. The main purpose for referencing this node is to determine which nodes should be called first.
```python
from langgraph.graph import START
graph.add_edge(START, "node_a")
```
### `END` Node
The `END` Node is a special node that represents a terminal node. This node is referenced when you want to denote which edges have no actions after they are done.
```
from langgraph.graph import END
graph.add_edge("node_a", END)
```
### Node Caching
LangGraph supports caching of tasks/nodes based on the input to the node. To use caching:
* Specify a cache when compiling a graph (or specifying an entrypoint)
* Specify a cache policy for nodes. Each cache policy supports:
* `key_func` used to generate a cache key based on the input to a node, which defaults to a `hash` of the input with pickle.
* `ttl`, the time to live for the cache in seconds. If not specified, the cache will never expire.
For example:
```py
import time
from typing_extensions import TypedDict
from langgraph.graph import StateGraph
from langgraph.cache.memory import InMemoryCache
from langgraph.types import CachePolicy
class State(TypedDict):
x: int
result: int
builder = StateGraph(State)
def expensive_node(state: State) -> dict[str, int]:
# expensive computation
time.sleep(2)
return {"result": state["x"] * 2}
builder.add_node("expensive_node", expensive_node, cache_policy=CachePolicy(ttl=3))
builder.set_entry_point("expensive_node")
builder.set_finish_point("expensive_node")
graph = builder.compile(cache=InMemoryCache())
print(graph.invoke({"x": 5}, stream_mode='updates')) # (1)!
[{'expensive_node': {'result': 10}}]
print(graph.invoke({"x": 5}, stream_mode='updates')) # (2)!
[{'expensive_node': {'result': 10}, '__metadata__': {'cached': True}}]
```
1. First run takes the full second to run (due to mocked expensive computation).
2. Second run utilizes cache and returns quickly.
## Edges
Edges define how the logic is routed and how the graph decides to stop. This is a big part of how your agents work and how different nodes communicate with each other. There are a few key types of edges:
- Normal Edges: Go directly from one node to the next.
- Conditional Edges: Call a function to determine which node(s) to go to next.
- Entry Point: Which node to call first when user input arrives.
- Conditional Entry Point: Call a function to determine which node(s) to call first when user input arrives.
A node can have MULTIPLE outgoing edges. If a node has multiple out-going edges, **all** of those destination nodes will be executed in parallel as a part of the next superstep.
### Normal Edges
If you **always** want to go from node A to node B, you can use the [add_edge][langgraph.graph.StateGraph.add_edge] method directly.
```python
graph.add_edge("node_a", "node_b")
```
### Conditional Edges
If you want to **optionally** route to 1 or more edges (or optionally terminate), you can use the [add_conditional_edges][langgraph.graph.StateGraph.add_conditional_edges] method. This method accepts the name of a node and a "routing function" to call after that node is executed:
```python
graph.add_conditional_edges("node_a", routing_function)
```
Similar to nodes, the `routing_function` accepts the current `state` of the graph and returns a value.
By default, the return value `routing_function` is used as the name of the node (or list of nodes) to send the state to next. All those nodes will be run in parallel as a part of the next superstep.
You can optionally provide a dictionary that maps the `routing_function`'s output to the name of the next node.
```python
graph.add_conditional_edges("node_a", routing_function, {True: "node_b", False: "node_c"})
```
!!! tip
Use [`Command`](#command) instead of conditional edges if you want to combine state updates and routing in a single function.
### Entry Point
The entry point is the first node(s) that are run when the graph starts. You can use the [`add_edge`][langgraph.graph.StateGraph.add_edge] method from the virtual [`START`][langgraph.constants.START] node to the first node to execute to specify where to enter the graph.
```python
from langgraph.graph import START
graph.add_edge(START, "node_a")
```
### Conditional Entry Point
A conditional entry point lets you start at different nodes depending on custom logic. You can use [`add_conditional_edges`][langgraph.graph.StateGraph.add_conditional_edges] from the virtual [`START`][langgraph.constants.START] node to accomplish this.
```python
from langgraph.graph import START
graph.add_conditional_edges(START, routing_function)
```
You can optionally provide a dictionary that maps the `routing_function`'s output to the name of the next node.
```python
graph.add_conditional_edges(START, routing_function, {True: "node_b", False: "node_c"})
```
## `Send`
By default, `Nodes` and `Edges` are defined ahead of time and operate on the same shared state. However, there can be cases where the exact edges are not known ahead of time and/or you may want different versions of `State` to exist at the same time. A common example of this is with [map-reduce](https://langchain-ai.github.io/langgraph/how-tos/map-reduce/) design patterns. In this design pattern, a first node may generate a list of objects, and you may want to apply some other node to all those objects. The number of objects may be unknown ahead of time (meaning the number of edges may not be known) and the input `State` to the downstream `Node` should be different (one for each generated object).
To support this design pattern, LangGraph supports returning [`Send`][langgraph.types.Send] objects from conditional edges. `Send` takes two arguments: first is the name of the node, and second is the state to pass to that node.
```python
def continue_to_jokes(state: OverallState):
return [Send("generate_joke", {"subject": s}) for s in state['subjects']]
graph.add_conditional_edges("node_a", continue_to_jokes)
```
## `Command`
It can be useful to combine control flow (edges) and state updates (nodes). For example, you might want to BOTH perform state updates AND decide which node to go to next in the SAME node. LangGraph provides a way to do so by returning a [`Command`][langgraph.types.Command] object from node functions:
```python
def my_node(state: State) -> Command[Literal["my_other_node"]]:
return Command(
# state update
update={"foo": "bar"},
# control flow
goto="my_other_node"
)
```
With `Command` you can also achieve dynamic control flow behavior (identical to [conditional edges](#conditional-edges)):
```python
def my_node(state: State) -> Command[Literal["my_other_node"]]:
if state["foo"] == "bar":
return Command(update={"foo": "baz"}, goto="my_other_node")
```
!!! important
When returning `Command` in your node functions, you must add return type annotations with the list of node names the node is routing to, e.g. `Command[Literal["my_other_node"]]`. This is necessary for the graph rendering and tells LangGraph that `my_node` can navigate to `my_other_node`.
Check out this [how-to guide](../how-tos/graph-api.md#combine-control-flow-and-state-updates-with-command) for an end-to-end example of how to use `Command`.
### When should I use Command instead of conditional edges?
Use `Command` when you need to **both** update the graph state **and** route to a different node. For example, when implementing [multi-agent handoffs](./multi_agent.md#handoffs) where it's important to route to a different agent and pass some information to that agent.
Use [conditional edges](#conditional-edges) to route between nodes conditionally without updating the state.
### Navigating to a node in a parent graph
If you are using [subgraphs](./subgraphs.md), you might want to navigate from a node within a subgraph to a different subgraph (i.e. a different node in the parent graph). To do so, you can specify `graph=Command.PARENT` in `Command`:
```python
def my_node(state: State) -> Command[Literal["other_subgraph"]]:
return Command(
update={"foo": "bar"},
goto="other_subgraph", # where `other_subgraph` is a node in the parent graph
graph=Command.PARENT
)
```
!!! note
Setting `graph` to `Command.PARENT` will navigate to the closest parent graph.
!!! important "State updates with `Command.PARENT`"
When you send updates from a subgraph node to a parent graph node for a key that's shared by both parent and subgraph [state schemas](#schema), you **must** define a [reducer](#reducers) for the key you're updating in the parent graph state. See this [example](../how-tos/graph-api.md#navigate-to-a-node-in-a-parent-graph).
This is particularly useful when implementing [multi-agent handoffs](./multi_agent.md#handoffs).
Check out [this guide](../how-tos/graph-api.md#navigate-to-a-node-in-a-parent-graph) for detail.
### Using inside tools
A common use case is updating graph state from inside a tool. For example, in a customer support application you might want to look up customer information based on their account number or ID in the beginning of the conversation.
Refer to [this guide](../how-tos/graph-api.md#use-inside-tools) for detail.
### Human-in-the-loop
`Command` is an important part of human-in-the-loop workflows: when using `interrupt()` to collect user input, `Command` is then used to supply the input and resume execution via `Command(resume="User input")`. Check out [this conceptual guide](./human_in_the_loop.md) for more information.
## Graph Migrations
LangGraph can easily handle migrations of graph definitions (nodes, edges, and state) even when using a checkpointer to track state.
- For threads at the end of the graph (i.e. not interrupted) you can change the entire topology of the graph (i.e. all nodes and edges, remove, add, rename, etc)
- For threads currently interrupted, we support all topology changes other than renaming / removing nodes (as that thread could now be about to enter a node that no longer exists) -- if this is a blocker please reach out and we can prioritize a solution.
- For modifying state, we have full backwards and forwards compatibility for adding and removing keys
- State keys that are renamed lose their saved state in existing threads
- State keys whose types change in incompatible ways could currently cause issues in threads with state from before the change -- if this is a blocker please reach out and we can prioritize a solution.
## Configuration
When creating a graph, you can also mark that certain parts of the graph are configurable. This is commonly done to enable easily switching between models or system prompts. This allows you to create a single "cognitive architecture" (the graph) but have multiple different instance of it.
You can optionally specify a `config_schema` when creating a graph.
```python
class ConfigSchema(TypedDict):
llm: str
graph = StateGraph(State, config_schema=ConfigSchema)
```
You can then pass this configuration into the graph using the `configurable` config field.
```python
config = {"configurable": {"llm": "anthropic"}}
graph.invoke(inputs, config=config)
```
You can then access and use this configuration inside a node or conditional edge:
```python
def node_a(state, config):
llm_type = config.get("configurable", {}).get("llm", "openai")
llm = get_llm(llm_type)
...
```
See [this guide](../how-tos/graph-api.md#add-runtime-configuration) for a full breakdown on configuration.
### Recursion Limit
The recursion limit sets the maximum number of [super-steps](#graphs) the graph can execute during a single execution. Once the limit is reached, LangGraph will raise `GraphRecursionError`. By default this value is set to 25 steps. The recursion limit can be set on any graph at runtime, and is passed to `.invoke`/`.stream` via the config dictionary. Importantly, `recursion_limit` is a standalone `config` key and should not be passed inside the `configurable` key as all other user-defined configuration. See the example below:
```python
graph.invoke(inputs, config={"recursion_limit": 5, "configurable":{"llm": "anthropic"}})
```
Read [this how-to](https://langchain-ai.github.io/langgraph/how-tos/recursion-limit/) to learn more about how the recursion limit works.
## Visualization
It's often nice to be able to visualize graphs, especially as they get more complex. LangGraph comes with several built-in ways to visualize graphs. See [this how-to guide](../how-tos/graph-api.md#visualize-your-graph) for more info.
---
concepts/tools.md
---
# Tools
Many AI applications interact with users via natural language. However, some use cases require models to interface directly with external systems—such as APIs, databases, or file systems—using structured input. In these scenarios, [tool calling](../how-tos/tool-calling.md) enables models to generate requests that conform to a specified input schema.
**Tools** encapsulate a callable function and its input schema. These can be passed to compatible [chat models](https://python.langchain.com/docs/concepts/chat_models), allowing the model to decide whether to invoke a tool and with what arguments.
## Tool calling

Tool calling is typically **conditional**. Based on the user input and available tools, the model may choose to issue a tool call request. This request is returned in an `AIMessage` object, which includes a `tool_calls` field that specifies the tool name and input arguments:
```python
llm_with_tools.invoke("What is 2 multiplied by 3?")
# -> AIMessage(tool_calls=[{'name': 'multiply', 'args': {'a': 2, 'b': 3}, ...}])
```
If the input is unrelated to any tool, the model returns only a natural language message:
```python
llm_with_tools.invoke("Hello world!") # -> AIMessage(content="Hello!")
```
Importantly, the model does not execute the tool—it only generates a request. A separate executor (such as a runtime or agent) is responsible for handling the tool call and returning the result.
See the [tool calling guide](../how-tos/tool-calling.md) for more details.
## Prebuilt tools
LangChain provides prebuilt tool integrations for common external systems including APIs, databases, file systems, and web data.
Browse the [integrations directory](https://python.langchain.com/docs/integrations/tools/) for available tools.
Common categories:
* **Search**: Bing, SerpAPI, Tavily
* **Code execution**: Python REPL, Node.js REPL
* **Databases**: SQL, MongoDB, Redis
* **Web data**: Scraping and browsing
* **APIs**: OpenWeatherMap, NewsAPI, etc.
## Custom tools
You can define custom tools using the `@tool` decorator or plain Python functions. For example:
```python
from langchain_core.tools import tool
@tool
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
```
See the [tool calling guide](../how-tos/tool-calling.md) for more details.
## Tool execution
While the model determines when to call a tool, execution of the tool call must be handled by a runtime component.
LangGraph provides prebuilt components for this:
* [`ToolNode`][langgraph.prebuilt.tool_node.ToolNode]: A prebuilt node that executes tools.
* [`create_react_agent`][langgraph.prebuilt.chat_agent_executor.create_react_agent]: Constructs a full agent that manages tool calling automatically.
---
concepts/langgraph_self_hosted_data_plane.md
---
---
search:
boost: 2
---
# Self-Hosted Data Plane
There are two versions of the self-hosted deployment: [Self-Hosted Data Plane](./deployment_options.md#self-hosted-data-plane) and [Self-Hosted Control Plane](./deployment_options.md#self-hosted-control-plane).
!!! info "Important"
The Self-Hosted Data Plane deployment option requires an [Enterprise](plans.md) plan.
## Requirements
- You use `langgraph-cli` and/or [LangGraph Studio](./langgraph_studio.md) app to test graph locally.
- You use `langgraph build` command to build image.
## Self-Hosted Data Plane
The [Self-Hosted Data Plane](../cloud/deployment/self_hosted_data_plane.md) deployment option is a "hybrid" model for deployment where we manage the [control plane](./langgraph_control_plane.md) in our cloud and you manage the [data plane](./langgraph_data_plane.md) in your cloud. This option provides a way to securely manage your data plane infrastructure, while offloading control plane management to us. When using the Self-Hosted Data Plane version, you authenticate with a [LangSmith](https://smith.langchain.com/) API key.
| | [Control plane](../concepts/langgraph_control_plane.md) | [Data plane](../concepts/langgraph_data_plane.md) |
|-------------------|-------------------|------------|
| **What is it?** |
Control plane UI for creating deployments and revisions
Control plane APIs for creating deployments and revisions
|
Data plane "listener" for reconciling deployments with control plane state
LangGraph Servers
Postgres, Redis, etc
|
| **Where is it hosted?** | LangChain's cloud | Your cloud |
| **Who provisions and manages it?** | LangChain | You |
For information on how to deploy a [LangGraph Server](../concepts/langgraph_server.md) to Self-Hosted Data Plane, see [Deploy to Self-Hosted Data Plane](../cloud/deployment/self_hosted_data_plane.md)
### Architecture

### Compute Platforms
- **Kubernetes**: The Self-Hosted Data Plane deployment option supports deploying data plane infrastructure to any Kubernetes cluster.
- **Amazon ECS**: Coming soon!
!!! tip
If you would like to deploy to Kubernetes, you can follow the [Self-Hosted Data Plane deployment guide](../cloud/deployment/self_hosted_data_plane.md).
---
concepts/faq.md
---
---
search:
boost: 2
---
# FAQ
Common questions and their answers!
## Do I need to use LangChain to use LangGraph? What’s the difference?
No. LangGraph is an orchestration framework for complex agentic systems and is more low-level and controllable than LangChain agents. LangChain provides a standard interface to interact with models and other components, useful for straight-forward chains and retrieval flows.
## How is LangGraph different from other agent frameworks?
Other agentic frameworks can work for simple, generic tasks but fall short for complex tasks bespoke to a company’s needs. LangGraph provides a more expressive framework to handle companies’ unique tasks without restricting users to a single black-box cognitive architecture.
## Does LangGraph impact the performance of my app?
LangGraph will not add any overhead to your code and is specifically designed with streaming workflows in mind.
## Is LangGraph open source? Is it free?
Yes. LangGraph is an MIT-licensed open-source library and is free to use.
## How are LangGraph and LangGraph Platform different?
LangGraph is a stateful, orchestration framework that brings added control to agent workflows. LangGraph Platform is a service for deploying and scaling LangGraph applications, with an opinionated API for building agent UXs, plus an integrated developer studio.
| Features | LangGraph (open source) | LangGraph Platform |
|---------------------|-----------------------------------------------------------|--------------------------------------------------------------------------------------------------------|
| Description | Stateful orchestration framework for agentic applications | Scalable infrastructure for deploying LangGraph applications |
| SDKs | Python and JavaScript | Python and JavaScript |
| HTTP APIs | None | Yes - useful for retrieving & updating state or long-term memory, or creating a configurable assistant |
| Streaming | Basic | Dedicated mode for token-by-token messages |
| Checkpointer | Community contributed | Supported out-of-the-box |
| Persistence Layer | Self-managed | Managed Postgres with efficient storage |
| Deployment | Self-managed | • Cloud SaaS • Free self-hosted • Enterprise (paid self-hosted) |
| Scalability | Self-managed | Auto-scaling of task queues and servers |
| Fault-tolerance | Self-managed | Automated retries |
| Concurrency Control | Simple threading | Supports double-texting |
| Scheduling | None | Cron scheduling |
| Monitoring | None | Integrated with LangSmith for observability |
| IDE integration | LangGraph Studio | LangGraph Studio |
## Is LangGraph Platform open source?
No. LangGraph Platform is proprietary software.
There is a free, self-hosted version of LangGraph Platform with access to basic features. The Cloud SaaS deployment option and the Self-Hosted deployment options are paid services. [Contact our sales team](https://www.langchain.com/contact-sales) to learn more.
For more information, see our [LangGraph Platform pricing page](https://www.langchain.com/pricing-langgraph-platform).
## Does LangGraph work with LLMs that don't support tool calling?
Yes! You can use LangGraph with any LLMs. The main reason we use LLMs that support tool calling is that this is often the most convenient way to have the LLM make its decision about what to do. If your LLM does not support tool calling, you can still use it - you just need to write a bit of logic to convert the raw LLM string response to a decision about what to do.
## Does LangGraph work with OSS LLMs?
Yes! LangGraph is totally ambivalent to what LLMs are used under the hood. The main reason we use closed LLMs in most of the tutorials is that they seamlessly support tool calling, while OSS LLMs often don't. But tool calling is not necessary (see [this section](#does-langgraph-work-with-llms-that-dont-support-tool-calling)) so you can totally use LangGraph with OSS LLMs.
## Can I use LangGraph Studio without logging in to LangSmith
Yes! You can use the [development version of LangGraph Server](../tutorials/langgraph-platform/local-server.md) to run the backend locally.
This will connect to the studio frontend hosted as part of LangSmith.
If you set an environment variable of `LANGSMITH_TRACING=false`, then no traces will be sent to LangSmith.
## What does "nodes executed" mean for LangGraph Platform usage?
**Nodes Executed** is the aggregate number of nodes in a LangGraph application that are called and completed successfully during an invocation of the application. If a node in the graph is not called during execution or ends in an error state, these nodes will not be counted. If a node is called and completes successfully multiple times, each occurrence will be counted.
---
concepts/double_texting.md
---
---
search:
boost: 2
---
# Double Texting
!!! info "Prerequisites"
- [LangGraph Server](./langgraph_server.md)
Many times users might interact with your graph in unintended ways.
For instance, a user may send one message and before the graph has finished running send a second message.
More generally, users may invoke the graph a second time before the first run has finished.
We call this "double texting".
Currently, LangGraph only addresses this as part of [LangGraph Platform](langgraph_platform.md), not in the open source.
The reason for this is that in order to handle this we need to know how the graph is deployed, and since LangGraph Platform deals with deployment the logic needs to live there.
If you do not want to use LangGraph Platform, we describe the options we have implemented in detail below.

## Reject
This is the simplest option, this just rejects any follow-up runs and does not allow double texting.
See the [how-to guide](../cloud/how-tos/reject_concurrent.md) for configuring the reject double text option.
## Enqueue
This is a relatively simple option which continues the first run until it completes the whole run, then sends the new input as a separate run.
See the [how-to guide](../cloud/how-tos/enqueue_concurrent.md) for configuring the enqueue double text option.
## Interrupt
This option interrupts the current execution but saves all the work done up until that point.
It then inserts the user input and continues from there.
If you enable this option, your graph should be able to handle weird edge cases that may arise.
For example, you could have called a tool but not yet gotten back a result from running that tool.
You may need to remove that tool call in order to not have a dangling tool call.
See the [how-to guide](../cloud/how-tos/interrupt_concurrent.md) for configuring the interrupt double text option.
## Rollback
This option interrupts the current execution AND rolls back all work done up until that point, including the original run input. It then sends the new user input in, basically as if it was the original input.
See the [how-to guide](../cloud/how-tos/rollback_concurrent.md) for configuring the rollback double text option.
---
concepts/langgraph_components.md
---
## Components
The LangGraph Platform consists of components that work together to support the development, deployment, debugging, and monitoring of LangGraph applications:
- [LangGraph Server](./langgraph_server.md): The server defines an opinionated API and architecture that incorporates best practices for deploying agentic applications, allowing you to focus on building your agent logic rather than developing server infrastructure.
- [LangGraph CLI](./langgraph_cli.md): LangGraph CLI is a command-line interface that helps to interact with a local LangGraph
- [LangGraph Studio](./langgraph_studio.md): LangGraph Studio is a specialized IDE that can connect to a LangGraph Server to enable visualization, interaction, and debugging of the application locally.
- [Python/JS SDK](./sdk.md): The Python/JS SDK provides a programmatic way to interact with deployed LangGraph Applications.
- [Remote Graph](../how-tos/use-remote-graph.md): A RemoteGraph allows you to interact with any deployed LangGraph application as though it were running locally.
- [LangGraph control plane](./langgraph_control_plane.md): The LangGraph Control Plane refers to the Control Plane UI where users create and update LangGraph Servers and the Control Plane APIs that support the UI experience.
- [LangGraph data plane](./langgraph_data_plane.md): The LangGraph Data Plane refers to LangGraph Servers, the corresponding infrastructure for each server, and the "listener" application that continuously polls for updates from the LangGraph Control Plane.

---
concepts/auth.md
---
---
search:
boost: 2
---
# Authentication & Access Control
LangGraph Platform provides a flexible authentication and authorization system that can integrate with most authentication schemes.
## Core Concepts
### Authentication vs Authorization
While often used interchangeably, these terms represent distinct security concepts:
- [**Authentication**](#authentication) ("AuthN") verifies _who_ you are. This runs as middleware for every request.
- [**Authorization**](#authorization) ("AuthZ") determines _what you can do_. This validates the user's privileges and roles on a per-resource basis.
In LangGraph Platform, authentication is handled by your [`@auth.authenticate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.authenticate) handler, and authorization is handled by your [`@auth.on`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.on) handlers.
## Default Security Models
LangGraph Platform provides different security defaults:
### LangGraph Platform
- Uses LangSmith API keys by default
- Requires valid API key in `x-api-key` header
- Can be customized with your auth handler
!!! note "Custom auth"
Custom auth **is supported** for all plans in LangGraph Platform.
### Self-Hosted
- No default authentication
- Complete flexibility to implement your security model
- You control all aspects of authentication and authorization
!!! note "Custom auth"
Custom auth is supported for **Enterprise** self-hosted deployments.
Standalone Container (Lite) deployments do not support custom auth natively.
## System Architecture
A typical authentication setup involves three main components:
1. **Authentication Provider** (Identity Provider/IdP)
* A dedicated service that manages user identities and credentials
* Handles user registration, login, password resets, etc.
* Issues tokens (JWT, session tokens, etc.) after successful authentication
* Examples: Auth0, Supabase Auth, Okta, or your own auth server
2. **LangGraph Backend** (Resource Server)
* Your LangGraph application that contains business logic and protected resources
* Validates tokens with the auth provider
* Enforces access control based on user identity and permissions
* Doesn't store user credentials directly
3. **Client Application** (Frontend)
* Web app, mobile app, or API client
* Collects time-sensitive user credentials and sends to auth provider
* Receives tokens from auth provider
* Includes these tokens in requests to LangGraph backend
Here's how these components typically interact:
```mermaid
sequenceDiagram
participant Client as Client App
participant Auth as Auth Provider
participant LG as LangGraph Backend
Client->>Auth: 1. Login (username/password)
Auth-->>Client: 2. Return token
Client->>LG: 3. Request with token
Note over LG: 4. Validate token (@auth.authenticate)
LG-->>Auth: 5. Fetch user info
Auth-->>LG: 6. Confirm validity
Note over LG: 7. Apply access control (@auth.on.*)
LG-->>Client: 8. Return resources
```
Your [`@auth.authenticate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.authenticate) handler in LangGraph handles steps 4-6, while your [`@auth.on`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.on) handlers implement step 7.
## Authentication
Authentication in LangGraph runs as middleware on every request. Your [`@auth.authenticate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.authenticate) handler receives request information and should:
1. Validate the credentials
2. Return [user info](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.MinimalUserDict) containing the user's identity and user information if valid
3. Raise an [HTTP exception](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.exceptions.HTTPException) or AssertionError if invalid
```python
from langgraph_sdk import Auth
auth = Auth()
@auth.authenticate
async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
# Validate credentials (e.g., API key, JWT token)
api_key = headers.get("x-api-key")
if not api_key or not is_valid_key(api_key):
raise Auth.exceptions.HTTPException(
status_code=401,
detail="Invalid API key"
)
# Return user info - only identity and is_authenticated are required
# Add any additional fields you need for authorization
return {
"identity": "user-123", # Required: unique user identifier
"is_authenticated": True, # Optional: assumed True by default
"permissions": ["read", "write"] # Optional: for permission-based auth
# You can add more custom fields if you want to implement other auth patterns
"role": "admin",
"org_id": "org-456"
}
```
The returned user information is available:
- To your authorization handlers via [`ctx.user`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AuthContext)
- In your application via `config["configuration"]["langgraph_auth_user"]`
??? tip "Supported Parameters"
The [`@auth.authenticate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.authenticate) handler can accept any of the following parameters by name:
* request (Request): The raw ASGI request object
* body (dict): The parsed request body
* path (str): The request path, e.g., "/threads/abcd-1234-abcd-1234/runs/abcd-1234-abcd-1234/stream"
* method (str): The HTTP method, e.g., "GET"
* path_params (dict[str, str]): URL path parameters, e.g., {"thread_id": "abcd-1234-abcd-1234", "run_id": "abcd-1234-abcd-1234"}
* query_params (dict[str, str]): URL query parameters, e.g., {"stream": "true"}
* headers (dict[bytes, bytes]): Request headers
* authorization (str | None): The Authorization header value (e.g., "Bearer ")
In many of our tutorials, we will just show the "authorization" parameter to be concise, but you can opt to accept more information as needed
to implement your custom authentication scheme.
### Agent authentication
Custom authentication permits delegated access. The values you return in `@auth.authenticate` are added to the run context, giving agents user-scoped credentials lets them access resources on the user’s behalf.
```mermaid
sequenceDiagram
%% Actors
participant ClientApp as Client
participant AuthProv as Auth Provider
participant LangGraph as LangGraph Backend
participant SecretStore as Secret Store
participant ExternalService as External Service
%% Platform login / AuthN
ClientApp ->> AuthProv: 1. Login (username / password)
AuthProv -->> ClientApp: 2. Return token
ClientApp ->> LangGraph: 3. Request with token
Note over LangGraph: 4. Validate token (@auth.authenticate)
LangGraph -->> AuthProv: 5. Fetch user info
AuthProv -->> LangGraph: 6. Confirm validity
%% Fetch user tokens from secret store
LangGraph ->> SecretStore: 6a. Fetch user tokens
SecretStore -->> LangGraph: 6b. Return tokens
Note over LangGraph: 7. Apply access control (@auth.on.*)
%% External Service round-trip
LangGraph ->> ExternalService: 8. Call external service (with header)
Note over ExternalService: 9. External service validates header and executes action
ExternalService -->> LangGraph: 10. Service response
%% Return to caller
LangGraph -->> ClientApp: 11. Return resources
```
After authentication, the platform creates a special configuration object that is passed to your graph and all nodes via the configurable context.
This object contains information about the current user, including any custom fields you return from your [`@auth.authenticate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.authenticate) handler.
To enable an agent to act on behalf of the user, use [custom authentication middleware](../how-tos/auth/custom_auth.md). This will allow the agent to interact with external systems like MCP servers, external databases, and even other agents on behalf of the user.
For more information, see the [Use custom auth](../how-tos/auth/custom_auth.md#enable-agent-authentication) guide.
### Agent authentication with MCP
For information on how to authenticate an agent to an MCP server, see the [MCP conceptual guide](../concepts/mcp.md).
## Authorization
After authentication, LangGraph calls your [`@auth.on`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.on) handlers to control access to specific resources (e.g., threads, assistants, crons). These handlers can:
1. Add metadata to be saved during resource creation by mutating the `value["metadata"]` dictionary directly. See the [supported actions table](#supported-actions) for the list of types the value can take for each action.
2. Filter resources by metadata during search/list or read operations by returning a [filter dictionary](#filter-operations).
3. Raise an HTTP exception if access is denied.
If you want to just implement simple user-scoped access control, you can use a single [`@auth.on`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.on) handler for all resources and actions. If you want to have different control depending on the resource and action, you can use [resource-specific handlers](#resource-specific-handlers). See the [Supported Resources](#supported-resources) section for a full list of the resources that support access control.
```python
@auth.on
async def add_owner(
ctx: Auth.types.AuthContext,
value: dict # The payload being sent to this access method
) -> dict: # Returns a filter dict that restricts access to resources
"""Authorize all access to threads, runs, crons, and assistants.
This handler does two things:
- Adds a value to resource metadata (to persist with the resource so it can be filtered later)
- Returns a filter (to restrict access to existing resources)
Args:
ctx: Authentication context containing user info, permissions, the path, and
value: The request payload sent to the endpoint. For creation
operations, this contains the resource parameters. For read
operations, this contains the resource being accessed.
Returns:
A filter dictionary that LangGraph uses to restrict access to resources.
See [Filter Operations](#filter-operations) for supported operators.
"""
# Create filter to restrict access to just this user's resources
filters = {"owner": ctx.user.identity}
# Get or create the metadata dictionary in the payload
# This is where we store persistent info about the resource
metadata = value.setdefault("metadata", {})
# Add owner to metadata - if this is a create or update operation,
# this information will be saved with the resource
# So we can filter by it later in read operations
metadata.update(filters)
# Return filters to restrict access
# These filters are applied to ALL operations (create, read, update, search, etc.)
# to ensure users can only access their own resources
return filters
```
### Resource-Specific Handlers {#resource-specific-handlers}
You can register handlers for specific resources and actions by chaining the resource and action names together with the [`@auth.on`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.Auth.on) decorator.
When a request is made, the most specific handler that matches that resource and action is called. Below is an example of how to register handlers for specific resources and actions. For the following setup:
1. Authenticated users are able to create threads, read threads, and create runs on threads
2. Only users with the "assistants:create" permission are allowed to create new assistants
3. All other endpoints (e.g., e.g., delete assistant, crons, store) are disabled for all users.
!!! tip "Supported Handlers"
For a full list of supported resources and actions, see the [Supported Resources](#supported-resources) section below.
```python
# Generic / global handler catches calls that aren't handled by more specific handlers
@auth.on
async def reject_unhandled_requests(ctx: Auth.types.AuthContext, value: Any) -> False:
print(f"Request to {ctx.path} by {ctx.user.identity}")
raise Auth.exceptions.HTTPException(
status_code=403,
detail="Forbidden"
)
# Matches the "thread" resource and all actions - create, read, update, delete, search
# Since this is **more specific** than the generic @auth.on handler, it will take precedence
# over the generic handler for all actions on the "threads" resource
@auth.on.threads
async def on_thread_create(
ctx: Auth.types.AuthContext,
value: Auth.types.threads.create.value
):
if "write" not in ctx.permissions:
raise Auth.exceptions.HTTPException(
status_code=403,
detail="User lacks the required permissions."
)
# Setting metadata on the thread being created
# will ensure that the resource contains an "owner" field
# Then any time a user tries to access this thread or runs within the thread,
# we can filter by owner
metadata = value.setdefault("metadata", {})
metadata["owner"] = ctx.user.identity
return {"owner": ctx.user.identity}
# Thread creation. This will match only on thread create actions
# Since this is **more specific** than both the generic @auth.on handler and the @auth.on.threads handler,
# it will take precedence for any "create" actions on the "threads" resources
@auth.on.threads.create
async def on_thread_create(
ctx: Auth.types.AuthContext,
value: Auth.types.threads.create.value
):
# Setting metadata on the thread being created
# will ensure that the resource contains an "owner" field
# Then any time a user tries to access this thread or runs within the thread,
# we can filter by owner
metadata = value.setdefault("metadata", {})
metadata["owner"] = ctx.user.identity
return {"owner": ctx.user.identity}
# Reading a thread. Since this is also more specific than the generic @auth.on handler, and the @auth.on.threads handler,
# it will take precedence for any "read" actions on the "threads" resource
@auth.on.threads.read
async def on_thread_read(
ctx: Auth.types.AuthContext,
value: Auth.types.threads.read.value
):
# Since we are reading (and not creating) a thread,
# we don't need to set metadata. We just need to
# return a filter to ensure users can only see their own threads
return {"owner": ctx.user.identity}
# Run creation, streaming, updates, etc.
# This takes precedenceover the generic @auth.on handler and the @auth.on.threads handler
@auth.on.threads.create_run
async def on_run_create(
ctx: Auth.types.AuthContext,
value: Auth.types.threads.create_run.value
):
metadata = value.setdefault("metadata", {})
metadata["owner"] = ctx.user.identity
# Inherit thread's access control
return {"owner": ctx.user.identity}
# Assistant creation
@auth.on.assistants.create
async def on_assistant_create(
ctx: Auth.types.AuthContext,
value: Auth.types.assistants.create.value
):
if "assistants:create" not in ctx.permissions:
raise Auth.exceptions.HTTPException(
status_code=403,
detail="User lacks the required permissions."
)
```
Notice that we are mixing global and resource-specific handlers in the above example. Since each request is handled by the most specific handler, a request to create a `thread` would match the `on_thread_create` handler but NOT the `reject_unhandled_requests` handler. A request to `update` a thread, however would be handled by the global handler, since we don't have a more specific handler for that resource and action.
### Filter Operations {#filter-operations}
Authorization handlers can return `None`, a boolean, or a filter dictionary.
- `None` and `True` mean "authorize access to all underling resources"
- `False` means "deny access to all underling resources (raises a 403 exception)"
- A metadata filter dictionary will restrict access to resources
A filter dictionary is a dictionary with keys that match the resource metadata. It supports three operators:
- The default value is a shorthand for exact match, or "$eq", below. For example, `{"owner": user_id}` will include only resources with metadata containing `{"owner": user_id}`
- `$eq`: Exact match (e.g., `{"owner": {"$eq": user_id}}`) - this is equivalent to the shorthand above, `{"owner": user_id}`
- `$contains`: List membership (e.g., `{"allowed_users": {"$contains": user_id}}`) The value here must be an element of the list. The metadata in the stored resource must be a list/container type.
A dictionary with multiple keys is treated using a logical `AND` filter. For example, `{"owner": org_id, "allowed_users": {"$contains": user_id}}` will only match resources with metadata whose "owner" is `org_id` and whose "allowed_users" list contains `user_id`.
See the reference [here](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.FilterType) for more information.
## Common Access Patterns
Here are some typical authorization patterns:
### Single-Owner Resources
This common pattern lets you scope all threads, assistants, crons, and runs to a single user. It's useful for common single-user use cases like regular chatbot-style apps.
```python
@auth.on
async def owner_only(ctx: Auth.types.AuthContext, value: dict):
metadata = value.setdefault("metadata", {})
metadata["owner"] = ctx.user.identity
return {"owner": ctx.user.identity}
```
### Permission-based Access
This pattern lets you control access based on **permissions**. It's useful if you want certain roles to have broader or more restricted access to resources.
```python
# In your auth handler:
@auth.authenticate
async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
...
return {
"identity": "user-123",
"is_authenticated": True,
"permissions": ["threads:write", "threads:read"] # Define permissions in auth
}
def _default(ctx: Auth.types.AuthContext, value: dict):
metadata = value.setdefault("metadata", {})
metadata["owner"] = ctx.user.identity
return {"owner": ctx.user.identity}
@auth.on.threads.create
async def create_thread(ctx: Auth.types.AuthContext, value: dict):
if "threads:write" not in ctx.permissions:
raise Auth.exceptions.HTTPException(
status_code=403,
detail="Unauthorized"
)
return _default(ctx, value)
@auth.on.threads.read
async def rbac_create(ctx: Auth.types.AuthContext, value: dict):
if "threads:read" not in ctx.permissions and "threads:write" not in ctx.permissions:
raise Auth.exceptions.HTTPException(
status_code=403,
detail="Unauthorized"
)
return _default(ctx, value)
```
## Supported Resources
LangGraph provides three levels of authorization handlers, from most general to most specific:
1. **Global Handler** (`@auth.on`): Matches all resources and actions
2. **Resource Handler** (e.g., `@auth.on.threads`, `@auth.on.assistants`, `@auth.on.crons`): Matches all actions for a specific resource
3. **Action Handler** (e.g., `@auth.on.threads.create`, `@auth.on.threads.read`): Matches a specific action on a specific resource
The most specific matching handler will be used. For example, `@auth.on.threads.create` takes precedence over `@auth.on.threads` for thread creation.
If a more specific handler is registered, the more general handler will not be called for that resource and action.
???+ tip "Type Safety"
Each handler has type hints available for its `value` parameter at `Auth.types.on...value`. For example:
```python
@auth.on.threads.create
async def on_thread_create(
ctx: Auth.types.AuthContext,
value: Auth.types.on.threads.create.value # Specific type for thread creation
):
...
@auth.on.threads
async def on_threads(
ctx: Auth.types.AuthContext,
value: Auth.types.on.threads.value # Union type of all thread actions
):
...
@auth.on
async def on_all(
ctx: Auth.types.AuthContext,
value: dict # Union type of all possible actions
):
...
```
More specific handlers provide better type hints since they handle fewer action types.
#### Supported actions and types {#supported-actions}
Here are all the supported action handlers:
| Resource | Handler | Description | Value Type |
|----------|---------|-------------|------------|
| **Threads** | `@auth.on.threads.create` | Thread creation | [`ThreadsCreate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.ThreadsCreate) |
| | `@auth.on.threads.read` | Thread retrieval | [`ThreadsRead`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.ThreadsRead) |
| | `@auth.on.threads.update` | Thread updates | [`ThreadsUpdate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.ThreadsUpdate) |
| | `@auth.on.threads.delete` | Thread deletion | [`ThreadsDelete`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.ThreadsDelete) |
| | `@auth.on.threads.search` | Listing threads | [`ThreadsSearch`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.ThreadsSearch) |
| | `@auth.on.threads.create_run` | Creating or updating a run | [`RunsCreate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.RunsCreate) |
| **Assistants** | `@auth.on.assistants.create` | Assistant creation | [`AssistantsCreate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AssistantsCreate) |
| | `@auth.on.assistants.read` | Assistant retrieval | [`AssistantsRead`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AssistantsRead) |
| | `@auth.on.assistants.update` | Assistant updates | [`AssistantsUpdate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AssistantsUpdate) |
| | `@auth.on.assistants.delete` | Assistant deletion | [`AssistantsDelete`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AssistantsDelete) |
| | `@auth.on.assistants.search` | Listing assistants | [`AssistantsSearch`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.AssistantsSearch) |
| **Crons** | `@auth.on.crons.create` | Cron job creation | [`CronsCreate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.CronsCreate) |
| | `@auth.on.crons.read` | Cron job retrieval | [`CronsRead`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.CronsRead) |
| | `@auth.on.crons.update` | Cron job updates | [`CronsUpdate`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.CronsUpdate) |
| | `@auth.on.crons.delete` | Cron job deletion | [`CronsDelete`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.CronsDelete) |
| | `@auth.on.crons.search` | Listing cron jobs | [`CronsSearch`](../cloud/reference/sdk/python_sdk_ref.md#langgraph_sdk.auth.types.CronsSearch) |
???+ note "About Runs"
Runs are scoped to their parent thread for access control. This means permissions are typically inherited from the thread, reflecting the conversational nature of the data model. All run operations (reading, listing) except creation are controlled by the thread's handlers.
There is a specific `create_run` handler for creating new runs because it had more arguments that you can view in the handler.
## Next Steps
For implementation details:
- Check out the introductory tutorial on [setting up authentication](../tutorials/auth/getting_started.md)
- See the how-to guide on implementing a [custom auth handlers](../how-tos/auth/custom_auth.md)