Streaming¶
Streaming is critical for making LLM applications feel responsive to end users.
When creating a streaming run, the streaming mode determines what kinds of data are streamed back to the API client.
Supported streaming modes¶
LangGraph Platform supports the following streaming modes:
Mode | Description | LangGraph Library Method |
---|---|---|
values |
Stream the full graph state after each super-step. Guide | .stream() / .astream() with stream_mode="values" |
updates |
Stream only the updates to the graph state after each node. Guide | .stream() / .astream() with stream_mode="updates" |
messages-tuple |
Stream LLM tokens for any messages generated inside the graph (useful for chat apps). Guide | .stream() / .astream() with stream_mode="messages" |
debug |
Stream debug information throughout graph execution. Guide | .stream() / .astream() with stream_mode="debug" |
custom |
Stream custom data. Guide | .stream() / .astream() with stream_mode="custom" |
events |
Stream all events (including the state of the graph); mainly useful when migrating large LCEL apps. Guide | .astream_events() |
✅ You can also combine multiple modes at the same time. See the how-to guide for configuration details.
Stateless runs¶
If you don't want to persist the outputs of a streaming run in the checkpointer DB, you can create a stateless run without creating a thread:
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)
async for chunk in client.runs.stream(
None, # (1)!
assistant_id,
input=inputs,
stream_mode="updates"
):
print(chunk.data)
- We are passing
None
instead of athread_id
UUID.
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });
// create a streaming run
const streamResponse = client.runs.stream(
null, // (1)!
assistantID,
{
input,
streamMode: "updates"
}
);
for await (const chunk of streamResponse) {
console.log(chunk.data);
}
- We are passing
None
instead of athread_id
UUID.
Join and stream¶
LangGraph Platform allows you to join an active background run and stream outputs from it. To do so, you can use LangGraph SDK's client.runs.join_stream
method:
from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)
async for chunk in client.runs.join_stream(
thread_id,
run_id, # (1)!
):
print(chunk)
- This is the
run_id
of an existing run you want to join.
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });
const streamResponse = client.runs.joinStream(
threadID,
runId // (1)!
);
for await (const chunk of streamResponse) {
console.log(chunk);
}
- This is the
run_id
of an existing run you want to join.
Outputs not buffered
When you use .join_stream
, output is not buffered, so any output produced before joining will not be received.
API Reference¶
For API usage and implementation, refer to the API reference.