Skip to content

Streaming

Streaming is critical for making LLM applications feel responsive to end users.
When creating a streaming run, the streaming mode determines what kinds of data are streamed back to the API client.

Supported streaming modes

LangGraph Platform supports the following streaming modes:

Mode Description LangGraph Library Method
values Stream the full graph state after each super-step. Guide .stream() / .astream() with stream_mode="values"
updates Stream only the updates to the graph state after each node. Guide .stream() / .astream() with stream_mode="updates"
messages-tuple Stream LLM tokens for any messages generated inside the graph (useful for chat apps). Guide .stream() / .astream() with stream_mode="messages"
debug Stream debug information throughout graph execution. Guide .stream() / .astream() with stream_mode="debug"
custom Stream custom data. Guide .stream() / .astream() with stream_mode="custom"
events Stream all events (including the state of the graph); mainly useful when migrating large LCEL apps. Guide .astream_events()

✅ You can also combine multiple modes at the same time. See the how-to guide for configuration details.

Stateless runs

If you don't want to persist the outputs of a streaming run in the checkpointer DB, you can create a stateless run without creating a thread:

from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)

async for chunk in client.runs.stream(
    None,  # (1)!
    assistant_id,
    input=inputs,
    stream_mode="updates"
):
    print(chunk.data)
  1. We are passing None instead of a thread_id UUID.
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });

// create a streaming run
const streamResponse = client.runs.stream(
  null,  // (1)!
  assistantID,
  {
    input,
    streamMode: "updates"
  }
);
for await (const chunk of streamResponse) {
  console.log(chunk.data);
}
  1. We are passing None instead of a thread_id UUID.
curl --request POST \
--url <DEPLOYMENT_URL>/runs/stream \
--header 'Content-Type: application/json' \
--header 'x-api-key: <API_KEY>'
--data "{
  \"assistant_id\": \"agent\",
  \"input\": <inputs>,
  \"stream_mode\": \"updates\"
}"

Join and stream

LangGraph Platform allows you to join an active background run and stream outputs from it. To do so, you can use LangGraph SDK's client.runs.join_stream method:

from langgraph_sdk import get_client
client = get_client(url=<DEPLOYMENT_URL>, api_key=<API_KEY>)

async for chunk in client.runs.join_stream(
    thread_id,
    run_id,  # (1)!
):
    print(chunk)
  1. This is the run_id of an existing run you want to join.
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: <DEPLOYMENT_URL>, apiKey: <API_KEY> });

const streamResponse = client.runs.joinStream(
  threadID,
  runId  // (1)!
);
for await (const chunk of streamResponse) {
  console.log(chunk);
}
  1. This is the run_id of an existing run you want to join.
curl --request GET \
--url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>/stream \
--header 'Content-Type: application/json' \
--header 'x-api-key: <API_KEY>'

Outputs not buffered

When you use .join_stream, output is not buffered, so any output produced before joining will not be received.

API Reference

For API usage and implementation, refer to the API reference.