agent-protocol

Agent Streaming Protocol

This directory contains the Agent Protocol streaming specification and generated language bindings:

The streaming protocol is a thread-centric event and command protocol for observing and controlling long-running agent executions. It is designed for multiple transports, supports filtered subscriptions, and makes streamed model output, tool activity, graph state, checkpoints, lifecycle status, and human-in-the-loop interactions available through a common envelope.

Design Goals

The protocol is built around a few stable primitives:

Transport Model

The CDDL schema defines a single payload model that can be used over SSE/HTTP, WebSocket, and in-process transports.

SSE/HTTP

SSE uses connection-scoped subscriptions:

The events request body is an EventStreamRequest:

{
  "channels": ["messages", "updates", "lifecycle"],
  "namespaces": [[]],
  "depth": 2,
  "since": 123
}

Each SSE connection is its own subscription. Closing the connection unsubscribes from that stream. A client may open multiple event streams for the same thread, for example one stream for low-latency model tokens and another for state or checkpoint updates.

WebSocket

WebSocket uses in-band commands over a single full-duplex connection:

Subscriptions persist across run boundaries and are removed explicitly or when the WebSocket closes.

Once the upgrade succeeds, the WebSocket uses the same top-level message framing described below. Clients send Command objects, and servers send CommandResponse, ErrorResponse, and unsolicited Event objects on the same connection.

Top-Level Framing

Client-to-server messages are commands:

{
  "id": 1,
  "method": "run.start",
  "params": {
    "assistantId": "agent",
    "input": {
      "messages": [{ "role": "user", "content": "Hello" }]
    }
  }
}

Server-to-client messages are either command responses, error responses, or events.

Successful responses include the original command id and a typed result:

{
  "type": "success",
  "id": 1,
  "result": {
    "runId": "run_123"
  },
  "meta": {
    "appliedThroughSeq": 42
  }
}

Error responses include the original command id when available:

{
  "type": "error",
  "id": 1,
  "error": "invalid_argument",
  "message": "assistantId is required"
}

Events are unsolicited server pushes:

{
  "type": "event",
  "eventId": "evt_123",
  "seq": 43,
  "method": "messages",
  "params": {
    "namespace": [],
    "timestamp": 1710000000000,
    "data": {
      "event": "message-start",
      "role": "ai",
      "id": "msg_123"
    }
  }
}

The optional eventId maps to the SSE id: field and is used for transport reconnection. The optional seq is a monotonic sequence number used for ordering and replay.

Threads, Runs, and Input

The protocol is thread-centric. A thread is the durable identity for state, checkpoints, run history, and stream routing. A server may create a thread lazily when it receives the first run.start command for a thread that does not exist yet.

run.start is the main entry point for execution input:

The command carries the target assistantId, arbitrary graph input, optional runtime config, and optional metadata.

Human-in-the-loop control uses the input module:

Namespaces

A namespace is a path of strings identifying a location in the agent tree:

Subscriptions can filter by namespace prefix and optional depth. This allows a client to observe the whole tree, a single subgraph, or a bounded region under a subgraph without receiving unrelated events.

Channels

Channels are the primary subscription unit. A client requests one or more channels and receives only matching events.

messages

The messages channel streams transcript messages and content blocks. It uses explicit event boundaries:

  1. message-start
  2. content-block-start
  3. zero or more content-block-delta events
  4. content-block-finish
  5. repeat content block events for additional blocks
  6. message-finish

Content blocks do not interleave within a single message. Block N finishes before block N + 1 starts. This matches common LLM provider streaming behavior and keeps client assembly deterministic.

Delta events carry explicit delta variants. text-delta appends to the active block’s text field, reasoning-delta appends to reasoning, data-delta appends encoded data chunks to base64, and block-delta shallow-merges fields onto the active block. For example:

{
  "event": "content-block-delta",
  "index": 0,
  "delta": {
    "type": "text-delta",
    "text": "Hello "
  }
}

Multimodal data streams use data-delta for encoded chunks:

{
  "event": "content-block-delta",
  "index": 1,
  "delta": {
    "type": "data-delta",
    "data": "UklGR...",
    "encoding": "base64"
  }
}

Tool call arguments stream as chunk content and finalize as parsed tool calls:

{
  "event": "content-block-delta",
  "index": 1,
  "delta": {
    "type": "block-delta",
    "fields": {
      "type": "tool_call_chunk",
      "id": "call_123",
      "name": "search",
      "args": "{\"query\":"
    }
  }
}
{
  "event": "content-block-finish",
  "index": 1,
  "content": {
    "type": "tool_call",
    "id": "call_123",
    "name": "search",
    "args": {
      "query": "weather"
    }
  }
}

message-finish may include token usage for AI-authored messages. Unrecoverable model-call failures are emitted as message error events.

tools

The tools channel exposes tool execution lifecycle observability:

  1. tool-started
  2. zero or more tool-output-delta events for streaming tools
  3. tool-finished or tool-error

Tool events are correlated by toolCallId. Clients can connect a tool execution back to a tool call content block by matching the tool call content block id with toolCallId.

lifecycle

The lifecycle channel tracks root run and subgraph status:

Lifecycle events include a namespace, optional graphName, optional error, optional checkpoint, and optional cause.

The cause field explains why a child namespace started. Current cause variants include:

Consumers should tolerate unknown cause types so new cause variants can be added without breaking existing clients.

input

The input channel carries human-in-the-loop requests. An input.requested event contains an interruptId and application-defined payload. Clients answer with input.respond, passing the same namespace and interrupt ID.

values

The values channel carries full graph state snapshots. When a subscription is created, the first replayed values event is the current full state, giving the client a stable baseline before applying deltas from other channels.

updates

The updates channel carries per-node or per-step state deltas:

{
  "method": "updates",
  "params": {
    "namespace": [],
    "timestamp": 1710000000000,
    "data": {
      "node": "agent",
      "values": {
        "messages": []
      }
    }
  }
}

Clients that need complete state should subscribe to values for an initial snapshot and use updates for incremental changes.

checkpoints

The checkpoints channel emits lightweight checkpoint envelopes. Each envelope includes:

This lets clients build branching and time-travel interfaces without streaming full checkpoint state. Full state can be fetched lazily with state.get or in bulk with state.listCheckpoints.

A checkpoint event is emitted on the same superstep as the corresponding values event. Clients subscribed to both can correlate them by namespace and step, or by adjacent event sequence numbers.

tasks

The tasks channel carries Pregel task creation and result events. Its payload shape is intentionally open because it follows the runtime task representation.

custom and custom:*

The custom channel carries user-defined payloads emitted from graph code. The base custom channel uses a name and payload envelope. Namespaced custom channels such as custom:progress allow applications to define more specific event lanes while keeping the protocol extensible.

State and Checkpoints

The state module provides command-level access to graph state:

State access complements streaming. Streams provide live observation; state commands provide explicit reads and time-travel operations.

Replay and Reconnection

Servers may keep a ring buffer of recent events per thread. Clients use sequence numbers to recover missed events:

The server replays matching buffered events after the requested point and then switches to live delivery. If the requested event is no longer buffered, servers should report that the client missed events so the client can resync through state commands.

Extensibility

Most records include an Extensible tail, allowing additional text-keyed fields for forward compatibility. Consumers should ignore unknown fields and default unknown tagged variants to a safe fallback instead of failing closed.

Content blocks are especially extensible. New block types can be added to LangChain content block definitions and flow through the same message lifecycle. During streaming, encoded data chunks can use data-delta, while block-specific incremental fields can use block-delta without changing the transport or channel model.

Generated Bindings

The js and py directories contain generated type bindings for the CDDL schema. They are intended for typing protocol payloads, not as runtime clients. The packages do not include transport implementations, connection management, or helper APIs.

When the CDDL schema changes, regenerate the language bindings from streaming/protocol.cddl and keep the generated files in sync.