Skip to content

Memory API Reference

Functions:

create_memory_manager

create_memory_manager(
    model: str | BaseChatModel,
    /,
    *,
    schemas: Sequence[Union[BaseModel, type]] = (Memory,),
    instructions: str = _MEMORY_INSTRUCTIONS,
    enable_inserts: bool = True,
    enable_updates: bool = True,
    enable_deletes: bool = False,
) -> Runnable[MemoryState, list[ExtractedMemory]]

Create a memory manager that processes conversation messages and generates structured memory entries.

This function creates an async callable that analyzes conversation messages and existing memories to generate or update structured memory entries. It can identify implicit preferences, important context, and key information from conversations, organizing them into well-structured memories that can be used to improve future interactions.

The manager supports both unstructured string-based memories and structured memories defined by Pydantic models, all automatically persisted to the configured storage.

Parameters:

  • model (Union[str, BaseChatModel]) –

    The language model to use for memory enrichment. Can be a model name string or a BaseChatModel instance.

  • schemas (Optional[list], default: (Memory,) ) –

    List of Pydantic models defining the structure of memory entries. Each model should define the fields and validation rules for a type of memory. If None, uses unstructured string-based memories. Defaults to None.

  • instructions (str, default: _MEMORY_INSTRUCTIONS ) –

    Custom instructions for memory generation and organization. These guide how the model extracts and structures information from conversations. Defaults to predefined memory instructions.

  • enable_inserts (bool, default: True ) –

    Whether to allow creating new memory entries. When False, the manager will only update existing memories. Defaults to True.

  • enable_updates (bool, default: True ) –

    Whether to allow updating existing memories that are outdated or contradicted by new information. Defaults to True.

  • enable_deletes (bool, default: False ) –

    Whether to allow deleting existing memories that are outdated or contradicted by new information. Defaults to False.

Returns:

  • manager ( Runnable[MemoryState, list[ExtractedMemory]] ) –

    An runnable that processes conversations and returns ExtractedMemory's. The function signature depends on whether schemas are provided

Examples

Basic unstructured memory enrichment:

from langmem import create_memory_manager

manager = create_memory_manager("anthropic:claude-3-5-sonnet-latest")

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

# Extract memories from conversation
memories = await manager(conversation)
print(memories[0][1])  # First memory's content
# Output: "User prefers dark mode for all applications"

Structured memory enrichment with Pydantic models:

from pydantic import BaseModel
from langmem import create_memory_manager

class PreferenceMemory(BaseModel):
    """Store the user's preference"""
    category: str
    preference: str
    context: str

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory]
)

# Same conversation, but with structured output
conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"}
]
memories = await manager(conversation)
print(memories[0][1])
# Output:
# PreferenceMemory(
#     category="ui",
#     preference="dark_mode",
#     context="User explicitly stated preference for dark mode in all applications"
# )

Working with existing memories:

conversation = [
    {
        "role": "user",
        "content": "Actually I changed my mind, dark mode hurts my eyes",
    },
    {"role": "assistant", "content": "I'll update your preference"},
]

# The manager will upsert; working with the existing memory instead of always creating a new one
updated_memories = await manager.ainvoke(
    {"messages": conversation, "existing": memories}
)

Insertion-only memories:

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    enable_updates=False,
    enable_deletes=False,
)

conversation = [
    {
        "role": "user",
        "content": "Actually I changed my mind, dark mode is the best mode",
    },
    {"role": "assistant", "content": "I'll update your preference"},
]

# The manager will only create new memories
updated_memories = await manager.ainvoke(
    {"messages": conversation, "existing": memories}
)
print(updated_memories)

Providing multiple max steps for extraction and synthesis:

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
)

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

# Set max steps for extraction and synthesis
max_steps = 3
memories = await manager.ainvoke(
    {"messages": conversation, "max_steps": max_steps}
)
print(memories)

create_memory_store_manager

create_memory_store_manager(
    model: str | BaseChatModel,
    /,
    *,
    schemas: list | None = None,
    instructions: str = _MEMORY_INSTRUCTIONS,
    enable_inserts: bool = True,
    enable_deletes: bool = True,
    query_model: str | BaseChatModel | None = None,
    query_limit: int = 5,
    namespace: tuple[str, ...] = (
        "memories",
        "{langgraph_user_id}",
    ),
) -> MemoryStoreManager

Enriches memories stored in the configured BaseStore.

The system automatically searches for relevant memories, extracts new information, updates existing memories, and maintains a versioned history of all changes.

Parameters:

  • model (Union[str, BaseChatModel]) –

    The primary language model to use for memory enrichment. Can be a model name string or a BaseChatModel instance.

  • schemas (Optional[list], default: None ) –

    List of Pydantic models defining the structure of memory entries. Each model should define the fields and validation rules for a type of memory. If None, uses unstructured string-based memories. Defaults to None.

  • instructions (str, default: _MEMORY_INSTRUCTIONS ) –

    Custom instructions for memory generation and organization. These guide how the model extracts and structures information from conversations. Defaults to predefined memory instructions.

  • enable_inserts (bool, default: True ) –

    Whether to allow creating new memory entries. When False, the manager will only update existing memories. Defaults to True.

  • enable_deletes (bool, default: True ) –

    Whether to allow deleting existing memories that are outdated or contradicted by new information. Defaults to True.

  • query_model (Optional[Union[str, BaseChatModel]], default: None ) –

    Optional separate model for memory search queries. Using a smaller, faster model here can improve performance. If None, uses the primary model. Defaults to None.

  • query_limit (int, default: 5 ) –

    Maximum number of relevant memories to retrieve for each conversation. Higher limits provide more context but may slow down processing. Defaults to 5.

  • namespace (tuple[str, ...], default: ('memories', '{langgraph_user_id}') ) –

    Storage namespace structure for organizing memories. Supports templated values like "{langgraph_user_id}" which are populated from the runtime context. Defaults to ("memories", "{langgraph_user_id}").

Returns:

  • manager ( MemoryStoreManager ) –

    An runnable that processes conversations and automatically manages memories in the LangGraph BaseStore.

The basic data flow works as follows:

sequenceDiagram
participant Client
participant Manager
participant Store
participant LLM

Client->>Manager: conversation history
Manager->>Store: find similar memories
Store-->>Manager: memories
Manager->>LLM: analyze & extract
LLM-->>Manager: memory updates
Manager->>Store: apply changes
Manager-->>Client: updated memories
Examples

Run memory extraction "inline" within your LangGraph app. By default, each "memory" is a simple string:

import os

from anthropic import AsyncAnthropic
from langchain_core.runnables import RunnableConfig
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore

from langmem import create_memory_store_manager

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager("anthropic:claude-3-5-sonnet-latest", namespace=("memories", "{langgraph_user_id}"))
client = AsyncAnthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))


@entrypoint(store=store)
async def my_agent(message: str, config: RunnableConfig):
    memories = await store.asearch(
        ("memories", config["configurable"]["langgraph_user_id"]),
        query=message,
    )
    llm_response = await client.messages.create(
        model="claude-3-5-sonnet-latest",
        system="You are a helpful assistant.\n\n## Memories from the user:"
        f"\n<memories>\n{memories}\n</memories>",
        max_tokens=2048,
        messages=[{"role": "user", "content": message}],
    )
    response = {"role": "assistant", "content": llm_response.content[0].text}

    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]},
    )
    return response["content"]


response_1 = await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config={"configurable": {"langgraph_user_id": "user123"}},
)
print("response_1:", response_1)
# Later conversation - automatically retrieves and uses the stored preference
response_2 = await my_agent.ainvoke(
    "What theme do I prefer?",
    config={"configurable": {"langgraph_user_id": "user123"}},
)
print("response_2:", response_2)
# You can list over memories in the user's namespace manually:
print(store.search(("memories", "user123")))

You can customize what each memory can look like by defining schemas:

from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from pydantic import BaseModel

from langmem import create_memory_store_manager

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
)

class PreferenceMemory(BaseModel):
    """Store preferences about the user."""
    category: str
    preference: str
    context: str


store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    namespace=("project", "team_1", "{langgraph_user_id}"),
)


@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response


# Store structured memory
await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config={"configurable": {"langgraph_user_id": "user123"}},
)

# See the extracted memories yourself
print(store.search(("memories", "user123")))

# Memory is automatically stored and can be retrieved in future conversations
# The system will also automatically update it if preferences change

By default, relevant memories are recalled by directly embedding the new messages. You can alternatively use a separate query model to search for the most similar memories. Here's how it works:

    sequenceDiagram
        participant Client
        participant Manager
        participant QueryLLM
        participant Store
        participant MainLLM

        Client->>Manager: messages
        Manager->>QueryLLM: generate search query
        QueryLLM-->>Manager: optimized query
        Manager->>Store: find memories
        Store-->>Manager: memories
        Manager->>MainLLM: analyze & extract
        MainLLM-->>Manager: memory updates
        Manager->>Store: apply changes
        Manager-->>Client: result
Using an LLM to search for memories
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",  # Main model for memory processing
    query_model="anthropic:claude-3-5-haiku-latest",  # Faster model for search
    query_limit=10,  # Retrieve more relevant memories
    namespace=("memories", "{langgraph_user_id}"),
)


@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response


await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config={"configurable": {"langgraph_user_id": "user123"}},
)

# See the extracted memories yourself
print(store.search(("memories", "user123")))

In the examples above, we were calling the manager in the main thread. In a real application, you'll likely want to background the execution of the manager, either by executing it in a background thread or on a separate server. To do so, you can use the ReflectionExecutor class:

sequenceDiagram
    participant Agent
    participant Background
    participant Store

    Agent->>Agent: process message
    Agent-->>User: response
    Agent->>Background: schedule enrichment<br/>(after_seconds=0)
    Note over Background,Store: Memory processing happens<br/>in background thread
Running reflections in the background

Background enrichment using @entrypoint:

from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest", namespace=("memories", "{user_id}")
)
reflection = ReflectionExecutor(manager, store=store)
agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest", tools=[], store=store
)


@entrypoint(store=store)
async def chat(messages: list):
    response = await agent.ainvoke({"messages": messages})

    fut = reflection.submit(
        {
            "messages": response["messages"],
        },
        # We'll schedule this immediately.
        # Adding a delay lets you **debounce** and deduplicate reflection work
        # whenever the user is actively engaging with the agent.
        after_seconds=0,
    )

    return fut


fut = await chat.ainvoke(
    [{"role": "user", "content": "I prefer dark mode in my apps"}],
    config={"configurable": {"user_id": "user-123"}},
)
# Inspect the result
fut.result()  # Wait for the reflection to complete; This is only for demoing the search inline
print(store.search(("memories", "user-123")))

Comments