Skip to content

Memory API Reference

Functions:

create_memory_manager

create_memory_manager(
    model: str | BaseChatModel,
    /,
    *,
    schemas: Sequence[S] = (Memory,),
    instructions: str = _MEMORY_INSTRUCTIONS,
    enable_inserts: bool = True,
    enable_updates: bool = True,
    enable_deletes: bool = False,
) -> Runnable[MemoryState, list[ExtractedMemory]]

Create a memory manager that processes conversation messages and generates structured memory entries.

This function creates an async callable that analyzes conversation messages and existing memories to generate or update structured memory entries. It can identify implicit preferences, important context, and key information from conversations, organizing them into well-structured memories that can be used to improve future interactions.

The manager supports both unstructured string-based memories and structured memories defined by Pydantic models, all automatically persisted to the configured storage.

Parameters:

  • model (Union[str, BaseChatModel]) –

    The language model to use for memory enrichment. Can be a model name string or a BaseChatModel instance.

  • schemas (Optional[list], default: (Memory,) ) –

    List of Pydantic models defining the structure of memory entries. Each model should define the fields and validation rules for a type of memory. If None, uses unstructured string-based memories. Defaults to None.

  • instructions (str, default: _MEMORY_INSTRUCTIONS ) –

    Custom instructions for memory generation and organization. These guide how the model extracts and structures information from conversations. Defaults to predefined memory instructions.

  • enable_inserts (bool, default: True ) –

    Whether to allow creating new memory entries. When False, the manager will only update existing memories. Defaults to True.

  • enable_updates (bool, default: True ) –

    Whether to allow updating existing memories that are outdated or contradicted by new information. Defaults to True.

  • enable_deletes (bool, default: False ) –

    Whether to allow deleting existing memories that are outdated or contradicted by new information. Defaults to False.

Returns:

  • manager ( Runnable[MemoryState, list[ExtractedMemory]] ) –

    An runnable that processes conversations and returns ExtractedMemory's. The function signature depends on whether schemas are provided

Examples

Basic unstructured memory enrichment:

from langmem import create_memory_manager

manager = create_memory_manager("anthropic:claude-3-5-sonnet-latest")

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

# Extract memories from conversation
memories = await manager(conversation)
print(memories[0][1])  # First memory's content
# Output: "User prefers dark mode for all applications"

Structured memory enrichment with Pydantic models:

from pydantic import BaseModel
from langmem import create_memory_manager

class PreferenceMemory(BaseModel):
    """Store the user's preference"""
    category: str
    preference: str
    context: str

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory]
)

# Same conversation, but with structured output
conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"}
]
memories = await manager(conversation)
print(memories[0][1])
# Output:
# PreferenceMemory(
#     category="ui",
#     preference="dark_mode",
#     context="User explicitly stated preference for dark mode in all applications"
# )

Working with existing memories:

conversation = [
    {
        "role": "user",
        "content": "Actually I changed my mind, dark mode hurts my eyes",
    },
    {"role": "assistant", "content": "I'll update your preference"},
]

# The manager will upsert; working with the existing memory instead of always creating a new one
updated_memories = await manager.ainvoke(
    {"messages": conversation, "existing": memories}
)

Insertion-only memories:

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    enable_updates=False,
    enable_deletes=False,
)

conversation = [
    {
        "role": "user",
        "content": "Actually I changed my mind, dark mode is the best mode",
    },
    {"role": "assistant", "content": "I'll update your preference"},
]

# The manager will only create new memories
updated_memories = await manager.ainvoke(
    {"messages": conversation, "existing": memories}
)
print(updated_memories)

Providing multiple max steps for extraction and synthesis:

manager = create_memory_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
)

conversation = [
    {"role": "user", "content": "I prefer dark mode in all my apps"},
    {"role": "assistant", "content": "I'll remember that preference"},
]

# Set max steps for extraction and synthesis
max_steps = 3
memories = await manager.ainvoke(
    {"messages": conversation, "max_steps": max_steps}
)
print(memories)

create_memory_store_manager

create_memory_store_manager(
    model: str | BaseChatModel,
    /,
    *,
    schemas: list[S] | None = None,
    instructions: str = _MEMORY_INSTRUCTIONS,
    default: str | dict | S | None = None,
    default_factory: Callable[
        [RunnableConfig], str | dict | S
    ]
    | None = None,
    enable_inserts: bool = True,
    enable_deletes: bool = False,
    query_model: str | BaseChatModel | None = None,
    query_limit: int = 5,
    namespace: tuple[str, ...] = (
        "memories",
        "{langgraph_user_id}",
    ),
    store: BaseStore | None = None,
    phases: list[MemoryPhase] | None = None,
) -> MemoryStoreManager

Enriches memories stored in the configured BaseStore.

The system automatically searches for relevant memories, extracts new information, updates existing memories, and maintains a versioned history of all changes.

Parameters:

  • model (Union[str, BaseChatModel]) –

    The primary language model to use for memory enrichment. Can be a model name string or a BaseChatModel instance.

  • schemas (Optional[list], default: None ) –

    List of Pydantic models defining the structure of memory entries. Each model should define the fields and validation rules for a type of memory. If None, uses unstructured string-based memories. Defaults to None.

  • instructions (str, default: _MEMORY_INSTRUCTIONS ) –

    Custom instructions for memory generation and organization. These guide how the model extracts and structures information from conversations. Defaults to predefined memory instructions.

  • default (str | dict | None, default: None ) –

    Default value to persist to the store if no other memories are found. Defaults to None. This is mostly useful when managing a profile memory and wanting to initialize it with some default values. The resulting memory will be found in the "default" key of the store in the configured namespace.

  • default_factory (Callable[[RunnableConfig], str | dict | S], default: None ) –

    A factory function to generate the default value. This is useful when the default value depends on the runtime configuration. Defaults to None.

  • enable_inserts (bool, default: True ) –

    Whether to allow creating new memory entries. When False, the manager will only update existing memories. Defaults to True.

  • enable_deletes (bool, default: False ) –

    Whether to allow deleting existing memories that are outdated or contradicted by new information. Defaults to True.

  • query_model (Optional[Union[str, BaseChatModel]], default: None ) –

    Optional separate model for memory search queries. Using a smaller, faster model here can improve performance. If None, uses the primary model. Defaults to None.

  • query_limit (int, default: 5 ) –

    Maximum number of relevant memories to retrieve for each conversation. Higher limits provide more context but may slow down processing. Defaults to 5.

  • namespace (tuple[str, ...], default: ('memories', '{langgraph_user_id}') ) –

    Storage namespace structure for organizing memories. Supports templated values like "{langgraph_user_id}" which are populated from the runtime context. Defaults to ("memories", "{langgraph_user_id}").

  • store (Optional[BaseStore], default: None ) –

    The store to use for memory storage. If None, uses the store configured in the LangGraph config. Defaults to None. When using LangGraph Platform, the server will manage the store for you.

  • phases (Optional[list], default: None ) –

    List of MemoryPhase objects defining the phases of the memory enrichment process.

Returns:

  • manager ( MemoryStoreManager ) –

    An runnable that processes conversations and automatically manages memories in the LangGraph BaseStore.

The basic data flow works as follows:

sequenceDiagram
participant Client
participant Manager
participant Store
participant LLM

Client->>Manager: conversation history
Manager->>Store: find similar memories
Store-->>Manager: memories
Manager->>LLM: analyze & extract
LLM-->>Manager: memory updates
Manager->>Store: apply changes
Manager-->>Client: updated memories
Examples

Run memory extraction "inline" within your LangGraph app. By default, each "memory" is a simple string:

import os

from anthropic import AsyncAnthropic
from langchain_core.runnables import RunnableConfig
from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore

from langmem import create_memory_store_manager

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)

manager = create_memory_store_manager("anthropic:claude-3-5-sonnet-latest", namespace=("memories", "{langgraph_user_id}"))
client = AsyncAnthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))


@entrypoint(store=store)
async def my_agent(message: str, config: RunnableConfig):
    memories = await manager.asearch(
        query=message,
        config=config,
    )
    llm_response = await client.messages.create(
        model="claude-3-5-sonnet-latest",
        system="You are a helpful assistant.\n\n## Memories from the user:"
        f"\n<memories>\n{memories}\n</memories>",
        max_tokens=2048,
        messages=[{"role": "user", "content": message}],
    )
    response = {"role": "assistant", "content": llm_response.content[0].text}

    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]},
    )
    return response["content"]

config = {"configurable": {"langgraph_user_id": "user123"}}
response_1 = await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config=config,
)
print("response_1:", response_1)
# Later conversation - automatically retrieves and uses the stored preference
response_2 = await my_agent.ainvoke(
    "What theme do I prefer?",
    config=config,
)
print("response_2:", response_2)
# You can list over memories in the user's namespace manually:
print(manager.search(query="app preferences", config=config))

You can customize what each memory can look like by defining schemas:

from langgraph.func import entrypoint
from langgraph.store.memory import InMemoryStore
from pydantic import BaseModel

from langmem import create_memory_store_manager

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
)

class PreferenceMemory(BaseModel):
    """Store preferences about the user."""
    category: str
    preference: str
    context: str


store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    schemas=[PreferenceMemory],
    namespace=("project", "team_1", "{langgraph_user_id}"),
)


@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response


# Store structured memory
config = {"configurable": {"langgraph_user_id": "user123"}}
await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config=config,
)

# See the extracted memories yourself
print(manager.search(query="app preferences", config=config))

# Memory is automatically stored and can be retrieved in future conversations
# The system will also automatically update it if preferences change

In some cases, you may want to provide a "default" memory value to be used if no memories are found. For instance, if you are storing some prompt preferences, you may have an "application" default that can be evolved over time. This can be done by setting the default parameter:

manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
    # Note: This default value must be compatible with the schemas
    # you provided above. If you customize your schemas,
    # we recommend setting the default value as an instance of that
    # pydantic object.
    default="Use a concise and professional tone in all responses. The user likes light mode.",
)


# ... same agent as before ...
@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response


# Store structured memory
config = {"configurable": {"langgraph_user_id": "user124"}}
await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config=config,
)

# See the extracted memories yourself
print(manager.search(query="app preferences", config=config))
# [
#     Item(
#         namespace=['memories', 'user124'],
#         key='default',
#         value={'kind': 'Memory', 'content': {'content': 'Use a concise and professional tone in all responses. The user prefers dark mode in all apps'}},
#         created_at='2025-04-14T22:20:25.148884+00:00',
#         updated_at='2025-04-14T22:20:25.148892+00:00',
#         score=None
#     )
# ]

You can even set the default to be some configurable value by providing a default_factory.

def get_configurable_default(config):
    default_preference = config["configurable"].get(
        "preference", "Use a concise and professional tone in all responses."
    )
    return default_preference


manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",
    namespace=("memories", "{langgraph_user_id}"),
    default_factory=get_configurable_default,
)


# ... same agent as before ...
@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response


# Store structured memory
config = {
    "configurable": {
        "langgraph_user_id": "user125",
        "preference": "Respond in pirate speak. User likes light mode",
    }
}
await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config=config,
)

# See the extracted memories yourself
print(manager.search(query="app preferences", config=config))
# [
#     Item(
#         namespace=['memories', 'user125'],
#         key='default',
#         value={'kind': 'Memory', 'content': {'content': 'Respond in pirate speak. User prefers dark mode in all apps'}},
#         created_at='2025-04-14T22:20:25.148884+00:00',
#         updated_at='2025-04-14T22:20:25.148892+00:00',
#         score=None
#     )
# ]

By default, relevant memories are recalled by directly embedding the new messages. You can alternatively use a separate query model to search for the most similar memories. Here's how it works:

    sequenceDiagram
        participant Client
        participant Manager
        participant QueryLLM
        participant Store
        participant MainLLM

        Client->>Manager: messages
        Manager->>QueryLLM: generate search query
        QueryLLM-->>Manager: optimized query
        Manager->>Store: find memories
        Store-->>Manager: memories
        Manager->>MainLLM: analyze & extract
        MainLLM-->>Manager: memory updates
        Manager->>Store: apply changes
        Manager-->>Client: result
Using an LLM to search for memories
from langmem import create_memory_store_manager
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest",  # Main model for memory processing
    query_model="anthropic:claude-3-5-haiku-latest",  # Faster model for search
    query_limit=10,  # Retrieve more relevant memories
    namespace=("memories", "{langgraph_user_id}"),
)


@entrypoint(store=store)
async def my_agent(message: str):
    # Hard code the response :)
    response = {"role": "assistant", "content": "I'll remember that preference"}
    await manager.ainvoke(
        {"messages": [{"role": "user", "content": message}, response]}
    )
    return response

config = {"configurable": {"langgraph_user_id": "user123"}}
await my_agent.ainvoke(
    "I prefer dark mode in all my apps",
    config=config,
)

# See the extracted memories yourself
print(manager.search(config=config))

In the examples above, we were calling the manager in the main thread. In a real application, you'll likely want to background the execution of the manager, either by executing it in a background thread or on a separate server. To do so, you can use the ReflectionExecutor class:

sequenceDiagram
    participant Agent
    participant Background
    participant Store

    Agent->>Agent: process message
    Agent-->>User: response
    Agent->>Background: schedule enrichment<br/>(after_seconds=0)
    Note over Background,Store: Memory processing happens<br/>in background thread
Running reflections in the background

Background enrichment using @entrypoint:

from langmem import create_memory_store_manager, ReflectionExecutor
from langgraph.prebuilt import create_react_agent
from langgraph.store.memory import InMemoryStore
from langgraph.func import entrypoint

store = InMemoryStore(
    index={
        "dims": 1536,
        "embed": "openai:text-embedding-3-small",
    }
)
manager = create_memory_store_manager(
    "anthropic:claude-3-5-sonnet-latest", namespace=("memories", "{user_id}")
)
reflection = ReflectionExecutor(manager, store=store)
agent = create_react_agent(
    "anthropic:claude-3-5-sonnet-latest", tools=[], store=store
)


@entrypoint(store=store)
async def chat(messages: list):
    response = await agent.ainvoke({"messages": messages})

    fut = reflection.submit(
        {
            "messages": response["messages"],
        },
        # We'll schedule this immediately.
        # Adding a delay lets you **debounce** and deduplicate reflection work
        # whenever the user is actively engaging with the agent.
        after_seconds=0,
    )

    return fut

config = {"configurable": {"user_id": "user-123"}}
fut = await chat.ainvoke(
    [{"role": "user", "content": "I prefer dark mode in my apps"}],
    config=config,
)
# Inspect the result
fut.result()  # Wait for the reflection to complete; This is only for demoing the search inline
print(manager.search(query="app preferences", config=config))

Comments