Skip to content

How to add semantic search to your agent's memory

This guide shows how to enable semantic search in your agent's memory store. This lets your agent search for items in the long-term memory store by semantic similarity.

Dependencies and environment setup

First, install this guide's required dependencies.

npm install \
  @langchain/langgraph \
  @langchain/openai \
  @langchain/core \
  uuid \
  zod

Next, we need to set API keys for OpenAI (the LLM we will use)

export OPENAI_API_KEY=your-api-key

Optionally, we can set API key for LangSmith tracing, which will give us best-in-class observability.

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_CALLBACKS_BACKGROUND="true"
export LANGCHAIN_API_KEY=your-api-key

Here we create our memory store with an index configuration.

By default, stores are configured without semantic/vector search. You can opt in to indexing items when creating the store by providing an IndexConfig to the store's constructor.

If your store class does not implement this interface, or if you do not pass in an index configuration, semantic search is disabled, and all index arguments passed to put will have no effect.

Now, let's create that store!

import { OpenAIEmbeddings } from "@langchain/openai";
import { InMemoryStore } from "@langchain/langgraph";

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const store = new InMemoryStore({
  index: {
    embeddings,
    dims: 1536,
  }
});

The anatomy of a memory

Before we get into semantic search, let's look at how memories are structured, and how to store them:

let namespace = ["user_123", "memories"]
let memoryKey = "favorite_food"
let memoryValue = {"text": "I love pizza"}

await store.put(namespace, memoryKey, memoryValue)

As you can see, memories are composed of a namespace, a key, and a value.

Namespaces are multi-dimensional values (arrays of strings) that allow you to segment memory according to the needs of your application. In this case, we're segmenting memories by user by using a User ID ("user_123") as the first dimension of our namespace array.

Keys are arbitrary strings that identify the memory within the namespace. If you write to the same key in the same namespace multiple times, you'll overwrite the memory that was stored under that key.

Values are objects that represent the actual memory being stored. These can be any object, so long as its serializable. You can structure these objects according to the needs of your application.

Simple memory retrieval

Let's add some more memories to our store and then fetch one of them by it's key to check that it stored properly.

await store.put(
  ["user_123", "memories"],
  "italian_food",
  {"text": "I prefer Italian food"}
)
await store.put(
  ["user_123", "memories"],
  "spicy_food",
  {"text": "I don't like spicy food"}
)
await store.put(
  ["user_123", "memories"],
  "occupation",
  {"text": "I am an airline pilot"}
)

// That occupation is too lofty - let's overwrite
// it with something more... down-to-earth
await store.put(
  ["user_123", "memories"],
  "occupation",
  {"text": "I am a tunnel engineer"}
)

// now let's check that our occupation memory was overwritten
const occupation = await store.get(["user_123", "memories"], "occupation")
console.log(occupation.value.text)
I am a tunnel engineer

Searching memories with natural language

Now that we've seen how to store and retrieve memories by namespace and key, let's look at how memories are retrieved using semantic search.

Imagine that we had a big pile of memories that we wanted to search, and we didn't know the key that corresponds to the memory that we want to retrieve. Semantic search allows us to search our memory store without keys, by performing a natural language query using text embeddings. We demonstrate this in the following example:

const memories = await store.search(["user_123", "memories"], {
  query: "What is my occupation?",
  limit: 3,
});

for (const memory of memories) {
  console.log(`Memory: ${memory.value.text} (similarity: ${memory.score})`);
}
Memory: I am a tunnel engineer (similarity: 0.3070681445327329)
Memory: I prefer Italian food (similarity: 0.1435366180543232)
Memory: I love pizza (similarity: 0.10650935500808985)

Simple Example: Long-term semantic memory in a ReAct agent

Let's look at a simple example of providing an agent with long-term memory.

Long-term memory can be thought of in two phases: storage, and recall.

In the example below we handle storage by giving the agent a tool that it can use to create new memories.

To handle recall we'll add a prompt step that queries the memory store using the text from the user's chat message. We'll then inject the results of that query into the system message.

Simple memory storage tool

Let's start off by creating a tool that lets the LLM store new memories:

import { tool } from "@langchain/core/tools";
import { LangGraphRunnableConfig } from "@langchain/langgraph";

import { z } from "zod";
import { v4 as uuidv4 } from "uuid";

const upsertMemoryTool = tool(async (
  { content },
  config: LangGraphRunnableConfig
): Promise<string> => {
  const store = config.store as InMemoryStore;
  if (!store) {
    throw new Error("No store provided to tool.");
  }
  await store.put(
    ["user_123", "memories"],
    uuidv4(), // give each memory its own unique ID
    { text: content }
  );
  return "Stored memory.";
}, {
  name: "upsert_memory",
  schema: z.object({
    content: z.string().describe("The content of the memory to store."),
  }),
  description: "Upsert long-term memories.",
});

In the tool above, we use a UUID as the key so that the memory store can accumulate memories endlessly without worrying about key conflicts. We do this instead of accumulating memories into a single object or array because the memory store indexes items by key. Giving each memory its own key in the store allows each memory to be assigned its own unique embedding vector that can be matched to the search query.

Simple semantic recall mechanism

Now that we have a tool for storing memories, let's create a prompt function that we can use with createReactAgent to handle the recall mechanism.

Note that if we weren't using createReactAgent here, you could use this same function as the first node in your graph and it would work just as well.

import { MessagesAnnotation } from "@langchain/langgraph";

const addMemories = async (
  state: typeof MessagesAnnotation.State,
  config: LangGraphRunnableConfig
) => {
  const store = config.store as InMemoryStore;

  if (!store) {
    throw new Error("No store provided to state modifier.");
  }

  // Search based on user's last message
  const items = await store.search(
    ["user_123", "memories"], 
    { 
      // Assume it's not a complex message
      query: state.messages[state.messages.length - 1].content as string,
      limit: 4 
    }
  );


  const memories = items.length 
    ? `## Memories of user\n${
      items.map(item => `${item.value.text} (similarity: ${item.score})`).join("\n")
    }`
    : "";

  // Add retrieved memories to system message
  return [
    { role: "system", content: `You are a helpful assistant.\n${memories}` },
    ...state.messages
  ];
};

Putting it all together

Finally, let's put it all together into an agent, using createReactAgent. Notice that we're not adding a checkpointer here. The examples below will not be reusing message history. All details not contained in the input messages will be coming from the recall mechanism defined above.

import { ChatOpenAI } from "@langchain/openai";
import { createReactAgent } from "@langchain/langgraph/prebuilt";

const agent = createReactAgent({
  llm: new ChatOpenAI({ model: "gpt-4o-mini" }),
  tools: [upsertMemoryTool],
  prompt: addMemories,
  store: store
});

Using our sample agent

Now that we've got everything put together, let's test it out!

First, let's define a helper function that we can use to print messages in the conversation.

import {
  BaseMessage,
  isSystemMessage,
  isAIMessage,
  isHumanMessage,
  isToolMessage,
  AIMessage,
  HumanMessage,
  ToolMessage,
  SystemMessage,
} from "@langchain/core/messages";

function printMessages(messages: BaseMessage[]) {
  for (const message of messages) {
    if (isSystemMessage(message)) {
      const systemMessage = message as SystemMessage;
      console.log(`System: ${systemMessage.content}`);
    } else if (isHumanMessage(message)) {
      const humanMessage = message as HumanMessage;
      console.log(`User: ${humanMessage.content}`);
    } else if (isAIMessage(message)) {
      const aiMessage = message as AIMessage;
      if (aiMessage.content) {
        console.log(`Assistant: ${aiMessage.content}`);
      }
      if (aiMessage.tool_calls) {
        for (const toolCall of aiMessage.tool_calls) {
          console.log(`\t${toolCall.name}(${JSON.stringify(toolCall.args)})`);
        }
      }
    } else if (isToolMessage(message)) {
      const toolMessage = message as ToolMessage;
      console.log(
        `\t\t${toolMessage.name} -> ${JSON.stringify(toolMessage.content)}`
      );
    }
  }
}

Now if we run the agent and print the message, we can see that the agent remembers the food preferences that we added to the store at the very beginning of this demo!

let result = await agent.invoke({
  messages: [
    {
      role: "user",
      content: "I'm hungry. What should I eat?",
    },
  ],
});

printMessages(result.messages);
User: I'm hungry. What should I eat?
Assistant: Since you prefer Italian food and love pizza, how about ordering a pizza? You could choose a classic Margherita or customize it with your favorite toppings, making sure to keep it non-spicy. Enjoy your meal!

Storing new memories

Now that we know that the recall mechanism works, let's see if we can get our example agent to store a new memory:

result = await agent.invoke({
  messages: [
    {
      role: "user",
      content: "Please remember that every Thursday is trash day.",
    },
  ],
});

printMessages(result.messages);
User: Please remember that every Thursday is trash day.
    upsert_memory({"content":"Every Thursday is trash day."})
        upsert_memory -> "Stored memory."
Assistant: I've remembered that every Thursday is trash day!
And now that it has stored it, let's see if it remembers.

Remember - there's no checkpointer here. Every time we invoke the agent it's an entirely new conversation.

result = await agent.invoke({
  messages: [
    {
      role: "user",
      content: "When am I supposed to take out the garbage?",
    },
  ],
});

printMessages(result.messages);
User: When am I supposed to take out the garbage?
Assistant: You take out the garbage every Thursday, as it's trash day for you.

Advanced Usage

The example above is quite simple, but hopefully it helps you to imagine how you might interweave storage and recall mechanisms into your agents. In the sections below we touch on a few more topics that might help you as you get into more advanced use cases.

Multi-vector indexing

You can store and search different aspects of memories separately to improve recall or to omit certain fields from the semantic indexing process.

import { InMemoryStore } from "@langchain/langgraph";

// Configure store to embed both memory content and emotional context
const multiVectorStore = new InMemoryStore({
  index: {
    embeddings: embeddings,
    dims: 1536,
    fields: ["memory", "emotional_context"],
  },
});

// Store memories with different content/emotion pairs
await multiVectorStore.put(["user_123", "memories"], "mem1", {
  memory: "Had pizza with friends at Mario's",
  emotional_context: "felt happy and connected",
  this_isnt_indexed: "I prefer ravioli though",
});
await multiVectorStore.put(["user_123", "memories"], "mem2", {
  memory: "Ate alone at home",
  emotional_context: "felt a bit lonely",
  this_isnt_indexed: "I like pie",
});

// Search focusing on emotional state - matches mem2
const results = await multiVectorStore.search(["user_123", "memories"], {
  query: "times they felt isolated",
  limit: 1,
});

console.log("Expect mem 2");

for (const r of results) {
  console.log(`Item: ${r.key}; Score(${r.score})`);
  console.log(`Memory: ${r.value.memory}`);
  console.log(`Emotion: ${r.value.emotional_context}`);
}
Expect mem 2
Item: mem2; Score(0.58961641225287)
Memory: Ate alone at home
Emotion: felt a bit lonely

Override fields at storage time

You can override which fields to embed when storing a specific memory using put(..., { index: [...fields] }), regardless of the store's default configuration.

import { InMemoryStore } from "@langchain/langgraph";

const overrideStore = new InMemoryStore({
  index: {
    embeddings: embeddings,
    dims: 1536,
    // Default to embed memory field
    fields: ["memory"],
  }
});

// Store one memory with default indexing
await overrideStore.put(["user_123", "memories"], "mem1", {
  memory: "I love spicy food",
  context: "At a Thai restaurant",
});

// Store another memory, overriding which fields to embed
await overrideStore.put(["user_123", "memories"], "mem2", {
  memory: "I love spicy food",
  context: "At a Thai restaurant",
  // Override: only embed the context
  index: ["context"]
});

// Search about food - matches mem1 (using default field)
console.log("Expect mem1");
const results2 = await overrideStore.search(["user_123", "memories"], {
  query: "what food do they like",
  limit: 1,
});

for (const r of results2) {
  console.log(`Item: ${r.key}; Score(${r.score})`);
  console.log(`Memory: ${r.value.memory}`);
}

// Search about restaurant atmosphere - matches mem2 (using overridden field)
console.log("Expect mem2");
const results3 = await overrideStore.search(["user_123", "memories"], {
  query: "restaurant environment",
  limit: 1,
});

for (const r of results3) {
  console.log(`Item: ${r.key}; Score(${r.score})`);
  console.log(`Memory: ${r.value.memory}`);
}
Expect mem1
Item: mem1; Score(0.3375009515587189)
Memory: I love spicy food
Expect mem2
Item: mem2; Score(0.1920732213417712)
Memory: I love spicy food

Was this page helpful?