Memory in DeepAgents: How Agents Learn Across Conversations
Filesystem-backed memory, user versus agent scope, and background consolidation, with the TypeScript to wire each one
Memory in DeepAgents is just files. The agent reads and writes memory with the same filesystem tools it uses for everything else, and a backend decides where those files actually live. That is the whole model. Once it clicks, the rest is configuration.
I wrote earlier about short-term versus long-term agent memory at a concept level. This post is the DeepAgents-specific follow-up: the TypeScript for getting persistence right, the scoping decisions that matter, and the security traps that come with shared memory. It is part three of my series on building with DeepAgents.
To be clear about scope: this is about long-term memory, the kind that survives across conversations. Short-term memory (the conversation history and scratch files inside one session) is managed automatically as part of agent state, and I covered how that gets compressed in the context engineering post.
How memory works, in three steps#
- You point the agent at memory files by passing paths to
memory. A backend controls where those files are stored and who can read them. - The agent reads memory. It can load files into the system prompt at startup, or read them on demand during the conversation. Skills, for example, use on-demand loading: only descriptions at startup, full content when relevant.
- The agent updates memory, optionally. When it learns something, it uses its
edit_filetool to update a memory file. That can happen during the conversation or in the background between conversations.
Not all memory is writable. Developer-defined skills and organization policies are usually read-only. We will get to why that matters.
Scoping is the decision that counts#
The single most important choice is the backend namespace, because it decides who shares a memory file. Two patterns cover most cases.
Agent-scoped memory#
Set the namespace to the assistant ID and every conversation, for every user, reads and writes the same file. The agent builds up one shared identity: accumulated knowledge, learned preferences, refined approaches.
import { createDeepAgent, CompositeBackend, StateBackend, StoreBackend } from "deepagents";
const agent = createDeepAgent({
memory: ["/memories/AGENTS.md"],
skills: ["/skills/"],
backend: new CompositeBackend(
new StateBackend(),
{
"/memories/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.serverInfo.assistantId],
}),
"/skills/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.serverInfo.assistantId],
}),
},
),
});This is great for a single shared assistant that should get better over time. It is dangerous when users should not see each other's data, which is most consumer products.
User-scoped memory#
Namespace by user ID and each user gets an isolated copy. Core instructions stay fixed, but preferences and history are per person. User A's notes never leak into User B's conversation.
const agent = createDeepAgent({
memory: ["/memories/preferences.md"],
skills: ["/skills/"],
backend: new CompositeBackend(
new StateBackend(),
{
"/memories/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.context.userId],
}),
"/skills/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.context.userId],
}),
},
),
});Default to user scope unless you have a specific reason to share. It is the safe choice, and switching from shared to isolated later is a painful migration.
You do not pre-populate these files from the agent. You seed them from application code with the Store API (store.put), and the agent creates or edits them on demand as users share things worth remembering.
The four dimensions of memory#
Beyond paths and scope, it helps to think about memory along a few axes. The docs lay them out and they map cleanly onto real decisions:
| Dimension | The question | Options |
|---|---|---|
| Duration | How long does it last? | Short-term (one conversation) or long-term (across them) |
| Information type | What kind of thing is it? | Episodic (experiences), procedural (skills), semantic (facts) |
| Scope | Who can see and change it? | User, agent, or organization |
| Update strategy | When is it written? | During the conversation, or between conversations |
| Retrieval | How is it read? | Loaded into the prompt, or on demand |
| Permissions | Can the agent write? | Read-write (default) or read-only |
Most of what people call "memory" is semantic: facts and preferences in files like AGENTS.md. Procedural memory is skills (covered in the skills post). Episodic memory is the interesting one.
Episodic memory: remembering how, not just what#
Episodic memory stores past experiences: what happened, in what order, and how it turned out. Semantic memory tells the agent "this user prefers TypeScript." Episodic memory lets it recall how it solved a problem last time.
DeepAgents already persists every conversation as a checkpointed thread, which is the mechanism episodic memory needs. To make past conversations searchable, you wrap thread search in a tool:
import { Client } from "@langchain/langgraph-sdk";
import { tool } from "@langchain/core/tools";
const client = new Client({ apiUrl: "<DEPLOYMENT_URL>" });
const searchPastConversations = tool(
async ({ query }, runtime) => {
const userId = runtime.serverInfo.user.identity;
const threads = await client.threads.search({
metadata: { userId },
limit: 5,
});
const results = [];
for (const thread of threads) {
const history = await client.threads.getHistory(thread.threadId);
results.push(history);
}
return JSON.stringify(results);
},
{
name: "search_past_conversations",
description: "Search past conversations for relevant context.",
}
);The concrete payoff: a coding agent can look back at a past debugging session and skip straight to the likely root cause instead of rediscovering it. You pull userId from runtime context rather than passing it as a parameter, which keeps the tool honest about whose history it can see.
Background consolidation: writing memory off the hot path#
By default the agent writes memory mid-conversation. That is simple and immediate, but it adds latency and makes the agent multitask. The alternative is to consolidate between conversations with a separate agent, sometimes called sleep-time compute.
| Approach | Pros | Cons |
|---|---|---|
| Hot path (during) | Available immediately, transparent | Adds latency, agent multitasks |
| Background (between) | No user-facing latency, can synthesize across many conversations | Not available until next conversation, needs a second agent |
For most apps the hot path is fine. Add background consolidation when latency matters or when memory quality across many sessions matters more than immediacy. The pattern is a consolidation agent (itself a deep agent) that reads recent history, extracts key facts, merges them into the store, and runs on a cron:
import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: "<DEPLOYMENT_URL>" });
const cronJob = await client.crons.create(
"consolidation_agent",
{
schedule: "0 */6 * * *", // every 6 hours, UTC
input: { messages: [{ role: "user", content: "Consolidate recent memories." }] },
},
);Keep the cron interval and the consolidation agent's lookback window in sync. If the agent looks back 6 hours, run it every 6 hours. Run it more often and you reprocess the same conversations. Run it less often and you drop memories that fall outside the window. Pick a cadence that tracks real usage; consolidating far more often than people actually talk to the agent just burns tokens on no-op runs.
The security part you cannot skip#
Shared memory is a prompt-injection vector. If one user can write to memory that another user reads, a malicious user can plant instructions in shared state. Three mitigations, in order:
- Default to user scope
(userId)unless you have a real reason to share. - Make shared policies read-only. Populate organization memory from application code, not the agent, and use policy hooks on the backend to reject writes to those paths.
- Put human-in-the-loop validation in front of writes to sensitive shared paths. An interrupt that requires approval before the agent writes to shared memory closes the loop.
Organization-level memory follows the user-scoped pattern but namespaces by org, and it is typically read-only for exactly this reason:
const agent = createDeepAgent({
memory: ["/memories/preferences.md", "/policies/compliance.md"],
backend: new CompositeBackend(
new StateBackend(),
{
"/memories/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.context.userId],
}),
"/policies/": new StoreBackend({
namespace: (ctx) => [ctx.runtime.context.orgId],
}),
},
),
});Two operational notes#
Concurrent writes to the same file can cause last-write-wins conflicts. For user-scoped memory this is rare, since a user usually has one active conversation. For agent or org scope, either serialize writes through background consolidation or split memory into separate files per topic to cut contention. In practice a single lost write is rarely fatal; the model usually retries or recovers.
And if you run multiple agents in one deployment, add assistantId to the namespace so each agent gets its own memory:
new StoreBackend({
namespace: (ctx) => [
ctx.runtime.serverInfo.assistantId,
ctx.runtime.context.userId,
],
})Use LangSmith tracing to audit what the agent writes. Every file write shows up as a tool call, so you can see exactly what landed in memory and when.
Where I would start#
Start with user-scoped, read-write, hot-path memory and one AGENTS.md per user. It is the simplest thing that is also safe. Add episodic search when the agent does complex multi-step work and would benefit from recalling past sessions. Add background consolidation only when latency or cross-session quality forces your hand. Reach for read-only org memory the moment you have shared policy that the agent must follow but must not edit.
Next in the series: skills in DeepAgents, how to give an agent deep capabilities without paying for them on every turn.

Folarin Akinloye is an AI Engineer based in London, UK. He builds production-ready agentic AI systems, multi-agent architectures, and sophisticated RAG implementations, and writes about the engineering decisions behind them.