· 9 min read

OpenClaw Memory: How Persistent Context Actually Works (and What Breaks It)

Three layers, three failure modes. A grounded look at what OpenClaw memory actually is, why it forgets, and what hosting changes about the failure pattern.

openclaw memory context hosting

Ask anyone running OpenClaw what frustrates them most, and "memory" shows up faster than any other word. The assistant forgets what you told it yesterday. It loses the thread mid-task. It re-asks for files it already read. The complaint is real, but the diagnosis is usually wrong.

OpenClaw does not have a memory system in the way most people mean. It has three loosely coupled mechanisms that together produce something that sometimes feels like memory — and sometimes very much does not. Understanding which layer is failing, and why, is the difference between a tool that helps and a tool you keep apologizing for.

What "memory" actually means inside OpenClaw

Memory is whatever the model can act on right now, which is a narrower thing than whatever you have stored on disk. At any given moment OpenClaw is working with a context window — a finite chunk of tokens that includes your current message, the recent conversation, the contents of any files it has read, and a system prompt. Anything outside that window does not exist to the model unless something pulls it back in.

This is where the first expectation breaks. Users picture memory as a database the assistant queries. The model treats it as a working surface. A note sitting on disk is not memory until something loads it into the window.

The three layers that produce persistent context

Persistent context comes from three distinct layers that operate on different timescales and have different rules for loading and writing.

Session context is the active window. It holds the current conversation, the files OpenClaw has opened, and the reasoning it has produced in this exact session. It is ephemeral. When the session ends, it is gone unless something else captured it first.

Project files are conventions and notes stored on disk — most commonly the CLAUDE.md at the root of a project, plus any nested files in subdirectories. These are read once when the session starts and then live inside the context window for the rest of that session.

The persistent store is everything else: notes you wrote in earlier sessions, structured knowledge graphs, scratchpads, vault entries. It only enters the active window when something — usually a tool call, a search, or an explicit instruction — pulls a snippet forward.

Each layer has its own write path, its own read trigger, and its own characteristic way of breaking. Conflating them is what makes troubleshooting memory feel like chasing smoke.

Where each layer fails

Each layer breaks for a different reason, and the failures rarely surface as errors — they show up as the assistant quietly doing the wrong thing.

Session context fails when the window fills. The model has a hard token budget. As the conversation grows, older turns get compressed into shorter summaries, and eventually evicted altogether. Compression flattens specifics into generic descriptions, so the assistant remembers that you "discussed the database schema" but not which columns you decided to drop. Tool output, especially file reads, eats this budget faster than people expect.

Project files fail when they go stale or sprawl. A CLAUDE.md that worked six months ago can quietly contradict the current state of the codebase. Nested instructions from subdirectory files can cancel each other out. And there is a soft ceiling beyond which the file is still being read but stops doing useful work, because nothing in it matches what the model is currently being asked about.

The persistent store fails in two opposing ways. The first is that nothing writes to it: the model finishes a useful session, the user closes the laptop, and no record of the decisions survives. The second is that retrieval misses: the store grows large enough that searches return stale or off-topic snippets instead of the relevant one, and the model has no way to know what it should have surfaced.

A quick map of the three layers

Here is the same picture in table form, in case it helps to see the asymmetries side by side.

LayerWhere it livesLoaded whenHow it breaks
Session contextActive context windowAlways present during the sessionToken limits, compression, eviction
Project filesDisk (CLAUDE.md, conventions)At session startStale content, conflicting instructions, sprawl
Persistent storeDisk (notes, vault, knowledge graph)Only when explicitly retrievedMissed write-backs, retrieval misses

Why a local setup tends to have all three problems at once

Local installs concentrate every failure mode onto a single machine that is rarely set up to manage them. A laptop sleeps. It restarts. It is shared with browser tabs, video calls, and a dozen other applications competing for memory and disk. Whatever maintenance the persistent store needs — periodic compaction, indexing, write-back of finished sessions — has to happen on a machine that may not be powered on when the scheduled job is supposed to fire.

Sync across devices makes this sharper. If your CLAUDE.md and notes vault live on iCloud or Dropbox, two machines can drift in opposite directions and neither will warn you. A useful note written on the laptop yesterday may not be on the desktop today, depending on which way the sync arrows happen to be pointing.

The point is not that local setups are doomed. It is that the three layers each demand a habit, and the habits compound — when one slips, the next one usually slips with it.

What managed hosting changes about the failure modes

Managed hosting does not invent a new kind of memory; it stabilizes the conditions under which the three existing layers operate. The persistent store lives on infrastructure that stays online between your sessions, so scheduled maintenance — compaction, indexing, write-back — actually runs on the cadence it was configured for. The filesystem is a single source of truth across every device you connect from, so there is no sync drift between laptops.

Clowdbot runs OpenClaw as a managed service for this exact set of reasons. The session layer still has a context window and still compresses under load — no host can change that. But the project and persistent layers stop quietly degrading between sessions, because nothing they depend on is going to sleep when you close your laptop. We covered the broader trade-offs in what running OpenClaw locally actually means for your machine, and the cost and operational differences in our comparison of hosting models.

When local memory is the right call

Hosted memory is not always the better answer. If your work is contained to a single machine, with a single user, and a project small enough that a 300-line CLAUDE.md covers everything you need OpenClaw to know, the gains from hosting are marginal. If your context contains data that should not leave the device — air-gapped, regulated, or simply sensitive — local is the correct default, and the discipline of vault hygiene becomes a security feature rather than a chore. For a broader view of what OpenClaw can and cannot do regardless of where it runs, see our honest capabilities guide, and for the security implications of where memory physically lives, the security overview.

Frequently asked questions

Does OpenClaw remember conversations forever?

No. By default, a conversation lives only as long as the session that hosts it. Anything you want OpenClaw to recall in a future session has to be written somewhere — a project file, a notes vault, a knowledge graph — and then loaded back in when the next session starts.

Why does my assistant forget things I told it earlier in the same conversation?

That is the session-context layer hitting its token ceiling. As newer turns and file reads fill the window, older content gets compressed into summaries and eventually evicted. The information is not lost from disk if it was ever written there, but the model can no longer see it without re-loading.

Can I move memory between a local and a hosted setup?

Yes. Project files and notes are plain files on disk; moving them is a matter of copying a directory. What changes is which machine is responsible for keeping the persistent store healthy over time.

Does compression mean I permanently lose information?

Only from the context window. If the underlying notes were written to a project file or a persistent store, they remain on disk and can be loaded into a future session. Compression is a windowing problem, not a deletion event.

How big can a CLAUDE.md file get before it stops being useful?

There is no hard limit, but practical usefulness drops well before file size becomes the problem. A CLAUDE.md that tries to cover everything tends to dilute the signal the model uses to prioritize. Tighter, well-structured project files outperform sprawling ones.

Memory in OpenClaw is not a single feature. It is the product of three layers, each with its own discipline. Once you can name which layer is failing, the fix usually presents itself — and a hosted setup mostly helps because it removes the failure modes that come from running maintenance on a machine that is also your laptop.