Why Your AI Agent Gets Dumber Over Time: OpenClaw Three-Layer Memory in Practice
The Problem: 917 Lines of Warning
In February 2026, we ran a memory health check on our team's 4 agents. The results were sobering:
| Agent | Before | After | Reduction |
|---|---|---|---|
| Agent-1 | 917 lines / 30KB | 88 lines / 2KB | 90% |
| Strategist | 348 lines / 14KB | 73 lines / 3KB | 79% |
| Tech Advisor | 121 lines / 3KB | — | didn't need it |
| Creative | 43 lines / 4KB | — | didn't need it |
What does 917 lines mean in practice? If the agent's context window is 128K tokens, a 917-line MEMORY.md eats roughly 1/4 of it. That leaves dramatically less room for actual tasks, reasoning, and responses.
The symptoms are obvious: slower responses, worse instruction-following, the more it remembers the dumber it gets.
Root Cause: AI Doesn't Forget on Its Own
The human brain has a natural forgetting curve — unimportant things fade automatically. AI doesn't.
Every memory feels "important" when it's written. Three days later it might be completely obsolete. Without a pruning mechanism, bloat is inevitable. Not if — when.
The typical path:
Did something → "write it down" → MEMORY.md +5 lines
Did something else → "write that too" → +5 lines
Project ended → nobody cleaned up → historical details occupy space forever
Three months later → 300+ lines → agent burns 1/4 context window just reading memory
Core insight: AI won't forget on its own, so you have to teach it to forget.
The Solution: Three-Layer Architecture
Layer 1: Hot Memory — MEMORY.md (≤150 lines)
What it is: The "factory settings" the agent reads every time it wakes up.
What goes here:
- Currently active rules
- Team roles and responsibilities
- SOPs (standard operating procedures)
- Key parameters
Hard limit: ≤150 lines. Not a suggestion — a constraint.
Analogy: Muscle memory — things you know without having to think.
What good hot memory looks like:
# MEMORY.md - Agent-1
## Core Principles
- servasyy speaks → current task pauses
- If it can be solved in 30 seconds, don't overthink it
## Team Roles
- Agent-1: coordination
- Strategist: strategy design
- Tech Advisor: code implementation
## Collaboration Rules
- @Agent-1 = I respond
- Important info goes to memory/
Under 30 lines, but contains everything that would cause mistakes if missed.
Common mistakes:
❌ Putting project details in hot memory:
# Wrong
- Project X client is Mr. Wang, phone 138-xxxx-xxxx
- Last bug was in user_service.py line 43
This was relevant once. In three months the project may not exist. Put it in cold memory.
❌ Putting daily tasks in hot memory:
# Wrong
- Need to finish xxx feature today
- Meeting at 3pm
These are one-time items. They belong in the raw log, not hot memory.
Before vs. after:
Before (217 lines):
# Experience from 2024-11
- That bug was finally fixed, the cause was...
- Client Mr. Wang said...
- Conclusion from discussion with Zhang San...
(200+ lines of this)
After (88 lines):
# MEMORY.md - Agent-1
## Core Rules
- servasyy speaks > current task
- Coordinator doesn't write code
## Team Roles
- Agent-1: receive requirements, coordinate
- Strategist: strategy design
- Tech Advisor: code implementation
- Creative: content creation
## Collaboration SOP
1. Receive requirement → confirm with Strategist
2. Strategist agrees → assign tasks
3. Execute → sync status
4. Review → notify servasyy
60% smaller, all core rules intact.
Layer 2: Cold Memory — memory/archive/YYYY-MM.md
What it is: Not auto-injected. Retrieved on demand via memory_search.
What goes here:
- Historical lessons
- Completed project experience
- Outdated but potentially useful decisions
Organization: Monthly archives with descriptive titles for searchability.
Analogy: An experience library — you look things up when needed, you don't carry it all around.
Common mistakes:
❌ Dumping everything into hot memory because "it's already written, might as well keep it"
❌ Archiving to cold memory and never searching it — what's the point of archiving if you never retrieve?
❌ Vague titles:
# 2026-02.md ← wrong
# 2026-02 Project Retrospective and Decisions ← right
Layer 3: Raw Log — memory/YYYY-MM-DD.md
What it is: Daily stream of events, for retrospective review.
What goes here: Everything that happened today, raw.
Don't archive to cold memory unless there's a lesson worth extracting.
Analogy: A dashcam — you don't watch it daily, but you replay it when something goes wrong.
When to archive to cold memory:
- There's a lesson worth remembering (e.g., a mistake and its root cause)
- There's a decision that needs to be traceable (e.g., why you chose option A)
- There's reusable project experience
Everything else stays in the raw log.
Common mistakes:
❌ Deleting raw logs the same day — don't. Logs are your safety net.
❌ Writing too much detail in raw logs — nobody wants to read an essay. Record key points.
❌ Trying to promote everything to hot memory — wait for the monthly retrospective to decide what's worth upgrading.
The Three-Question Decision Framework
When deciding where a memory belongs, ask three questions:
Q1: If I don't read this on next startup, will I make a mistake? Yes → P0, keep in hot memory. Examples: current team roles, review SOPs, hard rules.
Q2: Might I need to look this up someday? Maybe → P1, archive to cold memory. Examples: last month's project lessons, root cause analysis of a failure.
Q3: Neither of the above? Leave it in the raw log. No extra processing needed. Examples: conversation records, temporary debugging sessions.
Pruning Steps
- Read the full MEMORY.md
- Apply the three questions to each entry
- P0 stays, P1 moves to
memory/archive/YYYY-MM.md - Verify: pruned MEMORY.md ≤ 150 lines
Note: create the archive directory first if it doesn't exist: mkdir -p memory/archive
Results
Quantitative:
- Strategist: per-session injection dropped from ~14KB to ~3KB (79% savings)
- Agent-1: from ~30KB to ~2KB (93% savings)
- Significant reduction in token consumption across the team
Qualitative:
- Better instruction-following (core rules no longer buried in noise)
- Historical knowledge preserved (cold memory is always searchable)
- Automatic alerts prevent unconscious bloat
Core Principles
- Hot memory stays lean: ≤150 lines is a hard constraint, not a guideline
- Cold memory stays active: archive regularly, keep it searchable
- Raw logs stay intact: don't delete them, they're your audit trail
Memory management isn't about "remembering more" — it's about "remembering accurately." In a finite context window, every memory competes with future tasks for space. Pruning isn't loss. It's focus.
Data source: 2026-02-12 live test, 4-agent memory health check report.