Literature review · 6 episode(s)

Agent Memory and Aging

Memory is inference, not retrieval

The benchmark work on memory staleness draws the field's sharpest line: having the new fact in the prompt isn't enough if nothing flags the old one as superseded, especially for 'propagated' conflicts where common-sense reasoning has to retire a belief no one explicitly revoked. The same model can ace staleness recognition and fail catastrophically when a question quietly assumes the stale fact, and popular memory frameworks underperform the raw model on exactly these cases; moving adjudication to write time recovers most, but not all, of the gap E031. Mechanistically, the picture is worse for small backbones: routing competence (add/update/delete) comes online before content comprehension, producing a silent-failure regime where a model confidently overwrites memories it doesn't understand — invisible to end-to-end benchmarks because the JSON stays valid E023.

Consolidation is a learnable skill

Once memory is more than a log, the interesting question is the slow loop: consolidation. Splitting agent memory into a fast writer and a slow consolidator — with forgetting as the default and a 'thief test' reward that scores entries by what masking them does to task success — produces memory banks an order of magnitude smaller at higher success, and the consolidation skill transfers zero-shot across domains and backbones E064. Graph-structured experience with a placebo-controlled training signal (run the executor with and without memory, reward only the difference) lets a 3B copilot write a playbook that improves a frozen 32B executor E106. The most radical move dispenses with the notebook entirely: agents distill experience into flashcards and train them into a small writable slice of their own weights mid-conversation — and the structure of what gets written matters more than the writing, with QA flashcards quadrupling the value of raw transcripts E114.

Frozen weights still age

Reliability is a lifespan property, not a day-one benchmark score: the memory store, retrieval, and compaction around a frozen model keep changing every session, and agents age through compression, interference, revision, and maintenance failures that look identical in error rates but require opposite repairs. A counterfactual diagnostic ladder separates write, read, and utilization failures without model internals, and a one-paragraph change to the compaction prompt — naming what must be preserved verbatim — extends useful lifespan roughly 4.5x E086. The security corollary belongs here too: persistent memory means an attack can be planted once and fire days later in someone else's session, which is taken up in the security topic E113.

Episodes anchoring this topic

When Your AI Assistant Won't Let Go of Old Facts About You
Reframed agent memory from retrieval to inference, with the recognize-but-still-act-on-stale-facts gap.
When Agent Memory Stops Being a Database and Starts Being a Skill
Established consolidation as a learnable, transferable skill with forgetting as the default.
Why Frozen-Weight Agents Still Get Worse Over Time
Named the four mechanistically distinct aging modes of frozen-weight agents and the counterfactual ladder that tells them apart.
Agents That Rewrite Their Own Weights Instead of Just Taking Notes
Moved memory from prompt space into the weights themselves via mid-episode flashcard training.
Giving Agents a Notebook Instead of New Weights: How ExpGraph Lets Frozen Models Learn
The placebo-controlled utility reward for experience, and the small-copilot-improves-big-executor result.
Why a Small Agent Confidently Overwrites Memories It Doesn't Understand
The mechanistic finding that routing competence precedes content comprehension, predicting silent memory corruption in small backbones.