Experience Compression Spectrum: an architectural framework unifying memory, skills, and rules in LLM agents
The Experience Compression Spectrum is a new architectural framework that positions memory, skills, and rules of LLM agents along a single axis of increasing compression — from episodic memory (5–20×) through procedural skills (50–500×) to declarative rules (1000×+). The analysis reveals that existing systems operate at fixed compression levels and that memory and skills do not communicate with each other.
This article was generated using artificial intelligence from primary sources.
What is the Experience Compression Spectrum?
A new arXiv preprint published on 17 April 2026 proposes a unified theoretical framework for how LLM agents handle experience across long-horizon, multi-season deployments. Rather than treating memory, skills, and rules as separate architectural components, authors Xing Zhang and colleagues position them along a single axis of increasing compression:
- Episodic memory — raw records of what happened, compression 5–20×
- Procedural skills — parameterised routines learned from patterns, 50–500×
- Declarative rules — general statements that hold across contexts, 1000×+
The idea is simple but powerful: all three are different levels of the same process — compressing experience into reusable knowledge. The only difference is how much context is lost for the sake of greater conciseness.
What does the analysis of existing systems reveal?
The authors identify three systemic problems:
1. Fixed compression level. Most agents operate at one point on the spectrum — some remember everything, others extract rules. But real experience is not uniform — some things deserve detailed memory (edge cases), others deserve extreme compression (stable procedures). Systems without adaptive flexibility pay a price at one end or the other.
2. Memory and skills do not communicate. Research communities working on memory (long-term context, RAG, episode replay) and skills (skill learning, program synthesis) do not exchange results. The authors argue these are fundamentally the same thing — compression of experience — just developed in silos.
3. Evaluation differs by level. How do you measure “good memory” vs. “good skill” vs. “good rule”? Each level has its own benchmarks, making it difficult to compare systems operating at different points on the spectrum.
What are the design principles for full-spectrum agents?
The paper does not propose a concrete implementation but rather principles for agents operating across the full spectrum:
- Dynamic positioning — the agent itself chooses the compression level for each experience, based on pattern frequency and confidence
- Bidirectional movement — skills can be distilled from memory, rules from skills; but also in reverse: when a rule breaks down, the agent must be able to “decompress” back to episodic detail
- Lifecycle management — rules and skills become stale as context changes; experience needs a revision mechanism, not just accumulation
Why does this matter?
Long-horizon agents — those operating for weeks or months in the same context (customer support, tech support, personal assistants, coding) — need experience. But pure memorisation does not scale (the context window grows, costs grow), and premature compression loses information. The paper argues that compression is a spectrum, not a binary choice, and that the next generation of agents should be designed with that in mind.
Implications for builders
For teams building production agents, the message is architectural: instead of separate modules for memory, skills, and rules, consider a single memory hierarchy with mechanisms for promotion and demotion across levels. Mechanisms such as summarisation pipelines, skill extractors, and rule inducers are parts of the same system — they just operate at different compression levels.
The paper is a preprint without experimental results from new models — it is more of a position paper that defines a common vocabulary for the field. But that is precisely its value: teams currently building long-horizon agents can use it as a guide when designing their memory architecture.
Frequently Asked Questions
- What is the difference between memory, skills, and rules in this framework?
- Memory consists of raw episodic records (logs of what happened, 5–20× compression), skills are parameterised procedures the agent can invoke in similar situations (50–500× compression), and rules are declarative statements that hold across contexts (1000×+ compression). All three are ways of compressing experience — they differ only in the level of abstraction.
- Why does it matter that systems operate at a fixed compression level?
- Most agents today either remember everything (expensive, slower over time), compress to fixed procedures (loses specificity), or extract rules (loses context). The authors argue that true long-horizon agents need to dynamically choose a compression level — sometimes a rule, sometimes a full record — depending on the task at hand.
- What is the practical implication for AI agent builders?
- You need to design the memory system as a spectrum, not a single level. That means mechanisms for: compressing episodes into skills when a pattern recurs, abstracting skills into rules when justified, and decompression when the agent encounters an edge case where the rule breaks down.
Related news
arXiv:2605.22502: Compiling agentic workflows into LLM weights achieves near-frontier quality at 100× lower cost
arXiv:2605.22794: MOSS shows agents that self-improve by rewriting their own source code
arXiv:2605.22535: TerminalWorld benchmark measures LLM agents on real Linux terminal tasks without simulation