Context rot is the phrase that stuck.
Anthropic named it last fall in a piece on context engineering for AI agents. The argument: as the window grows, the agent loses track of what mattered. Performance degrades. The fix is curation — the smallest set of high-signal tokens that gets the agent to do the right thing.
They are right. This is the runtime problem, named clearly, with a useful set of patterns: lightweight identifiers, just-in-time retrieval, structured notes, sub-agents.
Eight months later, those techniques are everywhere. Build agents long enough and you arrive at all of them.
But every one of those strategies assumes something the article does not say out loud.
Runtime curation assumes upstream curation.
You can only curate a smaller set of high-signal tokens if you already know what counts as signal. If the source of truth is vague, contradictory, or absent, retrieval can find the document — it cannot make the document know what matters.
Pull on that thread and most product teams have nothing on the other end.
The bug is not always that the context window is too full.
The agent does not become confused because the context is too large. It becomes confused because the work was never reduced to a decision.
A spec that says "improve checkout" does not get sharper because you packed it more tightly. Compression preserves shape. It does not create judgment.
Garbage in, smaller garbage out.
A good spec is already a curated set of high-signal tokens.
It is not a dump of everything you know about the product. It is the slice that compiles into action: objective, outcomes, edge cases, constraints, what good looks like, what must not break.
That is context engineering done by humans, at authoring time, before any retrieval pipeline gets a turn.
The runtime layer is what Anthropic is writing about. The authoring layer is upstream of it, and it is where most of the rot starts.
The pattern transfers.
A spec should not carry every quote, transcript, ticket, dashboard, and Slack argument into the prompt. It should name the claim, link the evidence, and tell the agent when to dereference.
Same just-in-time retrieval. One layer earlier. That is the shape of Pathmode's evidence-to-spec workflow: claims stay compact, evidence stays dereferenceable.
The agent walks back to the source when it needs to, not when the prompt happens to include it.
The two problems do not have the same shape.
Runtime curation is increasingly an engineering problem: retrieval, ranking, summarization, tool design. There are real techniques, and they will keep getting better.
Upstream curation is an editorial problem. Somebody has to decide what matters. That decision does not get cheaper when the model gets smarter. It gets more expensive — because more is built from less, faster, with less correction along the way.
This is the layer that resists automation. It is also the layer that decides whether your factory's output is yours or generic.
Context rot is real. Anthropic is right about the runtime fix: curate harder.
But the harder curation starts before the agent ever sees the prompt.
The upstream fix is intent.
Most teams will spend the next year tuning retrieval while leaving the source material vague. They will get better at feeding agents smaller context. They will not necessarily get better at feeding them truer context.
Don't Just Write Code. Define Intent.
Turn user friction into structured Intent Specs that drive your AI agents.
Get Started for Free