Andrej Karpathy recently described a phase shift in his own practice:
"I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words."
In November, he was coding 80% manually with 20% agent assistance. By December, the ratio had inverted. And then he named what's happening:
"The profession is being dramatically refactored, as the bits contributed by the programmer are increasingly sparse and between."
That word—sparse—is doing a lot of work.
What Remains in the Gaps?
The human contribution to software is not disappearing. It is becoming discontinuous. Concentrated in moments of judgment rather than distributed across hours of implementation.
The code still gets written. The tests can be made to pass. But what the developer actually contributes is no longer the code itself. It is:
- The specification that precedes it.
- The evaluation that follows it.
- The context that shapes it.
Drew Breunig recently published a software library with no code. whenwords formats relative time ("three hours ago," "next Tuesday") in Python, Rust, Go, TypeScript, and Java. The repository contains only three files: a 500-line specification, a test suite, and instructions that say: paste this into Claude.
The library generates itself on demand. The spec is the product. The implementation is ephemeral.
The Spec Is the Moat Now
If the programmer's contribution is specification and judgment, and if agents reliably translate specification into working code—why distribute implementations at all?
Breunig tested it. Claude has never failed to generate a working whenwords in any language he tried. The code crystallizes locally, on demand.
This has limits. whenwords has 125 tests. SQLite has 51,445. Spec-only distribution works for small, stable, well-defined functions.
But the principle scales: when the cost of generating code approaches zero, the artifact that persists is the specification.
The intent is the durable thing. And if that's true, the intent is also the defensible thing.
The CLAUDE.md Phenomenon
Peter Steinberger's approach is even more aggressive. He runs 3–8 agents simultaneously on a 300,000-line TypeScript codebase. Six hundred commits in a single day. When asked about safeguards, his response was blunt:
"To actually be super diligent you would have to be very attentive, which kind of defeats the point of moving fast. So I think yes, YOLO is the only way of running agents."
What makes it possible is his CLAUDE.md—a living document that encodes context, constraints, and institutional knowledge. The agents read it and work within it.
But unlike Breunig's spec (written upfront, tightly bounded), Steinberger's document accumulates. It is the sediment of decisions made, conversations had, constraints discovered.
This is no longer a static requirements document. It is institutional memory made explicit.
Don't Write Context. Generate It.
Here's where we see the fault line.
A CLAUDE.md is a brilliant runtime artifact. It is the perfect standardized format to hand to an agent. But it is a poor system of record.
When you treat the flat file as the source of truth, it accumulates "sediment"—outdated instructions, conflicting constraints, and noise. You can't query it. You can't visualize dependencies.
The context file should be a derived artifact.
At Pathmode, we believe you shouldn't be hand-editing your CLAUDE.md. You should be generating it.
An IntentSpec isn't a document to replace your workflow—it's a data model that powers it:
- Status:
draft→validated→shipped→verified - Relations:
depends_on,enables,blocks - Evidence: Linked directly to user research and friction points
- Outcomes: Observable state changes that define success
Pathmode acts as the intent engine that feeds your agents. You can't query a markdown file, but Pathmode can export the perfect, task-specific context to one.
The spec isn't prose. It's a node in a graph. And the graph is the system of record for why we built what we built.
What Karpathy Gets Right
His observations about agent-driven development are precise:
- Models make "subtle conceptual errors that a slightly sloppy, hasty junior dev might do."
- They overcomplicate code and bloat abstractions.
- Tenacity is their superpower: "They never get tired, they never get demoralized."
- The real leverage comes from declarative goals: "Don't tell it what to do, give it success criteria and watch it go."
That last point lands hardest.
This is exactly what an IntentSpec is for. Not a task list. Not a ticket. A success criteria contract that the agent loops against until it passes.
The Atrophy Question
Karpathy notes he's already starting to atrophy his ability to write code manually. Generation and discrimination are different capabilities. You can review code just fine even if you struggle to write it.
If code-writing skills atrophy, what remains?
The ability to define what should exist and why. The ability to evaluate whether it worked. The ability to encode institutional knowledge so that agents don't have to rediscover it.
These are the sparse bits between.
They are not incidental. They are the whole game.
2026: The Year of Metabolizing
Karpathy's prediction:
"LLM agent capabilities have crossed some kind of threshold of coherence around December 2025 and caused a phase shift... 2026 is going to be a high energy year as the industry metabolizes the new capability."
We agree. And we'd add: the organizations that thrive won't be the ones that adopt agents fastest. They'll be the ones that build the intent layer that governs those agents.
Protect the spec. The rest is derived.
Don't Just Write Code. Define Intent.
Turn user friction into structured Intent Specs that drive your AI agents.
Get Started for Free