In the last thirty days, the two labs that matter most for agentic coding have shipped announcements that look unrelated.
Anthropic published its 2026 Agentic Coding Trends Report. OpenAI shipped Codex for (almost) everything — turning Codex into a general-purpose agent workspace with Goals, an in-app browser, persistent memory, scheduled automations, and ninety-plus plugins into Jira, Notion, Slack, and Microsoft 365.
Different motions. Same underlying pressure.
Agentic coding is becoming operational. The upstream definition of what should happen, why it matters, and how success is verified is still too thin to keep up. Both announcements make this gap more visible — Anthropic by naming it, OpenAI by enlarging the execution surface that sits on top of it.
The labs are building better executors. They are not building the product-intent layer.
Anthropic: developers delegate more than they trust
The Anthropic report is built around a number that should make every engineering leader pause.
Developers delegate 60% of their work to AI, yet fully trust only 0–20% without oversight. The critical skill is now knowing what to delegate and when to intervene.
That is a forty-point gap between delegation and trust. It does not close because the model gets smarter. It does not close because the IDE gets more agentic. It closes when the team can say, in writing, what done looks like — before the work starts and in a form an agent and a human can both verify.
Anthropic also describes the new architecture: "specialist agents working in parallel orchestrated by humans who increasingly resemble conductors rather than musicians." Conductors don't write the score during the performance. They work from one. The score in software is the spec — objective, outcomes, edge cases, constraints, verification — and most teams don't have one.
The report is a forecast. The forty-point trust gap is an artifact of the layer that's missing today.
OpenAI: Codex is becoming an agent workspace, not just a coding tool
OpenAI's April 16 announcement was the inverse move.
Codex is no longer a coding assistant inside an editor. It's a workspace with Goals, browser use, persistent memory, scheduled automations, and ninety-plus plugins reaching into the systems where product work actually happens — Jira, Notion, Slack, Microsoft 365. More than four million developers were using it weekly by mid-April, up from three million two weeks earlier.
This is OpenAI absorbing the execution surface. A Codex Goal can run on a schedule, browse the web, file a Jira ticket, and post into Slack. The agent reaches further into the operational stack every release.
Goals, in this product, are an execution primitive. They tell the agent what to do next — a task, possibly recurring, possibly tool-using. They are powerful and they will get more powerful.
They are not, however, a product-intent system. And nothing in the announcement suggests they're meant to be one.
Goals are not the same as intent
A Codex Goal answers a different question than a product-intent artifact answers.
A Goal answers: what should the agent do next? It is task-shaped, agent-scoped, and short-lived. It belongs to the agent and its current run.
An intent spec answers: what does done look like for this product change, why does it matter, and how will we know it worked? It is product-shaped, team-scoped, and durable. It belongs to the product and survives across many agents and many runs.
Both are useful. They are not interchangeable. A Goal describes an action. An intent spec describes a change in the product and the world that several Goals — and several humans — might cooperate to achieve.
This distinction matters because it stays true even as Codex Goals get more capable. Goals can grow richer schedules, better tool routing, more sophisticated retries. They will still describe what an agent should attempt, not what the team is trying to accomplish at the product level.
The missing layer is the Why
Anthropic's report names a trust gap. OpenAI's Codex update enlarges the execution surface that sits below it. Neither move addresses the layer above both: the product team's definition of why this work matters and how success is judged.
Call this the product-intent layer. Three properties make it distinct:
- Persistent. It outlives any single agent run. The agent that runs tomorrow can read the same artifact as the agent that ran today.
- Evidence-anchored. It points to the friction, the quote, the metric, the request that justified the work. A spec that can't show its evidence is just an opinion in a JSON envelope.
- Owned by the product team. Not by the model, not by the IDE, not by the agent's planner. By the people who decide what gets built and how the team will know it worked.
The labs are not the wrong place for this layer to live. They are the wrong shape for it. Anthropic's job is to make the model and its runners better. OpenAI's job is to expand what an agent can reach. Neither charter includes producing and maintaining the durable, team-owned artifact that defines what good means for this product, this quarter, this user.
What an intent spec adds: outcomes, constraints, edge cases, verification
The artifact that fills this layer has a recognizable shape. We call it an IntentSpec. The name is less important than the parts:
- Objective. The change in user behavior or product state — not the feature, the change.
- Outcomes. Observable, decomposable, testable statements of what would prove the objective was met.
- Edge cases. The boring failure modes that ship bugs when no one writes them down.
- Constraints. Invariants that must hold even on the happy path.
- Evidence. The tickets, interviews, transcripts, telemetry that justified the call. Not vibes.
- Verification. How the team — or a grader — checks the work without redoing it.
A Codex Goal can carry a slice of this when it dispatches an agent. A Claude Outcome can encode another slice when it grades the result. Neither is the upstream artifact. Both are exports of it.
When the upstream artifact is good, the export is a translation step. When the upstream artifact is missing, every executor downstream — Codex, Claude, Linear, the next teammate — is guessing.
Trust scales with verification, not with the model
Return to the forty-point gap.
Developers delegate 60% of their work to AI but trust only 0–20% of it without oversight. Better models will close some of that. Better executors will close some of that. Neither closes it the rest of the way, because the gap isn't about the model's competence. It's about the team's ability to say what good looks like in a form that can be checked.
When success criteria are vague, more capable executors produce more convincing wrong answers. When success criteria are explicit, even mid-capability executors produce work the team can ship — because the team can tell when they have.
The 2026 announcements from Anthropic and OpenAI point in the same direction. Agentic coding is becoming operational. The bottleneck is moving upstream. The teams that win this transition will be the ones that treat product intent as a first-class artifact, owned and maintained alongside the code, the tickets, and the telemetry.
The labs are building better executors. The product-intent layer is yours to build.
Don't Just Write Code. Define Intent.
Turn user friction into structured Intent Specs that drive your AI agents.
Get Started for Free