The fastest pull request I've ever seen was also the most expensive.

20 minutes from prompt to PR. Then four hours of rework, a security review, three support tickets, and a Slack thread titled "who approved this?"

We ran the experiment on purpose. Two teams, same task, same AI tools, same codebase. One team prompted their way through it. The other used an Intent Layer. The results weren't even close.

The Task

Build a "Forgot Password" flow for an existing app. Simple enough—every developer has built one, and every AI agent has generated one.

Team A: Prompt-Driven

The engineer opened Cursor and typed:

Add a forgot password flow. User enters email,
gets a reset link, clicks it, sets new password.

The agent delivered in 20 minutes:

A /forgot-password page with an email input
An API route that sends a reset email
A /reset-password page with a new password form
Basic validation

It worked. On the happy path, it was fine.

What surfaced over the next 48 hours:

No rate limiting. A pen test flagged the endpoint on day one.
Reset tokens never expired. Open security hole.
No logging. When a user said "I never got the email," support had zero visibility.
Reset link used HTTP in the email template.
The UI didn't use the existing design system. It looked like a different app.

None of these were exotic failures. They were obvious requirements—the kind any experienced developer would think of. But the agent wasn't asked to think. It was asked to build.

Team B: Intent Layer

The PM and engineer spent 15 minutes writing an IntentSpec before handing anything to the agent:

yaml

objective:
  statement: "Allow users to reset their password securely"
  success_criteria:
    - User receives reset email within 60 seconds
    - Reset token expires after 15 minutes
    - Flow matches existing auth UI patterns

constraints:
  - Rate limiting: max 3 requests per email per hour
  - All links must use HTTPS
  - Log all reset attempts for support team visibility
  - Must integrate with existing email provider (Resend)

outcomes:
  - observable: "Password reset completion rate > 90%"
  - metric: "Support tickets about 'never received email' decrease by 50%"

edge_cases:
  - User enters unknown email → generic "check your email" (no account leaking)
  - User clicks expired link → clear error with retry option
  - Corporate network email delays → fallback help text after 2 minutes

Time to first PR: 45 minutes.

The agent generated rate-limited endpoints, expiring tokens with configurable TTL, structured logging with request IDs, HTTPS-only templates, edge case handling, and UI built on the existing design system.

One minor issue surfaced in review: the fallback help text triggered too early on slow connections. A one-line config change. Nothing structural.

The Real Comparison

	Team A (Prompt)	Team B (Intent Layer)
Time to first PR	20 min	45 min
Security issues found post-merge	3	0
UX inconsistencies	2	0
Support burden	Created new tickets	Reduced existing ones
Rework after merge	4+ hours	~5 min
Total time to production	5+ hours	50 min

Team A was faster to the PR. Team B was faster to production. And production is the only finish line that counts.

The Asymmetry

This isn't a story about "spend more time planning." It's about what the AI receives as input.

A prompt describes what to build. An IntentSpec defines what success looks like. That distinction changes everything downstream:

Constraints become guardrails. The agent doesn't have to guess about rate limits, security patterns, or design systems. They're in the spec.
Edge cases become test cases. Instead of discovering failures in production, they're addressed during generation.
Success criteria become reviewable. The PR reviewer doesn't just check "does it work?"—they check "does it meet the contract?"

Prompting optimizes for speed to first output. Intent engineering optimizes for speed to shippable output. And the gap between those two things is where most rework, most bugs, and most team friction lives.

What This Means for Your Team

If you're using AI agents to build, ask yourself: what are you actually feeding them?

If the input is...	The agent will...
A one-line prompt	Build the happy path and guess the rest
A detailed prompt	Build more, but still guess at constraints and edge cases
An IntentSpec	Build against a contract with explicit boundaries

The quality ceiling of your AI output is set by the quality floor of your input. No model upgrade fixes a missing constraint. No better prompt compensates for undefined success criteria.

The 25 minutes Team B spent writing the spec wasn't "planning overhead." It was the actual engineering. The agent did the rest.

Prompts are for prototypes. Specs are for products.

The fastest PR means nothing if it's not the last one.

The fastest pull request I've ever seen was also the most expensive.

20 minutes from prompt to PR. Then four hours of rework, a security review, three support tickets, and a Slack thread titled "who approved this?"

We ran the experiment on purpose. Two teams, same task, same AI tools, same codebase. One team prompted their way through it. The other used an Intent Layer. The results weren't even close.

The Task

Build a "Forgot Password" flow for an existing app. Simple enough—every developer has built one, and every AI agent has generated one.

Team A: Prompt-Driven

The engineer opened Cursor and typed:

Add a forgot password flow. User enters email,
gets a reset link, clicks it, sets new password.

The agent delivered in 20 minutes:

A /forgot-password page with an email input
An API route that sends a reset email
A /reset-password page with a new password form
Basic validation

It worked. On the happy path, it was fine.

What surfaced over the next 48 hours:

No rate limiting. A pen test flagged the endpoint on day one.
Reset tokens never expired. Open security hole.
No logging. When a user said "I never got the email," support had zero visibility.
Reset link used HTTP in the email template.
The UI didn't use the existing design system. It looked like a different app.

None of these were exotic failures. They were obvious requirements—the kind any experienced developer would think of. But the agent wasn't asked to think. It was asked to build.

Team B: Intent Layer

The PM and engineer spent 15 minutes writing an IntentSpec before handing anything to the agent:

yaml

objective:
  statement: "Allow users to reset their password securely"
  success_criteria:
    - User receives reset email within 60 seconds
    - Reset token expires after 15 minutes
    - Flow matches existing auth UI patterns

constraints:
  - Rate limiting: max 3 requests per email per hour
  - All links must use HTTPS
  - Log all reset attempts for support team visibility
  - Must integrate with existing email provider (Resend)

outcomes:
  - observable: "Password reset completion rate > 90%"
  - metric: "Support tickets about 'never received email' decrease by 50%"

edge_cases:
  - User enters unknown email → generic "check your email" (no account leaking)
  - User clicks expired link → clear error with retry option
  - Corporate network email delays → fallback help text after 2 minutes

Time to first PR: 45 minutes.

One minor issue surfaced in review: the fallback help text triggered too early on slow connections. A one-line config change. Nothing structural.

The Real Comparison

	Team A (Prompt)	Team B (Intent Layer)
Time to first PR	20 min	45 min
Security issues found post-merge	3	0
UX inconsistencies	2	0
Support burden	Created new tickets	Reduced existing ones
Rework after merge	4+ hours	~5 min
Total time to production	5+ hours	50 min

Team A was faster to the PR. Team B was faster to production. And production is the only finish line that counts.

The Asymmetry

This isn't a story about "spend more time planning." It's about what the AI receives as input.

A prompt describes what to build. An IntentSpec defines what success looks like. That distinction changes everything downstream:

Constraints become guardrails. The agent doesn't have to guess about rate limits, security patterns, or design systems. They're in the spec.
Edge cases become test cases. Instead of discovering failures in production, they're addressed during generation.
Success criteria become reviewable. The PR reviewer doesn't just check "does it work?"—they check "does it meet the contract?"

What This Means for Your Team

If you're using AI agents to build, ask yourself: what are you actually feeding them?

If the input is...	The agent will...
A one-line prompt	Build the happy path and guess the rest
A detailed prompt	Build more, but still guess at constraints and edge cases
An IntentSpec	Build against a contract with explicit boundaries

The quality ceiling of your AI output is set by the quality floor of your input. No model upgrade fixes a missing constraint. No better prompt compensates for undefined success criteria.

The 25 minutes Team B spent writing the spec wasn't "planning overhead." It was the actual engineering. The agent did the rest.

Prompts are for prototypes. Specs are for products.

The fastest PR means nothing if it's not the last one.

Prompt vs Intent Layer: Same Task, Different Results

The Task

Team A: Prompt-Driven

Team B: Intent Layer

The Real Comparison

The Asymmetry

What This Means for Your Team

Don't Just Write Code. Define Intent.

Prompt vs Intent Layer: Same Task, Different Results

The Task

Team A: Prompt-Driven

Team B: Intent Layer

The Real Comparison

The Asymmetry

What This Means for Your Team

Don't Just Write Code. Define Intent.