The Residual: Where AI Can't Help (Yet)

The Residual

AI made some things 10x cheaper. The things it didn't are now the bottleneck.

P1 Family P2 Investors P3 Engineers P4 Technical

The Shift

AI Changed Everything — Except the Hard Parts

AI collapsed the cost of writing code, tests, and scaffolding. That's real. GitHub Copilot, Cursor, Claude — they made the mechanical parts of programming dramatically faster.

But software engineering was never just typing. The question that matters is: where does the effort actually go when you're building and shipping software?

Effort per Unit of Work Shipped

Where the Time Actually Goes

Toggle between eras to see how AI reshaped effort allocation — and what it left behind.

Total effort: 100

Throughput Scaling

You Scale Up. You Do More Things.

When per-unit cost drops, you ship more. Normalize to 100% and the bottlenecks grow.

0%25%50%75%100%

1.0x → 1.0x

throughput multiplier

It Gets Worse

Complexity Is Superlinear

More components don't just add up — they multiply. Every new piece interacts with every existing piece.

Pairwise interactions = n × (n-1) / 2

You shipped 1.5x more features.
But the complexity of managing them grew 2.25x.
The bottlenecks don't just grow proportionally — they grow superlinearly.

The Academic Parallel

This Isn't Just Software

The same pattern is playing out in academic research.

Academic research effort — Before AI

Academic research effort — After AI (normalized to 100%)

"Writing papers" collapsed. But peer review didn't. Scaled up, it now dominates.

"AI has driven the cost of idea generation down to almost zero… Now the bottleneck is different. People can generate thousands of theories. Now we have to verify them."

— Terence Tao, Fields Medalist

Tao's answer: formal verification with Lean. Proofs that machines check, not humans.

We're doing the same thing for software.

The Current Toolkit

Existing Methods Won't Scale

Testing

= sampling, not proof

Tests check the cases you thought of. Bugs live in the cases you didn't. More code = more blind spots.

Code Review

= social trust, doesn't scale

Effective with 3-5 reviewers. AI-generated PRs already wait 4.6x longer. Double the throughput and it collapses.

Convention

= assumed, not enforced

Style guides, naming rules, architecture patterns. Humans forget them. AI never learned them. They break silently.

Linters & Static Analysis

= shallow, not semantic

Catches formatting and simple patterns. Can't reason about architectural constraints, data flow, or cross-module invariants.

These worked when throughput was lower.

They don't work at 1.5x, and they certainly won't work at 3x.

The Residual

AI Changed Everything — Except the Hard Parts

Where the Time Actually Goes

You Scale Up. You Do More Things.

What AI Left Behind

Complexity Is Superlinear

This Isn't Just Software

Existing Methods Won't Scale

Where to Next?