Nobody’s spelling it out plainly, so I will.
Uber’s AI coding budget ran dry in four months. Microsoft pulled Claude Code from most of its internal developers this week.
These aren’t bootstrapped shops hitting a free-tier wall.
These are firms with CFOs, procurement teams, and finance orgs supposedly watching every line item.
Still got ambushed.
That’s the part worth sitting with.
What Actually Happened — The Quick Version
– Uber burned through its entire 2026 AI coding budget before May. Four months into a twelve-month cycle.
– Microsoft cut off Claude Code access for thousands of workers on June 30, routing everyone to GitHub Copilot CLI instead.
– Token prices are dropping. Total invoices aren’t. Per-unit economics improve while agentic AI consumes tokens faster than any price cut can offset.
– Goldman Sachs projects120 quadrillion tokens per month by 2030. A 24x leap from today’s roughly 5 quadrillion.
– Two of the most resource-rich firms in tech just hit a ceiling. If they can overspend, so can you.
The Pitch Nobody Questioned
More AI = faster shipping.
That was the whole pitch.
Incentive programs kicked off.
Leaderboards materialized. Engineers got ranked by how much they tapped the tools. Seems logical enough.
What nobody modeled: enterprise AI tools meter by token.
A standard chatbot session might chew through a few thousand tokens.
An AI coding agent on a multi-step ticket?
Read the codebase, edit, test, catch the failure, retest. Burns tens of thousands per hit.
Fortune cited Gartner estimating inference costs on one-trillion-parameter LLMs will crater 90% by 2030. Good news on the surface. But the same analysis noted enterprise AI spending will keep climbing because agentic models swallow far more tokens per task. Cheaper per unit. Bigger volume. The math doesn’t work if you aren’t tracking consumption.
The Numbers Nobody’s Talking About
Nvidia exec Bryan Catanzaro said compute costs for his crew now outpace employee salaries. The compute invoice eclipses what they pay the humans.
Let that sink in.
Microsoft dropped $5 billion into Anthropic.
Anthropic committed $30 billion to Azure compute. Those deals sit separate from the license cancellations. Different commercial bucket entirely.
But read the room: Microsoft will pay Anthropic billions for model access while simultaneously trimming internal Claude Code seats since the per-head number stopped making sense.
Incentives got miswired. Firms pushed adoption without pricing the adoption.
Where Uber Went Sideways
Be concrete. Details matter here.
Uber’s CTO Praveen Neppali Naga told The Information the enterprise blew through its 2026 AI coding budget in four months.
Not six. Four.
After actively pushing adoption, including internal leaderboards that ranked teams by AI tool usage.
What does that incentive structure create?
Engineers chasing good performance reviews have a direct incentive to use AI tools more. Not smarter. More.
The leaderboard doesn’t grade cost per task.
It tallies usage volume.
That’s the ratchet.
Reward volume over efficiency, you get volume. And AI coding tools compound fast.
Side note: their internal tooling documentation is apparently a mess. Engineers had to guess which model they were even calling. No wonder the meters ran hot.
Small shops getting squeezed right now aren’t the enterprises with lawyers on retainer. They’re indie devs who read “Claude Code for solo devs” guides, signed up for Pro. And are now watching the invoice swell while shipping the same output they shipped before. No CFO to write it off. Just a credit card statement and a choice about which tool gets cut next month.
What Actually Matters When You’re Picking a Tool
Forget benchmark scores.
What actually matters is the cost envelope for how you actually work.
A tool that scores 5% higher on HumanEval but costs three times more per month is worse for most solo operators than a slightly slower tool with predictable pricing.
Microsoft ditched Claude Code and migrated to Copilot CLI. That says something about their cost math. Doesn’t mean Copilot CLI wins overall. Means the billing model fit their flow better than per-seat Claude Code licensing.
For a small agency, the checklist:
– Monthly budget for AI coding tools
– Token cost per average task
– Which tool delivers the steadiest spend for the workflows you actually run
– Pricing model alignment with how you work
Project-based and bursty, or steady and daily? Running five projects a month, each with a two-week intensive build followed by a quiet stretch? An annual subscription with a high monthly fixed cost probably stings.
Pay-per-token might outperform even with a higher per-unit rate.
Microsoft’s call was rational. Not a jab at Anthropic’s tech. Math.
And you get to run those same numbers.
This Is a Governance Problem, Not a Tech Problem
Here’s my problem with how this coverage is playing out.
The framing lands as “AI tools cost too much.” That’s a misread.
AI tools are priced right for what they deliver.
The issue is the incentive structure that pushed usage without metering the usage.
Companies told staff to use more AI.
Staff complied. Nobody had a cost-per-usage model running. Bills showed up, and nobody knew how to make them make sense.
This is a governance failure dressed up as a technology problem.
Governance is something you can install on day one, before signing up for anything.
Set a monthly AI tool budget. Track what each project burns in tokens. Know your cost per task type. Build a simple dashboard. Even a spreadsheet gets the job done. Something showing your burn rate and trend.
If your invoice climbs 30% month over month and your output isn’t climbing 30%, that’s your signal. You’ll catch it in month two instead of month six.
The firms that’ll survive the AI cost reckoning are the ones that learn to use AI efficiently, not just constantly. Maximize value per token, not token usage.
Enterprises are learning this the hard way right now. You don’t have to wait for your own budget crisis to start building the habit. Get ahead of it.
Sources
Fortune. Microsoft and Uber hit AI cost problem
