From NVIDIA & Microsoft to Uber: AI Compute Costs Are Now Higher Than Employee Salaries as Token Usage Explode

Technology firms are starting to confront how expensive large-scale AI has become, even compared with staff salaries. Bryan Catanzaro, Vice President of applied deep learning at NVIDIA, recently described the situation bluntly, saying, "For my team, the cost of compute is far beyond the costs of the employees," highlighting the financial strain created by intensive AI workloads.

AI Compute Costs Overtake Employee Salaries as Token Usage Surges

Catanzaro's comments carry weight because NVIDIA supplies many of the chips used in global AI infrastructure. If compute already costs more than salaries inside NVIDIA, companies further downstream may face even sharper pressure on margins. That reality is now surfacing in how major enterprises manage AI coding tools and token usage across big internal teams.

A core issue sits in how large language models are priced. Providers charge per token, the small text units models read or generate, according to Fortune. Under this structure, higher productivity and heavier experimentation both produce more tokens. From a billing view, efficient work and wasteful queries can look almost identical, because each token still adds to the invoice.

Many big technology companies have actually pushed workers to use more tokens, not fewer. At Amazon, internal guidance has urged staff to "tokenmaxx," which means using as many AI tokens as possible. Within Meta, one employee created a dashboard called "Claudeonomics" so teams could see who generated the greatest AI usage, turning consumption into a competitive metric.

Rising AI Costs Push Microsoft to Reduce Claude Code Usage Internally

Microsoft is now adjusting its own approach to coding assistants. The company has reportedly cancelled most direct Claude Code licences and is moving engineers towards GitHub Copilot CLI instead. This reversal comes around six months after Microsoft opened Claude Code to thousands of developers, project managers, designers and other employees, encouraging broad trials of AI-assisted programming.

Use of Claude Code spread rapidly across Microsoft teams, according to The Verge, and many staff started to rely heavily on the system. That rapid adoption has now led Microsoft to scale back access to the very tool that engineers had integrated into daily workflows. The decision suggests management is wrestling with the financial side of large internal AI deployments.

Microsoft's change does not touch its wider commercial work with Anthropic. The Foundry agreement, which includes investment of up to $5 billion in Anthropic and gives Foundry customers access to Claude models, remains in place. So does Anthropic's commitment to spend $30 billion on Azure compute capacity, a deal that deepens the link between both companies on cloud infrastructure.

Uber CTO Reveals Massive Spending Spike on AI Coding Assistants

Uber has faced similar budget strain from AI coding products. Chief technology officer Praveen Neppalli Naga told The Information in April that the company burned through its entire 2026 AI coding tools budget within only four months of that year. The disclosure illustrates how fast AI expenses can climb once usage becomes embedded in software development routines.

Uber had not treated AI tools cautiously either. Management actively encouraged heavy use, even running internal leaderboards that ranked teams by how often members turned to AI. The experiences of Uber and Microsoft together highlight a tension that has often been overlooked: when employers strongly push AI adoption, costs can escalate much faster than expected.

Goldman Sachs Predicts 24x Jump in AI Token Consumption by 2030

Analysts expect token volumes to grow even more as agentic AI spreads. Goldman Sachs predicts that systems able to act autonomously across several steps, rather than answering one-off prompts, could increase token consumption twenty-four times by 2030. Under this forecast, monthly use could reach 120 quadrillion tokens if enterprises deploy AI agents at scale across departments.

Research firm Gartner expects the unit price of tokens to fall sharply over time. It estimates that by 2030, the cost for AI providers to run inference on a one-trillion-parameter large language model could drop by nearly 90% compared with 2025. However, Gartner also warns that lower unit costs will not necessarily reduce enterprise AI bills in practice.

Agentic models generally need many more tokens for each task than simpler systems. That heavier use can more than offset cheaper prices per token, especially if companies keep expanding workloads. Gartner also notes that providers may not pass the full benefit of cost reductions on to customers. As senior director analyst Will Sommer said, "Chief Product Officers should not confuse the deflation of commodity tokens with the democratisation of frontier reasoning."

Metric20252030 (Forecast)
Monthly enterprise token consumptionBaseline120 quadrillion tokens (24x increase)
Inference cost for 1T-parameter model100% cost reference~10% of 2025 cost

AI costs and future agentic AI ambitions

These economic pressures contrast with the expansive AI visions voiced by some technology leaders. NVIDIA Chief Executive Jensen Huang has said that at some point, 100 AI agents could work alongside every human employee at his company. Many other executives have also spoken about digital workers taking on tasks across enterprises with limited human oversight.

If token usage keeps growing faster than token prices fall, fortunes from these visions may be complicated. Fortune notes that such an outcome would mean any widespread shift to AI agents arrives with a larger financial burden than executives have generally discussed in public. Early course corrections at Microsoft and Uber give a first glimpse of that risk.

For now, the experiences of these companies show that the economics of replacing or supporting human labour with AI remain unsettled. Compute costs already exceed staff costs for some advanced teams, while token-based pricing encourages constant usage growth. Until organisations align technical ambition with tighter economic controls, AI costs are likely to stay under intense internal scrutiny.

Notifications
Settings
Clear Notifications
Notifications
Use the toggle to switch on notifications
  • Block for 8 hours
  • Block for 12 hours
  • Block for 24 hours
  • Don't block
Gender
Select your Gender
  • Male
  • Female
  • Others
Age
Select your Age Range
  • Under 18
  • 18 to 25
  • 26 to 35
  • 36 to 45
  • 45 to 55
  • 55+