From "Token competition" to "Token throttling": Average monthly costs reach $7500 per person, astronomical bills force giants to collectively hit the brakes.
Businesses have shifted their AI spending from "extreme consumption" to "extreme conservation," with industry giants setting limits on AI usage and intelligent tools to deal with rising cost pressures. This budget control trend has sparked a debate among companies between cost control and productivity, while also bringing a windfall to infrastructure providers like Microsoft and Databricks that offer cost optimization, gateway tools, and model routers.
Enterprise AI spending is undergoing a directional reversal. Technology giants that previously used rankings to incentivize employees to consume large amounts of tokens are now setting limits on AI usage. The trend of "tokenmaxxing" (maximum consumption) is quickly giving way to "tokenminimizing" (maximum throttling), with a wave of AI budget control sweeping through Fortune 500 giants such as AT&T, Meta, Uber, Walmart, and Amazon.
According to The Information, AT&T has started limiting employee access to Microsoft's GitHub Copilot. Meta has tightened employee spending on Anthropic and other AI service providers, contrasting sharply with the lively scenes of employees competing and consuming tokens a few months ago. Bloomberg previously reported that Uber and Walmart have set limits on the use of AI programming tools. According to the Financial Times, Amazon has abolished internal rankings based on AI usage for employees.
The driving force behind this shift is rapidly escalating cost pressures. The enterprises with the highest AI usage intensity are spending up to $7,500 per employee per month on AI. Even as the prices of various models continue to decline, the intelligent tools that repeatedly call models are causing enterprise AI bills to triple compared to before, with cost pressures exceeding the budget limits of many companies.
This shift is redefining the beneficiaries of the AI market. The demand for "gateways" and model routers tools that help enterprises monitor, limit, and optimize AI spending is rapidly rising, with companies like Microsoft, Databricks, and Factory invested under NVIDIA, among others, benefiting from this structural shift. Software vendors Palantir and Snowflake are also seen as potential beneficiaries of this structural change.
Eye-popping bills: cost control reshapes budget logic
The accumulation of cost pressures is evident. Uber is the most extreme case to date - the company had exhausted its annual AI programming budget by April 2026, and has now adjusted the monthly usage limit of each tool per employee to $1,500. Walmart has set limits on the use of its internal AI agents; Amazon directly abolished related rankings after finding that employees were consuming a large amount of computing power to compete for rankings and driving up costs.
Even at the individual level, cost consumption is striking. Microsoft found that some engineers were spending $500 to $2,000 per month on Token fees alone on Claude Code.
The root of the problem lies in the widespread use of intelligent tools that structurally change the consumption patterns of Tokens. Unlike manually sending single commands, such tools repeatedly and automatically call models in the process of completing a task, significantly increasing actual usage. This means that even though the price of a single Token has been continuously decreasing, overall enterprise bills remain high.
Divergent reactions: Hit the brakes or keep accelerating?
Not all companies choose to tighten up. Box CEO Aaron Levie is quite pleased with this. "We've never celebrated tokenmaxxing," he said, "we don't have rankings, so we haven't gone off track - we don't incentivize the wrong behavior."
In contrast, Databricks takes a different approach. Nikita Shamgunov, head of engineering at the company, stated at an event hosted by Nebius last week that Databricks does not set limits on engineers' AI budgets, so "tokenmaxxing is still present." This stance reflects a viewpoint that for companies that believe their employees can efficiently use AI, restricting usage may not be cost-effective.
This divergence reveals the tension inherent in Token throttling policies: while controlling usage can reduce costs, it may also simultaneously reduce the productivity gains that AI originally promised which were precisely the main justification companies made for this expenditure.
Infrastructure benefits: cost control tools in high demand
On the other side of the "Token throttling" wave, there is a rising structural demand for cost control infrastructure.
More and more companies are migrating simple tasks from expensive cutting-edge models to more affordable or open-source alternative models to control costs without reducing actual usage. Executives from Palantir and Box have both stated that demand from enterprise customers for such solutions is rapidly increasing.
Infrastructure layer suppliers are quickly filling this gap. Microsoft and Databricks have each launched "gateway" tools that help enterprises monitor employee AI usage and implement spending limits. Factory, an AI software development company invested in by NVIDIA with a valuation of $1.5 billion, released a new model router earlier this month, aimed at automatically assigning low-complexity tasks to lower-cost models.
Microsoft CEO Satya Nadella echoed these trends in an article published on the X platform last weekend, advocating for AI models to operate like commodities that can be replaced at any time. He wrote, "None of us want to see a world where every company in every industry hands over value to a few 'one-size-fits-all' models." It is worth noting that this statement comes from a tech giant's leader facing competition pressure from Anthropic and OpenAI, making the strategic intent behind it equally intriguing.
Microsoft's dual strategy: Introducing new pricing, yet emphasizing "cost control"
While actively responding to customer demands for cost reduction, Microsoft this week revealed the pricing structure of its new flagship AI product, Copilot Cowork, which closely resembles the model introduced by Anthropic earlier.
Copilot Cowork relies on Anthropic's models to automatically perform complex multi-step tasks within Microsoft Office 365 software - for example, users can send a batch of receipt screenshots to the tool, which will generate an electronic spreadsheet containing the corresponding expense information. This far exceeds the basic tasks that existing 365 Copilot can handle (such as summarizing emails or creating financial models in Excel).
In terms of pricing, users must first have a $30 per month minimum fee for 365 Copilot authorization, and then pay additional fees based on actual usage of Copilot Cowork. This "seat fee + consumption" combination billing model aligns closely with the pricing model that Anthropic introduced to enterprise customers earlier this year.
Faced with widespread concerns among enterprise customers about skyrocketing AI costs, Microsoft Executive Vice President Charles Lamanna stated in a blog post on Tuesday that customers "can choose the way to control costs," including setting usage limits for Copilot Cowork for employees. At the same time, Microsoft also previewed a feature that allows customers to replace the Anthropic models in Copilot Cowork with other models from OpenAI or Microsoft themselves, claiming to achieve lower costs with similar results; according to an employee familiar with the matter, Microsoft is also testing options to replace Anthropic models with open-source models in some scenarios. This layout indicates that in the era of "Token throttling," how to maintain product competitiveness while alleviating customer cost concerns has become the core proposition of a new round of competition in the enterprise software market.
This article is reprinted from the "Wall Street Watch" app, written by Zhang Yaqi; GMTEight editor: Song Zhiying.
Related Articles

AI godfather warns: Musk's xAI has "failed", AI industry may face a "big bubble burst"

"Dollar rebound ends emerging market currency bull market, hottest forex trading of the year cools rapidly"

Energy costs increase inflationary pressure! South Korea's Producer Price Index (PPI) in May rose by 8.5% year-on-year, reaching a near four-year high.
AI godfather warns: Musk's xAI has "failed", AI industry may face a "big bubble burst"

"Dollar rebound ends emerging market currency bull market, hottest forex trading of the year cools rapidly"

Energy costs increase inflationary pressure! South Korea's Producer Price Index (PPI) in May rose by 8.5% year-on-year, reaching a near four-year high.

RECOMMEND





