THE ECONOMIST: The supply-chain crunch taking a byte out of AI
THE ECONOMIST: ‘Tokenmaxxing’ techies may need to take a hiatus as the demand for artificial intelligence increases.

A new craze has taken hold of Silicon Valley in recent months.
Techies looking to prove they are among the vanguard of artificial-intelligence adoption have taken to “tokenmaxxing”, competing against one another to burn through the most tokens (as the chunks of text processed by AI models are known).
Between January and March, the weekly tokens processed by OpenRouter, a marketplace for accessing models, quadrupled.
Sign up to The Nightly's newsletters.
Get the first look at the digital newspaper, curated daily stories and breaking headlines delivered to your inbox.
By continuing you agree to our Terms and Privacy Policy.As demand for AI soars, the industry behind it is struggling to keep up.
In March Anthropic, an AI lab whose models are popular with businesses, began throttling access to its tools at busy times, and has since been altering its subscription plans seemingly in a bid to curb usage.
In April its service has experienced outages averaging around 30 minutes per day. And it is not alone.
In March OpenAI, a rival, abruptly shut Sora, its video-generation tool, to redirect scarce computing power toward more lucrative uses.
On April 20, GitHub, a coding-collaboration site owned by Microsoft, stopped accepting new subscriptions for its programming bot.
The industry’s response has been to pour ever larger sums of money into new infrastructure.
On April 20 Anthropic announced a $100 billion partnership with Amazon to secure up to five gigawatts of server capacity, with nearly a fifth due to come online by the end of the year.
On April 24 it said that Google would also invest $40b to help the lab meet its computing needs. On April 27 OpenAI announced that it was reworking its partnership with Microsoft to allow it to distribute all its products through any cloud provider, giving it greater flexibility to tap into computing supply.
The five so-called hyperscalers —Alphabet, Amazon, Meta, Microsoft and Oracle — are investing hundreds of billions of dollars apiece in data centres.

Alphabet, Amazon and Oracle have already raised more than $100b in debt between them this year.
To free up cash, Meta recently announced that it would lay off 10 per cent of its workforce, while Microsoft said that it would offer voluntary redundancies to about 7 per cent of its workers.
Adding more capacity, however, is only getting more challenging.
In America and beyond, political opposition to the construction of data centres is growing.
What is more, the companies making the hardware that fills them — from chips and networking gear to cooling equipment — have been investing far too little to keep pace with demand. The squeeze on capacity, then, looks set to worsen.
Start with the politics. In April legislators in Maine voted in favour of a bill to ban the construction of data centres above 20 megawatts until November 2027.
Although it was subsequently vetoed by the governor, lawmakers in more than 10 other American states are weighing similar measures. According to one count, $156b-worth of data-centre projects were blocked or delayed last year in America by local opposition and litigation.
Other countries, from Ireland to Brazil, are experiencing a growing backlash. Concern over the impact of power-hungry data centres on electricity bills in particular has become widespread — and may intensify further as the war in the Gulf raises energy prices.
Even when data centres are approved for construction and can get hooked up to a power source — be that the grid or, increasingly, their own means of generation — those erecting them are finding it harder to get their hands on the computing equipment needed to operate them.
Ivan Chiam of SemiAnalysis, a research firm, points out that there are not enough chips to fill the data centres now being built.
Consider the graphics-processing units designed by Nvidia, which provide more than two-thirds of the world’s AI computing power.
The price to rent one of its H100 GPUs, launched in 2022, has soared by around 30 per cent since November, as customers unable to get their hands on newer models have looked to older generations.
Competing AI processors are also getting more difficult to obtain.
In April Andy Jassy, Amazon’s boss, said that his company had nearly sold out access to its Trainium2 AI chips.
A significant chunk of the capacity of Trainium4, due next year, “has already been reserved”.
The squeeze also extends to memory chips, in particular the kind of high-bandwidth memory that AI models rely on.
All three big producers — SK Hynix, Samsung and Micron — say that most of their supply for 2026 is already sold out.
Some hope of relief came in March when Google unveiled TurboQuant, an algorithm meant to reduce the amount of memory AI needs, causing the share prices of the memory-makers to briefly swoon.
Even so, demand for HBM is expected to outstrip supply for at least the next three years.

The shortages are now spreading to central-processing units.
“Agentic” AI tools that plan, reason and carry out tasks rely more heavily on these types of chips to co-ordinate their work.
Morgan Stanley, an investment bank, estimates that agentic systems require one CPU for every GPU, compared with a ratio of one to 12 for chatbot-style systems.
Indeed, demand for CPUs has been so robust that it has breathed new life into Intel, which not long ago looked like it was headed for collapse.
The market capitalisation of the American chipmaker, which is one of the leading producers of CPUs, has more than doubled over the past six months.
Semi-detached
The crux of the problem is that companies along the AI supply chain are investing far less than the hyperscalers in expanding their capacity.
We examined the planned capital spending this year of the 50 or so largest manufacturers of chips, chipmaking tools, servers, networking gear and cooling equipment, and how it has changed since 2024.
Over that period, the five hyperscalers have increased their combined capital spending by 190 per cent, from $234b to $677b, whereas the hardware suppliers they rely on have increased theirs by only 45 per cent, from $153b to $223b.
Take TSMC, the world’s biggest contract chipmaker and the dominant supplier of cutting-edge GPUs and CPUs.
Its most advanced fabs — those making chips that are five nanometres or smaller — are already running flat out.
C.C. Wei, its boss, has admitted that supply is “very tight”, but that “there are no shortcuts”: building a new fab takes two to three years.
The company plans to spend about $55b in 2026, up by 34 per cent from a year earlier; analysts expect the figure to rise to $65b in 2027.
As a share of sales, however, its capital expenditure has fallen from around half in 2022 to a third this year.
TSMC’s caution has frustrated its customers.
Sam Altman, OpenAI’s boss, has urged the firm to “just build more capacity”.
In March Elon Musk, boss of Tesla and SpaceX, announced plans to build a so-called “Terafab” with the modest ambition of churning out more processing power annually than the entirety of the global semiconductor industry today.
The facility, which Mr Musk has enlisted Intel to help set up, is unlikely to start production until 2028 at the earliest, and even then at a fraction of the scale envisioned.
What is more, Mr Musk may struggle to get his hands on enough of the advanced machines he will need to operate it, which are also in short supply.
That illustrates the mismatch that now clouds the future of AI.
Improving software takes months, whereas expanding supply chains takes years. Hardware-makers are wary of over-building and being stuck with idle capacity.
The craze for “tokenmaxxing”, then, might soon be cut short.
Originally published as AI is confronting a supply-chain crunch
