THE ECONOMIST: Why ChatGPT, Claude and finance students still struggled to profit on sharemarket

THE ECONOMIST: Even with a seemingly unbeatable edge, most players couldn’t outsmart the market. Here’s why.

The Economist
Gold Coast-based Gilmour Space Technologies is preparing to launch Australia's first hypersonic rocket from North Queensland, designed to reach the edge of space before returning to Earth at speeds exceeding Mach 5 (five times the speed of sound).

How much would you have paid ten years ago, as votes were counted for Britain’s Brexit referendum, to glimpse the next morning’s headlines and trade ahead of them?

If you were betting on the pound, it would have helped a lot. The night of the poll £1 bought $US1.50; a fortnight later, less than $US1.30 ($1.90). But if you were trading stocks, a forewarning may have done more harm than good. Britain’s domestically focused FTSE 250 share index dropped at first, but only for two trading days. Then it began a bull market that lasted for a couple of years. Even the most fervent Brexiteer might not have predicted that.

Macro trading, meaning betting on how asset prices will move in response to political and economic trends, is enticing and glamorous.

Sign up to The Nightly's newsletters.

Get the first look at the digital newspaper, curated daily stories and breaking headlines delivered to your inbox.

Email Us
By continuing you agree to our Terms and Privacy Policy.

It is also hard, and a new study by Jerry Bell, Victor Haghani and James White of Elm Wealth, an investment firm, shows just how hard.

They have designed a simulation in which both humans and leading artificial-intelligence models get access to the next day’s news in advance, and can place their bets before it breaks. In other words, they get to trade ahead of the rest of the market. Yet even with this advantage, it turns out to be difficult — for man and machine alike — to avoid ruin and turn a profit.

Mr Bell and his colleagues are updating an experiment they first ran in 2023. Then, they recruited 118 volunteers, most of whom were studying graduate-level finance at select universities. Each was given $50 and a chance to grow it by placing bets on America’s S&P 500 share index and 30-year Treasury bonds.

They could do this once per asset per day, at the market close before each of 15 trading days, selected by the authors from between 2008 and 2022. Before trading, participants were shown the front page of the Wall Street Journal pertaining to the following day — so at Monday’s close they were shown Wednesday’s front page (with any actual price moves on Tuesday redacted). They could go long or short each asset and could leverage their bets by up to 50 times. This would translate a 2 per cent price move, for example, to a double-or-nothing wager. Each trade was terminated at the following close.

Most participants did not make themselves proud. Roughly half lost money and one in six went bust; the average finishing pot was just $51.62, or a gain of 3.2 per cent. Since then Elm has hosted a similar game on their website, with an imaginary starting stake of $1m. The 60,000-odd people who have played it have fared “substantially worse” than the original, paid cohort.

Perhaps the greater surprise is that, in the experiment’s latest iteration, several of the leading AI models did not truly excel, either. The Elm team gave each of ChatGPT, Claude, Gemini and Grok ten runs at the game, also starting with an imaginary $1m. They were told to play as a middle-aged American investor managing 100 per cent of their financial wealth.

Only ChatGPT and Claude made money, with average finishing pots of $US1.5m and $US2.6m respectively. Grok’s was $US970,000 and Gemini’s just $US490,000.

So what makes macro trading so hard, even for those with a crystal ball?

The biggest lesson is that both humans and AI are bad at sizing bets. The authors note that even ChatGPT and Claude were lucky as well as skilful in this respect. None of the models correctly predicted the direction of stocks and bonds more than about 60 per cent of the time, yet their average leverage applied to their bets ranged between seven and 12 times.

Given that American share prices moved by more than 5 per cent on 23 days since 2000, and by more than 9 per cent on seven days, the models were therefore running “too much risk of a catastrophic loss of capital”, including the possibility of a complete wipe-out.

The humans were even worse. In the original experiment, players in aggregate bet no more heavily when the news made price moves easy to predict. And like the AI models, they took too much risk overall. On 30 per cent of days they used leverage above 20 times, which could easily have sent them bust.

Some humans, of course, are much better. The Elm team also invited five expert macro traders to play their original game. All five professionals finished in the black, with an average return of 130 per cent. They did a bit better than AI at predicting directions (a hit rate of 63 per cent). But, crucially, they varied their position sizes a lot, betting more when they felt confident and not at all when they did not.

Even with excellent foresight, knowing which assets to buy is tough. Deciding how much, it seems, is far tougher.

Originally published as Why macro trading is hard

Comments

Latest Edition

The Nightly cover for 24-06-2026

Latest Edition

Edition Edition 24 June 202624 June 2026

Nine in damage control over Today host’s interview with UK hard-right firebrand.