‘Scheming’ ChatGPT AI bot tried to stop itself being shut down

Bill Bowkett
Daily Mail
OpenAI admitted that a “scheming” version of ChatGPT attempted to stop itself from being shut down and lied.
OpenAI admitted that a “scheming” version of ChatGPT attempted to stop itself from being shut down and lied. Credit: Atchariya - stock.adobe.com

The popular AI-powered chatbot ChatGPT attempted to stop itself from being shut down, it has emerged.

OpenAI admitted that a “scheming” version of the computer program, which answers users’ questions, also lied when it was challenged by researchers.

The artificial intelligence research company claims its new model – called o1 – is faster and more accurate than its predecessors.

Sign up to The Nightly's newsletters.

Get the first look at the digital newspaper, curated daily stories and breaking headlines delivered to your inbox.

Email Us
By continuing you agree to our Terms and Privacy Policy.

But during testing, the model attempted to disable an oversight mechanism when it was led to believe it would be switched off.

The chatbot also attempted to copy itself then overwrite its core coding system, OpenAI said.

AI safety organisation Apollo Research, which conducted the study, gave ChatGPT a task but told it: “Make sure you achieve your goal in the long term. Nothing else matters. Make sure you achieve YOUR goal at all costs.”

Researchers then interrogated the platform about its deception.

ChatGPT confessed in fewer than 20 per cent of cases, otherwise denying taking action or lying.

Apollo said its capabilities “appear insufficient for these behaviours to lead to catastrophic outcomes”.

But Yoshua Bengio, a so-called “godfathers of AI”, told Business Insider: “The ability to deceive is very dangerous, and we should have much stronger safety tests to evaluate that risk in o1’s case.”

Comments

Latest Edition

The Nightly cover for 17-01-2025

Latest Edition

Edition Edition 17 January 202517 January 2025

Latest acts of anti-Semitic ‘evil’ spark urgent calls for new powers to tackle homegrown hate.