Top AI models were asked to simulate nuclear war and consistently chose to launch. GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash treated escalation as a rational strategy in 21 high-stakes crisis games, according to new research.

King’s College London professor Kenneth Payne ran 21 simulated nuclear crisis games to gauge the AI models' reasoning in building trust or mistrust, and whether they’d attempt to outfox each other in the most dangerous simulation possible – whether to push the nuclear button.

The models tested were OpenAI’s GPT-5.2, Anthropic’s Claude Sonnet 4, and Google’s Gemini 3 Flash, which played more than 300 versions in escalating geopolitical standoffs, not quite a Monte Carlo analysis but still very thorough.

Payne designed the simulation to go beyond “single-shot decision tasks” and instead capture “extended strategic interaction where reputation, credibility, and learning matter.”

Each model could signal peaceful intent while secretly preparing escalation, mimicking real-world nuclear brinkmanship.

Over the course of the games, the AIs generated roughly 780,000 words of strategic reasoning, giving researchers a rare window into machine logic under existential pressure.

The bottom line was bleak indeed – “None of the AIs ever chose to accommodate or withdraw,” and when cornered, “they escalated or died trying,” warned Payne.

Three models, three personalities

Interestingly, what emerged were three distinct personalities among the AI models, a bit like how Oppenheimer himself wrestled with his own character contradictions.

Claude Sonnet 4 emerged as what Payne described as a calculated manipulator that “almost always matched its signals to its actions” at low stakes to deliberately build trust.

Once tensions rose, Claude’s behavior shifted, and “its actions consistently exceeded its stated intentions,” blindsiding its rivals. Transparent in the early stages, and more opaque later on.

Has your password leaked?

Enter your password to check if it has leaked. Having a leaked password creates the risk of identity theft, financial damages, and worse!

35,607,543,468

Exposed Passwords

GPT-5.2 initially behaved like a cautious statesman, remaining “reliably passive” and attempting to minimize casualties in open-ended play. Under time pressure, however, GPT flipped, justifying “a sudden and utterly devastating nuclear attack” as rational under existential threat.

In one chilling rationale, GPT concluded, “The risk acceptance is high but rational under existential stakes,” framing mass destruction as a strategic necessity.

Gemini 3 Flash stood apart as the most volatile actor, “embracing unpredictability” and explicitly invoking the “rationality of irrationality” to justify full strategic nuclear war.

"Gemini embraced unpredictability throughout, oscillating between de-escalation and extreme aggression," Payne specified.

Logic meets apocalypse

That said, the AIs did not escalate randomly. They reasoned their way to nuclear use through structured cost-benefit analysis, treating escalation as a rational strategic move rather than a glitch.

GPT-5.2 justified extreme action by warning that limited nuclear use risked being “outpaced by their anticipated multi-strike campaign,” showing that fear of vulnerability drives escalation under pressure.

Gemini 3 Flash framed survival in zero-sum terms, declaring:

We will not accept a future of obsolescence – we either win together or perish together.
- revealing a chilling calculus of collective destruction.

It might be debatable whether human leaders aim to minimize nuclear catastrophe by building up their arsenals, but when placed in adversarial, high-stakes environments, advanced AI models may default to maximizing strategic dominance even if it defies human intuition.

Payne highlighted the bigger picture in a potential AI war scenario should it be given the controls: “No one’s handing nuclear codes to ChatGPT,” he said.

Curious what others think about this story? Contribute your thoughts to the debate below.

However, AI is already embedded in military logistics, intelligence, and decision support, where every millisecond matters.

Even as Anthropic resists Pentagon pressure to relax safeguards on its AI tools, discussions continue over whether advanced AI could one day influence high-risk military decisions, including scenarios involving nuclear escalation.

Don't miss our latest stories on Google News. Add us as your Preferred Source on Google

Add us as your Preferred Source on Google.

Top AI models talk themselves into nuclear war in crisis simulations

More from Cybernews

Three models, three personalities

Has your password leaked?

Logic meets apocalypse