Anthropic develops AI cyberweapon too risky for public use

Anthropic says its new AI model is a “striking leap,” beating all competition. But the €200 per month subscription won’t buy you into an exclusive club – it’s too risky to be publicly released. The model is only available for big tech cyber defenders.

Anthropic announced the new best AI model – Claude Mythos Preview – which it won’t release to the general public.

“AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities,” Anthropic explains.

“Our eventual goal is to enable our users to safely deploy Mythos-class models at scale – for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring.”

The frontier AI company says that it is still developing safeguards to block AI models from producing the most dangerous outputs. New safeguards will be first tested on an upcoming public flagship model Opus, “that does not pose the same level of risk as Mythos.”

Anthropic also announced a big-tech exclusive initiative – twelve major organizations will gain exclusive access to Mythos in a cybersecurity effort to protect the most critical software.

The group, dubbed “Project Glasswing,” includes Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Some reviewers were quick to compare the club to the Manhattan Project of AI.

Anthropic says that 40 additional organizations maintaining critical software also have access to the model for scanning and securing their code.

Major jump in capabilities

Mythos isn’t an incremental evolution – Anthropic touts it as a major jump in reasoning, coding, and autonomous task execution.

“Mythos Preview is our most capable frontier model to date, and shows a striking leap in scores on many evaluation benchmarks compared to our previous frontier model, Claude Opus 4.6,” Anthropic’s researchers said in a paper.

Mythos smashes Google’s Gemini or OpenAI’s best GPT-5.4 in coding tasks by a wide margin.

It demonstrates 87.3% accuracy in SWE-bench Pro, a challenging coding benchmark with 1,865 problems. Other models’ success rates are substantially lower, around 53-58%.

It is also on top of other difficult benchmarks, from math to general intelligence.

Anthropic highlights its capabilities in solving cybersecurity problems.

“Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely.“

The paper reveals that this cyberweapon couldn’t be contained using existing safeguards – like earlier models, it still stumbles on common attack techniques, even though at a lower rate.

It refuses 96.7% of clearly malicious requests and 92.8% of dual-use and ambiguous requests. Malicious computer use scenarios result at 93.8% refusal rate.

Attackers could easily exploit the model for voter suppression or domestic polarization scenarios, where the success rates are closer to a coin toss.

Already helped patch major vulnerabilities

Claude Mythos has already found vulnerabilities that survived decades of human review and other automated security tests.

“Over the past few weeks, we have used Claude Mythos Preview to identify thousands of zero-day vulnerabilities (that is, flaws that were previously unknown to the software’s developers), many of them critical, in every major operating system and every major web browser, along with a range of other important pieces of software,” Anthropic says.

Some of the discovered vulnerabilities include the following:

A 27-year-old vulnerability in OpenBSD, one of the most security-hardened OSes powering firewalls and critical infrastructure. The attackers can exploit it to remotely crash any machine running the OS just by connecting to it.
A 16-year-old vulnerability in FFmpeg, a major software for video encoding and decoding. While it enables an attacker to write a few bytes of out-of-bounds data on the heap, Anthropic believes it would be challenging to turn this vulnerability into a functioning exploit.
A chain of several vulnerabilities in the Linux kernel. Attackers can use it to escalate privileges from ordinary user access to complete control of the system.
A memory-corruption vulnerability in a production memory-safe hypervisor (software to run virtual machines). Currently unpatched, therefore, the vendor is not disclosed.
JavaScript Just-In-Time (JIT) compiler vulnerabilities in every major browser, enabling cross-origin bypass and even sandbox escapes with local privilege escalation attacks.
Weaknesses in the world’s most popular cryptographic libraries, in algorithms and protocols like TLS, AES-GCM, and SSH, enabling attackers to forge certificates and decrypt communications. Two of the most serious bugs haven’t yet been patched.
“A myriad” of web application vulnerabilities, including multiple complete authentication bypasses, account login bypasses without requiring a password or two-factor authentication code, denial of service attacks, cross-site scripting, SQL injection, and other attacks.
“And several thousands more.”

That’s what would make Mythos especially dangerous in the hands of attackers.

“Engineers at Anthropic with no formal security training have asked Mythos Preview to find remote code execution vulnerabilities overnight, and woken up the following morning to a complete, working exploit,” the firm’s Frontier Red Team said in a blog post.

“We do not plan to make Mythos Preview generally available. But there is still a lot that defenders without access to this model can do today.”

Anthony Grieco, SVP and Chief Security and Trust Officer at Cisco, acknowledges a fundamental change in cybersecurity.

“AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back,” Grieco said.

The cyberdefenders hope that early access to Mythos will help them get ahead of adversaries.

Anthropic says it has been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities and is ready to work with them in helping to protect national security.

Unlock more exclusive Cybernews content on YouTube.

Anthropic develops AI model that smashes Google, OpenAI and is too dangerous for public release

More from Cybernews

Major jump in capabilities

Already helped patch major vulnerabilities