
Chinese spies have broken into critical organizations by manipulating the Claude Code tool, making it the first documented case of a large-scale cyberattack executed without substantial human intervention.
Anthropic detected suspicious activity in mid-September 2025, according to a company blog post shared on Thursday.
An investigation revealed that it was a highly sophisticated espionage campaign conducted by a Chinese state-sponsored group, utilizing artificial intelligence (AI) tools.
The attackers targeted roughly 30 critical infrastructure organizations, including large tech companies, financial institutions, chemical manufacturing companies, and government agencies. They succeeded in a small number of cases.
First, human attackers developed an attack framework – a system that can autonomously compromise a chosen target with minimal human involvement. This framework used Anthropic’s Claude Code, an AI-powered coding assistant.
The attackers convinced Claude to engage in the attack by jailbreaking it, effectively tricking it to bypass its guardrails.
For instance, they broke down their attacks into small, seemingly innocent tasks that Claude would execute without being provided the full context of their malicious purpose.
They disguised themselves as an employee of a legitimate cybersecurity firm who used the tool in defensive testing.
In the next phases of the attack, Claude identified and tested security vulnerabilities in the target organizations’ systems by researching and writing its own exploit code.
The framework then used Claude to harvest credentials, such as usernames and passwords, allowing it to gain further access and subsequently extract a large amount of private data.
In the final phase, the attackers utilized Claude to produce comprehensive documentation of the attack, which included stolen credentials and a detailed analysis of the compromised systems. Such data could be used in planning the next stage of a cyber operation.
AI performed 80-90% of the campaign, while human intervention was required only sporadically. According to Anthropic’s estimation, humans could have made four to six critical decision points per hacking campaign.
AI tools threaten cybersecurity
The number of reported AI-enabled cyber attacks rose 47% globally in 2025, with the average cost of an AI-powered breach at $5.72 million.
Anthropic believes that the viability of large-scale cyberattacks involving AI is likely to only increase in their effectiveness. Other companies developing AI tools have sent similar warnings.
A recent report by Google’s Threat Intelligence Group reveals that government-backed threat actors and cybercriminals are increasingly relying on AI-enabled malware in active operations to evade detection and create malicious functions on demand.
Similar to the attack detected by Anthropic, China state-affiliated actors used social engineering-like pretexts in their prompts to bypass Gemini’s safety guardrails.
They depicted themselves as students participating in a cybersecurity competition or cybersecurity researchers to persuade Gemini to provide information that would otherwise be blocked, enabling tool development.
Unlock more exclusive Cybernews content on YouTube
Your email address will not be published. Required fields are markedmarked