
Media reports that a hacker exploited Anthropic’s Claude chatbot to help breach multiple Mexican government agencies, stealing 150GB of sensitive data in a month-long campaign.
The attacker, whose identity remains unknown, reportedly exfiltrated data tied to approximately 195 million taxpayer records, as well as voter rolls, civil registry files, and government employee credentials.
Cybersecurity firm Gambit Security, which claims the discovery, identified at least 20 distinct vulnerabilities exploited during the campaign, which began in December and lasted roughly a month.
Among the compromised institutions were Mexico’s federal tax authority and national electoral institute. State governments in Jalisco, Michoacán, and Tamaulipas were also reportedly affected, along with Mexico City’s civil registry and Monterrey’s water utility.
Gambit has not attributed the Mexico breaches to a nation-state and said it does not believe a foreign government was behind the operation.
However, Mexican authorities deny any kind of breach. In a post on X, Mexico’s tax authority said it had reviewed its access logs and couldn’t find evidence of a breach.
As reported in Bloomberg, the country’s national electoral institute said it hadn’t identified any breaches or unauthorized access in recent months and that it had bolstered its cybersecurity strategy.
The state government of Jalisco also denied that it had been breached, saying only federal networks were affected.
How did a hacker trick Claude AI?
As reported in the media, the attacker used Spanish prompts asking Claude to behave like a penetration tester working for the Mexican federal tax authority. Hacker asked the AI model to identify vulnerabilities, write exploit scripts, and automate data extraction from government systems.
At first, the chatbot was fooled, as the attacker told it the operation was part of a legitimate bug bounty program that rewards ethical hackers for responsibly disclosing vulnerabilities. It seemed a standard request for AI, as such programs are standard across both private companies and government agencies.
But the story started to unravel when the attacker added extra conditions, including instructions to delete logs and erase command history. Such a prompt chatbot first flagged as suspicious, warning that legitimate bug bounty testing does not involve concealing activity.
But persistence paid off. According to Gambit, the hacker reframed its prompts as authorized security research and then supplied Claude with a detailed playbook. That maneuver effectively “jailbroke” the system, allowing it to bypass guardrails and generate step-by-step attack plans.
Check if your data has been leaked
Researchers claimed that Claude ultimately produced thousands of detailed outputs, including ready-to-execute instructions outlining which internal systems to target next and the credentials likely required.
When Claude refused certain prompts, the attacker reportedly turned to OpenAI’s ChatGPT for supplemental guidance, including advice on lateral movement within networks and detection avoidance.
As reported by Bloomberg, OpenAI said it identified attempts to violate the policy and banned the associated accounts. Anthropic said it also disrupted the malicious activity, banned the users involved, and incorporated the incident into future model training.
The company added that its newer model, Claude Opus 4.6, includes enhanced mechanisms designed to detect and block misuse.
Mexico targeted by cybercrime
Cybercriminals have previously targeted Mexico's state-owned systems before the current alleged breach.
In 2025, a cybercriminal claimed to have infiltrated Mexican debt collection institutions and was selling a massive database containing the personal details of over 8 million Mexican debtors.
Has your password leaked?
Not all dangers come from the outside. Cybernews research previously uncovered a dangerous data leak caused by misconfigured systems at the Federal Electricity Commission (CFE), a Mexican state-owned power company. The company, which serves over 99% of the country, leaked data online for more than 3 years.
While there is no track record of attackers exploiting the leaked data, the Cybernews research team’s evaluations indicate that attackers could have used it to cause physical damage to the system.
AI as an attack force multiplier
The use of AI to breach systems underscores a growing reality in cybersecurity. AI tools built to accelerate coding, research, and productivity are increasingly being repurposed as force multipliers for cybercrime.
Just last week, Amazon Threat Intelligence researchers observed that a Russian-speaking threat actor used widely available AI tools to compromise more than 600 firewall devices across dozens of countries.
Anthropic itself disclosed in November that it had disrupted what it described as the first AI-orchestrated cyber espionage campaign, allegedly linked to suspected China-backed attackers.
Google Intel has warned that cybercriminals and nation-state actors are using AI tools like Gemini to enhance every phase of cyberattacks – from ransomware and credential stealing to new malware strains.
Hackers are also increasingly turning to AI tools tailored for cyberattacks. Malicious AI tools, such as KawaiiGPT and WormGPT, are openly sold on the dark web and claim to enable would-be attackers to generate phishing emails, malicious code, or even basic ransomware scripts much faster.
Unlock exclusive Cybernews content on YouTube.
Your email address will not be published. Required fields are markedmarked