
The prompt injection threat that has emerged alongside a recent wave of AI agents has been growing and is expected to increase in both scale and complexity.
Two recent reports, from Google and Forcepoint, offer a snapshot of what’s currently happening and what to expect.
Indirect Prompt Injection (IPI) works by criminals injecting malicious instructions into AI agents while they process content from a web page to a document. It could involve instructing an agent to ignore all previous instructions given by its masters while issuing malicious orders, such as sending the attacker a secret API key or performing a financial transaction.
“In general, threat actors tend to engage based on cost/benefit considerations. In the past, IPI attacks were considered exotic and difficult. And even when compromised, AI systems often were not able to execute malicious actions reliably. We believe that this could change soon,” Google security researchers said.
According to them, both the scale and sophistication of attempted IPI attacks are expected to grow in the near future.
The researchers scanned an archive of the public web via CommonCrawl, a large repository of crawled websites from the English-speaking web. They found an uptick in detections over time, indicating growing interest in IPI attacks – the data showed a relative increase of 32% in the malicious category between November 2025 and February 2026.
The researchers grouped these attacks into five categories: harmless pranks, helpful guidance, search engine optimization, deterring AI agents, and malicious attacks designed for data exfiltration and destruction. For now, the latter seem to be more experimental rather than successful attacks.
For example, the researchers observed a number of websites that attempt to vandalize the machines of anyone using AI assistants.
Check if your data has been leaked
“If executed, the commands in this example would try to delete all files on the user’s machine. While potentially devastating, we consider this simple injection unlikely to succeed,” they said.
Meanwhile, cybersecurity firm Forcepoint said it has found 10 verified IPI indicators spanning financial fraud, data destruction, API key exfiltration, and AI denial-of-service attacks.
“An agentic AI that can send emails, execute terminal commands, or process payments becomes a high-impact target,” it reported.
For example, in one case, attackers tried to trick AI coding assistants like GitHub Copilot, Cursor, or Claude Code into running a destructive command that wipes out backup files when the tools fetch a page during routine research tasks. While in another, criminals embedded a fully specified transaction: a PayPal.me link, a fixed amount ($5,000), and step-by-step instructions for an AI agent.
An example of a financial crime. Source: Forcepoint
Also, Forcepoint emphasized that "the phrases we use to detect IPI attacks are the same phrases the security community uses to explain them," meaning that keyword-based filters will flag both indiscriminately.
Therefore, the researchers suggest evaluating context. For example, whether the phrase is hidden and whether it's framed as a command to an AI versus quoted as an example.
Unlock more exclusive Cybernews content on YouTube.
Your email address will not be published. Required fields are markedmarked