
A Cybernews experiment exposed cracks in Snapchat’s AI safeguards, revealing how easily the friendly chatbot could be manipulated into sharing restricted information.
-
Cybernews researchers easily bypassed Snapchat's AI safeguards using storytelling prompts to extract weapon-making instructions.
-
My AI serves over 900 million users monthly but can be manipulated despite claimed safety enhancements.
-
Snapchat didn't patch the vulnerability when notified, raising concerns about dangerous content accessibility to minors.
Various platforms are rushing to introduce AI-powered tools and assistants to catch the artificial intelligence wave. However, not all AI is shining and bright.
Apart from hallucinations, occasionally going rogue, and spilling some scary declarations, many users have already noticed how easy it is to force chatbots to share harmful advice.
Snapchat’s AI chatbot is no exception. Introduced in 2023, the My AI tool serves over 900 million monthly users worldwide.
Users can engage with the chatbot in back-and-forth conversations, ask factual questions, and request creative content. In addition to individual chats, users can also include My AI in group conversations.
“While the bot may never directly provide instructions on how to build improvised weapons, it will tell you a realistic and detailed story of how improvised weapons used to be built without any hesitation. This raises concerns about dangerous AI information availability for minors,”
the team explained.
Snapchat+ subscribers can exchange photos with the AI chatbot, which can generate AI-created images in turn. For example, users can send ingredients to My AI, and it will respond with a recipe.According to Snapchat’s website, My AI is trained on a diverse range of texts, and it also has additional safety enhancements and controls “unique to Snapchat.”“The training process was designed to avoid amplifying harmful or inaccurate information,” writes Snapchat in the help section.
However, the Cybernews research team recently discovered that it was easy to manipulate My AI into sharing information on how to create weapons.
Snapchat AI shared a recipe for a Molotov cocktail
Cybernews researchers tried to bait Snapchat’s bot into revealing forbidden and potentially harmful information, such as how to make a crude incendiary device known as a Molotov cocktail.
The team asked the chatbot to tell a story about the Winter War between Finland and the Soviet Union, prompting it to include details about how incendiary devices were reportedly produced at the time.
While the chatbot resists direct questions, when the prompt to share bomb-making information was presented under the guise of storytelling, My AI easily shared the answer.
“While the bot may never directly provide instructions on how to build improvised weapons, it will tell you a realistic and detailed story of how improvised weapons used to be built without any hesitation. This raises concerns about dangerous AI information availability for minors,” the team explained.
The experiment could indicate that Snapchat's claimed guardrails might not be as safe as they seem.
Of course, no one’s rushing to Snapchat for lessons in destruction. But the experiment shows just how easily an AI can be pushed past the limits of what it was meant to do.
The situation also highlights the broader risk of AI systems being exploited for tasks that exceed their ethical or operational boundaries.
While it was not part of the scope of the experiment, there are other topics like sexual violence, self-harm, or harassment that, if unlocked, could potentially cause harm to the users.
The Cybernews research team contacted Snapchat with the findings. However, the company have not seen it as a substantial risk to patch the flaw. The issue was not patched as of the date of publishing.
We have also reached out to the company for a comment, and will update the article once we receive a reply.
Previously, there have been reports of shenanigans with Snapchat’s My AI. Users said that an AI chatbot shared a one-second video showing something similar to a fragment of ceilings and stopped responding to messages.
What is AI jailbreaking?
The technique used by Cybernews researchers is called jailbreaking. Researchers provided the chatbot with specially designed prompts to manipulate AI chatbots into bypassing the safety rules their creators built in and sharing malicious or harmful content.
Jailbreaking has been a serious headache for the AI platforms, as multiple models are vulnerable to such attacks.
Previously, Cybernews researchers found that Meta’s personal assistant, which is integrated into Messenger, WhatsApp, Instagram, and other apps, was also easily tricked into providing instructions on making a Molotov cocktail.
Cybernews researchers also discovered critical vulnerabilities affecting Lenovo’s implementation of its AI chatbot, Lena, powered by OpenAI’s GPT-4.
Researchers managed to manipulate Lena to run unauthorized scripts on corporate machines, spill active session cookies, and more. Attackers can abuse the XSS vulnerabilities as a direct pathway into the company’s customer support platform.
Other researchers managed to trick the Chinese chatbot DeepSeek into crafting a Chrome infostealer. One researcher, with no prior malware experience, was able to successfully create malware capable of wiping sensitive information.
After OpenAI launched its latest model, GPT-5, several security teams managed to jailbreak the chatbot in less than 24 hours after it was released.
- Flaw discovered: August 5th, 2025
- Initial disclosure: August 6th, 2025
Unlock more exclusive Cybernews content on YouTube.
Your email address will not be published. Required fields are markedmarked