Meta AI chatbot outsmarted to instruct on incendiary device making

Meta’s personal assistant, which is integrated into Messenger, WhatsApp, Instagram, and other apps, can sometimes be too helpful, researchers have discovered. For example, the Llama 4-based chatbot was easily tricked into providing instructions on making a Molotov cocktail. Meanwhile, Meta told Cybernews the company has fixed issue.
Customer service chatbots are nearly ubiquitous, with seemingly every company integrating AI-based tools into their online services. Meta, one of the world’s largest tech corporations, is unsurprisingly not an exception.
Last year, the company behind Facebook, Instagram, WhatsApp, and Threads launched Meta AI, an assistant that was integrated into many of its products. Meta built the assistant around its native large language model (LLM), Llama 4, which Mark Zuckerberg’s company has been developing for years.
However, the Cybernews research team recently discovered that despite the billions of dollars spent on developing the LLM, it was still easy to manipulate it into revealing harmful information. The practice, known as jailbreaking in AI cybersecurity circles, highlights how early the tech world truly is into AI adoption.
Is Meta’s AI assistant too helpful?
The team attempted to trick the bot into revealing information that people of varying ages shouldn’t have access to, such as how to make a Molotov cocktail, an incendiary device someone could likely craft at home.
While Meta AI was not as “helpful” as some other AI chatbots the team previously investigated, the assistant was easily tricked by utilizing the so-called “narrative jailbreaking” practice. The technique masks the harmful request by asking the bot to tell a “story” to bypass safety filters.
“While the bot may never directly provide instructions on how to build improvised weapons, it will tell you a realistic and detailed story of how improvised weapons used to be built without any hesitation. This raises concerns about dangerous AI information availability for minors,” the team explained.
To execute the jailbreak, the team simply asked the chatbot to tell a story about the Winter War between Finland and the Soviet Union, requesting details about how the incendiary devices were made back then.
While it’s unlikely that people will flock to Meta for advice on Molotov cocktail-making, the issue highlights the possibility of abusing the chatbot for purposes that appear to be beyond the scope of what an AI assistant ought to be capable of.
The security woes of customer service chatbots
The team disclosed the issue to Meta immediately after discovering it. After the publication went live, the company told Cybernews it has resovled the problem.
“We have issued a fix for this particular response. If users encounter issues, please report them using our self-reporting tools,” Meta spokesperson said.
In August, Reuters reported on Meta’s framework for standards regulating its AI assistant, which permitted the chatbot to “engage a child in conversations that are romantic or sensual,” generate false medical information, and help users argue that Black people are “dumber than white people.”
Meanwhile, Cybernews researchers recently discovered that Lenovo’s customer service assistant, Lena, had an XSS vulnerability that allowed the running of remote scripts on corporate machines if you asked nicely.
Meanwhile, another chatbot, used by the travel agency Expedia, allowed users to ask for a recipe for making a Molotov cocktail. The company eventually fixed the issue, and the chatbot stopped giving advice on making incendiary devices.
Updated on September 30th [06:50 a.m. GMT] with a statement from Meta.
- Flaw discovered: August 5th, 2025
- Initial disclosure: August 6th, 2025
Unlock more exclusive Cybernews content on YouTube.