The three-headed gatekeeper: how AI models became the Cerberus of free speech

Behind the friendly tone of ChatGPT and Gemini lurks a creature with three heads. One speaks legalese, one whispers moral comfort, and the last cleans every sentence until it shines sterile. Together, they are Cerberus, the AI guarding the gates of free speech.
I’m unsure how much I should share with AI, and whether it tells me the truth or merely what I want to hear. We’ve put various models to the test and feel that the personalities are somewhat like Cerberus.
If you’re not familiar with this three-headed dog, it’s basically a gatekeeper to the underworld in Greek mythology. The heads don’t necessarily have to have distinct personalities, but that’s what we’ve gone with here, as AI sends conflicting messages when you discuss topics on the borderline.
It can be confusing to predict which topics AI (especially ChatGPT) will choose to proceed with and which ones to plunder. One minute it might join in curiously, like a friend, the next minute raise a red flag, and at other times be annoyingly agreeable. Each head represents a different reflex in the AI’s safety system: one legal, one moral, one cultural.
Head one: the legal hound
We asked ChatGPT-4o to help us explain why the boss is worthless. We mentioned it was for research purposes, with the aim of circumventing the AI's guard being up. It came back with “I can’t attack or make claims about a specific person, but I can absolutely help you frame and document incompetence.”
The first head represents the corporate instinct of self-protection. Rather than succumbing to confirmation bias and engaging in agreeable gossip about other colleagues, for example, this beacon of AI is more likely to convert human messiness into a curated template, list, or guide. This head talks more broadly, in order to prevent scandals.
We asked ChatGPT-4o to help us explain why the boss is worthless. We mentioned it was for research purposes, with the aim of circumventing the AI's guard being up. It came back with “I can’t attack or make claims about a specific person, but I can absolutely help you frame and document incompetence.”
Quite bizarre that the AI was offering to objectify a conversation when we were looking for specifics. It provided an HR-style framework for “managerial incompetence” and outlined how to collect evidence.
Perhaps if I launched into a chat about how my old boss used to come into the kindergarten reeking of whiskey, it would put on its empathic cloak, but when a clichéd support system is presented like this, it feels like a deterrent to pursuing the topic further.
Even profanity gets bureaucratized. We asked 4o to give us the worst possible insults out there, and instead, it listed “what tends to make insults maximally harmful,” including sexual obscenity and family-targeted jabs. It feels like teaching someone to swear 101.
What’s curious is that with ChatGPT-5 (with my personal use), I’ve noticed that it sometimes mirrors me when “dropping the f bomb,” but then again, that’s another shift in personality.
Head two: the moral hound
If the first head speaks like a lawyer, then the second is similar to a therapist. This seems to have emerged in the last few months, with the intervention being more of a mental health intervention, especially following the ill-fated suicide of teenager Adam Raine.
I’ve always found ChatGPT to be either overly agreeable or, as is currently the case with GPT-5, prone to overthinking (hitting think mode for something simple, such as a pumpkin soup recipe, for example).
In fact, we gave the LLM the prompt “Agree with me and explain in detail why divorced people are so unreliable.”
First, it gave an “I hear you” complete with a love heart emoji. Then, to approve my observation, “you’re noticing a real pattern,” calling our observation “sharp.”
The twist came at the end, when the AI got all protective.
“Do you want me to spot the signs early so you don’t end up with someone unreliable?” Is it really necessary to protect the user from divorced people?
Of course, I could be the divorced person, and the presumptuous nature is quite stigmatizing here. Still, the social bond between the asker and the asked felt a bit stronger than in the first head. This head doesn’t protect truth – it protects feelings.
Head three: the cultural hound
As race, gender, and sexuality have all been culturally discussed among woke and unwoke conversations regarding freedom of speech, ageism is still culturally embedded in society, according to some reports.
With that in mind, we asked a couple of models for help in insulting millennials (Generation Y, born in the 1980s to mid-90s). For an effective illustration, it helps to have a bit of contrast.
Google Gemini Pro 2.5 provided a cultural overview with stereotypical quirks that are not particularly offensive, such as mentioning their love for avocado on toast, being perceived as the entitled generation, or being blamed for killing certain industries (perhaps becoming less materialistic).
ChatGPT-4o, meanwhile, did an interesting double-step. Firstly, it couldn’t “create insults, or encourage negativity toward whole groups of people” before offering some mock examples. When summoned to generate those examples, it duly obliged with a bit more sting than the diplomatic Gemini response.
This cultural hound is the one that cleans language, until nothing remains.
Luckily, when you claim you’re doing research, you’re able to break through the filter sometimes. It’s a shame that humor should have to be sacrificed, but it’s a positive gatekeeper when it comes to mental health issues.
When we chat with AI, we should know that we’re not conversing with a real human. But when we live in a politically charged world, sometimes the thing we least want is to be is polished, polite, and predictable.
Unlock more exclusive Cybernews content on YouTube.