ChatGPT memorizes puns but can’t understand them, study finds

Large language models (LLMs) like ChatGPT and Gemini can memorize familiar joke structures, but they don’t actually understand them.

While LLMs can engage in human-like conversations and even crack jokes, they still lack genuine creativity and deep understanding, a new study published in the Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing has found.

Researchers tested whether five LLMs can understand puns, which are a form of wordplay that relies on different possible meanings of a word or sound-alike words to create a humorous effect.

The models tested were GPT-4o, Qwen2.5-72B, Llama3.3-70B, Gemini2.0-Flash, and Mistral3-24B.

Don't miss our latest stories on Google News

Add us as your Preferred Source on Google.

The research team fed the models puns like “Long fairy tales have a tendency to dragon (drag on)” but swapped the keyword to create nonsense like “Long fairy tales have a tendency to wyvern.” The models classified the sentence as a pun.

Similarly, when researchers fed chatbots a common pun, “I used to be a comedian, but my life became a joke,” but changed the word “joke” with “chaotic,” they – again – perceived the sentence as a joke.

“When faced with unfamiliar puns, their success rate in distinguishing puns from sentences without a pun can drop to as low as 20% – much worse than the 50% you’d expect from random guessing,” Mohammad Taher Pilehvar, a senior lecturer at Cardiff University, said in a press release.

Moreover, when models were given a sentence that resembled a pun but lacked comedic intent or double meaning, such as “Old X never die, they just X,” they still insisted it was funny.

GPT-4o performed the best at identifying puns, while Mistral3-24B had the lowest scores, according to the study.

The researchers suggest that LLMs should be used with caution when creative thinking – such as understanding humour, empathy, or cultural nuance – is required.

“It’s a reminder that, in general, outputs from these models should be taken with a pinch of salt,” said Jose Camacho-Collados, a professor at Cardiff University.

As creatives are increasingly worried about the impact of artificial intelligence (AI) on their livelihoods, comedians appear to be confident about not being replaced by chatbots.

A group of comedians who experimented with using AI in their creative process told researchers that LLMs would never be able to create human-level comedy because they cannot draw on personal experiences. Moreover, they lack perspective, context, and situational awareness.

However, they raised the ethical issues of models using copyrighted works and the lack of diversity represented in training data.

Unlock more exclusive Cybernews content on YouTube.

Pun intended, but misunderstood: Can ChatGPT take a joke?

More from Cybernews