
Researchers have concluded that US President Donald Trump’s political speeches – not easy for any human, to be honest – are also difficult to grasp for AI systems like ChatGPT.
Trump’s speeches, while sounding like mad ramblings to some, are actually rich in figurative political language. So much so that researchers decided to use them as a testing ground for the capabilities and limitations of large language models (LLMs).
It turns out systems like OpenAI’s ChatGPT indeed struggle when it comes to understanding figurative language in political context.
The study, authored by Haohan Men, Xiaoyu Li, and Jinhua Sun from the College of International Studies, National University of Defense Technology, Nanjing, China, is titled “Large language models prompt engineering as a method for embodied cognitive linguistic representation: a case study of political metaphors in Trump’s discourse.”
First things first, though. LLMs are programs trained to understand and generate human language.
When the models analyze vast amounts of texts and learn patterns in how words and sentences are used, they can then write essays, summarize documents, answer queries, and, perhaps most famously, hold conversations that feel natural.
But, of course, the LLMs don’t really understand language the way we do: they instead rely on pattern recognition to guess what words are likely to follow one another.
Oftentimes, this is convincing. But the models can misinterpret meaning, especially when language is abstract or emotionally charged, and here’s where Trump and his speeches come in.
To test how well an LLM can detect metaphors in political speech, the researchers selected four of Trump’s speeches from mid-2024 to early 2025.
These included his Republican nomination acceptance speech after surviving an assassination attempt, his post-election victory remarks, his inaugural address, and his speech to Congress. They total over 28,000 words.
“Spanning over a year of Trump’s presidential campaign, these speeches cover a variety of occasions and thematic content, offering a comprehensive portrayal of his rhetorical style. They contain numerous vivid, flexible, and highly inflammatory political metaphors, providing ample primary material for this study,” say the researchers.
They prompted the ChatGPT-4 model to go through a step-by-step process: understand the context of the speech, identify potential metaphors, categorize them by theme, and explain their likely emotional or ideological impact.
To be fair, out of 138 sampled sentences, the system correctly identified 119 metaphorical expressions. That’s an accuracy rate of around 86%.
However, a closer look revealed several recurring problems in the model’s reasoning. These issues provide insight into the limitations of AI when it tries to interpret complex human communication.
The AI system was confusing metaphors with other forms of expression, such as similes. For example, the model misinterpreted the phrase “Washington D.C., which is a horrible killing field” as metaphorical when it is more accurately described as a literal, emotionally charged comparison.
The model also tended to overanalyze simple expressions. In one case, it flagged the phrase “a series of bold promises” as metaphorical, interpreting it as a spatial metaphor when no such figurative meaning was intended.
The missteps show that while LLMs can detect surface-level patterns, they often lack the ability to understand meaning in context.
Finally, ChatGPT-4 also struggled to correctly classify names and technical terms. For instance, it treated “Iron Dome,” the name of Israel’s missile defense system, as a metaphor instead of a proper noun.
Quite obviously, the researchers conclude that these missteps show that while LLMs can detect surface-level patterns, they often lack the ability to understand meaning in context.
“While LLMs offer significant strengths in analyzing political contexts and the socio-cultural dimensions of metaphors, challenges remain in distinguishing metaphors from similes, ensuring consistency, and addressing complex referential contexts,” the study says.
Your email address will not be published. Required fields are markedmarked