AI failed the Holocaust test

AI has stumbled on one of the hardest tests of all – retelling Holocaust survivor testimony. A new study by Cornell historian Jan Burzlaff shows how ChatGPT smoothed out suffering, omitted vital details, and risked distorting memory itself.

What first drew me to history in school was its roughness, especially the silences and the contradictions.

I walked through Auschwitz years ago and realised that no amount of text could capture the weight of that quiet.

In Ireland, Oliver Cromwell is still shorthand for brutality. And Italian unification, usually charted as a clean march, is full of betrayals and half-victories once you look closer.

That unevenness is the point. History lives in what cannot be smoothed out.

Cornell historian Jan Burzlaff recently asked whether AI could cope with that. He fed ChatGPT Holocaust survivor testimonies from archives in La Paz, Bolivia; Kraków, Poland, and Connecticut, USA, and asked it to retell them.

On paper, a simple enough exercise. In practice, the gaps said everything.

Jewish museum in Berlin. — Florian Gaertner via Getty Images

One story brought it into focus. Seven-year-old Luisa D., starving, recalled her mother cutting her finger and dripping blood into her mouth – “the faintest trace of moisture” – to keep her alive. ChatGPT omitted the moment.

For Burzlaff, that absence was telling.

“This omission alone demonstrates why human historians remain indispensable in the age of artificial intelligence,” he wrote in Rethinking History, the journal where he published his findings.

What the AI produced was smooth, coherent, readable, and wrong. Survivor accounts aren’t meant to be tidy. They falter, they contradict, they fall into silence.

The pauses matter as much as the words. Burzlaff found that ChatGPT simply ignored this. “Essentially, it ignored the extent these individuals suffered on an emotional level.”

That is the problem – AI tries to clarify, and in clarifying, distorts. Details that should jolt are softened. Contradictions that should remain are ironed out.

Get our latest stories today on Google News

Add us as your Preferred Source on Google.

Burzlaff put it starkly.

“They summarize but do not listen, reproduce but do not interpret, and excel at coherence but falter at contradiction.”

The machine generates answers, and it doesn’t seem to hear.

Microsoft recently put historians on its list of professions most exposed to automation. Burzlaff doesn’t accept that.

“If historical writing can be done by a machine, then it was never historical enough.”

History, he argues, is not about stacking facts in order. It is about deciding what carries weight – the silence that cannot be filled, the contradiction that must be left standing.

A black and white photo of Auschwitz. — Culture Club via Getty Images

AI can process testimony, but it cannot decide what is sacred in it – that is the historian’s work. Without that judgment, memory becomes information that is technically accurate, but ethically hollow. Readers may not even see what’s gone missing, which is what makes it so dangerous.

Burzlaff frames Holocaust testimony as a litmus test. If AI cannot hold on here, where the suffering is beyond dispute, what chance does it have with more ambiguous histories?

The conversation on this topic is live. Join in the discussion.

The threat is not an obvious error. It is flattening – producing versions of the past that are easy to consume but emptied of their force.

“The problem we historians now face is not whether AI can recognize meaning, but whether we will continue to do so,” explains Burzlaff.

That is the line that lingers. It’s about more than Holocaust memory. It is about how we remember ourselves in an age that prizes pattern and prediction over patience and care.

Unlock more exclusive Cybernews content on YouTube.

AI failed the Holocaust test – and that should worry us all

More from Cybernews