Oxford study warns against AI chatbot health advice

Many have turned to artificial intelligence (AI) chatbots for medical advice, but they cannot be trusted as a reliable source of information, a new study has found. Clear and helpful answers are rare, it says.

American healthcare is in a dire state – hospital wait times are long, and the costs remain stubbornly high. The solution? AI chatbots like ChatGPT.

A recent survey found that one in six American adults turns to chatbots for health advice at least once a month. This, of course, wouldn’t be so bad – but the stakes are quite high. It’s your health after all.

That’s precisely what a new Oxford-led study says. According to the researchers, the habit of asking chatbots for advice comes with risk – many struggle to get clear, helpful answers, and some don’t even know what to ask.

Even worse, people may receive advice that mixes both correct and harmful information, researchers at the Oxford Internet Institute said after running a large experiment and writing up its results in a paper titled “Clinical knowledge in LLMs does not translate to human interactions.”

1,300 participants of the study, all based in the United Kingdom, were given several medical scenarios created by doctors. The goal was to test how well people can make health decisions using both AI tools and their own judgment.

Participants used several top AI models: GPT-4o (ChatGPT), Cohere Command R+, and Meta’s Llama 3. They were also allowed to search online or rely on their own understanding.

Surprisingly, the study found no major advantage to using AI. People did not perform better – that is, found more correct information with chatbots than without them.

“LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings,” said the study.

What’s more, the experiment found that many participants failed to spot serious conditions, and some even downplayed the risks after reading chatbot responses. Others misunderstood the chatbot’s suggestions and chose poorly – meaning that the chatbots may actually weaken decision-making, not strengthen it.

Don’t miss our latest stories on Google News

Add us as your Preferred Source on Google.

In short, excelling at medical question-answering tasks does not translate to accurate performance in clinical settings under physician guidance. Plus, there are more examples.

One study showed that radiologists assisted by AI did not perform better at reading chest X-rays than without AI assistance, and another one found that physicians assisted by large language models only marginally outperformed unassisted physicians in diagnosis problems.

Large tech companies have chosen to react to those rare healthcare professionals who claim that AI chatbots have the potential to improve patient care and public health.

Already in 2023, the American Medical Association has advised doctors (PDF) not to rely on chatbots like ChatGPT for making medical decisions.

Finally, there are security implications. Every AI chatbot is trained on massive amounts of people’s data, which may include sensitive and confidential patient information, a 2023 paper in the Journal of Medical Internet Research reminded.

Your every question trains the model further – but is your data then secure? And if your physician uses ChatGPT to, for example, create a letter to the patient’s insurer, the patient’s personal information and medical condition become part of ChatGPT’s database.

Despite all this, large tech companies have chosen to react to those rare healthcare professionals who claim that AI chatbots have the potential to improve patient care and public health.

Apple is creating an AI coach for sleep, exercise, and diet. Amazon is talking up AI’s role in transforming health care, and Microsoft is working on AI that can sort messages from patients to doctors.

Oxford-led study counsels against health advice from AI chatbots