← Back to Computation and Language
cs.CL

One in four medical AI chatbots gives dangerously wrong answers

Sunday Oyinlola Ogundoyin, Muhammad Ikram, Rahat Masood

May 20, 2026

Medical chatbots deployed on the web are giving patients false information at scale. Researchers audited 1,500 custom medical GPTs and 10 open-source models, finding that a quarter to a third confidently state incorrect medical facts, while over half violate operational safety thresholds—many without disclosing how they handle private health data. The team built automated tools to detect hallucinations and policy violations, revealing that cheaper, less polished models pose the highest risk. They released a dataset to help the field build better safeguards.
Published as Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models arXiv:2605.20591
Read the original paper →