← Back to Computation and Language cs.CL
Benchmarking medical AI vision models in Bangla
Rafid Ahmed, Intesar Tahmid, Mir Sazzat Hossain, Tasnimul Hossain Tomal, Md Fahim, Md Farhad Alam Bhuiyan
May 18, 2026
BanglaMedVQA is the first medical visual question-answering benchmark for Bangla, a language spoken by hundreds of millions globally. The dataset contains clinically validated image-question-answer pairs evaluated against leading foundation models including GPT-4.1 mini, Gemini, and open-source alternatives like Gemma-3. Results show all tested models struggle significantly with fine-grained diagnostic reasoning in Bangla—even top performers fail on specialized medical questions. This work documents a critical performance gap between English and Bangla medical AI capabilities and establishes a baseline for future multilingual medical AI research.
Read the original paper →