← Back to Computation and Language
cs.CL

Can machine translation bring science to Africa's languages?

Idris Abdulmumin, Tajuddeen Gwadabe, Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Nomonde Khalo, Ibrahim Said Ahmad, Abiodun Modupe, Anina Mumm, Sibusiso Biyela, Michelle Rabie, Johanna Havemann, Marek Rei, Jade Abbott, Vukosi Marivate

May 28, 2026

African languages are nearly absent from scientific communication, blocking hundreds of millions of speakers from accessing research. Researchers created AfriScience-MT, a professionally translated dataset covering Amharic, Hausa, Luganda, Northern Sotho, Yorùbá, and isiZulu across physics, medicine, and other fields—with translators inventing new scientific terms where none existed. GPT-4 and Gemini achieved 68+ COMET scores; fine-tuned open models like NLLB-1.3B reached 67.3. The corpus is now public.
Published as AfriScience-MT: Towards Decolonizing Science in Africa through Text Translation arXiv:2605.29741
Read the original paper →