Who actually annotates NLP datasets, and why won't researchers say?

Maria Kunilovskaya, Gagan Bhatia, Lisa Sophie Albertelli, Yanran Chen, Christian Greisinger, Lotta Kiefer, Christoph Leiter, Subhadeep Roy, Tewodros Achamaleh, Muhammad Arslan Manzoor, Sebastian Pohl, Yufang Hou, Steffen Eger

NLP research depends on human-labeled data, yet papers rarely document who annotated or how. Researchers analyzed 1,603 papers (2018–2025) using LLM-assisted extraction, validated against 41 hand-reviewed papers, and found papers report recruitment and volume but routinely skip annotator expertise, compensation, language proficiency, and agreement scores—especially in model evaluation. The audit establishes a taxonomy and minimum reporting standard to improve annotation transparency and reproducibility.