← Back to Computation and Language
cs.CL

Who actually annotates NLP datasets, and why won't researchers say?

Maria Kunilovskaya, Gagan Bhatia, Lisa Sophie Albertelli, Yanran Chen, Christian Greisinger, Lotta Kiefer, Christoph Leiter, Subhadeep Roy, Tewodros Achamaleh, Muhammad Arslan Manzoor, Sebastian Pohl, Yufang Hou, Steffen Eger

June 1, 2026

NLP research depends on human-labeled data, yet papers rarely document who annotated or how. Researchers analyzed 1,603 papers (2018–2025) using LLM-assisted extraction, validated against 41 hand-reviewed papers, and found papers report recruitment and volume but routinely skip annotator expertise, compensation, language proficiency, and agreement scores—especially in model evaluation. The audit establishes a taxonomy and minimum reporting standard to improve annotation transparency and reproducibility.
Published as Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025 arXiv:2606.02255
Read the original paper →