← Back to Computation and Language
cs.CL

A map of 2 million scientific breakthroughs and their dependencies

Peter A. Jansen

May 14, 2026

Scientific progress builds incrementally—discoveries enable other discoveries. This work extracts detailed scientific contributions from 230,000 open-access papers in AI/NLP and links them to their prerequisites, creating the Scientific Contribution Graph. The authors introduce a prediction task: given existing technologies, which discoveries might they enable? Current models reach 0.48 MAP using temporally filtered backtesting, showing steady improvement. The dataset and task could support research impact assessment and accelerate scientific discovery.
Published as The Scientific Contribution Graph: Automated Literature-based Technological Roadmapping at Scale arXiv:2605.15011
Read the original paper →