← Back to Artificial Intelligence
cs.AI

When do AI teams actually beat solo AI at science?

Fiona Y. Wong, Markus J. Buehler

May 21, 2026

Scientists gather evidence from scattered sources: different instruments, databases, disciplines. This paper tests whether coordinating multiple AI agents across domains beats simpler single-source approaches. Testing four tasks (molecular structure, paradigm shifts, disease emergence, exoplanet detection), they found coordination helps when each discipline captures only part of the phenomenon—disease emergence hit 0.944 AUROC and exoplanet vetting 0.955 AUROC. But when one signal dominates, multi-agent coordination mainly improves interpretability and traceability, not accuracy. The work cuts through hype by showing coordination adds value only in specific regimes, not universally.
Published as Cross-domain benchmarks reveal when coordinated AI agents improve scientific inference from partial evidence arXiv:2605.22300
Read the original paper →