When do AI teams actually beat solo AI at science?

Scientists gather evidence from scattered sources: different instruments, databases, disciplines. This paper tests whether coordinating multiple AI agents across domains beats simpler single-source approaches. Testing four tasks (molecular structure, paradigm shifts, disease emergence, exoplanet detection), they found coordination helps when each discipline captures only part of the phenomenon—disease emergence hit 0.944 AUROC and exoplanet vetting 0.955 AUROC. But when one signal dominates, multi-agent coordination mainly improves interpretability and traceability, not accuracy. The work cuts through hype by showing coordination adds value only in specific regimes, not universally.