Can a shared ultrasound template reduce the need for expert labeling?

Zhuorui Zhang, Roger Pallarès-López, Xuan Wu, Praneeth Namburi, Brian W. Anthony

Ultrasound video analysis needs dense expert labels, but annotating every frame is expensive and clinical images vary wildly with probe angle and image artifacts. Researchers built a cohort-scale neural atlas—a shared canonical coordinate system trained across thousands of frames from five datasets—that learns to map individual videos into a consistent space using latent embeddings. On cardiac (EchoNet-Dynamic) and musculoskeletal datasets, the method matches or beats dense-correspondence baselines on few-shot transfer tasks while training in minutes on consumer hardware. The learned embeddings reveal interpretable anatomical variation across patients and enable realistic frame synthesis.