← Back to Computer Vision cs.CV
Can a model learn MIL tasks from just a few examples?
Alexander Möllers, Marvin Sextro, Julius Hense, Gabriel Dernbach, Klaus-Robert Müller
June 4, 2026
Multiple instance learning—where you label groups rather than individual items—typically requires substantial labeled data. This work trains a Perceiver-based model on synthetic bag-structured data, then tests whether it can solve new MIL tasks from only a handful of labeled examples at inference time via in-context learning. The pretrained model, trained on a mixture of synthetic generators, outperforms supervised baselines across 12 benchmarks without gradient updates or task-specific tuning, addressing the low-label bottleneck in pathology and satellite imagery.
Read the original paper →