Why you don't need all pairs to train ranking models well?

Training ranking and similarity models requires comparing pairs of items, which scales quadratically and becomes expensive at scale. This work shows you can sample only a fraction of pairs—if you pick the right ones—and still match the performance of using everything. The key insight: sample pairs intelligently using survey techniques, not individual items. Theory and experiments confirm this works for embeddings in vision and graph learning.