← Back to Machine Learning (Statistics)
stat.ML

Why you don't need all pairs to train ranking models well?

Louise Davy, Stephan Clémençon, Charlotte Laclau

June 1, 2026

Training ranking and similarity models requires comparing pairs of items, which scales quadratically and becomes expensive at scale. This work shows you can sample only a fraction of pairs—if you pick the right ones—and still match the performance of using everything. The key insight: sample pairs intelligently using survey techniques, not individual items. Theory and experiments confirm this works for embeddings in vision and graph learning.
Published as Doing well with less! On Sampling Techniques for Empirical Pairwise Loss Estimation/Minimization arXiv:2606.02345
Read the original paper →