← Back to Machine Learning (Statistics) stat.ML
How many training samples do classification algorithms actually need?
Meysam Alishahi, Alexander Munteanu, Simon Omlor, Jeff M. Phillips
May 22, 2026
How many data points must you sample to train a classifier reliably? This work settles the question for logistic, hinge, and ReLU losses with various regularizers, proving tight dimension-free bounds. They show L₂ regularization needs k²/ε² samples (where k is parameter count), while L₁ requires only k/ε². For certain loss functions, the bound drops to linear in k. The key insight: refined moment analysis avoids the loose over-counting built into standard sensitivity sampling frameworks, improving prior cubic bounds threefold.
Read the original paper →