← Back to Machine Learning (Statistics) stat.ML
Speeding up distributed learning by hiding communication delays
Yassine Maziane, Ammar Mahran, Artavazd Maranjyan, Peter Richtárik
May 20, 2026
Communication eats training time in distributed learning, especially across slow networks. LOSCAR-SGD tackles this by combining three cost-reduction tricks: sending only important model parameters, letting workers train multiple steps locally, and continuing optimization while waiting for data to arrive. The key innovation is a merge rule that safely incorporates delayed information without losing progress made during communication. Theory shows how sparsity, overlap, and mismatched worker speeds affect convergence on smooth non-convex problems.
Read the original paper →