← Back to Machine Learning (Statistics)
stat.ML

How to learn online without revealing your preferences

Tommaso Cesari, Roberto Colomboni

May 27, 2026

Online learning algorithms must often pick actions (like ad placements) while protecting individual privacy via differential privacy. This work closes a theoretical gap by proving the first algorithm matching the conjectured optimal regret bound for pure-DP decision-making—splitting cost between the hardness of distinguishing close-performing actions and privacy's statistical tax. The method uses exponential-time blocks with random prefixes to convert privacy noise into learnable signal.
Published as Optimal Gap-Dependent Regret for Private Stochastic Decision-Theoretic Online Learning arXiv:2605.29148
Read the original paper →