← Back to Machine Learning
cs.LG

Learning optimal prices with almost no feedback in a changing market

Xiangyu Yang, Feng Xu, Jian-Qiang Hu, Jiaqiao Hu

May 20, 2026

Online retailers and service providers often know only the revenue from one posted price each day, not the full demand curve, and customer preferences shift over time. This work develops an algorithm that learns to set optimal prices using only that single revenue signal per period, without assuming any specific mathematical form for how demand works. A restarting mechanism periodically refreshes learning to forget stale data; when the pace of change is unknown, a meta-learning layer automatically hedges across multiple restart schedules. Experiments on synthetic and real data show the approach tracks optimal pricing even as market conditions drift.
Published as Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity arXiv:2605.21263
Read the original paper →