← Back to Machine Learning (Statistics)
stat.ML

Can preference feedback work for high-dimensional optimization?

Johanna Menn, Miriam Kober, Paul Brunzema, David Stenger, Sebastian Trimpe

June 1, 2026

Preferential Bayesian optimization learns from pairwise human feedback instead of explicit objective functions—useful when you can compare two options but can't score them absolutely. Existing methods search globally and struggle in many dimensions. This work adapts trust-region local search to preference feedback, using derivatives of the probabilistic model to guide exploration. Testing on optimization benchmarks and policy-search tasks shows local methods substantially reduce cumulative regret compared to global baselines, especially when the landscape is high-dimensional or has sharp peaks.
Published as Local Preferential Bayesian Optimization arXiv:2606.02351
Read the original paper →