← Back to Machine Learning (Statistics) stat.ML
Can preference feedback work for high-dimensional optimization?
Johanna Menn, Miriam Kober, Paul Brunzema, David Stenger, Sebastian Trimpe
June 1, 2026
Preferential Bayesian optimization learns from pairwise human feedback instead of explicit objective functions—useful when you can compare two options but can't score them absolutely. Existing methods search globally and struggle in many dimensions. This work adapts trust-region local search to preference feedback, using derivatives of the probabilistic model to guide exploration. Testing on optimization benchmarks and policy-search tasks shows local methods substantially reduce cumulative regret compared to global baselines, especially when the landscape is high-dimensional or has sharp peaks.
Read the original paper →