← Back to Machine Learning
cs.LG

Why benchmarks ignore the feature engineering that wins real competitions

Andrej Tschalzev, Nick Erickson, Yuyang Wang, Huzefa Rangwala, Stefan Lüdtke, Heiner Stuckenschmidt, Christian Bartelt

June 1, 2026

Tabular ML benchmarks test fancy models on raw data, ignoring feature engineering—the step that actually matters in practice. TabPrep adds lightweight pattern-specific generators (targeting structural quirks like interactions and logarithmic scales) and consistently boosts tree, neural, linear, and foundation models across TabArena. Often, the engineering beats the architecture. Code released.
Published as TabPrep: Closing the Feature Engineering Gap in Tabular Benchmarks arXiv:2606.02384
Read the original paper →