← Back to Machine Learning
cs.LG

A stabler way to rank which features actually matter in ML models

Lanxin Xiang, Liang Shi, Youhui Ye, Boyu Jiang, Dawei Zhou, Feng Guo

May 14, 2026

SHAP scores for feature importance can swing dramatically depending on train-test splits or random seeds, making model interpretation unreliable. RoSHAP models the full distribution of SHAP scores via bootstrap resampling and kernel density estimation, then collapses that distribution into a single ranking criterion that rewards features for being active, strong, and consistent. The authors prove the aggregated score is asymptotically Gaussian, which cuts the computational cost of distribution estimation. In simulations and real-data experiments, RoSHAP better identifies true signal features than single-run SHAP, and models built on RoSHAP-selected features match full-model predictive performance with substantially fewer predictors — useful for practitioners doing feature selection in noisy settings.
Published as RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution arXiv:2605.15154
Read the original paper →