← Back to Artificial Intelligence cs.AI
Can foundation models help predict when models fail on new data?
Shuxuan Li, Zhilin Zhao, Quyu Kong, Wei-Shi Zheng
June 4, 2026
When models encounter data different from training, predicting their performance without labels is hard—existing methods rely only on the failing model itself. FRAP combines predictions from a foundation model and the target model, aligning them via temperature scaling and weighting by confidence to create a better performance proxy. Tests across multiple datasets and architectures show consistent, substantial improvements over baseline estimation methods.
Read the original paper →