Why robustness tricks are really one problem in disguise

Robustness looks like many separate challenges (domain shift, occlusion, compositional generalization), but this paper argues they're one problem: controlling how the encoder responds to nuisances that don't change the label. The matching principle says the regularizer's range must cover the covariance of those nuisances. The authors prove closed-form optimality in the linear-Gaussian case, show why CORAL, adversarial training, IRM, and augmentation are different ways to estimate the same object, and validate predictions on 13 pre-registered experiments from ImageNet to Qwen2.5-7B—12 pass.