When predictions face adversarial interventions: a game theory view

This paper models a two-player game where a leader chooses a prediction function for a target variable, and a follower then intervenes on the causal system to maximize their own objective. The leader knows what the follower will target but may not know their true objective. The key result: predictors built on the stable blanket—a specific invariant subset of causal parents—are guaranteed to perform at least as well as those based on causal parents alone, and under stated conditions, achieve worst-case optimality across all possible adversarial interventions. The authors prove this for two common classes of follower objectives, provide distribution generalization bounds, and demonstrate the approach on simulated and real-world data.