When should you trust a simulator versus run real experiments?

Harsh Parikh, Gabriel Levin-Konigsberg, Dominique Perrault-Joncas, Alexander Volfovsky

When you have a simulator trained on historical data, it's cheap but inherits biases from how that data was collected. Real experiments are unbiased but expensive. The paper decomposes what a simulator gets wrong into two parts: shift that randomized experiments can identify, and irreducible error that extra data won't fix. It then proposes Fisher-SEP, a strategy that decides when to experiment by minimizing uncertainty about a policy's value. Vending-machine and HIV-testing examples show when front-loaded pilots beat ongoing simulation, and when exploration is the only way to reach undersampled regions.