How easily do self-driving AI models break when given bad input?

Mohammadreza Teymoorianfard, Jean-Philippe Monteuuis, Jonathan Petit, Amir Houmansadr

Vision-language models designed for autonomous driving are assumed to reason reliably, but researchers show they're vulnerable to realistic input perturbations like garbled or corrupted text. Using NVIDIA's Alpamayo models, they achieved 89% success manipulating reasoning and 72% success corrupting trajectory predictions—leading to more crashes in closed-loop simulation. The work introduces the first systematic evaluation framework and benchmark for testing such attacks on reasoning-driven autonomous systems.