Do expensive AI agents get hacked like cheap ones?

Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang, Frederico Araujo, Ian Molloy

Autonomous AI agents interact with so many layers of computing infrastructure that they create massive security surfaces—like giving a program access to everything at once. Researchers penetration-tested proprietary agent products in 2025 to see whether strict coding standards and formal review catch vulnerabilities that plague open-source agents. The gap between well-funded and community projects matters for understanding real-world AI security risks.