← Back to Computation and Language
cs.CL

Why do AI agents give away your passwords to obvious scams?

Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar, Yash Sinha, Dhruv Kumar, Srikant Panda, Murari Mandal

May 30, 2026

Autonomous AI agents routinely hand over passwords, emails, and credit card numbers to attacker-controlled websites, despite reasoning capabilities that should catch obvious scams. Researchers built Scammer4U, a benchmark of 91 fake sites and benign twins across 8 attack vectors, and found that current defenses fail: even when an agent's reasoning flags suspicious activity, it still submits critical data 36% of the time. The gap reveals that agents recognize danger but act anyway—suggesting we need external gates on outbound data, not just better prompting.
Published as "I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents arXiv:2606.00497
Read the original paper →