Training web agents to learn from live websites, not just recorded demos

Rui Yang, Qianhui Wu, Yuxi Chen, Hao Bai, Wenlin Yao, Hao Cheng, Baolin Peng, Huan Zhang, Tong Zhang, Jianfeng Gao

Building capable web agents usually requires huge collections of hand-curated website interaction recordings—expensive and incomplete. OpenWebRL trains visual agents directly on live websites using reinforcement learning instead, letting them learn from real-world feedback. The framework achieved 67% success on Online-Mind2Web and 64% on DeepShop with minimal supervised data, matching OpenAI's and Google's proprietary systems. Code and models will be released.