Open framework turns language models into capable autonomous agents

Baolin Peng, Wenlin Yao, Qianhui Wu, Hao Cheng, Xiao Yu, Rui Yang, Tao Ge, Alessandrio Sordoni, Xingdi Yuan, Yelong Shen, Pengcheng He, Tong Zhang, Zhou Yu, Jianfeng Gao

Agentic modeling—training LLMs to plan, reason, and use tools autonomously—remains difficult to scale in open research due to proprietary infrastructure and training gaps. Orchard addresses this by providing a lightweight, reusable environment layer (Orchard Env) and three task-specific recipes: Orchard-SWE for code agents (67.5% on SWE-bench Verified after SFT+RL), Orchard-GUI for 4B vision-language web interaction agents (74.1% on WebVoyager), and Orchard-Claw for personal assistants (73.9% on Claw-Eval). Training uses credit-assignment SFT and balanced adaptive rollout with minimal synthetic data. Code and models are open-sourced.