Can robots learn from their own work?

Scaling robot learning in warehouses requires more than lab data—it needs continuous feedback loops. This work proposes a data flywheel that converts real logistics operations into reusable training assets. A world model generates supervision for tricky parcels robots rarely encounter, while deployment feedback improves policies over time. WM-DAgger combines world-model-based data synthesis with imitation learning to handle out-of-distribution scenarios.