← Back to Artificial Intelligence
cs.AI

Teaching AI to understand how objects change and respond

Kunqi Xu, Jitao Li, Jianglong Ye, Tianshu Tang, Isabella Liu, Sifei Liu, Xueyan Zou

May 18, 2026

Current world models either generate videos or reconstruct scenes, but don't explicitly represent how objects behave and change state. WorldString learns a state manifold for real-world objects directly from point cloud or RGB-D video input, treating objects as actionable entities with intrinsic properties that determine their behavior. The fully differentiable architecture integrates seamlessly with policy learning and neural dynamics models, positioning it as a foundational primitive for building physical world models that understand object interactions.
Published as Actionable World Representation arXiv:2605.18743
Read the original paper →