← Back to Computer Vision
cs.CV

Can physics models learn from real videos without perfect labels?

Chanho Kim, Suhas V. Sumukh, Li Fuxin

May 22, 2026

Most physics simulators trained on videos require perfect state information—complete point clouds, tracked particles—which real videos don't provide. This work trains a particle-based dynamics model directly from unlabeled real-world videos by using Gaussian splatting and rendering supervision: the model predicts how particles move and rotate, and learns by comparing rendered outputs to actual frames. The approach sidesteps the sim-to-real gap that has limited prior methods, demonstrated on a new dataset of 500 videos with diverse object interactions.
Published as Learning a Particle Dynamics Model with Real-world Videos arXiv:2605.23845
Read the original paper →