← Back to Computer Vision cs.CV
Can reinforcement learning fix video generation's camera control problem?
Zizun Li, Haoyu Guo, Runzhe Teng, Chunhua Shen, Tong He
May 22, 2026
Video generation models struggle to follow precise camera movements and maintain physical scale when given new instructions. Geo-Align uses reinforcement learning with a geometry-aware reward system that measures 3D camera trajectories directly from generated frames, penalizing rotation and translation errors. Trained on unpaired real and synthetic videos, it outperforms supervised baselines on camera controllability and visual quality.
Read the original paper →