← Back to Computer Vision
cs.CV

Why video AI can't tell left from right—and how to fix it

Jongseo Lee, Hyuntak Lee, Sunghun Kim, Sooa Kim, Jihoon Chung, Jinwoo Choi

May 21, 2026

Video-LLMs struggle with directional motion despite sophisticated architecture—most guess randomly on "did it move left or right" questions. The researchers traced where this breaks down: motion information exists in the model's hidden states but fails to bind to the correct answer. They created MoDirect, a diagnostic dataset, and DeltaDirect, a method that predicts 2D motion vectors from frame differences. The fix improves real-world motion direction accuracy by 22 points without requiring real-world training data, while maintaining performance on standard benchmarks.
Published as Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs arXiv:2605.22823
Read the original paper →