← Back to Computer Vision cs.CV
Can AI understand complex movie plots from video?
Zhengqian Wu, Zhixian Liu, Aodong Chen, Jingyang Zhang, Ruizhe Li, Hanlin Ge, Zhongyuan Wang, Chunxia Xiao, Chao Liang
June 4, 2026
Existing video question-answering systems handle simple factual queries but stumble when asked to track characters, motivations, and plot arcs over hours of content. Researchers built StoryVideoQA—363K automatically-generated QA pairs spanning TV shows and feature films up to 2 hours long—and benchmarked 20 leading models, showing they lose coherence tracking storylines. They propose PlotTree, which reorganizes video into hierarchical plot summaries to reason about complex narratives more reliably.
Read the original paper →