← Back to Computer Vision
cs.CV

Keeping video characters consistent across long narratives

Jinzhuo Liu, Jiangning Zhang, Wencan Jiang, Yabiao Wang, Dingkang Liang, Zhucun Xue, Ran Yi, Yong Liu

May 18, 2026

Long autoregressive video generation struggles to maintain consistent character identities when prompts change, causing identity drift and attribute loss. IAMFlow addresses this by using an LLM to extract entities and assign persistent global IDs, paired with a VLM that verifies character attributes from rendered frames rather than relying on implicit similarity matching. An inference acceleration pipeline with asynchronous verification and quantization keeps computation practical. The authors also introduce NarraStream-Bench, a benchmark with 324 multi-prompt scripts and multimodal evaluation metrics. IAMFlow achieves the best overall performance without any training.
Published as Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory arXiv:2605.18733
Read the original paper →