← Back to Computer Vision cs.CV
Can avatars read minds and react naturally?
Jianxu Shangguan, Jing Xu, Hang Ye, Xiaoxuan Ma, Yizhou Wang, Wentao Zhu
June 4, 2026
Current AI treats talking to digital humans as two separate problems: language models chat well but look stiff, while video generators look realistic but ignore social reasoning. This work merges both by creating a closed-loop system where an avatar perceives your behavior, infers your mental state, and responds with synchronized speech and expression—while also generating your reactive listening behavior. Tested on a new dataset with psychological personas and hidden social goals, the method outperforms baselines on dialogue quality even when given less information than competitors, suggesting that explicit uncertainty about what someone thinks actually produces better conversation.
Read the original paper →