← Back to Computer Vision cs.CV
How to make talking faces without training anything new
Hao Wu, Xiangyang Luo, Hao Wang, Jiawei Zhang, Yi Zhang, Jinwei Wang
May 28, 2026
Talking face generation normally demands task-specific training on massive datasets. This work skips that entirely by repurposing Stable Diffusion and IP-Adapter with three lightweight, parameter-free modules that handle lip sync, identity consistency, and temporal smoothing. Results beat existing methods on both accuracy and visual quality without touching pretrained weights.
Read the original paper →