Achieving high synchronization in the synthesis of realistic, speech-driven talking
head videos presents a significant challenge. Traditional Generative Adversarial
Networks (GAN) struggle to maintain consistent facial identity, while Neural Radiance
Fields (NeRF) methods, although they can address this issue, often produce mismatched
lip movements, inadequate facial expressions, and unstable head poses.
A lifelike talking head requires synchronized coordination of subject identity,
lip movements, facial expressions, and head poses. The absence of these synchronizations
is a fundamental flaw, leading to unrealistic and artificial outcomes.
To address the critical issue of synchronization, identified as the ''devil''
in creating realistic talking heads, we introduce SyncTalk. This NeRF-based
method effectively maintains subject identity, enhancing synchronization and
realism in talking head synthesis. SyncTalk employs a Face-Sync Controller to
align lip movements with speech and innovatively uses a 3D facial blendshape
model to capture accurate facial expressions. Our Head-Sync Stabilizer optimizes
head poses, achieving more natural head movements. The Portrait-Sync Generator restores
hair details and blends the generated head with the torso for a seamless visual experience.
Extensive experiments and user studies demonstrate that SyncTalk outperforms
state-of-the-art methods in synchronization and realism. We recommend watching
the supplementary video.
Overview of SyncTalk. Given a cropped reference video of a talking head and the corresponding speech,
SyncTalk can extract the Lip Feature ,
Expression Feature ,
and Head Pose through
two synchronization modules
and . The Tri-Plane Hash Representation then models the head,
outputting a rough speech-driven video. The Portrait-Sync Generator further restores details such
as hair and background, ultimately producing a high-resolution talking head video.
@article{peng2023synctalk,
title={SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis},
author={Ziqiao Peng and Wentao Hu and Yue Shi and Xiangyu Zhu and Xiaomei Zhang and Jun He and Hongyan Liu and Zhaoxin Fan},
journal={arXiv preprint arXiv:2311.17590},
year={2023}
}