Generating fine-grained facial animations that accurately portray emotional expressions using only a portrait and an audio recording presents a challenge. Existing methods often rely on multiple emotional portraits or a video clip to capture different emotional expressions, while some utilize emotion labels for animation generation. However, these approaches lack precise control over facial emotional expression and face issues with lip synchronization accuracy.
In order to address these challenges, we propose an visual attribute-guided audio decoupler. This enables the obtention of content vectors solely related to the audio content, enhancing the stability of subsequent lip movement coefficient predictions. To achieve more precise emotional expression, we introduce a fine-grained emotion coefficient prediction mechanism. Additionally, we propose an emotion intensity control method using a fine-grained emotion matrix. Through these, effective control over emotional expression in the generated videos and finer classification of emotion intensity are accomplished. Subsequently, a series of 3DMM coefficient generation networks are designed to predict 3D coefficients, followed by the utilization of a rendering network to generate the final video.
Our experimental results demonstrate that our proposed method, EmoSpeaker, outperforms existing emotional talking face generation methods in terms of expression variation and lip synchronization.
Please refer to our paper for more details.
We compare EmoSpeaker with recent works in emotional talking-head generation.
Source Image
EmoSpeaker
EAMM
Source Image
EmoSpeaker
EAMM
Source Image
EmoSpeaker
EAMM
EmoSpeaker can generate eight kinds of emotional talking-head.
EmoSpeaker can also generate talking-head of different emotional intensities by adjusting the fine-grained emotion. Here are 15 fine-grained demonstrations of multiple emotions. Use the slider here to adjust the fine-grained.
Angry Level 1
Angry Level 15
Contempt Level 1
Contempt Level 15
Disgusted Level 1
Disgusted Level 15
Fear Level 1
Fear Level 15
Happy Level 1
Happy Level 15
Sad Level 1
Sad Level 15
Surprised Level 1
Surprised Level 15