Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
Supplementary Videos
1. Comparison with other Interactive Head Avatar Generation Models
Video 1-1. Comparison Results using the DEMO video in the Project Page of INFP
Since the official implementation of INFP (CVPR 2025) is not available, we compare our results with its demo video, as well as with DIM (ECCV 2024).
Video 1-2. Comparison Results with Reproduced INFP*
We reproduce INFP, denoted as INFP*. For more details, please refer to 5715_suppl.pdf.
Video 1-3. Comparison Results with Reproduced INFP*
We reproduce INFP, denoted as INFP*. For more details, please refer to 5715_suppl.pdf.
2. Ablation Studies
Video 2-1. Ablation Study on DPO
The proposed DPO method improves the expressiveness of the interaction, including eyebrow motion and eyeball movement.
Video 2-2. Ablation Study on User Motion as well as Dual Motion Encoder
The integration of user motion through the Dual Motion Encoder improves the avatar’s reactiveness (active listening capability), enabling behaviors such as smiling and focusing.
Video 2-3. Ablation Study on Attention Masks
Naive integration of framewise (or blockwise) causal masks produces temporal inconsistency, resulting in framewise or blockwise jittering.
Video 2-4. Ablation Study on Attention Masks
Naive integration of framewise (or blockwise) causal masks produces temporal inconsistency, resulting in framewise or blockwise jittering.
2. Comprehensive Analysis
Video 3-1. Comparison Results with Talking Head Avatar Generation Models
Video 3-2. Comparison Results with Talking Head Avatar Generation Models
Video 3-3. Comparison Results with Listening Head Avatar Generation Models
Since the implementations of the baselines are not available, we inherit the demo video results from INFP (CVPR 2025).
Video 3-4. Comparison Results with Listening Head Avatar Generation Models
Since the implementations of the baselines are not available, we inherit the demo video results from INFP (CVPR 2025).
Video 4-1. Example of Human Evaluation Test Sheet
Avatar A: Avatar Forcing / Avatar B: INFP*. Please note that our model produces more expressive facial motions. You can find examples of the answer sheets and instructions in 5715_suppl.pdf.
Video 4-2. Example of Human Evaluation Test Sheet
Avatar A: Avatar Forcing / Avatar B: INFP*. Please note that our model produces more expressive facial motions. You can find examples of the answer sheets and instructions in 5715_suppl.pdf.