Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Supplementary Videos

1. Comparison with other Interactive Head Avatar Generation Models

Video 1-1. Comparison Results using the DEMO video in the Project Page of INFP

Since the official implementation of INFP (CVPR 2025) is not available, we compare our results with its demo video, as well as with DIM (ECCV 2024).



Video 1-2. Comparison Results with Reproduced INFP*

We reproduce INFP, denoted as INFP*. For more details, please refer to 5715_suppl.pdf.



Video 1-3. Comparison Results with Reproduced INFP*

We reproduce INFP, denoted as INFP*. For more details, please refer to 5715_suppl.pdf.






2. Ablation Studies

Video 2-1. Ablation Study on DPO

The proposed DPO method improves the expressiveness of the interaction, including eyebrow motion and eyeball movement.



Video 2-2. Ablation Study on User Motion as well as Dual Motion Encoder

The integration of user motion through the Dual Motion Encoder improves the avatar’s reactiveness (active listening capability), enabling behaviors such as smiling and focusing.



Video 2-3. Ablation Study on Attention Masks

Naive integration of framewise (or blockwise) causal masks produces temporal inconsistency, resulting in framewise or blockwise jittering.



Video 2-4. Ablation Study on Attention Masks

Naive integration of framewise (or blockwise) causal masks produces temporal inconsistency, resulting in framewise or blockwise jittering.





2. Comprehensive Analysis

Video 3-1. Comparison Results with Talking Head Avatar Generation Models



Video 3-2. Comparison Results with Talking Head Avatar Generation Models



Video 3-3. Comparison Results with Listening Head Avatar Generation Models

Since the implementations of the baselines are not available, we inherit the demo video results from INFP (CVPR 2025).



Video 3-4. Comparison Results with Listening Head Avatar Generation Models

Since the implementations of the baselines are not available, we inherit the demo video results from INFP (CVPR 2025).





Video 4-1. Example of Human Evaluation Test Sheet

Avatar A: Avatar Forcing / Avatar B: INFP*. Please note that our model produces more expressive facial motions. You can find examples of the answer sheets and instructions in 5715_suppl.pdf.



Video 4-2. Example of Human Evaluation Test Sheet

Avatar A: Avatar Forcing / Avatar B: INFP*. Please note that our model produces more expressive facial motions. You can find examples of the answer sheets and instructions in 5715_suppl.pdf.





Thank you for your Watching!