-
[pdf]
[supp]
[bibtex]@InProceedings{Kim_2025_WACV, author = {Kim, Taewoo and Lee, Geonsu and Lee, Hyukgi and Kim, Seongtae and Lee, Younggun}, title = {PixSwap: High-Resolution Face Swapping for Effective Reflection of Identity via Pixel-Level Supervision with Synthetic Paired Dataset}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {3742-3751} }
PixSwap: High-Resolution Face Swapping for Effective Reflection of Identity via Pixel-Level Supervision with Synthetic Paired Dataset
Abstract
Face swapping is to interchange identity features such as eyes nose and lips between a source and a target face while preserving the target attributes like expression pose skin color and hair. Despite considerable advancements in quality over the years recent studies conducting high-resolution face swapping still encounter challenges in reflecting the source identity and ensuring robust performance. We identify two primary reasons for these challenges: (1) the absence of pixel-level supervision and (2) limitations in the model architecture and pipeline. To address the first problem we construct a pseudo-ground truth paired dataset and provide essential pixel-level supervision for effective attribute and identity handling during face swapping. Models trained on the paired dataset significantly enhance the source identity reflection compared to those without the paired dataset. On the other hand existing StyleGAN-based approaches often underutilize source latent vectors or heavily rely on pre-trained models during inference resulting in incomplete identity representation or reduced robustness. Therefore we introduce a novel face-swapping model that is adept at leveraging spatial information of the target attributes and fully utilizing features of the source identity. Thanks to the effective paired dataset and network our model achieves state-of-the-art performance in both image and video-level face swapping notably improving the source identity reflection while preserving the target attributes. Extensive experiments validate the superior performance of our model over existing baselines.
Related Material