High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

Guo, Mingtao; Xing, Guanyu; Liu, Yanli

Mingtao Guo, Guanyu Xing, Yanli Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 228-238

Abstract

Relightable portrait animation aims to animate a static reference portrait to match the head movements of a driving video while adapting to user-specified or reference lighting conditions. Existing portrait animation methods fail to achieve relightable portraits because they do not separate and manipulate intrinsic (identity and appearance) and extrinsic (pose and lighting) features. In this paper, we present a Lighting Controllable Video Diffusion model (LCVD) for high-fidelity, relightable portrait animation. We address this limitation by distinguishing these feature types through dedicated subspaces within the feature space of a pre-trained image-to-video diffusion model. Specifically, for each frame, we use the 3D mesh of the reference portrait, the pose of the driving frame as well as the specified lighting to render an image called shading hint. While the reference image represents the intrinsic attributes, the shading hint encodes the extrinsic attributes. In the training phase, we employ a reference adapter to map the reference into an intrinsic feature subspace and a shading adapter to map the shading hints into an extrinsic feature subspace. By merging features from these subspaces, the model achieves nuanced control over lighting and pose in generated animations. Extensive evaluations show that LCVD outperforms state-of-the-art methods in lighting realism, image quality, and video consistency, setting a new benchmark in relightable portrait animation. Our project is available at \href https://github.com/MingtaoGuo/Relightable-Portrait-Animation https://github.com/MingtaoGuo/Relightable-Portrait-Animation

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Guo_2025_CVPR, author = {Guo, Mingtao and Xing, Guanyu and Liu, Yanli}, title = {High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2025}, pages = {228-238} }