CPTNet: Cascade Pose Transform Network for Single Image Talking Head Animation

Zhang, Jiale; Xian, Ke; Liu, Chengxin; Chen, Yinpeng; Cao, Zhiguo; Zhong, Weicai

Jiale Zhang, Ke Xian, Chengxin Liu, Yinpeng Chen, Zhiguo Cao, Weicai Zhong; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020

Abstract

We study the problem of talking head animation from a sin-gle image. Most of the existing methods focus on generating talking headsfor human. However, little attention has been paid to the creation of talk-ing head anime. In this paper, our goal is to synthesize vivid talking headsfrom a single anime image. To this end, we propose cascade pose trans-form network, termed CPTNet, that consists of a face pose transformnetwork and a head pose transform network. Specifically, we introducea mask generator to animate facial expression (e.g., close eyes and openmouth) and a grid generator for head movement animation, followed by afusion module to generate talking heads. In order to handle large motionand obtain more accurate results, we design a pose vector decomposi-tion and cascaded refinement strategy. In addition, we create an animetalking head dataset, that includes various anime characters and poses,to train our model. Extensive experiments on our dataset demonstratethat our model outperforms other methods, generating more accurateand vivid talking heads from a single anime image.

Related Material

[pdf]

[bibtex]

@InProceedings{Zhang_2020_ACCV, author = {Zhang, Jiale and Xian, Ke and Liu, Chengxin and Chen, Yinpeng and Cao, Zhiguo and Zhong, Weicai}, title = {CPTNet: Cascade Pose Transform Network for Single Image Talking Head Animation}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }