Focus and Retain: Complement the Broken Pose in Human Image Synthesis

Ge, Pu; Huang, Qiushi; Xiang, Wei; Jing, Xue; Li, Yule; Li, Yiyong; Sun, Zhun

Pu Ge, Qiushi Huang, Wei Xiang, Xue Jing, Yule Li, Yiyong Li, Zhun Sun; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 3370-3379

Abstract

Given a target pose, how to generate an image of a specific style with that target pose remains an ill-posed and thus complicated problem. Most recent works treat the human pose synthesis tasks as an image spatial transformation problem using flow warping techniques. However, we observe that, due to the inherent ill-posed nature of many complicated human poses, former methods fail to generate body parts. To tackle this problem, we propose a feature-level flow attention module and an Enhancer Network. The flow attention module produces a flow attention mask to guide the combination of the flow-warped features and the structural pose features. Then, we apply the Enhancer Network to refine the coarse image by injecting the pose information. We present our experimental evaluation both qualitatively and quantitatively on DeepFashion, Market-1501, and Youtube dance datasets. Quantitative results show that our method has 12.995 FID at DeepFashion, 25.459 FID at Market-1501, 14.516 FID at Youtube dance datasets, which outperforms some state-of-the-arts including Guide-Pixe2Pixe, Global-Flow-Local-Attn, and CocosNet.

Related Material

[pdf]

[bibtex]

@InProceedings{Ge_2021_WACV, author = {Ge, Pu and Huang, Qiushi and Xiang, Wei and Jing, Xue and Li, Yule and Li, Yiyong and Sun, Zhun}, title = {Focus and Retain: Complement the Broken Pose in Human Image Synthesis}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {3370-3379} }