GraFormer: Graph-Oriented Transformer for 3D Pose Estimation

Weixi Zhao, Weiqiang Wang, Yunjie Tian; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20438-20447


In 2D-to-3D pose estimation, it is important to exploit the spatial constraints of 2D joints, but it is not yet well modeled. To better model the relation of joints for 3D pose estimation, we propose an effective but simple network, called GraFormer, where a novel transformer architecture is designed via embedding graph convolution layers after multi-head attention block. The proposed GraFormer is built by repeatedly stacking the GraAttention block and the ChebGConv block. The proposed GraAttention block is a new transformer block designed for processing graph-structured data, which is able to learn better features through capturing global information from all the nodes as well as the explicit adjacency structure of nodes. To model the implicit high-order connection relations among non-neighboring nodes, the ChebGConv block is introduced to exchange information between non-neighboring nodes and attain a larger receptive field. We have empirically shown the superiority of GraFormer through extensive experiments on popular public datasets. Specifically, GraFormer outperforms the state-of-the-art GraghSH on the Human3.6M dataset yet only contains 18% parameters of it

Related Material

@InProceedings{Zhao_2022_CVPR, author = {Zhao, Weixi and Wang, Weiqiang and Tian, Yunjie}, title = {GraFormer: Graph-Oriented Transformer for 3D Pose Estimation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {20438-20447} }