Vision HGNN: An Image is More than a Graph of Nodes

Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, Zhangyang Wang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 19878-19888

Abstract


The realm of graph-based modeling has proven its adaptability across diverse real-world data types. However, its applicability to general computer vision tasks had been limited until the introduction of the Vision Graph Neural Network (ViG). ViG divides input images into patches, conceptualized as nodes, constructing a graph through connections to nearest neighbors. Nonetheless, this method of graph construction confines itself to simple pairwise relationships, leading to surplus edges and unwarranted memory and computation expenses. In this paper, we enhance ViG by transcending conventional "pairwise" linkages and harnessing the power of the hypergraph to encapsulate image information. Our objective is to encompass more intricate inter-patch associations. In both training and inference phases, we adeptly establish and update the hypergraph structure using the Fuzzy C-Means method, ensuring minimal computational burden. This augmentation yields the Vision HyperGraph Neural Network (ViHGNN). The model's efficacy is empirically substantiated through its state-of-the-art performance on both image classification and object detection tasks, courtesy of the hypergraph structure learning module that uncovers higher-order relationships. Our code is available at: https://github.com/VITA-Group/ViHGNN.

Related Material


[pdf]
[bibtex]
@InProceedings{Han_2023_ICCV, author = {Han, Yan and Wang, Peihao and Kundu, Souvik and Ding, Ying and Wang, Zhangyang}, title = {Vision HGNN: An Image is More than a Graph of Nodes}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {19878-19888} }