Object-Relation Reasoning Graph for Action Recognition

Yangjun Ou, Li Mi, Zhenzhong Chen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20133-20142


Action recognition is a challenging task since the attributes of objects as well as their relationships change constantly in the video. Existing methods mainly use object-level graphs or scene graphs to represent the dynamics of objects and relationships, but ignore modeling the fine-grained relationship transitions directly. In this paper, we propose an Object-Relation Reasoning Graph (OR2G) for reasoning about action in videos. By combining an object-level graph (OG) and a relation-level graph (RG), the proposed OR2G catches the attribute transitions of objects and reasons about the relationship transitions between objects simultaneously. In addition, a graph aggregating module (GAM) is investigated by applying the multi-head edge-to-node message passing operation. GAM feeds back the information from the relation node to the object node and enhances the coupling between the object-level graph and the relation-level graph. Experiments in video action recognition demonstrate the effectiveness of our approach when compared with the state-of-the-art methods.

Related Material

@InProceedings{Ou_2022_CVPR, author = {Ou, Yangjun and Mi, Li and Chen, Zhenzhong}, title = {Object-Relation Reasoning Graph for Action Recognition}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {20133-20142} }