Scene Graph Generation by Iterative Message Passing

Danfei Xu, Yuke Zhu, Christopher B. Choy, Li Fei-Fei; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5410-5419

Abstract


Understanding a visual scene goes beyond recognizing individual objects in isolation. Relationships between objects also constitute rich semantic information about the scene. In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image. We propose a novel end-to-end model that generates such structured scene representation from an input image. Our key insight is that the graph generation problem can be formulated as message passing between the primal node graph and its dual edge graph. Our joint inference model can take advantage of contextual cues to make better predictions on objects and their relationships. The experiments show that our model significantly outperforms previous methods on the Visual Genome dataset as well as support relation inference in NYU Depth V2 dataset.

Related Material


[pdf] [arXiv] [poster]
[bibtex]
@InProceedings{Xu_2017_CVPR,
author = {Xu, Danfei and Zhu, Yuke and Choy, Christopher B. and Fei-Fei, Li},
title = {Scene Graph Generation by Iterative Message Passing},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}