A Simple Baseline for Weakly-Supervised Scene Graph Generation

Jing Shi, Yiwu Zhong, Ning Xu, Yin Li, Chenliang Xu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 16393-16402

Abstract


We investigate the weakly-supervised scene graph generation, which is a challenging task since no correspondence of label and object is provided. The previous work regards such correspondence as a latent variable which is iteratively updated via nested optimization of the scene graph generation objective. However, we further reduce the complexity by decoupling it into an efficient first-order graph matching module optimized via contrastive learning to obtain such correspondence, which is used to train a standard scene graph generation model. The extensive experiments show that such a simple pipeline can significantly surpass the previous state-of-the-art by more than 30% on the Visual Genome dataset, both in terms of graph matching accuracy and scene graph quality. We believe this work serves as a strong baseline for future research.

Related Material


[pdf]
[bibtex]
@InProceedings{Shi_2021_ICCV, author = {Shi, Jing and Zhong, Yiwu and Xu, Ning and Li, Yin and Xu, Chenliang}, title = {A Simple Baseline for Weakly-Supervised Scene Graph Generation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {16393-16402} }