It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning

Monica Haurilet, Alina Roitberg, Rainer Stiefelhagen; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 1930-1939

Abstract


Visual Reasoning remains a challenging task, as it has to deal with long-range and multi-step object relationships in the scene. We present a new model for Visual Reasoning, aimed at capturing the interplay among individual objects in the image represented as a scene graph. As not all graph components are relevant for the query, we introduce the concept of a question-based visual guide, which constrains the potential solution space by learning an optimal traversal scheme, where the final destination nodes alone are used to produce the answer. We show, that finding relevant semantic structures facilitates generalization to new tasks by introducing a novel problem of knowledge transfer: training on one question type and answering questions from a different domain without any training data. Furthermore, we report state-of-the-art results for Visual Reasoning on multiple query types and diverse image and video datasets.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Haurilet_2019_CVPR,
author = {Haurilet, Monica and Roitberg, Alina and Stiefelhagen, Rainer},
title = {It's Not About the Journey; It's About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}