Inductive Biases for Low Data VQA: A Data Augmentation Approach

Narjes Askarian, Ehsan Abbasnejad, Ingrid Zukerman, Wray Buntine, Gholamreza Haffari; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2022, pp. 231-240

Abstract


Visual question answering (VQA) is the problem of understanding rich image contexts and answering complex natural language questions about them. VQA models have recently achieved remarkable results when training on large-scale labeled datasets. However, annotating large amounts of data is not feasible in many domains. In this paper, we address the problem of VQA in a low-labeled data regime, which is under-explored in the literature. We take a data augmentation approach to enlarge the initial small labeled data in order to inject proper inductive biases into the VQA model. We encode the additional inductive biases in the questions by producing new ones taking advantage of the image annotations. Our results show up to 34% accuracy improvements compared to the baselines trained on only the initial labeled data.

Related Material


[pdf]
[bibtex]
@InProceedings{Askarian_2022_WACV, author = {Askarian, Narjes and Abbasnejad, Ehsan and Zukerman, Ingrid and Buntine, Wray and Haffari, Gholamreza}, title = {Inductive Biases for Low Data VQA: A Data Augmentation Approach}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops}, month = {January}, year = {2022}, pages = {231-240} }