Choosing `Right' from Wrong: A Closer Look at Selection Bias in Spatial Multiple-Choice Questions in Large Multimodal Models

Giselle Zeno, Nour Jedidi, Steven Gomez; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2025, pp. 535-544

Abstract


This paper examines the limitations of a probing method for determining whether Large Multimodal Models (LMMs) actually understand spatial concepts in image inputs, which may be important for users calibrating their trust during visual-question answering. A typical approach is to prompt these models to choose the correct matching text caption from multiple-choice options. However, similar prompting with Large Language Models has been shown to give biased results as an artifact of the multiple-choice question (MCQ) format. We explore the extent to which this bias persists for spatial understanding tasks in LMMs, which take visual inputs alongside the MCQ text prompt, and go beyond recent work in characterizing this selection bias for current models. First, we demonstrate how image-text alignment benchmarks can be used to formulate a question for the LMM to reveal potential bias. We discuss prompting options and approaches to analyzing the outputs. Then, we use eight LMM architectures to perform this task using image-caption sets from What's Up benchmark. We find that this multiple-choice selection bias exists in LMMs even with state of the art models and under different methods of analysis. Finally, we discuss future steps, as well as alternatives to MCQs for probing spatial understanding in LMMs that could minimize bias introduced by the prompt format.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Zeno_2025_CVPR, author = {Zeno, Giselle and Jedidi, Nour and Gomez, Steven}, title = {Choosing `Right' from Wrong: A Closer Look at Selection Bias in Spatial Multiple-Choice Questions in Large Multimodal Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {535-544} }