Evaluating the Impact of Racial Cues on MLLMs Judgements of Politeness and Offensiveness

Mahammed Kamruzzaman, Gene Louis Kim; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 7663-7672

Abstract


Multimodal large language models (MLLMs) increasingly arbitrate socially sensitive decisions, yet we know little about how they respond to visual demographic cues. We present the first systematic audit comparing text-only versus image-only race-gender signals across four open-source MLLMs for two subjective tasks namely politeness and offensiveness. Textual cues sharply amplify bias: models are far more likely to label the same utterance as "very offensive" or "not polite at all" when it is attributed to a Black persona, yet judge it "somewhat polite" or "not offensive" when linked to an Asian persona, with the steepest disparities at the Black-man and Black-woman intersections. Visual cues narrow these gaps but do not erase them. We also find higher model-human alignment for Asian and woman annotators than for Black and man annotators, and adding demographic information rarely improves agreement. Our findings highlight modality as a critical, yet overlooked, axis of fairness in MLLMs.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Kamruzzaman_2025_ICCV, author = {Kamruzzaman, Mahammed and Kim, Gene Louis}, title = {Evaluating the Impact of Racial Cues on MLLMs Judgements of Politeness and Offensiveness}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {7663-7672} }