I Spy with My Little Eye A Minimum Cost Multicut Investigation of Dataset Frames

Katharina Prasse, Isaac Bravo, Stefanie Walter, Margret Keuper; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 2134-2143

Abstract


Visual framing analysis is a key method in social sciences for determining common themes and concepts in a given discourse. To reduce manual effort image clustering can significantly speed up the annotation process. In this work we phrase the clustering task as a Minimum Cost Multicut Problem [MP]. Solutions to the MP have been shown to provide clusterings that maximize the posterior probability solely from provided local pairwise probabilities of two images belonging to the same cluster. We discuss the efficacy of numerous embedding spaces to detect visual frames and show its superiority over other clustering methods. To this end we employ the climate change dataset ClimateTV which contains images commonly used for visual frame analysis. For broad visual frames DINOv2 is a suitable embedding space while ConvNeXt V2 returns a larger number of clusters which contain fine-grain differences i.e. speech and protest. Our insights into embedding space differences in combination with the optimal clustering - by definition - advances automated visual frame detection. Our code can be found at https://github.com/KathPra/MP4VisualFrameDetection.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Prasse_2025_WACV, author = {Prasse, Katharina and Bravo, Isaac and Walter, Stefanie and Keuper, Margret}, title = {I Spy with My Little Eye A Minimum Cost Multicut Investigation of Dataset Frames}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {2134-2143} }