Learning Intra-Class Multimodal Distributions With Orthonormal Matrices

Jumpei Goto, Yohei Nakata, Kiyofumi Abe, Yasunori Ishii, Takayoshi Yamashita; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 1870-1879


In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e., online clustering assigns only a single subcentroid to a feature vector extracted from a backbone and ignores the relationship between the other sub-centroids and the feature vector, and updating sub-centroids in an online clustering manner incurs significant storage costs. To address these limitations, we propose a novel method utilizing orthonormal matrices instead of sub-centroids for relaxing discrete assignments into continuous assignments. We update the orthonormal matrices using a gradient-based method, which eliminates the need for online clustering or additional storage. Experimental results on the CIFAR and ImageNet datasets exhibit that the proposed method outperforms current online clustering techniques in classification accuracy, sub-category discovery, and transferability, providing an efficient solution to the challenges posed by complex recognition targets.

Related Material

[pdf] [supp]
@InProceedings{Goto_2024_WACV, author = {Goto, Jumpei and Nakata, Yohei and Abe, Kiyofumi and Ishii, Yasunori and Yamashita, Takayoshi}, title = {Learning Intra-Class Multimodal Distributions With Orthonormal Matrices}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {1870-1879} }