Learning Decorrelated Representations Efficiently Using Fast Fourier Transform

Yutaro Shigeto, Masashi Shimbo, Yuya Yoshikawa, Akikazu Takeuchi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 2052-2060

Abstract


Barlow Twins and VICReg are self-supervised representation learning models that use regularizers to decorrelate features. Although these models are as effective as conventional representation learning models, their training can be computationally demanding if the dimension d of the projected embeddings is high. As the regularizers are defined in terms of individual elements of a cross-correlation or covariance matrix, computing the loss for n samples takes O(n d^2) time. In this paper, we propose a relaxed decorrelating regularizer that can be computed in O(n d log d) time by Fast Fourier Transform. We also propose an inexpensive technique to mitigate undesirable local minima that develop with the relaxation. The proposed regularizer exhibits accuracy comparable to that of existing regularizers in downstream tasks, whereas their training requires less memory and is faster for large d. The source code is available.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Shigeto_2023_CVPR, author = {Shigeto, Yutaro and Shimbo, Masashi and Yoshikawa, Yuya and Takeuchi, Akikazu}, title = {Learning Decorrelated Representations Efficiently Using Fast Fourier Transform}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {2052-2060} }