Orthogonal Transforms for Learning Invariant Representations in Equivariant Neural Networks

Jaspreet Singh, Chandan Singh, Ankur Rana; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 1523-1530

Abstract


The convolutional layers of the standard convolutional neural networks (CNNs) are equivariant to translation. Recently, a new class of CNNs is introduced which is equivariant to other affine geometric transformations such as rotation and reflection by replacing the standard convolutional layer with the group convolutional layer or using the steerable filters in the convloutional layer. We propose to embed the 2D positional encoding which is invariant to rotation, reflection and translation using orthogonal polar harmonic transforms (PHTs) before flattening the feature maps for fully-connected or classification layer in the equivariant CNN architecture. We select the PHTs among several invariant transforms, as they are very efficient in performance and speed. The proposed 2D positional encoding scheme between the convolutional and fully-connected layers of the equivariant networks is shown to provide significant improvement in performance on the rotated MNIST, CIFAR-10 and CIFAR-100 datasets.

Related Material


[pdf]
[bibtex]
@InProceedings{Singh_2023_WACV, author = {Singh, Jaspreet and Singh, Chandan and Rana, Ankur}, title = {Orthogonal Transforms for Learning Invariant Representations in Equivariant Neural Networks}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {1523-1530} }