Learning to Transform for Generalizable Instance-wise Invariance

Utkarsh Singhal, Carlos Esteves, Ameesh Makadia, Stella X. Yu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 6211-6221


Computer vision research has long aimed to build systems that are robust to transformations found in natural data. Traditionally, this is done using data augmentation or hard-coding invariances into the architecture. However, too much or too little invariance can hurt, and the correct amount is unknown a priori and dependent on the instance. Ideally, the appropriate invariance would be learned from data and inferred at test-time. We treat invariance as a prediction problem. Given any image, we predict a distribution over transformations. We use variational inference to learn this distribution end-to-end. Combined with a graphical model approach, this distribution forms a flexible, generalizable, and adaptive form of invariance. Our experiments show that it can be used to align datasets and discover prototypes, adapt to out-of-distribution poses, and generalize invariances across classes. When used for data augmentation, our method shows consistent gains in accuracy and robustness on CIFAR 10, CIFAR10-LT, and TinyImageNet.

Related Material

@InProceedings{Singhal_2023_ICCV, author = {Singhal, Utkarsh and Esteves, Carlos and Makadia, Ameesh and Yu, Stella X.}, title = {Learning to Transform for Generalizable Instance-wise Invariance}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {6211-6221} }