Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling

Sourajit Saha, Tejas Gokhale; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 620-629

Abstract


Convolutional neural networks (CNNs) widely deployed in several applications contain downsampling operators in their pooling layers which have been observed to be sensitive to pixel-level shift affecting the robustness of CNNs. We study shift invariance through the lens of maximum sampling bias (MSB) and find MSB to be negatively correlated with shift invariance. Based on this insight we propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS) to reduce MSB and learn translation-invariant representations. TIPS results in consistent performance gains on multiple benchmarks for image classification object detection and semantic segmentation in terms of accuracy shift consistency shift fidelity as well as improvements in adversarial and distributional robustness. TIPS results in the lowest MSB compared to all previous methods thus explaining the strong empirical results. TIPS can be integrated into any CNN and can be trained end-to-end with marginal computational overhead. Code: https://github.com/sourajitcs/tips/

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Saha_2025_WACV, author = {Saha, Sourajit and Gokhale, Tejas}, title = {Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {620-629} }