Naturally Computed Scale Invariance in the Residual Stream of ResNet18

André Longon; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops, 2025, pp. 4804-4808

Abstract


An important capacity in visual object recognition is invariance to image-altering variables which leave the identity of objects unchanged, such as lighting, rotation, and scale. How do neural networks achieve this? Prior mechanistic interpretability research has illuminated some invariance-building circuitry in InceptionV1, but the results are limited and networks with different architectures have remained largely unexplored. This work investigates ResNet18 with a particular focus on its residual stream, an architectural component which InceptionV1 lacks. We observe that many convolutional channels in intermediate blocks exhibit scale invariant properties, computed by the element-wise residual summation of scale equivariant representations: the block input's smaller-scale copy with the block pre-sum output's larger-scale copy. Through subsequent ablation experiments, we attempt to causally link these neural properties with scale-robust object recognition behavior. Our tentative findings suggest how the residual stream computes scale invariance and its possible role in behavior. Code will be made publicly available.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Longon_2025_CVPR, author = {Longon, Andr\'e}, title = {Naturally Computed Scale Invariance in the Residual Stream of ResNet18}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops}, month = {June}, year = {2025}, pages = {4804-4808} }