Steerers: A Framework for Rotation Equivariant Keypoint Descriptors

Georg Bökman, Johan Edstedt, Michael Felsberg, Fredrik Kahl; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 4885-4895

Abstract


Image keypoint descriptions that are discriminative and matchable over large changes in viewpoint are vital for 3D reconstruction. However descriptions output by learned descriptors are typically not robust to camera rotation. While they can be made more robust by e.g. data aug-mentation this degrades performance on upright images. Another approach is test-time augmentation which incurs a significant increase in runtime. Instead we learn a lin-ear transform in description space that encodes rotations of the input image. We call this linear transform a steerer since it allows us to transform the descriptions as if the im-age was rotated. From representation theory we know all possible steerers for the rotation group. Steerers can be optimized (A) given a fixed descriptor (B) jointly with a de-scriptor or (C) we can optimize a descriptor given a fixed steerer. We perform experiments in these three settings and obtain state-of-the-art results on the rotation invariant im-age matching benchmarks AIMS and Roto-360.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Bokman_2024_CVPR, author = {B\"okman, Georg and Edstedt, Johan and Felsberg, Michael and Kahl, Fredrik}, title = {Steerers: A Framework for Rotation Equivariant Keypoint Descriptors}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {4885-4895} }