Learning to Distill Convolutional Features Into Compact Local Descriptors

Jongmin Lee, Yoonwoo Jeong, Seungwook Kim, Juhong Min, Minsu Cho; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 898-908

Abstract


Extracting local descriptors or features is an essential step in solving image matching problems. Recent methods in the literature mainly focus on extracting effective descriptors, without much attention to the size of the descriptors. In this work, we study how to learn a compact yet effective local descriptor. The proposed method distills multiple intermediate features of a pretrained convolutional neural network to encode different levels of visual information from local textures to non-local semantics, resulting in local descriptors with a designated dimension. Experiments on standard benchmarks for semantic correspondence show that it achieves significantly improved performance over existing models, with up to a 100 times smaller size of descriptors. Furthermore, while trained on a small-sized dataset for semantic correspondence, the proposed method also generalizes well to other image matching tasks, performing comparable to the state of the art on wide-baseline matching and visual localization benchmarks.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Lee_2021_WACV, author = {Lee, Jongmin and Jeong, Yoonwoo and Kim, Seungwook and Min, Juhong and Cho, Minsu}, title = {Learning to Distill Convolutional Features Into Compact Local Descriptors}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2021}, pages = {898-908} }