HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


Local feature extraction remains an active research area due to the advances in fields such as SLAM, 3D reconstructions, or AR applications. The success in these applications relies on the performance of the feature detector, descriptor, and its matching process. While the trend of detector-descriptor interaction of most methods is based on unifying the two into a single network, we propose an alternative approach that treats both components independently and focuses on their interaction during the learning process. We formulate the classical hard-mining triplet loss as a new detector optimisation term to improve keypoint positions based on the descriptor map. Moreover, we introduce a dense descriptor that uses a multi-scale approach within the architecture and a hybrid combination of hand-crafted and learnt features to obtain rotation and scale robustness by design. We evaluate our method extensively on several benchmarks and show improvements over the state of the art in terms of image matching and 3D reconstruction quality while keeping on par in camera localisation tasks.

Related Material

[pdf] [code]
@InProceedings{Barroso-Laguna_2020_ACCV, author = {Barroso-Laguna, Axel and Verdie, Yannick and Busam, Benjamin and Mikolajczyk, Krystian}, title = {HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }