Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Liang, Yingping; Hu, Yutao; Shao, Wenqi; Fu, Ying

Yingping Liang, Yutao Hu, Wenqi Shao, Ying Fu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 6621-6631

Abstract

Feature matching plays a fundamental role in many computer vision tasks, yet existing methods rely on scarce and clean multi-view image collections, which constrains their generalization to diverse and challenging scenarios. Moreover, conventional feature encoders are typically trained on single-view 2D images, limiting their capacity to capture 3D-aware correspondences. In this paper, we propose a novel two-stage framework that lifts 2D images to 3D space, named as Lift to Match (L2M), taking full advantage of large-scale and diverse single-view images. To be specific, in the first stage, we learn a 3D-aware feature encoder using a combination of multi-view image synthesis and 3D feature Gaussian representation, which injects 3D geometry knowledge into the encoder. In the second stage, a novel-view rendering strategy, combined with large-scale synthetic data generation from single-view images, is employed to learn a feature decoder for robust feature matching, thus achieving generalization across diverse domains. Extensive experiments demonstrate that our method achieves superior generalization across zero-shot evaluation benchmarks, highlighting the effectiveness of the proposed framework for robust feature matching.

Related Material

[pdf] [arXiv]

[bibtex]

@InProceedings{Liang_2025_ICCV, author = {Liang, Yingping and Hu, Yutao and Shao, Wenqi and Fu, Ying}, title = {Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {6621-6631} }