-
[pdf]
[bibtex]@InProceedings{Lu_2025_ICCV, author = {Lu, Xiaoyan and Weng, Qihao}, title = {Tree Mapping with Limited Data: Fine-Tuning Foundation Models for Multimodal Fusion}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {2829-2834} }
Tree Mapping with Limited Data: Fine-Tuning Foundation Models for Multimodal Fusion
Abstract
Tree mapping plays a crucial role in remote sensing for ecological monitoring and resource management. Achieving accurate tree mapping relies on labeled training data, however, annotating geometrically complex trees in large-scale remote sensing imagery is time-consuming and labor-intensive. We propose a multimodal fusion framework that leverages fine-tuned foundation models to enable data-efficient tree mapping. To enhance spatial understanding and structural perception, we introduce depth information as an auxiliary modality alongside high-resolution RGB remote sensing imagery. By leveraging the complementary strengths of depth and visual data, our method mitigates the limitations of unimodal inputs. Experimental results demonstrate that integrating depth information significantly improves recognition accuracy and boundary delineation, particularly when training samples are scarce. This work highlights the potential of depth-aware multimodal learning to boost performance in data-constrained scenarios, offering a promising direction for scalable and cost-efficient environmental monitoring.
Related Material
