Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow

Zhou, Hanyu; Chang, Yi; Shi, Zhiwei

Hanyu Zhou, Yi Chang, Zhiwei Shi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 26477-26486

Abstract

Single RGB or LiDAR is the mainstream sensor for the challenging scene flow which relies heavily on visual features to match motion features. Compared with single modality existing methods adopt a fusion strategy to directly fuse the cross-modal complementary knowledge in motion space. However these direct fusion methods may suffer the modality gap due to the visual intrinsic heterogeneous nature between RGB and LiDAR thus deteriorating motion features. We discover that event has the homogeneous nature with RGB and LiDAR in both visual and motion spaces. In this work we bring the event as a bridge between RGB and LiDAR and propose a novel hierarchical visual-motion fusion framework for scene flow which explores a homogeneous space to fuse the cross-modal complementary knowledge for physical interpretation. In visual fusion we discover that event has a complementarity (relative v.s. absolute) in luminance space with RGB for high dynamic imaging and has a complementarity (local boundary v.s. global shape) in scene structure space with LiDAR for structure integrity. In motion fusion we figure out that RGB event and LiDAR are complementary (spatial-dense temporal-dense v.s. spatiotemporal-sparse) to each other in correlation space which motivates us to fuse their motion correlations for motion continuity. The proposed hierarchical fusion can explicitly fuse the multimodal knowledge to progressively improve scene flow from visual space to motion space. Extensive experiments have been performed to verify the superiority of the proposed method.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Zhou_2024_CVPR, author = {Zhou, Hanyu and Chang, Yi and Shi, Zhiwei}, title = {Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {26477-26486} }