Revisiting the Receptive Field of Conv-GRU in DROID-SLAM

Antyanta Bangunharcana, Soohyun Kim, Kyung-Soo Kim; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 1906-1916

Abstract


his work focuses on improving the Conv-GRU-based optical flow update within a DROID-SLAM framework. Prior optical flow models typically follow a UNet or coarse-to-fine architecture in order to extract long-range cross-correlation and context cues. This helps flow estimation in the presence of large motion and challenging image regions, e.g., textureless regions. We propose modifications to the Conv-GRU module which follows the rationale of these prior models by integrating (Atrous) Spatial Pyramid Pooling and global self-attention into the Conv-GRU block. By enlarging the receptive field through the aforementioned modifications, the model is able to integrate information from a larger context window, thus improving the robustness even when given inputs that comprise challenging image regions. We show empirically through extensive experiments the gain in accuracy through these modifications.

Related Material


[pdf]
[bibtex]
@InProceedings{Bangunharcana_2022_CVPR, author = {Bangunharcana, Antyanta and Kim, Soohyun and Kim, Kyung-Soo}, title = {Revisiting the Receptive Field of Conv-GRU in DROID-SLAM}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {1906-1916} }