Faster Self-adaptive Deep Stereo

Haiyang Wang, Xinchao Wang, Jie Song, Jie Lei, Mingli Song; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


Fueled by the power of deep learning, stereo vision has made unprecedented advances in recent years. Existing deep stereo models, however, can be hardly deployed to real-world scenarios where the data comes on-the-fly without any ground-truth information, and the data distribution continuously changes over time. Recently, Tonioni et al. [??] proposed the first real-time self-adaptive deep stereo system (MADNet) to address this problem, which, however, still runs at a relatively low speed with not so satisfactory performance. In this paper, we significantly upgrade their work in both speed and accuracy by incorporating two key components. First, instead of adopting only the image reconstruction loss as the proxy supervision, a second more powerful supervision is proposed, termed Knowledge Reverse Distillation (KRD), to guide the learning of deep stereo models. Second, we introduce a straightforward yet surprisingly effective Adapt-or-Hold (AoH) mechanism to automatically determine whether or not to fine-tune the stereo model in the online environment. Both components are lightweight and can be integrated into MADNet with only a few lines of code. Experiments demonstrate that the two proposed components improve the system by a large margin in both speed and accuracy. Our final system is twice as fast as MADNet, meanwhile attains considerable superior performance on the popular benchmark datasets KITTI.

Related Material

[pdf] [supp]
@InProceedings{Wang_2020_ACCV, author = {Wang, Haiyang and Wang, Xinchao and Song, Jie and Lei, Jie and Song, Mingli}, title = {Faster Self-adaptive Deep Stereo}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }