-
[pdf]
[supp]
[arXiv]
[bibtex]@InProceedings{Liu_2024_ACCV, author = {Liu, Yu and Mahmood, Arif and Khan, Muhammad Haris}, title = {Depth Attention for Robust RGB Tracking}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {213-231} }
Depth Attention for Robust RGB Tracking
Abstract
RGB video object tracking is a vital task in computer vision, yet its effectiveness is often limited by the lack of depth information, which is crucial for handling very challenging scenarios e.g., occlusions. In this work, we unveil a new framework that leverages monocular depth estimation to counter occlusions and motion blur in RGB video tracking. Specifically, our work introduces following contributions. (a) To our knowledge, we are the first to propose a Gaussian attention mechanism and provide a simple framework that allows seamlessly integration of depth insights with cutting-edge tracking algorithms, without RGB-Depth cameras, elevating accuracy and robustness. (b) We provide rigorous mathematical proofs to reveal the benefits of our method and offer a Fourier analysis to provide additional insights. (c) We provide extensive experiments on six challenging tracking benchmarks. Results demonstrate that our method provides consistent gains over several strong baselines. We believe that our method will open up new possibilities for more sophisticated VOT solutions in real-world scenarios. Our code and models will be publicly released.
Related Material