TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration

Olszewski, Jan; Rymarczyk, Dawid Damian; Wojcik, Piotr; Pach, Mateusz; Zielinski, Bartosz

Jan Olszewski, Dawid Damian Rymarczyk, Piotr Wojcik, Mateusz Pach, Bartosz Zielinski; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 8595-8605

Abstract

Active Visual Exploration (AVE) optimizes the utilization of robotic resources in real-world scenarios by sequentially selecting the most informative observations. However modern methods require a high computational budget due to processing the same observations multiple times through the autoencoder transformers. As a remedy we introduce a novel approach to AVE called TOken REcycling (TORE). It divides the encoder into extractor and aggregator components. The extractor processes each observation separately enabling the reuse of tokens passed to the aggregator. Moreover to further reduce the computations we decrease the decoder to only one block. Through extensive experiments we demonstrate that TORE outperforms state-of-the-art methods while reducing computational overhead by up to 90%.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Olszewski_2025_WACV, author = {Olszewski, Jan and Rymarczyk, Dawid Damian and Wojcik, Piotr and Pach, Mateusz and Zielinski, Bartosz}, title = {TORE: Token Recycling in Vision Transformers for Efficient Active Visual Exploration}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {8595-8605} }