Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Luca Bompani, Manuele Rusci, Daniele Palossi, Francesco Conti, Luca Benini; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 2182-2190

Abstract


This paper introduces Multi-Resolution Rescored Byte-Track (MR2-ByteTrack) a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25x by alternating the processing of high-resolution images (320x320 pixels) with multiple down-sized frames (192x192 pixels). To tackle the accuracy degradation due to the reduced image input size MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi_Resolution_Rescored_ByteTrack

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Bompani_2024_CVPR, author = {Bompani, Luca and Rusci, Manuele and Palossi, Daniele and Conti, Francesco and Benini, Luca}, title = {Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {2182-2190} }