Interpolation-Based Object Detection Using Motion Vectors for Embedded Real-Time Tracking Systems

Takayuki Ujiie, Masayuki Hiromoto, Takashi Sato; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018, pp. 616-624

Abstract


Deep convolutional neural networks (CNNs) have achieved outstanding performance in object detection, a crucial task in computer vision. With the computational intensiveness due to repeated convolutions, they consume large amount of power, making them difficult to apply in power-constrained embedded platforms. In this work, we present MVint, a power-efficient detection and tracking framework. MVint combines motion-vector-based interpolator and CNN-based detector to simultaneously achieve high accuracy and energy efficiency by utilizing motion vectors obtained inexpensively in the environments wherein encoding is conducted at the cameras. Through evaluations using MOT16 benchmark that evaluates multiple object tracking, we show MVint maintains 88% MOTA while reducing detection frequency down to 1/12. An implemention of MVint as a system prototype on Xilinx Zynq UltraScale+ MPSoC ZCU102 confirmed that MVint achieves an ideal 12x FPS compared with a vanilla detection approach.

Related Material


[pdf]
[bibtex]
@InProceedings{Ujiie_2018_CVPR_Workshops,
author = {Ujiie, Takayuki and Hiromoto, Masayuki and Sato, Takashi},
title = {Interpolation-Based Object Detection Using Motion Vectors for Embedded Real-Time Tracking Systems},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}