E2VTS: Energy-Efficient Video Text Spotting From Unmanned Aerial Vehicles

Zhenyu Hu, Pengcheng Pi, Zhenyu Wu, Yunhe Xue, Jiayi Shen, Jianchao Tan, Xiangru Lian, Zhangyang Wang, Ji Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 905-913

Abstract


Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains. UAV's limited battery capacity motivates us to develop an energy-efficient video text spotting solution. In this paper, we first revisit RCNN's crop & resize training strategy and empirically find that it outperforms aligned RoI sampling on a real-world video text dataset captured by UAV. To reduce energy consumption, we further propose a multi-stage image processor that takes videos' redundancy, continuity, and mixed degradation into account. The model is pruned and quantized before deployed on Raspberry Pi. Our proposed energy-efficient video text spotting solution, dubbed as E^2VTS, outperforms all previous methods by achieving a competitive tradeoff between energy efficiency and performance. All our codes and pre-trained models are available at https://github.com/wuzhenyusjtu/LPCVC20-VideoTextSpotting.

Related Material


[pdf]
[bibtex]
@InProceedings{Hu_2021_CVPR, author = {Hu, Zhenyu and Pi, Pengcheng and Wu, Zhenyu and Xue, Yunhe and Shen, Jiayi and Tan, Jianchao and Lian, Xiangru and Wang, Zhangyang and Liu, Ji}, title = {E2VTS: Energy-Efficient Video Text Spotting From Unmanned Aerial Vehicles}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {905-913} }