QUICKSAL: A small and sparse visual saliency model for efficient inference in resource constrained hardware

Vignesh Ramanathan, Pritesh Dwivedi, Bharath Katabathuni, Anirban Chakraborty, Chetan Singh Thakur; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 1678-1688

Abstract


Visual saliency is an important problem in the field of cognitive science and computer vision with applications such as surveillance, adaptive compressing, detecting unknown objects and scene understanding. In this paper, we propose a small and sparse neural network model for performing salient object segmentation that is suitable for use in mobile and embedded applications. Our model is built using depthwise separable convolutions and bottleneck inverted residuals which have been proven to perform very memory-efficient inference and can be easily implemented using standard functions available in all deep learning frameworks. The multiscale features extracted along with the layers with deep residuals allow our network to learn high-quality saliency maps. We present the quantitative results of our QUICKSAL model with multiple levels of model sparsity ranging from 0% to 96%, with the non-zero parameter count varying from 3.3M to 0.14M respectively - on publicly available benchmark datasets - showing that our highly constrained approach is comparable to other state-of-the-art approaches (parameter count 35M). We also present qualitative results on camouflage images and show that our model can successfully distinguish between the salient and non-salient parts even when both seem blended together.

Related Material


[pdf] [video]
[bibtex]
@InProceedings{Ramanathan_2020_WACV,
author = {Ramanathan, Vignesh and Dwivedi, Pritesh and Katabathuni, Bharath and Chakraborty, Anirban and Thakur, Chetan Singh},
title = {QUICKSAL: A small and sparse visual saliency model for efficient inference in resource constrained hardware},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2020}
}