NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models

Simin Chen, Zihe Song, Mirazul Haque, Cong Liu, Wei Yang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 15365-15374

Abstract


Neural image caption generation (NICG) models have received massive attention from the research community due to their excellent performance in visual understanding. Existing work focuses on improving NICG model accuracy while efficiency is less explored. However, many real-world applications require real-time feedback, which highly relies on the efficiency of NICG models. Recent research observed that the efficiency of NICG models could vary for different inputs. This observation brings in a new attack surface of NICG models, i.e., An adversary might be able to slightly change inputs to cause the NICG models to consume more computational resources. To further understand such efficiency-oriented threats, we propose a new attack approach, NICGSlowDown, to evaluate the efficiency robustness of NICG models. Our experimental results show that NICGSlowDown can generate images with human-unnoticeable perturbations that will increase the NICG model latency up to 483.86%. We hope this research could raise the community's concern about the efficiency robustness of NICG models.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Chen_2022_CVPR, author = {Chen, Simin and Song, Zihe and Haque, Mirazul and Liu, Cong and Yang, Wei}, title = {NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {15365-15374} }