IR Reasoner: Real-Time Infrared Object Detection by Visual Reasoning

Gündoğan, Meryem Mine; Aksoy, Tolga; Temizel, Alptekin; Halici, Ugur

Meryem Mine Gündoğan, Tolga Aksoy, Alptekin Temizel, Ugur Halici; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 422-430

Abstract

Thermal Infrared (IR) imagery is utilized in several applications due to their unique properties. However, there are a number of challenges, such as small target objects, image noise, lack of textural information, and background clutter, negatively affecting detection of objects in IR images. Current real-time object detection methods treat each image region separately and, in face of these challenges, this sole dependency on feature maps extracted by convolutional layers is not ideal. In this paper, we introduce a new architecture for real-time object detection in IR images by reasoning the relations between image regions by using self-attention. The proposed method, IR Reasoner, takes the spatial and semantic coherency between image regions into account to enhance the feature maps. We integrated this approach into the current state-of-the-art one-stage object detectors YOLOv4, YOLOR, and YOLOv7, and trained them from scratch on the FLIR ADAS dataset. Experimental evaluations show that the Reasoner variants perform better than the baseline models while still running in real-time. Our best performing Reasoner model YOLOv7-W6-Reasoner achieves 40.5% AP at 32.7 FPS. The code is publicly available.

Related Material

[pdf]

[bibtex]

@InProceedings{Gundogan_2023_CVPR, author = {G\"undo\u{g}an, Meryem Mine and Aksoy, Tolga and Temizel, Alptekin and Halici, Ugur}, title = {IR Reasoner: Real-Time Infrared Object Detection by Visual Reasoning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {422-430} }