Cross Modality Knowledge Distillation for Multi-Modal Aerial View Object Classification
In the case of bad weather or low lighting conditions, a single sensor may not be able to capture enough information for object identification. Compared with the traditional optical image, synthetic aperture radar (SAR) imaging has greater advantages, such as the ability to penetrate through fog and smoke. However, SAR images are of low resolution and contaminated by high-level speckle noise. As a result, it is of great difficulty to extract powerful and robust features from the SAR images. In this paper, we explored whether multiple imaging modalities can improve the object detection performance. Here, we propose a Cross Modality Knowledge Distillation (CMKD) paradigm, and explore two different network structures named CMKD-s and CMKD-m for the object classification task. Specifically, CMKD-s transfers the information captured by the two sensors using the online knowledge distillation, which can achieve cross-modal knowledge sharing and enhance the robustness of the aerial view object classification model. Moreover, leveraging the semi-supervised enhanced training, we proposed a novel method named CMKD-m, which strengthens the model for mutual knowledge transfer. Through quantitative comparison, we found that CMKD-s and CMKD-m outperform the method without knowledge transfer, on the NTIRE2021 SAR-EO challenge dataset.