Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head

Feng Ni, Yuehan Yao; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 0-0

Abstract


As a concise and classic framework for object detection and instance segmentation, Mask R-CNN achieves promising performance in both two tasks.However, considering stronger feature representation for Mask R-CNN fashion framework, there is room for improvement from two aspects. On the one hand, performing multi-task prediction needs more credible feature extraction and multi-scale features integration to handle objects with varied scales. In this paper, we address this problem by using a novel neck module called SA-FPN (Scale Aware Feature Pyramid Networks). With the enhanced feature representations, our model can accurately detect and segment the objects of multiple scales. On the other hand, in Mask R-CNN framework, isolation between parallel detection branch and instance segmentation branch exists, causing the gap between training and testing processes. To narrow this gap, we propose a unified head module named EJ-Head (Effective Joint Head) to combine two branches into one head, not only realizing the interaction between two tasks, but also enhancing the effectiveness of multi-task learning. Comprehensive experiments show that our proposed methods bring noticeable gains for object detection and instance segmentation. In particular, our model outperforms the original Mask R-CNN by 1 2 percent AP in both object detection and instance segmentation task on MS-COCO benchmark. Code will be available soon.

Related Material


[pdf]
[bibtex]
@InProceedings{Ni_2019_ICCV,
author = {Ni, Feng and Yao, Yuehan},
title = {Multi-Task Learning via Scale Aware Feature Pyramid Networks and Effective Joint Head},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2019}
}