Transductive Learning for Zero-Shot Object Detection

Shafin Rahman, Salman Khan, Nick Barnes; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6082-6091


Zero-shot object detection (ZSD) is a relatively unexplored research problem as compared to the conventional zero-shot recognition task. ZSD aims to detect previously unseen objects during inference. Existing ZSD works suffer from two critical issues: (a) large domain-shift between the source (seen) and target (unseen) domains since the two distributions are highly mismatched. (b) the learned model is biased against unseen classes, therefore in generalized ZSD settings, where both seen and unseen objects co-occur during inference, the learned model tends to misclassify unseen to seen categories. This brings up an important question: How effectively can a transductive setting address the aforementioned problems? To the best of our knowledge, we are the first to propose a transductive zero-shot object detection approach that convincingly reduces the domain-shift and model-bias against unseen classes. Our approach is based on a self-learning mechanism that uses a novel hybrid pseudo-labeling technique. It progressively updates learned model parameters by associating unlabeled data samples to their corresponding classes. During this process, our technique makes sure that knowledge that was previously acquired on the source domain is not forgotten. We report significant 'relative' improvements of 34.9% and 77.1% in terms of mAP and recall rates over the previous best inductive models on MSCOCO dataset.

Related Material

[pdf] [supp] [video]
author = {Rahman, Shafin and Khan, Salman and Barnes, Nick},
title = {Transductive Learning for Zero-Shot Object Detection},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}