- [pdf] [supp] [arXiv]
SOAR: Scene-debiasing Open-set Action Recognition
Deep models have the risk of utilizing spurious clues to make predictions, e.g., recognizing actions via classifying the background scene. This problem severely degrades the open-set action recognition performance when the testing samples exhibit scene distributions different from the training samples. To mitigate this scene bias, we propose a Scene-debiasing Open-set Action Recognition method (SOAR), which features an adversarial reconstruction module and an adaptive adversarial scene classification module. The former prevents a decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning. The latter aims to confuse scene type classification given video features, and helps to learn scene-invariant information. In addition, we design an experiment to quantify the scene bias. The results suggest current open-set action recognizers are biased toward the scene, and our SOAR better mitigates such bias. Furthermore, extensive experiments show our method outperforms state-of-the-art methods, with ablation studies demonstrating the effectiveness of our proposed modules.