-
[pdf]
[arXiv]
[bibtex]@InProceedings{Chen_2021_CVPR, author = {Chen, Zhihao and Wan, Liang and Zhu, Lei and Shen, Jia and Fu, Huazhu and Liu, Wennan and Qin, Jing}, title = {Triple-Cooperative Video Shadow Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2021}, pages = {2715-2724} }
Triple-Cooperative Video Shadow Detection
Abstract
Shadow detection in single image has received signifi-cant research interests in recent years. However, much lessworks has been explored in shadow detection over dynamicscenes. The bottleneck is the lack of a well-establisheddataset with high-quality annotations for video shadow de-tection. In this work, we collect a new video shadow detec-tion dataset (ViSha), which contains120videos with11,685frames, covering 60 object categories, varying lengths, anddifferent motion/lighting conditions. All the frames are an-notated with a high-quality pixel-level shadow mask. Tothe best of our knowledge, this is the first learning-orienteddataset for video shadow detection. Furthermore, we de-velop a new baseline model, named triple-cooperative videoshadow detection network (TVSD-Net). It utilizes tripleparallel networks in a cooperative manner to learn discrim-inative representations at intra-video and inter-video lev-els. Within the network, a dual gated co-attention moduleis proposed to constrain features from neighboring framesin the same video, while an auxiliary similarity loss is in-troduced to mine semantic information between differentvideos. Finally, we conduct a comprehensive study on ViShadataset, systematically evaluating 10 state-of-the-art mod-els (including single image shadow detectors, video ob-ject and saliency detection methods). Experimental resultsdemonstrate that our model outperforms SOTA competitors.
Related Material