-
[pdf]
[bibtex]@InProceedings{Luddecke_2021_CVPR, author = {Luddecke, Timo and Ecker, Alexander}, title = {The Role of Data for One-Shot Semantic Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {2653-2658} }
The Role of Data for One-Shot Semantic Segmentation
Abstract
In this work we investigate the potential of larger datasets for one-shot semantic segmentation. While computer vision models are often trained on millions of diverse samples, current one-shot semantic segmentation datasets encompass only a small number of samples (Pascal-5i), a small number of classes (Pascal-5i and COCO-20i) or have little variability (FSS-1000). To improve this situation, we introduce LVIS-OneShot, a one-shot variant of the LVIS dataset. With 718 classes and 114,347 images, it exceeds previous datasets substantially in terms of size. By controlled experiments we show that not only the number of images but also the number of different classes is crucial. We analyze transfer learning across common datasets and find that by training on LVIS-OneShot we outperform current state-of-the-art models on Pascal-5i. In particular, we observe that a simple baseline model (MaRF) learns to perform one-shot segmentation when trained on a large dataset although it has a generic architecture without strong inductive biases. Code and dataset are available here: eckerlab.org/code/one-shot-segmentation
Related Material