High Quality Entity Segmentation

Qi, Lu; Kuen, Jason; Shen, Tiancheng; Gu, Jiuxiang; Li, Wenbo; Guo, Weidong; Jia, Jiaya; Lin, Zhe; Yang, Ming-Hsuan

Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 4047-4056

Abstract

Dense image segmentation tasks (e.g., semantic, panop tic) are useful for image editing, but existing methods can hardly generalize well in an in-the-wild setting where there are unrestricted image domains, classes, and image reso lution & quality variations. Motivated by these observa tions, we construct a new entity segmentation dataset, with a strong focus on high-quality dense segmentation in the wild. The dataset contains images spanning diverse image domains and entities, along with plent(ful high-resolution images and high-quality mask annotations for training and testing. Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images. It improves mask prediction by fusing high-res image crops that provides more fine grained image details and the full image. CropFormer is the first query-based Tran. former architecture that can ef fectively fuse mask predictions from multiple image views, by learning queries that effectively associate the same en tities across the full image and its crop. With CropFormer, we achieve a significant AP gain of 1.9 on the challenging entity segmentation task. Furthermore, CropFormer con sistently improves the accuracy of traditional segmentation tasks and datasets. The dataset and code are released at http://luqi.info/entityv2.github.iol

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Qi_2023_ICCV, author = {Qi, Lu and Kuen, Jason and Shen, Tiancheng and Gu, Jiuxiang and Li, Wenbo and Guo, Weidong and Jia, Jiaya and Lin, Zhe and Yang, Ming-Hsuan}, title = {High Quality Entity Segmentation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {4047-4056} }