Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

Ke Fan, Zechen Bai, Tianjun Xiao, Tong He, Max Horn, Yanwei Fu, Francesco Locatello, Zheng Zhang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 23062-23071

Abstract


Object-centric learning (OCL) extracts the representation of objects with slots offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention which utilizes attention mechanisms to iteratively refine slot representations. However a major drawback of most object-centric models including slot attention is their reliance on predefining the number of slots. This not only necessitates prior knowledge of the dataset but also overlooks the inherent variability in the number of objects present in each instance. To overcome this fundamental limitation we present a novel complexity-aware object auto-encoder framework. Within this framework we introduce an adaptive slot attention (AdaSlot) mechanism that dynamically determines the optimal number of slots based on the content of the data. This is achieved by proposing a discrete slot sampling module that is responsible for selecting an appropriate number of slots from a candidate list. Furthermore we introduce a masked slot decoder that suppresses unselected slots during the decoding process. Our framework tested extensively on object discovery tasks with various datasets shows performance matching or exceeding top fixed-slot models. Moreover our analysis substantiates that our method exhibits the capability to dynamically adapt the slot number according to each instance's complexity offering the potential for further exploration in slot attention research. Project will be available at https://kfan21.github.io/AdaSlot/

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Fan_2024_CVPR, author = {Fan, Ke and Bai, Zechen and Xiao, Tianjun and He, Tong and Horn, Max and Fu, Yanwei and Locatello, Francesco and Zhang, Zheng}, title = {Adaptive Slot Attention: Object Discovery with Dynamic Slot Number}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {23062-23071} }