A Dataset and Framework for Learning State-invariant Object Representations

Rohan Sarkar, Avinash Kak; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026, pp. 4715-4723

Abstract


We introduce state invariance - robustness to changes in an object's structural form (as for a folded umbrella or for crumpled clothing) - complementing other common invariances to learn object representations for recognition and retrieval tasks. For this, we present ObjectsWithStateChange (OWSC), a novel dataset that captures variations in object appearance arising from state changes, along with pose, viewpoint, and illumination variations, to advance research in fine-grained 3D object recognition and retrieval. A key challenge is that objects within and across categories may look visually similar under certain state changes, making discrimination difficult. To address this, we propose a curriculum learning based mining strategy that progressively samples harder object pairs based on inter-object distances in the learned embedding space after each epoch, gradually sampling harder-to-distinguish examples of visually similar objects from within and across categories during training. Our ablation shows that this curriculum learning strategy enhances the model's ability to learn discriminative invariant features for fine-grained tasks, improving object recognition accuracy by 7.9% and retrieval mAP by 9.2% over prior methods on our new OWSC dataset and three other multi-view datasets, such as ModelNet40, ObjectPI, FG3D. Our OWSC dataset is available at https://github.com/sarkar-rohan/ObjectsWithStateChange .

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Sarkar_2026_WACV, author = {Sarkar, Rohan and Kak, Avinash}, title = {A Dataset and Framework for Learning State-invariant Object Representations}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {March}, year = {2026}, pages = {4715-4723} }