Act the Part: Learning Interaction Strategies for Articulated Object Part Discovery

Gadre, Samir Yitzhak; Ehsani, Kiana; Song, Shuran

Samir Yitzhak Gadre, Kiana Ehsani, Shuran Song; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15752-15761

Abstract

People often use physical intuition when manipulating articulated objects, irrespective of object semantics. Motivated by this observation, we identify an important embodied task where an agent must play with objects to recover their parts. To this end, we introduce Act the Part (AtP) to learn how to interact with articulated objects to discover and segment their pieces. By coupling action selection and motion segmentation, AtP is able to isolate structures to make perceptual part recovery possible without semantic labels. Our experiments show AtP learns efficient strategies for part discovery, can generalize to unseen categories, and is capable of conditional reasoning for the task. Although trained in simulation, we show convincing transfer to real world data with no fine-tuning. A summery video, interactive demo, and code will be available at atp.cs.columbia.edu.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Gadre_2021_ICCV, author = {Gadre, Samir Yitzhak and Ehsani, Kiana and Song, Shuran}, title = {Act the Part: Learning Interaction Strategies for Articulated Object Part Discovery}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2021}, pages = {15752-15761} }