Intelligent Robot Manipulation Requires Self-Directed Learning

Li Chen, Chonghao Sima, Kashyap Chitta, Antonio Loquercio, Ping Luo, Yi Ma, Hongyang Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026, pp. 842-853

Abstract


The Embodied AI community has long aspired to create robotic systems with human-level intelligence and dexterity. Recent advances in vision and language models motivated researchers to follow a similar paradigm for robotics and scale up imitation learning from demonstrations. However, imitation learning lacks the mechanism to incorporate feedback from the agent's own experience during interaction with the environment. This perspective argues that enabling agents to learn from their own experience, which we term as self-directed learning, is indispensable for advancing the intelligence and dexterity of robotic systems. Despite possessing a similar concept and toolkit to existing reinforcement learning methods, self-directed learning imposes extra challenges that could completely alter the algorithmic landscape of robot learning: the lack of resets and an explicit and noise-free reward signal. To overcome this limitation, we argue that future endeavors in self-directed learning should be focused into three aspects: goal identification, skill acquisition, and performance evaluation. To improve the efficiency of each step, we are inspired by education theory, suggesting that learning is not confined to a single modality, but rather relies on shared mechanisms across visual, textual, and kinesthetic processes. The key challenges and prospective research avenues to self-directed learning are outlined. We further foster a discussion on alternatives to self-directed learning to train robots for physically dexterous tasks.

Related Material


[pdf]
[bibtex]
@InProceedings{Chen_2026_CVPR, author = {Chen, Li and Sima, Chonghao and Chitta, Kashyap and Loquercio, Antonio and Luo, Ping and Ma, Yi and Li, Hongyang}, title = {Intelligent Robot Manipulation Requires Self-Directed Learning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2026}, pages = {842-853} }