LAM: Language Articulated Object Modelers

Gao, Yipeng; Ge, Yunhao; Cai, Peilin; Seita, Daniel; Itti, Laurent

Yipeng Gao, Yunhao Ge, Peilin Cai, Daniel Seita, Laurent Itti; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 16010-16020

Abstract

We introduce LAM, a system that explores collaboration between large language models and vision-language models to generate articulated objects from text prompts without a visual prior or prebuilt 3D assets. In contrast, we formulate articulated object generation as a unified code-generation task, in which geometry and articulations can be co-designed from scratch. Given an input text, LAM coordinates a team of specialized modules to generate code to represent the desired articulated object procedurally. LAM first reasons about the hierarchical structure of parts (links) with Link Designer, then writes code, compiles it, and debugs it with Geometry and Articulation Coders and self-corrects with Geometry and Articulation Checkers. The code serves as a structured, interpretable bridge between individual links, ensuring the correct relationships among them. Experiments demonstrate the power of leveraging code as a generative medium within a collaboration system, showcasing its effectiveness in automatically constructing complex articulated objects.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Gao_2026_CVPR, author = {Gao, Yipeng and Ge, Yunhao and Cai, Peilin and Seita, Daniel and Itti, Laurent}, title = {LAM: Language Articulated Object Modelers}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {16010-16020} }