-
[pdf]
[supp]
[bibtex]@InProceedings{Myers-Dean_2025_CVPR, author = {Myers-Dean, Josh and Price, Brian and Fan, Yifei and Gurari, Danna}, title = {Hierarchical Semantic Segmentation with Autoregressive Language Modeling}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {4129-4139} }
Hierarchical Semantic Segmentation with Autoregressive Language Modeling
Abstract
Hierarchical semantic segmentation entails progressively decomposing objects into smaller nested parts. Existing approaches either require multiple inference passes or multiple, fixed decoders. We instead introduce HALLUMI, an autoregressive language modeling framework that performs the task in one inference pass, relying on special tokens to indicate parent-child relationships so the hierarchy can be recovered from the generated text. Experiments on a hierarchical semantic segmentation dataset to the subpart-level (SPIN) show HALLUMI achieves state-of-the-art results.
Related Material