Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models

Ziqian Lu, Yunlong Yu, Qinyue Tong, Jun Liu; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 3575-3584

Abstract


Existing adaptation methods of pre-trained vision-language models like CLIP often rely on base-class samples during fine-tuning, introducing systematic biases that distort decision boundaries and degrade performance on novel classes. In this work, we break new ground by proposing a hierarchical divide-and-conquer framework that addresses classification bias at its root. Our method first segregates the label space into base and novel subspaces, ensuring domain separation. Subsequently, it employs text-embedding clustering within each subspace to decompose ambiguous intra-domain classes into disentangled, fine-grained clusters. This two-stage grouping strategy not only alleviates class confusion but also enables domain-specific model training in isolated subspaces, fostering specialized learning without overfitting base categories. Experiments on three classification benchmarks reveal that our approach achieves state-of-the-art performance, surpassing the second-best competitor by 10% average accuracy.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Lu_2025_ICCV, author = {Lu, Ziqian and Yu, Yunlong and Tong, Qinyue and Liu, Jun}, title = {Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {3575-3584} }