LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li, Wei Hu, Fan Zhang, Jun Liu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 19510-19520

Abstract


Long-tail recognition is challenging because it requires the model to learn good representations from tail categories and address imbalances across all categories. In this paper we propose a novel generative and fine-tuning framework LTGC to handle long-tail recognition via leveraging generated content. Firstly inspired by the rich implicit knowledge in large-scale models (e.g. large language models LLMs) LTGC leverages the power of these models to parse and reason over the original tail data to produce diverse tail-class content. We then propose several novel designs for LTGC to ensure the quality of the generated data and to efficiently fine-tune the model using both the generated and original data. The visualization demonstrates the effectiveness of the generation module in LTGC which produces accurate and diverse tail data. Additionally the experimental results demonstrate that our LTGC outperforms existing state-of-the-art methods on popular long-tailed benchmarks.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Zhao_2024_CVPR, author = {Zhao, Qihao and Dai, Yalun and Li, Hao and Hu, Wei and Zhang, Fan and Liu, Jun}, title = {LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {19510-19520} }