Hierarchical Prompting for Diffusion Classifiers

Wenxin Ning, Dongliang Chang, Yujun Tong, Zhongjiang He, Kongming Liang, Zhanyu Ma; Proceedings of the Asian Conference on Computer Vision (ACCV), 2024, pp. 1284-1300

Abstract


Recently, large-scale pre-trained text-to-image models like Stable Diffusion have demonstrated unparalleled capabilities, revolutionizing many tasks. Recent studies have found that these advanced generative models can be applied to discriminative tasks, showing strong accuracy and robustness in zero-shot recognition. However, the current pipeline suffers from impractical inference speed (about 1 minute per image). In this paper, we introduce Hierarchical Prompt Learning, a simple and effective pipeline to achieve high-speed classification for diffusion generators. Our method first proposes a hierarchical evaluation strategy, leveraging prior class tree taxonomy to reduce unnecessary class modeling. To handle the excessive sampling steps, we employ prompt learning, a parameter-efficient technique, to adapt downstream task-specific knowledge into the conditional text embedding. This allows our method to efficiently sample diffusion models in just 25 steps while maintaining high accuracy. The proposed hierarchical evaluation achieves up to 3.5x speedups compared to previous diffusion classifiers, and the combination with prompt learning achieves up to 20x speedups. Beyond efficiency, our method also maintains high performance in zero-shot and few-shot scenarios, both in-distribution and out-of-distribution. Moreover, our visualization analysis sheds light on what our diffusion prompts learn, providing insights into the model's decision-making process. Codes are available at https://github.com/PRIS-CV/Hierarchical-Prompting-for-Diffusion-Classifiers.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Ning_2024_ACCV, author = {Ning, Wenxin and Chang, Dongliang and Tong, Yujun and He, Zhongjiang and Liang, Kongming and Ma, Zhanyu}, title = {Hierarchical Prompting for Diffusion Classifiers}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2024}, pages = {1284-1300} }