Revelio: Interpreting and leveraging semantic information in diffusion models

Kim, Dahye; Thomas, Xavier; Ghadiyaram, Deepti

Dahye Kim, Xavier Thomas, Deepti Ghadiyaram; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 4659-4669

Abstract

We study how rich visual semantic information is represented within various layers and denoising timesteps of different diffusion architectures. We uncover monosemantic interpretable features by leveraging k-sparse autoencoders (k-SAE). We substantiate our mechanistic interpretations via transfer learning using light-weight classifiers on off-the-shelf diffusion models' features. On 4 datasets, we demonstrate the effectiveness of diffusion features for representation learning. We provide an in-depth analysis of how different diffusion architectures, pre-training datasets, and language model conditioning impacts visual representation granularity, inductive biases, and transfer learning capabilities. Our work is a critical step towards deepening interpretability of black-box diffusion models. Code and visualizations available at: https://github.com/revelio-diffusion/revelio

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Kim_2025_ICCV, author = {Kim, Dahye and Thomas, Xavier and Ghadiyaram, Deepti}, title = {Revelio: Interpreting and leveraging semantic information in diffusion models}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2025}, pages = {4659-4669} }