-
[pdf]
[bibtex]@InProceedings{Abraham_2025_CVPR, author = {Abraham, Sophia J. and Hauenstein, Jonathan D. and Scheirer, Walter J.}, title = {Wavelet-Based Mechanistic Interpretability of Vision Transformers via Frequency-Aware Ablations}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR) Workshops}, month = {June}, year = {2025}, pages = {4830-4834} }
Wavelet-Based Mechanistic Interpretability of Vision Transformers via Frequency-Aware Ablations
Abstract
We explore a wavelet-based interpretability framework for Vision Transformers (ViT), aiming to analyze their reliance on frequency-specific representations. Through systematic ablations of wavelet subbands, we assess how different frequency components contribute to latent representations and attention mechanisms. Our empirical study on CIFAR-10 reveals that high-frequency details, particularly those captured by Haar wavelets, may influence reconstruction fidelity and attention distributions. While preliminary findings suggest a frequency-dependent behavior in ViT representations, further investigation is needed to generalize across datasets and architectures. This study highlights the potential of frequency-based interpretability but also underscores the need for more rigorous evaluation in larger, more diverse settings. To encourage further exploration, all the experimentation and method code can be found on our GitHub repository.
Related Material