-
[pdf]
[bibtex]@InProceedings{Pan_2026_CVPR, author = {Pan, Yuwen and Wang, Yuan and Li, Shaohui and Li, Zhi and Liu, Yu and He, You}, title = {From Attraction to Equilibrium: Physics-Inspired Semantic Gravitons for Zero-Shot Anomaly Detection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {35628-35637} }
From Attraction to Equilibrium: Physics-Inspired Semantic Gravitons for Zero-Shot Anomaly Detection
Abstract
Zero-shot anomaly detection (ZSAD) aims to identify unseen anomalies without abnormal supervision, which is essential for open-world scenarios. Recent vision-language models such as CLIP enable anomaly reasoning through shared visual-textual embeddings, but existing methods often rely on coarse prompt fusion, leading to unstable alignment and imprecise localization under domain shifts. To address this issue, we propose the Semantic Graviton Network (SGNet), a physics-inspired framework that models multimodal alignment as an adaptive potential field. We introduce semantic gravitons, learnable dynamic mediators that bridge visual and textual modalities by establishing localized semantic equilibria through attraction and equilibrium forces. A graviton interaction network alternates text-to-graviton and vision-to-graviton coupling to progressively refine multimodal correspondence, while an energy-based potential regularization further stabilizes the interaction process. Extensive experiments on ten industrial and medical benchmarks show that SGNet achieves state-of-the-art performance for zero-shot anomaly detection.
Related Material

