SwinIA: Self-Supervised Blind-Spot Image Denoising without Convolutions

Mikhail Papkov, Pavel Chizhov, Leopold Parts; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 7071-7080

Abstract


Self-supervised image denoising implies restoring the signal from a noisy image without access to the ground truth. State-of-the-art solutions for this task rely on predicting masked pixels with a fully-convolutional neural network. This most often requires multiple forward passes information about the noise model or intricate regularization functions. In this paper we propose a Swin Transformer-based Image Autoencoder (SwinIA) the first fully-transformer architecture for self-supervised denoising. The flexibility of the attention mechanism helps to fulfill the blind-spot property that convolutional counterparts normally approximate. SwinIA can be trained end-to-end with a simple mean squared error loss without masking and does not require any prior knowledge about clean data or noise distribution. Simple to use SwinIA establishes the state of the art on several common benchmarks.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Papkov_2025_WACV, author = {Papkov, Mikhail and Chizhov, Pavel and Parts, Leopold}, title = {SwinIA: Self-Supervised Blind-Spot Image Denoising without Convolutions}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {7071-7080} }