-
[pdf]
[supp]
[bibtex]@InProceedings{Kwiatkowski_2025_WACV, author = {Kwiatkowski, Monika and Matern, Simon and Hellwich, Olaf}, title = {Swin-: Gradient-Based Image Restoration from Image Sequences using Video Swin-Transformers}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {1383-1391} }
Swin-: Gradient-Based Image Restoration from Image Sequences using Video Swin-Transformers
Abstract
Most deep-learning models for vision tasks rely on RGB images as their primary input layer assuming the model inherently discovers an optimal representation. In this work we challenge this assumption and show that image gradients offer a straightforward yet robust representation for multi-frame image restoration. We demonstrate that clusters naturally emerge within gradient patches indicating improved estimation of the underlying signal. We develop a Video Swin-Transformer model operating in the gradient domain facilitated by the implementation of two differentiable gradient modules. One module computes image gradients using convolutions with gradient filters while the other reconstructs an RGB image from its gradient representation using deconvolution in the frequency domain. Additionally we employ a composite training loss that measures the error both in the color domain and its gradient counterpart. Applied to a multi-frame image restoration task involving the removal of lighting shadows and occlusions our model consistently outperforms RGB-based counterparts without introducing additional parameters thanks to its gradient regularization. We further apply our framework to various restoration tasks discussing its advantages and limitations. Qualitative results highlight the model's improved generalization to real-world video scenarios demonstrating successful adaptation from synthetic image training to real video data deployment.
Related Material