Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring

Huicong Zhang, Haozhe Xie, Hongxun Yao; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 2673-2681

Abstract


Video deblurring relies on leveraging information from other frames in the video sequence to restore the blurred regions in the current frame. Mainstream approaches employ bidirectional feature propagation spatio-temporal transformers or a combination of both to extract information from the video sequence. However limitations in memory and computational resources constraints the temporal window length of the spatio-temporal transformer preventing the extraction of longer temporal contextual information from the video sequence. Additionally bidirectional feature propagation is highly sensitive to inaccurate optical flow in blurry frames leading to error accumulation during the propagation process. To address these issues we propose BSSTNet Blur-aware Spatio-temporal Sparse Transformer Network. It introduces the blur map which converts the originally dense attention into a sparse form enabling a more extensive utilization of information throughout the entire video sequence. Specifically BSSTNet (1) uses a longer temporal window in the transformer leveraging information from more distant frames to restore the blurry pixels in the current frame. (2) introduces bidirectional feature propagation guided by blur maps which reduces error accumulation caused by the blur frame. The experimental results demonstrate the proposed BSSTNet outperforms the state-of-the-art methods on the GoPro and DVD datasets.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Zhang_2024_CVPR, author = {Zhang, Huicong and Xie, Haozhe and Yao, Hongxun}, title = {Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {2673-2681} }