Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models

Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6641-6651

Abstract


Though diffusion models have been successfully applied to various image restoration (IR) tasks their performance is sensitive to the choice of training datasets. Typically diffusion models trained in specific datasets fail to recover images that have out-of-distribution degradations. To address this problem this work leverages a capable vision-language model and a synthetic degradation pipeline to learn image restoration in the wild (wild IR). More specifically all low-quality images are simulated with a synthetic degradation pipeline that contains multiple common degradations such as blur resize noise and JPEG compression. Then we introduce robust training for a degradation-aware CLIP model to extract enriched image content features to assist high-quality image restoration. Our base diffusion model is the image restoration SDE (IR-SDE). Built upon it we further present a posterior sampling strategy for fast noise-free image generation. We evaluate our model on both synthetic and real-world degradation datasets. Moreover experiments on the unified image restoration task illustrate that the proposed posterior sampling improves image generation quality for various degradations.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Luo_2024_CVPR, author = {Luo, Ziwei and Gustafsson, Fredrik K. and Zhao, Zheng and Sj\"olund, Jens and Sch\"on, Thomas B.}, title = {Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6641-6651} }