MISF: Multi-Level Interactive Siamese Filtering for High-Fidelity Image Inpainting

Xiaoguang Li, Qing Guo, Di Lin, Ping Li, Wei Feng, Song Wang; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1869-1878


Although achieving significant progress, existing deep generative inpainting methods still show low generalization across different scenes. As a result, the generated images usually contain artifacts or the filled pixels differ greatly from the ground truth, making them far from real-world applications. Image-level predictive filtering is a widely used restoration technique by predicting suitable kernels adaptively according to different input scenes. Inspired by this inherent advantage, we explore the possibility of addressing image inpainting as a filtering task. To this end, we first study the advantages and challenges of the image-level predictive filtering for inpainting: the method can preserve local structures and avoid artifacts but fails to fill large missing areas. Then, we propose the semantic filtering by conducting filtering on deep feature level, which fills the missing semantic information but fails to recover the details. To address the issues while adopting the respective advantages, we propose a novel filtering technique, i.e., Multi-level Interactive Siamese Filtering (MISF) containing two branches: kernel prediction branch (KPB) and semantic & image filtering branch (SIFB). These two branches are interactively linked: SIFB provides multi-level features for KPB while KPB predicts dynamic kernels for SIFB. As a result, the final method takes the advantage of effective semantic & image-level filling for high-fidelity inpainting. Moreover, we discuss the relationship between MISF and the naive encoder-decoder-based inpainting, inferring that MISF provides novel dynamic convolutional operations to enhance the high generalization capability across scenes. We validate our method on three challenging datasets, i.e., Dunhuang, Places2, and CelebA. Our method outperforms state-of-the-art baselines on four metrics, i.e., L1, PSNR, SSIM, and LPIPS.

Related Material

[pdf] [arXiv]
@InProceedings{Li_2022_CVPR, author = {Li, Xiaoguang and Guo, Qing and Lin, Di and Li, Ping and Feng, Wei and Wang, Song}, title = {MISF: Multi-Level Interactive Siamese Filtering for High-Fidelity Image Inpainting}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {1869-1878} }