Learned Image Compression With Super-Resolution Residual Modules and DISTS Optimization
Neural network-based image compressors have the ability to optimize various perceptual image quality metrics. We propose improved methods that is based on selective-detail decoding, which uses two decoders (a main decoder and selective-detail decoder) optimized for different image-quality metrics and applies the output result of a suitable decoder for each part of an image. The following three improvements are obtained with the proposed method. (1) Inspired by the super-resolution task, we add a super-resolution residual module to the main decoder, which is trained to up-sample an image to a resolution beyond the source image, aiming to output a visually clearer image. (2) To improve the perceptual image quality of the main decoder, we use an image quality metric based on Deep Image Structure and Texture Similarity (DISTS), the similarity of which is close to that of human senses with respect to texture. (3) To improve the mask accuracy for decoder selection, cross entropy loss is used for comparing predicted masks and ground truth masks. We also use the weighted mean squared error to improve the visual quality of the text part of an image.