Deep Cropping via Attention Box Prediction and Aesthetics Assessment

Wenguan Wang, Jianbing Shen; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2186-2194

Abstract


We model the photo cropping problem as a cascade of attention box regression and aesthetic quality classification, based on deep learning. A neural network is designed that has two branches for predicting attention bounding box and analyzing aesthetics, respectively. The predicted attention box is treated as an initial crop window where a set of cropping candidates are generated around it, without missing important information. Then, aesthetics assessment is employed to select the final crop as the one with the best aesthetic quality. With our network, cropping candidates share features within full-image convolutional feature maps, thus avoiding repeated feature computation and leading to higher computation efficiency. Via leveraging rich data for attention prediction and aesthetics assessment, the proposed method produces high-quality cropping results, even with the limited availability of training data for photo cropping. The experimental results demonstrate the competitive results and fast processing speed (5 fps with all steps).

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Wang_2017_ICCV,
author = {Wang, Wenguan and Shen, Jianbing},
title = {Deep Cropping via Attention Box Prediction and Aesthetics Assessment},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}