Is Bigger Always Better? An Empirical Study on Efficient Architectures for Style Transfer and Beyond

Jie An, Tao Li, Haozhi Huang, Jinwen Ma, Jiebo Luo; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 4084-4094

Abstract


Network architecture plays a pivotal role in the performance of style transfer algorithms. Most existing algorithms use VGG19 as the feature extractor, which incurs a high computational cost. In this work, we conduct an empirical study on the popular network architectures and find that some more efficient networks can replace VGG19 while having comparable style transfer performance. Beyond that, we show that an efficient network can be further accelerated by removing its empty channels via a simple channel pruning method tweaked for style transfer. To prevent the potential performance drop due to using a more lightweight network and obtain better style transfer results, we introduce a more accurate deep feature alignment strategy to improve existing style transfer modules. Taking GoogLeNet as an exemplary efficient network, the pruned GoogLeNet with the improved style transfer module is 2.3 107.4x faster than the state-of-the-art approaches and can achieve 68.03 FPS on 512 x 512 images. Extensive experiments demonstrate that VGG19 can be replaced by a more lightweight network with significantly improved efficiency and comparable style transfer quality.

Related Material


[pdf]
[bibtex]
@InProceedings{An_2023_WACV, author = {An, Jie and Li, Tao and Huang, Haozhi and Ma, Jinwen and Luo, Jiebo}, title = {Is Bigger Always Better? An Empirical Study on Efficient Architectures for Style Transfer and Beyond}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {4084-4094} }