Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu, Zixuan Wang, Jie Wen, Junjun He, Yu Qiao; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 6892-6901

Abstract


Few-shot font generation (FFG) produces stylized font images with a limited number of reference samples which can significantly reduce labor costs in manual font designs. Most existing FFG methods follow the style-content disentanglement paradigm and employ the Generative Adversarial Network (GAN) to generate target fonts by combining the decoupled content and style representations. The complicated structure and detailed style are simultaneously generated in those methods which may be the sub-optimal solutions for FFG task. Inspired by most manual font design processes of expert designers in this paper we model font generation as a multi-stage generative process. Specifically as the injected noise and the data distribution in diffusion models can be well-separated into different sub-spaces we are able to incorporate the font transfer process into these models. Based on this observation we generalize diffusion methods to model font generative process by separating the reverse diffusion process into three stages with different functions: The structure construction stage first generates the structure information for the target character based on the source image and the font transfer stage subsequently transforms the source font to the target font. Finally the font refinement stage enhances the appearances and local details of the target font images. Based on the above multi-stage generative process we construct our font generation framework named MSD-Font with a dual-network approach to generate font images. The superior performance demonstrates the effectiveness of our model. The code is available at: https://github.com/fubinfb/MSD-Font .

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Fu_2024_CVPR, author = {Fu, Bin and Yu, Fanghua and Liu, Anran and Wang, Zixuan and Wen, Jie and He, Junjun and Qiao, Yu}, title = {Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {6892-6901} }