SyntheticManga: Training-Free Manga Generation with Phased Diffusion

Xuelei Peng, Chi-Keung Tang, Yu-Wing Tai; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings, 2026, pp. 4410-4418

Abstract


Synthesizing visually consistent characters across sequential frames is a fundamental yet largely unsolved challenge in manga generation, where practitioners must navigate a critical trade-off between preserving character identity and faithfully adhering to textual prompts. We introduce SyntheticManga, a training-free framework that reconciles this tension through a principled, phased control strategy over the diffusion sampling trajectory. In the high-noise phase, we propose Boltzmann Fourier Guidance (BFG) -- to our knowledge, the first application of Boltzmann distribution principles to the character-consistency problem -- which constructs a probabilistic fusion mask derived from spectral feature drift to adaptively inject structural information from a reference image. In the subsequent mid-noise phase, our Adaptive Drift Modulator (ADM) leverages classical PID control theory to continuously minimize the L1 drift between noise predictions, thereby enabling fine-grained identity correction. Extensive experiments on the ConsiStory+ benchmark demonstrate that SyntheticManga achieves state-of-the-art performance, attaining a superior balance between identity consistency and prompt fidelity compared to existing methods.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Peng_2026_CVPR, author = {Peng, Xuelei and Tang, Chi-Keung and Tai, Yu-Wing}, title = {SyntheticManga: Training-Free Manga Generation with Phased Diffusion}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings}, month = {June}, year = {2026}, pages = {4410-4418} }