DGBD: Depth Guided Branched Diffusion for Comprehensive Controllability in Multi-View Generation

Hovhannes Margaryan, Daniil Hayrapetyan, Wenyan Cong, Zhangyang Wang, Humphrey Shi; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 747-756

Abstract


This paper presents an innovative approach to multi-view generation that can be comprehensively controlled over both perspectives (viewpoints) and non-perspective attributes (such as depth maps). Our controllable dual-branch pipeline named Depth Guided Branched Diffusion (DGBD) leverages depth maps and perspective information to generate images from alternative viewpoints while preserving shape and size fidelity. In the first DGBD branch we fine-tune a pre-trained diffusion model on multi-view data introducing a regularized batch-aware self-attention mechanism for multi-view consistency and generalization. Direct control over perspective is then achieved through cross-attention conditioned on camera position. Meanwhile the second DGBD branch introduces non-perspective control using depth maps. Qualitative and quantitative experiments validate the effectiveness of our approach surpassing or matching the performance of state-of-the-art novel view and multi-view synthesis methods.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Margaryan_2024_CVPR, author = {Margaryan, Hovhannes and Hayrapetyan, Daniil and Cong, Wenyan and Wang, Zhangyang and Shi, Humphrey}, title = {DGBD: Depth Guided Branched Diffusion for Comprehensive Controllability in Multi-View Generation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {747-756} }