PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description

Ziqi Cai, Shuchen Weng, Yifei Xia, Boxin Shi; Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025, pp. 7867-7876

Abstract


Achieving joint control over material properties, lighting, and high-level semantics in images is essential for applications in digital media, advertising, and interactive design. Existing methods often isolate these properties, lacking a cohesive approach to manipulating materials, lighting, and semantics simultaneously. We introduce PhyS-EdiT, a novel diffusion-based model that enables precise control over four critical material properties: roughness, metallicity, albedo, and transparency while integrating lighting and semantic adjustments within a single framework. To facilitate this disentangled control, we present PR-TIPS, a large and diverse synthetic dataset designed to improve the disentanglement of material and lighting effects. PhyS-EdiT incorporates a dual-network architecture and robust training strategies to balance low-level physical realism with high-level semantic coherence, supporting localized and continuous property adjustments. Extensive experiments demonstrate the superiority of PhyS-EdiT in editing both synthetic and real-world images, achieving state-of-the-art performance on material, lighting, and semantic editing tasks.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Cai_2025_CVPR, author = {Cai, Ziqi and Weng, Shuchen and Xia, Yifei and Shi, Boxin}, title = {PhyS-EdiT: Physics-aware Semantic Image Editing with Text Description}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {7867-7876} }