Exploring AIGC Video Quality: A Focus on Visual Harmony Video-Text Consistency and Domain Distribution Gap

Bowen Qu, Xiaoyu Liang, Shangkun Sun, Wei Gao; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6652-6660

Abstract


The recent advancements in Text-to-Video Artificial Intelligence Generated Content (AIGC) have been remarkable. Compared with traditional videos the assessment of AIGC videos encounters various challenges: visual inconsistency that defy common sense discrepancies between content and the textual prompt and distribution gap between various generative models etc. Target at these challenges in this work we categorize the assessment of AIGC video quality into three dimensions: visual harmony video-text consistency and domain distribution gap. For each dimension we design specific modules to provide a comprehensive quality assessment of AIGC videos. Furthermore our research identifies significant variations in visual quality fluidity and style among videos generated by different text-to-video models. Predicting the source generative model can make the AIGC video features more discriminative which enhances the quality assessment performance. The proposed method was used in the third-place winner of the NTIRE 2024 Quality Assessment for AI-Generated Content - Track 2 Video demonstrating its effectiveness.

Related Material


[pdf] [arXiv]
[bibtex]
@InProceedings{Qu_2024_CVPR, author = {Qu, Bowen and Liang, Xiaoyu and Sun, Shangkun and Gao, Wei}, title = {Exploring AIGC Video Quality: A Focus on Visual Harmony Video-Text Consistency and Domain Distribution Gap}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6652-6660} }