Short-form UGC Video Quality Assessment Based on Multi-Level Video Fusion with Rank-Aware

Haoran Xu, Mengduo Yang, Jie Zhou, Jiaze Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6297-6306

Abstract


Short-form UGC video platforms such as Kwai and TikTok have ushered in vigorous development. However due to the variety of short video types and uneven quality the workload of manual annotation is heavy. In this paper video is decomposed into three levels (frame level segment level and video level) based on the view of data augmentation and multi-level fusion and a new integrated framework is proposed to capture the spatial-temporal characteristics and relative rank information of different levels. It uses spatial-temporal data augmentation strategy multi-level feature fusion adaptive rank-aware loss and redistributed model ensemble at all levels. These components allow our method not only to capture features at each level but also to mitigate the difficulty of identifying the relative rank of the two kinds of hard samples. Our framework achieves 5th place among all methods in the NTIRE 2024 Short-form UGC Video Quality Assessment Challenge. A large number of experiments show that our framework not only performs well on the KVQ dataset but also on other benchmark VQA datasets. It proves the generalization and superiority of our framework.

Related Material


[pdf]
[bibtex]
@InProceedings{Xu_2024_CVPR, author = {Xu, Haoran and Yang, Mengduo and Zhou, Jie and Li, Jiaze}, title = {Short-form UGC Video Quality Assessment Based on Multi-Level Video Fusion with Rank-Aware}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {6297-6306} }