-
[pdf]
[supp]
[bibtex]@InProceedings{Wu_2026_CVPR, author = {Wu, Aiqiu and Qiu, Zhaofan and Yao, Ting and Mei, Tao}, title = {PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {38218-38227} }
PS-SR: Pseudo-Single-Step Video Super-Resolution via Speculative Diffusion
Abstract
Video Super-Resolution (VSR) fundamentally struggles with a critical trade-off: single-step models offer unmatched efficiency but often lack the high-frequency detail, creativity, and visual quality of their multi-step diffusion counterparts, which are computationally prohibitive for practical use. In this paper, we propose PS-SR, a novel "pseudo" single-step VSR framework that transcends this trade-off through a computationally asymmetric sampling pipeline. The key to PS-SR lies in its speculative diffusion mechanism: a powerful base model performs only a single, comprehensive sampling step, establishing the global structure and content fidelity, after which a lightweight draft model, directly augmented by the base model's features, speculatively performs subsequent refinements. Crucially, we further enforce a frequency-domain update rule that constrains these refinements to exclusively inject high-frequency details, preserving the foundational low-frequency content and preventing semantic drift across sampling steps. By doing so, PS-SR creates the "illusion" of a single-step model--delivering the similar inference speeds and input-output content consistency--while achieving the visual richness and creativity typically reserved for costly multi-step generative models. We demonstrate that our "pseudo-single-step" paradigm achieves state-of-the-art quality with a comparable speed to single-step models, paving the way for real-time, high-fidelity video enhancement. Please refer to our project page for more results: https://waq2001.github.io/PS-SR-page/.
Related Material

