Adaptive Human-Centric Video Compression for Humans and Machines

Wei Jiang, Hyomin Choi, Fabien Racapé; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023, pp. 1121-1129


We propose a novel framework to compress human-centric videos for both human viewing and machine analytics. Our system uses three coding branches to combine the power of generic face-prior learning with data-dependent detail recovery. The generic branch embeds faces into a discrete code space described by a learned high-quality (HQ) codebook, to reconstruct an HQ baseline face. The domain-adaptive branch adjusts reconstruction to fit the current data domain by adding domain-specific information through a supplementary codebook. The task-adaptive branch derives assistive details from a low-quality (LQ) input to help machine analytics on the restored face. Adaptive weights are introduced to balance the use of domain-adaptive and task-adaptive features in reconstruction, driving trade-offs among criteria including perceptual quality, fidelity, bitrate, and task accuracy. Moreover, the proposed online learning mechanism automatically adjusts the adaptive weights according to the actual compression needs. By sharing the main generic branch, our framework can extend to multiple data domains and multiple tasks more flexibly compared to conventional coding schemes. Our experiments demonstrate that at very low bitrates we can restore faces with high perceptual quality for human viewing while maintaining high recognition accuracy for machine use.

Related Material

[pdf] [supp]
@InProceedings{Jiang_2023_CVPR, author = {Jiang, Wei and Choi, Hyomin and Racap\'e, Fabien}, title = {Adaptive Human-Centric Video Compression for Humans and Machines}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2023}, pages = {1121-1129} }