Hiding Video in Audio via Reversible Generative Models

Hyukryul Yang, Hao Ouyang, Vladlen Koltun, Qifeng Chen; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1100-1109

Abstract


We present a method for hiding video content inside audio files while preserving the perceptual fidelity of the cover audio. This is a form of cross-modal steganography and is particularly challenging due to the high bitrate of video. Our scheme uses recent advances in flow-based generative models, which enable mapping audio to latent codes such that nearby codes correspond to perceptually similar signals. We show that compressed video data can be concealed in the latent codes of audio sequences while preserving the fidelity of both the hidden video and the cover audio. We can embed 128x128 video inside same-duration audio, or higher-resolution video inside longer audio sequences. Quantitative experiments show that our approach outperforms relevant baselines in steganographic capacity and fidelity.

Related Material


[pdf]
[bibtex]
@InProceedings{Yang_2019_ICCV,
author = {Yang, Hyukryul and Ouyang, Hao and Koltun, Vladlen and Chen, Qifeng},
title = {Hiding Video in Audio via Reversible Generative Models},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}