Prompt the Missing: Prompt-Based Robust Audio-Visual Classification under Uncertain Modalities

Eunju Park; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2025, pp. 1654-1662

Abstract


Audio-visual classification models typically assume the availability of all modalities at inference time. However, real-world conditions often result in missing or corrupted modalities due to noise, sensor failures, or transmission errors. To address this challenge, we propose Prompt the Missing, a lightweight and robust framework that leverages prompt learning to adaptively handle uncertain modality availability. Our method introduces learnable prompt tokens at both the input and attention levels, enabling dynamic adjustment to various degradation scenarios without modifying the backbone. We further employ a case-wise training strategy that simulates diverse missing-modality conditions, allowing the model to generalize effectively. Experiments on UrbanSound8K-AV and CIFAR10-AV show that our approach matches full fine-tuning performance under complete inputs, and significantly outperforms existing baselines under missing-modality settings--achieving up to +10.4% accuracy gain while reducing training time by 96% and memory usage by 82.3%. Our model also consistently surpasses parameter-efficient tuning methods such as LoRA and Adapter, with ablation studies confirming the effectiveness of our prompt design, fusion mechanisms, and prompt length choices. Notably, even under Concat-based evaluation--where degradation types are unknown--our method outperforms full fine-tuning, demonstrating strong generalization and deployment readiness. Code is available at https://github.com/pej0918/Prompt-The-Missing.

Related Material


[pdf]
[bibtex]
@InProceedings{Park_2025_CVPR, author = {Park, Eunju}, title = {Prompt the Missing: Prompt-Based Robust Audio-Visual Classification under Uncertain Modalities}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2025}, pages = {1654-1662} }