Learning by Taking Notes: Memory-Guided Continual Learning for Generative Multimodal Models

Yanhui Guo, Chenghuan Guo, Yan Gao, Yi Sun; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 4400-4410

Abstract


In the era of large models, emerging Generative Vision-Language Models (VLMs) have exhibited impressive zero-shot learning capabilities on generative tasks by leveraging knowledge acquired through pre-training on large-scale datasets. However, for specific downstream tasks such as classification and detection, VLMs require either prompt engineering with carefully crafted task-specific instructions or fine-tuning to align with task objectives and suppress hallucinations. These issues are further exacerbated under continual learning (CL) settings. In gradient-free in-context learning, generalization to novel tasks relies heavily on prompt design, which may be suboptimal or unavailable at test time. In contrast, gradient-based sequential fine-tuning across tasks tends to intensify hallucination due to the well-known phenomenon of catastrophic forgetting, a fundamental challenge in CL paradigms. To address these challenges, we propose OME, an optimal memory transport framework that leverages generative VLMs for CL. To mitigate hallucinations and alleviate catastrophic forgetting, OME integrates memory prompts and employs a lightweight adapter network, while maintaining a memorandum module to store task-relevant meta-information. Empirical results demonstrate that OME consistently outperforms state-of-the-art approaches across a range of challenging CL benchmarks, including both few-shot and conventional class-incremental learning scenarios.

Related Material


[pdf]
[bibtex]
@InProceedings{Guo_2025_ICCV, author = {Guo, Yanhui and Guo, Chenghuan and Gao, Yan and Sun, Yi}, title = {Learning by Taking Notes: Memory-Guided Continual Learning for Generative Multimodal Models}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2025}, pages = {4400-4410} }