Resolving the Stability-Plasticity Dilemma in Reinforcement Learning via Complementary Continual Critics

Bo Sun, Peixi Peng, Guang Tan, Haoran Xu, Yaokun Li, Yiqian Chang, Shuaixian Wang, Luntong Li; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026, pp. 22348-22357

Abstract


This paper proposes the Continual Dual-Critic with Cross-Attention (CD-CCA) framework for visual reinforcement learning to address the plasticity-stability conflict. Our method introduces continual learning techniques into the visual RL architecture, constructing two complementary critics using Continual Backpropagation (CBP) and Elastic Weight Consolidation (EWC) -- one for maintaining representational plasticity for rapid environmental adaptation, and the other for preserving knowledge stability to prevent catastrophic forgetting. Furthermore, we design a cross-attention based fusion mechanism that balances the value estimates from the dual critics according to observation characteristics. Experimental results on DeepMind Control and CARLA benchmarks show that CD-CCA effective mitigates issues of representation drift and policy degradation. Compared to existing visual RL methods, our approach exhibits enhanced robustness and adaptability in non-stationary environments and long-horizon decision-making tasks, providing a new architectural paradigm for the advancement of continual reinforcement learning.

Related Material


[pdf]
[bibtex]
@InProceedings{Sun_2026_CVPR, author = {Sun, Bo and Peng, Peixi and Tan, Guang and Xu, Haoran and Li, Yaokun and Chang, Yiqian and Wang, Shuaixian and Li, Luntong}, title = {Resolving the Stability-Plasticity Dilemma in Reinforcement Learning via Complementary Continual Critics}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2026}, pages = {22348-22357} }