Plastic and Stable Gated Classifiers for Continual Learning
Conventional neural networks are mostly high in plasticity but low in stability. Hence, catastrophic forgetting tends to occur over the sequential training of multiple tasks and a backbone learner loses its ability in solving a previously learnt task. Several studies have shown that catastrophic forgetting can be partially mitigated through freezing the feature extractor weights while only sequentially training the classifier network. Though these are effective methods in retaining knowledge, forgetting could still become severe if the classifier network is over-parameterised over many tasks. As a remedy, this paper presents a novel classifier design with high stability. Highway-Connection Classifier Networks (HCNs) leverage gated units to alleviate forgetting. When employed alone, they exhibit strong robustness against forgetting. In addition, they synergise well with many existing and popular continual learning archetypes.