Momentum Contrastive Pruning
Momentum contrast (MoCo) for unsupervised visual representation learning has a close performance to supervised learning, but it sometimes possesses excess parameters. Extracting a subnetwork from an over-parameterized unsupervised network without sacrificing performance is of particular interest to accelerate inference speed. Typical pruning methods are not applicable for MoCo, because in the fine-tune stage after pruning, the slow update of the momentum encoder will undermine the pretrained encoder. In this paper, we propose a Momentum Contrastive Pruning (MCP) method, which prunes the momentum encoder instead to obtain a momentum subnet. It maintains an un-pruned momentum encoder as a smooth transition scheme to alleviate the representation gap between the encoder and momentum subnet. To fulfill the sparsity requirements of the encoder, alternating direction method of multipliers (ADMM) is adopted. Experiments prove that our MCP method can obtain a momentum subnet that has almost equal performance as the over-parameterized MoCo when transferred to downstream tasks, meanwhile has much less parameters and float operations per second (FLOPs).