Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm

Liu, Zechun; Wu, Baoyuan; Luo, Wenhan; Yang, Xin; Liu, Wei; Cheng, Kwang-Ting

Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, Kwang-Ting Cheng; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 722-737

Abstract

In this work, we study the 1-bit convolutional neural networks (CNNs), of which both the weights and activations are binary. While being efficient, the classification accuracy of the current 1-bit CNNs is much worse compared with their counterpart real-valued CNN models on the large-scale dataset, like ImageNet. To shrink the performance gap between the 1-bit and real-valued CNN models, we propose a novel model, dubbed Bi-Real net, which connects the real activations (after the 1-bit convolution and/or BatchNorm layer, before the sign function) to that of the consecutive block, through an identity shortcut. Consequently, compared to the standard 1-bit CNN, the representational capability of the Bi-Real net is significantly enhanced, only with a negligible additional cost on computation. Moreover, we develop a specific training algorithm including three technical novelties for 1-bit CNNs. First, we derive a tight approximation to the derivative of the non-differentiable sign function with respect to activation. Second, we propose a magnitude-aware gradient with respect to weight to update the weight parameter. Last, we pre-train the real-valued CNN model with a clip function, rather than the ReLU function, to provide a better initialization for Bi-Real net. Experiments on ImageNet show that the Bi-Real net with proposed training algorithm achieves 56.4% and 62.2% top-1 accuracy with 18 layers and 34 layers, respectively, and achieves up to 23.9X memory saving and 17.0X computational reduction.

Related Material

[pdf]

[bibtex]

@InProceedings{Liu_2018_ECCV,
author = {Liu, Zechun and Wu, Baoyuan and Luo, Wenhan and Yang, Xin and Liu, Wei and Cheng, Kwang-Ting},
title = {Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}