HaLViT: Half of the Weights are Enough

Onur Can Koyun, Behçet Uğur Töreyin; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 3669-3678

Abstract


Deep learning architectures like Transformers and Convolutional Neural Networks (CNNs) have led to groundbreaking advances across numerous fields. However their extensive need for parameters poses challenges for implementation in environments with limited resources. In our research we propose a strategy that focuses on the utilization of the column and row spaces of weight matrices significantly reducing the number of required model parameters without substantially affecting performance. This technique is applied to both Bottleneck and Attention layers achieving a notable reduction in parameters with minimal impact on model efficacy. Our proposed model HaLViT exemplifies a parameter-efficient Vision Transformer. Through rigorous experiments on the ImageNet dataset and COCO dataset HaLViT's performance validates the effectiveness of our method offering results comparable to those of conventional models.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Koyun_2024_CVPR, author = {Koyun, Onur Can and T\"oreyin, Beh\c{c}et U\u{g}ur}, title = {HaLViT: Half of the Weights are Enough}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {3669-3678} }