-
[pdf]
[supp]
[bibtex]@InProceedings{Biswas_2025_WACV, author = {Biswas, Koushik and Reza, Amit and Karri, Meghana and Jha, Debesh and Pan, Hongyi and Tomar, Nikhil and Subedi, Aliza and Regmi, Smriti and Bagci, Ulas}, title = {Optimizing Neural Network Effectiveness via Non-Monotonicity Refinement}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {4300-4309} }
Optimizing Neural Network Effectiveness via Non-Monotonicity Refinement
Abstract
Activation functions play a crucial role in artificial neural networks by introducing non-linearities that enable networks to learn complex patterns in data. An appropriate choice of an activation function plays a crucial role in the training dynamics of a neural network which can boost network performance significantly. Rectified Linear Unit (ReLU) and its variants like leaky ReLU and parametric ReLU have emerged as the most popular activations due to their ability to enable faster training and generalization in deep neural networks despite having some significant issues like vanishing gradient problems. In this paper we have proposed smooth functions which we call the AMSU family which are smooth approximations of the maximum function. We derive three activations from the AMSU family namely AMSU-1 AMSU-2 & AMSU-3 and show their effectiveness in different deep learning problems. By simply replacing the ReLU function Top-1 accuracy improves by 5.88% 5.96% and 5.32% on the CIFAR100 dataset on the ShuffleNet V2 model. Also replacing ReLU with AMSU-1 AMSU-2 and AMSU-3 Top-1 accuracy improves by 8.50% 8.29% and 7.70% on the CIFAR100 dataset on the ShuffleNet V2 model with FGSM attack. Also Replacing ReLU with AMSU-1 AMSU-2 and AMSU-3 on ImageNet-1K data we got 3%-5% improvement on ShuffleNet and MobileNet models. The source code is publicly available at https://github.com/koushik313/AMSU.
Related Material