Training Dynamics Aware Neural Network Optimization with Stabilization

Zilin Fang, Mohamad Shahbazi, Thomas Probst, Danda Pani Paudel, Luc Van Gool; Proceedings of the Asian Conference on Computer Vision (ACCV), 2022, pp. 4276-4292

Abstract


We investigate the process of neural network training using gradient descent-based optimizers from a dynamic system point of view. To this end, we model the iterative parameter updates as a time-discrete switched linear system and analyze its stability behavior over the course of training. Accordingly, we develop a regularization scheme to encourage stable training dynamics by penalizing divergent parameter updates. Our experiments show promising stabilization and convergence effects on regression tasks, density-based crowd counting, and generative adversarial networks (GAN). Our results indicate that stable network training minimizes the variance of performance across different parameter initializations, and increases robustness to the choice of learning rate. Particularly in the GAN setup, the stability regularization enables faster convergence and lower FID with more consistency across runs. Our source code is available at: https://github.com/fangzl123/stableTrain.git

Related Material


[pdf] [code]
[bibtex]
@InProceedings{Fang_2022_ACCV, author = {Fang, Zilin and Shahbazi, Mohamad and Probst, Thomas and Paudel, Danda Pani and Van Gool, Luc}, title = {Training Dynamics Aware Neural Network Optimization with Stabilization}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {December}, year = {2022}, pages = {4276-4292} }