Beyond Gradient Descent for Regularized Segmentation Losses

Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10187-10196

Abstract


The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a "smoother" tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in "shallow" segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods.

Related Material


[pdf]
[bibtex]
@InProceedings{Marin_2019_CVPR,
author = {Marin, Dmitrii and Tang, Meng and Ayed, Ismail Ben and Boykov, Yuri},
title = {Beyond Gradient Descent for Regularized Segmentation Losses},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}