Single-objective and multi-objective optimization for variance counterbalancing in stochastic learning

DG Triantali, KE Parsopoulos, IE Lagaris - Applied Soft Computing, 2023 - Elsevier
Applied Soft Computing, 2023Elsevier
Artificial neural networks have proved to be useful in a host of demanding applications,
therefore becoming increasingly important in science and engineering. Large-scale
problems constitute a challenging task for training neural networks using the stochastic
gradient descent method and variations, which are based on the random selection of mini-
batches of training points at every iteration. The challenge lies on the mandatory use of
diminishing search step sizes in order to retain mild error fluctuations throughout the training …
Abstract
Artificial neural networks have proved to be useful in a host of demanding applications, therefore becoming increasingly important in science and engineering. Large-scale problems constitute a challenging task for training neural networks using the stochastic gradient descent method and variations, which are based on the random selection of mini-batches of training points at every iteration. The challenge lies on the mandatory use of diminishing search step sizes in order to retain mild error fluctuations throughout the training set preserving so the quality of the network’s generalization capability. Variance counterbalancing was recently proposed as a remedy for addressing the diminishing step sizes in neural network training using stochastic gradient methods. It is based on the concurrent minimization of the average mean squared error of the network along with the error variance over random sets of mini-batches. Also, it promotes the use of advanced optimization algorithms instead of the slowly convergent gradient descent. The present work aims at enriching our understanding of the original variance counterbalancing approach, as well as reformulating it as a multi-objective problem by taking advantage of its bi-objective nature. Experimental analysis reveals the performances of the studied approaches and their competitive edge over the established Adam method.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果