Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights

C Fiscko, A Agarwal, Y Ruan, S Kar, L Pileggi… - arXiv preprint arXiv …, 2023 - arxiv.org
We present a stochastic first-order optimization method specialized for deep neural networks
(DNNs), ECCO-DNN. This method models the optimization variable trajectory as a …

Lookahead optimizer: k steps forward, 1 step back

M Zhang, J Lucas, J Ba… - Advances in neural …, 2019 - proceedings.neurips.cc
The vast majority of successful deep neural networks are trained using variants of stochastic
gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly …

Training aware sigmoidal optimizer

D Macêdo, P Dreyer, T Ludermir… - arXiv preprint arXiv …, 2021 - arxiv.org
Proper optimization of deep neural networks is an open research question since an optimal
procedure to change the learning rate throughout training is still unknown. Manually defining …

AdaFisher: Adaptive Second Order Optimization via Fisher Information

DM Gomes, Y Zhang, E Belilovsky, G Wolf… - arXiv preprint arXiv …, 2024 - arxiv.org
First-order optimization methods are currently the mainstream in training deep neural
networks (DNNs). Optimizers like Adam incorporate limited curvature information by …

Sania: Polyak-type optimization framework leads to scale invariant stochastic algorithms

F Abdukhakimov, C Xiang, D Kamzolov… - arXiv preprint arXiv …, 2023 - arxiv.org
Adaptive optimization methods are widely recognized as among the most popular
approaches for training Deep Neural Networks (DNNs). Techniques such as Adam …

Understanding Stochastic Optimization Behavior at the Layer Update Level (Student Abstract)

J Zhang, GX Qiao, A Lopotenco, IT Pan - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Popular first-order stochastic optimization methods for deep neural networks (DNNs) are
usually either accelerated schemes (eg stochastic gradient descent (SGD) with momentum) …

PID controller-based stochastic optimization acceleration for deep neural networks

H Wang, Y Luo, W An, Q Sun, J Xu… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
Deep neural networks (DNNs) are widely used and demonstrated their power in many
applications, such as computer vision and pattern recognition. However, the training of these …

Adapid: An adaptive pid optimizer for training deep neural networks

B Weng, J Sun, A Sadeghi… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions
in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization algorithms …

Ddpnopt: Differential dynamic programming neural optimizer

GH Liu, T Chen, EA Theodorou - arXiv preprint arXiv:2002.08809, 2020 - arxiv.org
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with
nonlinear dynamical systems has received considerable attention recently, yet the …

Deepobs: A deep learning optimizer benchmark suite

F Schneider, L Balles, P Hennig - arXiv preprint arXiv:1903.05499, 2019 - arxiv.org
Because the choice and tuning of the optimizer affects the speed, and ultimately the
performance of deep learning, there is significant past and recent research in this area. Yet …