AdaSwarm: Augmenting gradient-based optimizers in deep learning with swarm intelligence

R Mohapatra, S Saha, CAC Coello… - … on Emerging Topics …, 2021 - ieeexplore.ieee.org
This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even
better performance than the Adam optimizer adopted in neural networks. In order to support …

Xgrad: Boosting gradient-based optimizers with weight prediction

L Guan, D Li, Y Shi, J Meng - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
In this paper, we propose a general deep learning training framework XGrad which
introduces weight prediction into the popular gradient-based optimizers to boost their …

Optimization methods in deep learning: A comprehensive overview

D Shulman - arXiv preprint arXiv:2302.09566, 2023 - arxiv.org
In recent years, deep learning has achieved remarkable success in various fields such as
image recognition, natural language processing, and speech recognition. The effectiveness …

Normalized direction-preserving adam

Z Zhang, L Ma, Z Li, C Wu - arXiv preprint arXiv:1709.04546, 2017 - arxiv.org
Adaptive optimization algorithms, such as Adam and RMSprop, have shown better
optimization performance than stochastic gradient descent (SGD) in some scenarios …

Adanorm: adaptive gradient norm correction based optimizer for cnns

SR Dubey, SK Singh… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The stochastic gradient descent (SGD) optimizers are generally used to train the
convolutional neural networks (CNNs). In recent years, several adaptive momentum based …

AdamL: A fast adaptive gradient method incorporating loss function

L Xia, S Massei - arXiv preprint arXiv:2312.15295, 2023 - arxiv.org
Adaptive first-order optimizers are fundamental tools in deep learning, although they may
suffer from poor generalization due to the nonuniform gradient scaling. In this work, we …

[PDF][PDF] Comparative analysis of optimizers in deep neural networks

C Desai - International Journal of Innovative Science and …, 2020 - researchgate.net
The role of optimizer in deep neural networks model impacts the accuracy of the model.
Deep learning comes under the umbrella of parametric approaches; however, it tries to relax …

Hyperadam: A learnable task-adaptive adam for network training

S Wang, J Sun, Z Xu - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org
Deep neural networks are traditionally trained using humandesigned stochastic optimization
algorithms, such as SGD and Adam. Recently, the approach of learning to optimize network …

DiffMoment: an adaptive optimization technique for convolutional neural network

S Bhakta, U Nandi, T Si, SK Ghosal, C Changdar… - Applied …, 2023 - Springer
Abstract Stochastic Gradient Decent (SGD) is a very popular basic optimizer applied in the
learning algorithms of deep neural networks. However, it has fixed-sized steps for every …

Effective neural network training with a new weighting mechanism-based optimization algorithm

Y Yu, F Liu - IEEE Access, 2019 - ieeexplore.ieee.org
First-order gradient-based optimization algorithms have been of core practical importance in
the field of deep learning. In this paper, we propose a new weighting mechanism-based first …