MFFGD: An adaptive Caputo fractional-order gradient algorithm for DNN

Z Huang, S Mao, Y Yang - Neurocomputing, 2024 - Elsevier
As a primary optimization method for neural networks, gradient descent algorithm has
received significant attention in the recent development of deep neural networks. However …

Decentralized Gradient-Free Methods for Stochastic Non-smooth Non-convex Optimization

Z Lin, J Xia, Q Deng, L Luo - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
We consider decentralized gradient-free optimization of minimizing Lipschitz continuous
functions that satisfy neither smoothness nor convexity assumption. We propose two novel …

On the Hardness of Meaningful Local Guarantees in Nonsmooth Nonconvex Optimization

G Kornowski, S Padmanabhan, O Shamir - arXiv preprint arXiv …, 2024 - arxiv.org
We study the oracle complexity of nonsmooth nonconvex optimization, with the algorithm
assumed to have access only to local function information. It has been shown by Davis …

General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization

K Ahn, G Magakyan, A Cutkosky - arXiv preprint arXiv:2411.07061, 2024 - arxiv.org
This work investigates the effectiveness of schedule-free methods, developed by A. Defazio
et al.(NeurIPS 2024), in nonconvex optimization settings, inspired by their remarkable …

Bilevel Optimization without Lower-Level Strong Convexity from the Hyper-Objective Perspective

L Chen, J Xu, J Zhang - arXiv preprint arXiv:2301.00712, 2023 - arxiv.org
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems,
such as hyperparameter tuning and meta-learning. A common goal in bilevel optimization is …

Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization

E Sahinoglu, S Shahrampour - arXiv preprint arXiv:2406.01484, 2024 - arxiv.org
We investigate the finite-time analysis of finding ($\delta,\epsilon $)-stationary points for
nonsmooth nonconvex objectives in decentralized stochastic optimization. A set of agents …

First-Order Methods for Linearly Constrained Bilevel Optimization

G Kornowski, S Padmanabhan, K Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Algorithms for bilevel optimization often encounter Hessian computations, which are
prohibitive in high dimensions. While recent works offer first-order methods for …

Improved Sample Complexity for Private Nonsmooth Nonconvex Optimization

G Kornowski, D Liu, K Talwar - arXiv preprint arXiv:2410.05880, 2024 - arxiv.org
We study differentially private (DP) optimization algorithms for stochastic and empirical
objectives which are neither smooth nor convex, and propose methods that return a …

Unlocking TriLevel Learning with Level-Wise Zeroth Order Constraints: Distributed Algorithms and Provable Non-Asymptotic Convergence

Y Jiao, K Yang, C Jian - arXiv preprint arXiv:2412.07138, 2024 - arxiv.org
Trilevel learning (TLL) found diverse applications in numerous machine learning
applications, ranging from robust hyperparameter optimization to domain adaptation …

Quantum Algorithms for Non-smooth Non-convex Optimization

C Liu, C Guan, J He, J Lui - arXiv preprint arXiv:2410.16189, 2024 - arxiv.org
This paper considers the problem for finding the $(\delta,\epsilon) $-Goldstein stationary
point of Lipschitz continuous objective, which is a rich function class to cover a great number …