A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org
Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Fine-tuning language models with just forward passes

S Malladi, T Gao, E Nichani… - Advances in …, 2023 - proceedings.neurips.cc
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but
as LMs grow in size, backpropagation requires a prohibitively large amount of memory …

Faster single-loop algorithms for minimax optimization without strong concavity

J Yang, A Orvieto, A Lucchi… - … Conference on Artificial …, 2022 - proceedings.mlr.press
Gradient descent ascent (GDA), the simplest single-loop algorithm for nonconvex minimax
optimization, is widely used in practical applications such as generative adversarial …

Stochastic gradient descent-ascent: Unified theory and new efficient methods

A Beznosikov, E Gorbunov… - International …, 2023 - proceedings.mlr.press
Abstract Stochastic Gradient Descent-Ascent (SGDA) is one of the most prominent
algorithms for solving min-max optimization and variational inequalities problems (VIP) …

Simfbo: Towards simple, flexible and communication-efficient federated bilevel learning

Y Yang, P Xiao, K Ji - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Federated bilevel optimization (FBO) has shown great potential recently in machine learning
and edge computing due to the emerging nested optimization structure in meta-learning …

Accelerated zeroth-order and first-order momentum methods from mini to minimax optimization

F Huang, S Gao, J Pei, H Huang - Journal of Machine Learning Research, 2022 - jmlr.org
In the paper, we propose a class of accelerated zeroth-order and first-order momentum
methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we …

Communication-efficient federated hypergradient computation via aggregated iterative differentiation

P Xiao, K Ji - International Conference on Machine Learning, 2023 - proceedings.mlr.press
Federated bilevel optimization has attracted increasing attention due to emerging machine
learning and communication applications. The biggest challenge lies in computing the …

Global convergence to local minmax equilibrium in classes of nonconvex zero-sum games

T Fiez, L Ratliff, E Mazumdar… - Advances in Neural …, 2021 - proceedings.neurips.cc
We study gradient descent-ascent learning dynamics with timescale separation ($\tau $-
GDA) in unconstrained continuous action zero-sum games where the minimizing player …

Zeroth-order alternating gradient descent ascent algorithms for a class of nonconvex-nonconcave minimax problems

Z Xu, ZQ Wang, JL Wang, YH Dai - Journal of Machine Learning Research, 2023 - jmlr.org
In this paper, we consider a class of nonconvex-nonconcave minimax problems, ie, NC-PL
minimax problems, whose objective functions satisfy the Polyak-Łojasiewicz (PL) condition …

Adagda: Faster adaptive gradient descent ascent methods for minimax optimization

F Huang, X Wu, Z Hu - International Conference on Artificial …, 2023 - proceedings.mlr.press
In the paper, we propose a class of faster adaptive Gradient Descent Ascent (GDA) methods
for solving the nonconvex-strongly-concave minimax problems by using the unified adaptive …