Sample complexity of asynchronous Q-learning: Sharper analysis and variance reduction

G Li, Y Wei, Y Chi, Y Gu… - Advances in neural …, 2020 - proceedings.neurips.cc
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a
Markov decision process (MDP), based on a single trajectory of Markovian samples induced …

Stability and generalization for markov chain stochastic gradient methods

P Wang, Y Lei, Y Ying, DX Zhou - Advances in Neural …, 2022 - proceedings.neurips.cc
Recently there is a large amount of work devoted to the study of Markov chain stochastic
gradient methods (MC-SGMs) which mainly focus on their convergence analysis for solving …

First order methods with markovian noise: from acceleration to variational inequalities

A Beznosikov, S Samsonov… - Advances in …, 2024 - proceedings.neurips.cc
This paper delves into stochastic optimization problems that involve Markovian noise. We
present a unified approach for the theoretical analysis of first-order gradient methods for …

Least squares regression with markovian data: Fundamental limits and algorithms

D Nagaraj, X Wu, G Bresler, P Jain… - Advances in neural …, 2020 - proceedings.neurips.cc
We study the problem of least squares linear regression where the datapoints are
dependent and are sampled from a Markov chain. We establish sharp information theoretic …

On the decentralized stochastic gradient descent with markov chain sampling

T Sun, D Li, B Wang - IEEE Transactions on Signal Processing, 2023 - ieeexplore.ieee.org
The decentralized stochastic gradient method emerges as a promising solution for solving
large-scale machine learning problems. This paper studies the decentralized Markov chain …

Federated learning under heterogeneous and correlated client availability

A Rodio, F Faticanti, O Marfoq, G Neglia… - IEEE INFOCOM 2023 …, 2023 - ieeexplore.ieee.org
The enormous amount of data produced by mobile and IoT devices has motivated the
development of federated learning (FL), a framework allowing such devices (or clients) to …

Finite-time analysis of markov gradient descent

TT Doan - IEEE Transactions on Automatic Control, 2022 - ieeexplore.ieee.org
Motivated by its broad applications in system identifications, stochastic control, and machine
learning, we study the popular stochastic gradient descent (SGD) when the gradient …

Sample complexity of asynchronous Q-learning: Sharper analysis and variance reduction

G Li, Y Wei, Y Chi, Y Gu, Y Chen - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a
Markov decision process (MDP), based on a single trajectory of Markovian samples induced …

Online target q-learning with reverse experience replay: Efficiently finding the optimal policy for linear mdps

N Agarwal, S Chaudhuri, P Jain, D Nagaraj… - arXiv preprint arXiv …, 2021 - arxiv.org
Q-learning is a popular Reinforcement Learning (RL) algorithm which is widely used in
practice with function approximation (Mnih et al., 2015). In contrast, existing theoretical …

A two-time-scale stochastic optimization framework with applications in control and reinforcement learning

S Zeng, TT Doan, J Romberg - SIAM Journal on Optimization, 2024 - SIAM
We study a new two-time-scale stochastic gradient method for solving optimization
problems, where the gradients are computed with the aid of an auxiliary variable under …