Distributed learning in wireless networks: Recent progress and future challenges

M Chen, D Gündüz, K Huang, W Saad… - IEEE Journal on …, 2021 - ieeexplore.ieee.org
The next-generation of wireless networks will enable many machine learning (ML) tools and
applications to efficiently analyze various types of data collected by edge devices for …

Improved communication efficiency in federated natural policy gradient via admm-based gradient updates

G Lan, H Wang, J Anderson, C Brinton… - arXiv preprint arXiv …, 2023 - arxiv.org
Federated reinforcement learning (FedRL) enables agents to collaboratively train a global
policy without sharing their individual data. However, high communication overhead …

Byzantine-robust online and offline distributed reinforcement learning

Y Chen, X Zhang, K Zhang… - International …, 2023 - proceedings.mlr.press
We consider a distributed reinforcement learning setting where multiple agents separately
explore the environment and communicate their experiences through a central server …

Multi-agent multi-armed bandits with limited communication

M Agarwal, V Aggarwal, K Azizzadenesheli - Journal of Machine Learning …, 2022 - jmlr.org
We consider the problem where N agents collaboratively interact with an instance of a
stochastic K arm bandit problem for K> N. The agents aim to simultaneously minimize the …

Asynchronous federated reinforcement learning with policy gradient updates: Algorithm design and convergence analysis

G Lan, DJ Han, A Hashemi, V Aggarwal… - arXiv preprint arXiv …, 2024 - arxiv.org
To improve the efficiency of reinforcement learning, we propose a novel asynchronous
federated reinforcement learning framework termed AFedPG, which constructs a global …

Federated Q-learning: Linear regret speedup with low communication cost

Z Zheng, F Gao, L Xue, J Yang - arXiv preprint arXiv:2312.15023, 2023 - arxiv.org
In this paper, we consider federated reinforcement learning for tabular episodic Markov
Decision Processes (MDP) where, under the coordination of a central server, multiple …

Society of agents: Regret bounds of concurrent Thompson sampling

Y Chen, P Dong, Q Bai… - Advances in neural …, 2022 - proceedings.neurips.cc
We consider the concurrent reinforcement learning problem where $ n $ agents
simultaneously learn to make decisions in the same environment by sharing experience with …

One policy is enough: Parallel exploration with a single policy is near-optimal for reward-free reinforcement learning

P Cisneros-Velarde, B Lyu… - International …, 2023 - proceedings.mlr.press
Although parallelism has been extensively used in Reinforcement Learning (RL), the
quantitative effects of parallel exploration are not well understood theoretically. We study the …

Federated reinforcement learning at the edge: Exploring the learning-communication tradeoff

K Gatsis - 2022 European Control Conference (ECC), 2022 - ieeexplore.ieee.org
Modern cyber-physical architectures use data col-lected from systems at different physical
locations to learn appropriate behaviors and adapt to uncertain environments. However, an …

Federated Q-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost

Z Zheng, H Zhang, L Xue - arXiv preprint arXiv:2405.18795, 2024 - arxiv.org
In this paper, we consider model-free federated reinforcement learning for tabular episodic
Markov decision processes. Under the coordination of a central server, multiple agents …