Communication efficient parallel reinforcement learning

M Chen, D Gündüz, K Huang, W Saad… - IEEE Journal on …, 2021 - ieeexplore.ieee.org

The next-generation of wireless networks will enable many machine learning (ML) tools and
applications to efficiently analyze various types of data collected by edge devices for …

被引用次数：501 相关文章所有 14 个版本

[PDF] arxiv.org

Improved communication efficiency in federated natural policy gradient via admm-based gradient updates

G Lan, H Wang, J Anderson, C Brinton… - arXiv preprint arXiv …, 2023 - arxiv.org

Federated reinforcement learning (FedRL) enables agents to collaboratively train a global
policy without sharing their individual data. However, high communication overhead …

被引用次数：26 相关文章所有 7 个版本

[PDF] mlr.press

Byzantine-robust online and offline distributed reinforcement learning

Y Chen, X Zhang, K Zhang… - International …, 2023 - proceedings.mlr.press

We consider a distributed reinforcement learning setting where multiple agents separately
explore the environment and communicate their experiences through a central server …

被引用次数：20 相关文章所有 6 个版本

[PDF] jmlr.org

Multi-agent multi-armed bandits with limited communication

M Agarwal, V Aggarwal, K Azizzadenesheli - Journal of Machine Learning …, 2022 - jmlr.org

We consider the problem where N agents collaboratively interact with an instance of a
stochastic K arm bandit problem for K> N. The agents aim to simultaneously minimize the …

被引用次数：45 相关文章所有 6 个版本

[PDF] arxiv.org

Asynchronous federated reinforcement learning with policy gradient updates: Algorithm design and convergence analysis

G Lan, DJ Han, A Hashemi, V Aggarwal… - arXiv preprint arXiv …, 2024 - arxiv.org

To improve the efficiency of reinforcement learning, we propose a novel asynchronous
federated reinforcement learning framework termed AFedPG, which constructs a global …

被引用次数：17 相关文章所有 2 个版本

[PDF] arxiv.org

Federated Q-learning: Linear regret speedup with low communication cost

Z Zheng, F Gao, L Xue, J Yang - arXiv preprint arXiv:2312.15023, 2023 - arxiv.org

In this paper, we consider federated reinforcement learning for tabular episodic Markov
Decision Processes (MDP) where, under the coordination of a central server, multiple …

被引用次数：9 相关文章所有 3 个版本

[PDF] neurips.cc

Society of agents: Regret bounds of concurrent Thompson sampling

Y Chen, P Dong, Q Bai… - Advances in neural …, 2022 - proceedings.neurips.cc

We consider the concurrent reinforcement learning problem where $ n $ agents
simultaneously learn to make decisions in the same environment by sharing experience with …

被引用次数：4 相关文章所有 5 个版本

[PDF] mlr.press

One policy is enough: Parallel exploration with a single policy is near-optimal for reward-free reinforcement learning

P Cisneros-Velarde, B Lyu… - International …, 2023 - proceedings.mlr.press

Although parallelism has been extensively used in Reinforcement Learning (RL), the
quantitative effects of parallel exploration are not well understood theoretically. We study the …

被引用次数：5 相关文章所有 6 个版本

[PDF] ox.ac.uk

Federated reinforcement learning at the edge: Exploring the learning-communication tradeoff

K Gatsis - 2022 European Control Conference (ECC), 2022 - ieeexplore.ieee.org

Modern cyber-physical architectures use data col-lected from systems at different physical
locations to learn appropriate behaviors and adapt to uncertain environments. However, an …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Federated Q-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost

Z Zheng, H Zhang, L Xue - arXiv preprint arXiv:2405.18795, 2024 - arxiv.org

In this paper, we consider model-free federated reinforcement learning for tabular episodic
Markov decision processes. Under the coordination of a central server, multiple agents …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群