Towards playing full moba games with deep reinforcement learning

M Nakamoto, S Zhai, A Singh… - Advances in …, 2024 - proceedings.neurips.cc

A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization
from existing datasets followed by fast online fine-tuning with limited interaction. However …

被引用次数：98 相关文章所有 7 个版本

[PDF] arxiv.org

Open problems in cooperative ai

A Dafoe, E Hughes, Y Bachrach, T Collins… - arXiv preprint arXiv …, 2020 - arxiv.org

Problems of cooperation--in which agents seek ways to jointly improve their welfare--are
ubiquitous and important. They can be found at scales ranging from our daily routines--such …

被引用次数：253 相关文章所有 3 个版本

[PDF] mlr.press

Douzero: Mastering doudizhu with self-play deep reinforcement learning

D Zha, J Xie, W Ma, S Zhang, X Lian… - … on machine learning, 2021 - proceedings.mlr.press

Games are abstractions of the real world, where artificial agents learn to compete and
cooperate with other agents. While significant achievements have been made in various …

被引用次数：145 相关文章所有 6 个版本

[PDF] mlr.press

Scalable evaluation of multi-agent reinforcement learning with melting pot

JZ Leibo, EA Dueñez-Guzman… - International …, 2021 - proceedings.mlr.press

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess
generalization to novel situations as their primary objective (unlike supervised learning …

被引用次数：98 相关文章所有 6 个版本

[PDF] neurips.cc

Unpacking reward shaping: Understanding the benefits of reward engineering on sample complexity

A Gupta, A Pacchiano, Y Zhai… - Advances in Neural …, 2022 - proceedings.neurips.cc

The success of reinforcement learning in a variety of challenging sequential decision-
making problems has been much discussed, but often ignored in this discussion is the …

被引用次数：66 相关文章所有 8 个版本

[PDF] arxiv.org

Learning safe control for multi-robot systems: Methods, verification, and open challenges

K Garg, S Zhang, O So, C Dawson, C Fan - Annual Reviews in Control, 2024 - Elsevier

In this survey, we review the recent advances in control design methods for robotic multi-
agent systems (MAS), focusing on learning-based methods with safety considerations. We …

被引用次数：12 相关文章所有 4 个版本

[PDF] mdpi.com

A review: machine learning for combinatorial optimization problems in energy areas

X Yang, Z Wang, H Zhang, N Ma, N Yang, H Liu… - Algorithms, 2022 - mdpi.com

Combinatorial optimization problems (COPs) are a class of NP-hard problems with great
practical significance. Traditional approaches for COPs suffer from high computational time …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on transformers in reinforcement learning

W Li, H Luo, Z Lin, C Zhang, Z Lu, D Ye - arXiv preprint arXiv:2301.03044, 2023 - arxiv.org

Transformer has been considered the dominating neural architecture in NLP and CV, mostly
under supervised settings. Recently, a similar surge of using Transformers has appeared in …

被引用次数：69 相关文章所有 3 个版本

Attacking deep reinforcement learning with decoupled adversarial policy

K Mo, W Tang, J Li, X Yuan - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

While Deep Reinforcement Learning (DRL) has achieved outstanding performance in
extensive applications, exploiting its vulnerability with adversarial attacks is essential …

被引用次数：79 相关文章所有 2 个版本

[PDF] neurips.cc

Towards unifying behavioral and response diversity for open-ended learning in zero-sum games

X Liu, H Jia, Y Wen, Y Hu, Y Chen… - Advances in …, 2021 - proceedings.neurips.cc

Measuring and promoting policy diversity is critical for solving games with strong non-
transitive dynamics where strategic cycles exist, and there is no consistent winner (eg, Rock …

被引用次数：51 相关文章所有 5 个版本

高级搜索

QQ 群