- 学术资源搜索

Mastering complex control in moba games with deep reinforcement learning

D Ye, Z Liu, M Sun, B Shi, P Zhao, H Wu, H Yu… - Proceedings of the AAAI …, 2020 - aaai.org

We study the reinforcement learning problem of complex action control in the Multi-player
Online Battle Arena (MOBA) 1v1 games. This problem involves far more complicated state …

被引用次数：360 相关文章所有 10 个版本

[PDF] neurips.cc

Towards playing full moba games with deep reinforcement learning

D Ye, G Chen, W Zhang, S Chen… - Advances in …, 2020 - proceedings.neurips.cc

MOBA games, eg, Honor of Kings, League of Legends, and Dota 2, pose grand challenges
to AI systems such as multi-agent, enormous state-action space, complex action control, etc …

被引用次数：212 相关文章所有 7 个版本

[PDF] arxiv.org

Supervised learning achieves human-level performance in moba games: A case study of honor of kings

D Ye, G Chen, P Zhao, F Qiu, B Yuan… - … on Neural Networks …, 2020 - ieeexplore.ieee.org

We present JueWu-SL, the first supervised-learning-based artificial intelligence (AI) program
that achieves human-level performance in playing multiplayer online battle arena (MOBA) …

被引用次数：63 相关文章所有 8 个版本

[PDF] mlr.press

Striving for simplicity and performance in off-policy DRL: Output normalization and non-uniform sampling

C Wang, Y Wu, Q Vuong… - … Conference on Machine …, 2020 - proceedings.mlr.press

We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art
performance but are also simple and minimalistic. For standard continuous control …

被引用次数：48 相关文章所有 9 个版本

[PDF] arxiv.org

Regularly updated deterministic policy gradient algorithm

S Han, W Zhou, S Lü, J Yu - Knowledge-Based Systems, 2021 - Elsevier

Abstract Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known
reinforcement learning methods. However, this method is inefficient and unstable in practical …

被引用次数：25 相关文章所有 3 个版本

[PDF] neurips.cc

Factored policy gradients: Leveraging structure for efficient learning in MOMDPs

T Spooner, N Vadori, S Ganesh - Advances in Neural …, 2021 - proceedings.neurips.cc

Policy gradient methods can solve complex tasks but often fail when the dimensionality of
the action-space or objective multiplicity grow very large. This occurs, in part, because the …

被引用次数：12 相关文章所有 9 个版本

[PDF] mi-research.net

Guided Proximal Policy Optimization with Structured Action Graph for Complex Decision-making

Y Yang, D Xing, W Xia, P Wang - Machine Intelligence Research, 2025 - Springer

Reinforcement learning encounters formidable challenges when tasked with intricate
decision-making scenarios, primarily due to the expansive parameterized action spaces and …

Towards simplicity in deep reinforcement learning: Streamlined off-policy learning

C Wang, Y Wu, Q Vuong, K Ross - 2019 - openreview.net

The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity
of maximum entropy reinforcement learning algorithms. Their popularity stems from the …

被引用次数：4 相关文章

[PDF] proquest.com

[图书][B] Sample-Efficient Deep Reinforcement Learning for Continuous Control

C Wang - 2023 - search.proquest.com

Sample efficiency in deep reinforcement learning (DRL) is measured by the amount of new
data collected to learn a task. It is one of the most important research topics in DRL …

[PDF] openreview.net

Neural Ordinary Differential Equation Value Networks for Parametrized Action Spaces

S Massaroli, M Poli, S Bakhtiyarov… - ICLR 2020 Workshop …, 2020 - openreview.net

Action spaces equipped with parameter sets are a common occurrence in reinforcement
learning applications. Solutions to problems of this class have been developed under …

被引用次数：2 相关文章

高级搜索

QQ 群