Mastering complex control in moba games with deep reinforcement learning

D Ye, Z Liu, M Sun, B Shi, P Zhao, H Wu, H Yu… - Proceedings of the AAAI …, 2020 - aaai.org
We study the reinforcement learning problem of complex action control in the Multi-player
Online Battle Arena (MOBA) 1v1 games. This problem involves far more complicated state …

Towards playing full moba games with deep reinforcement learning

D Ye, G Chen, W Zhang, S Chen… - Advances in …, 2020 - proceedings.neurips.cc
MOBA games, eg, Honor of Kings, League of Legends, and Dota 2, pose grand challenges
to AI systems such as multi-agent, enormous state-action space, complex action control, etc …

Supervised learning achieves human-level performance in moba games: A case study of honor of kings

D Ye, G Chen, P Zhao, F Qiu, B Yuan… - … on Neural Networks …, 2020 - ieeexplore.ieee.org
We present JueWu-SL, the first supervised-learning-based artificial intelligence (AI) program
that achieves human-level performance in playing multiplayer online battle arena (MOBA) …

Striving for simplicity and performance in off-policy DRL: Output normalization and non-uniform sampling

C Wang, Y Wu, Q Vuong… - … Conference on Machine …, 2020 - proceedings.mlr.press
We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art
performance but are also simple and minimalistic. For standard continuous control …

Regularly updated deterministic policy gradient algorithm

S Han, W Zhou, S Lü, J Yu - Knowledge-Based Systems, 2021 - Elsevier
Abstract Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known
reinforcement learning methods. However, this method is inefficient and unstable in practical …

Factored policy gradients: Leveraging structure for efficient learning in MOMDPs

T Spooner, N Vadori, S Ganesh - Advances in Neural …, 2021 - proceedings.neurips.cc
Policy gradient methods can solve complex tasks but often fail when the dimensionality of
the action-space or objective multiplicity grow very large. This occurs, in part, because the …

Guided Proximal Policy Optimization with Structured Action Graph for Complex Decision-making

Y Yang, D Xing, W Xia, P Wang - Machine Intelligence Research, 2025 - Springer
Reinforcement learning encounters formidable challenges when tasked with intricate
decision-making scenarios, primarily due to the expansive parameterized action spaces and …

Towards simplicity in deep reinforcement learning: Streamlined off-policy learning

C Wang, Y Wu, Q Vuong, K Ross - 2019 - openreview.net
The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity
of maximum entropy reinforcement learning algorithms. Their popularity stems from the …

[图书][B] Sample-Efficient Deep Reinforcement Learning for Continuous Control

C Wang - 2023 - search.proquest.com
Sample efficiency in deep reinforcement learning (DRL) is measured by the amount of new
data collected to learn a task. It is one of the most important research topics in DRL …

Neural Ordinary Differential Equation Value Networks for Parametrized Action Spaces

S Massaroli, M Poli, S Bakhtiyarov… - ICLR 2020 Workshop …, 2020 - openreview.net
Action spaces equipped with parameter sets are a common occurrence in reinforcement
learning applications. Solutions to problems of this class have been developed under …