Discrete sequential prediction of continuous actions for deep rl

K Arulkumaran, MP Deisenroth… - IEEE Signal …, 2017 - ieeexplore.ieee.org

Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence
(AI) and represents a step toward building autonomous systems with a higher-level …

被引用次数：3491 相关文章所有 6 个版本

[PDF] arxiv.org

A brief survey of deep reinforcement learning

K Arulkumaran, MP Deisenroth, M Brundage… - arXiv preprint arXiv …, 2017 - arxiv.org

Deep reinforcement learning is poised to revolutionise the field of AI and represents a step
towards building autonomous systems with a higher level understanding of the visual world …

被引用次数：1042 相关文章所有 12 个版本

[PDF] mlr.press

Q-transformer: Scalable offline reinforcement learning via autoregressive q-functions

Y Chebotar, Q Vuong, K Hausman… - … on Robot Learning, 2023 - proceedings.mlr.press

In this work, we present a scalable reinforcement learning method for training multi-task
policies from large offline datasets that can leverage both human demonstrations and …

被引用次数：52 相关文章所有 6 个版本

Grandmaster level in StarCraft II using multi-agent reinforcement learning

O Vinyals, I Babuschkin, WM Czarnecki, M Mathieu… - nature, 2019 - nature.com

Many real-world applications require artificial agents to compete and coordinate with other
agents in complex environments. As a stepping stone to this goal, the domain of StarCraft …

被引用次数：4598 相关文章所有 11 个版本

[PDF] neurips.cc

Hindsight experience replay

M Andrychowicz, F Wolski, A Ray… - Advances in neural …, 2017 - proceedings.neurips.cc

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning
(RL). We present a novel technique called Hindsight Experience Replay which allows …

被引用次数：2854 相关文章所有 6 个版本

[PDF] neurips.cc

A generalized algorithm for multi-objective reinforcement learning and policy adaptation

R Yang, X Sun, K Narasimhan - Advances in neural …, 2019 - proceedings.neurips.cc

We introduce a new algorithm for multi-objective reinforcement learning (MORL) with linear
preferences, with the goal of enabling few-shot adaptation to new tasks. In MORL, the aim is …

被引用次数：277 相关文章所有 11 个版本

[PDF] arxiv.org

Deep reinforcement learning based volt-var optimization in smart distribution systems

Y Zhang, X Wang, J Wang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

This paper develops a model-free volt-VAR optimization (VVO) algorithm via multi-agent
deep reinforcement learning (DRL) in unbalanced distribution systems. This method is novel …

被引用次数：208 相关文章所有 6 个版本

[PDF] neurips.cc

Language as an abstraction for hierarchical deep reinforcement learning

Y Jiang, SS Gu, KP Murphy… - Advances in Neural …, 2019 - proceedings.neurips.cc

Solving complex, temporally-extended tasks is a long-standing problem in reinforcement
learning (RL). We hypothesize that one critical element of solving such problems is the …

被引用次数：237 相关文章所有 9 个版本

[PDF] aaai.org

Action branching architectures for deep reinforcement learning

A Tavakoli, F Pardo, P Kormushev - … of the aaai conference on artificial …, 2018 - ojs.aaai.org

Discrete-action algorithms have been central to numerous recent successes of deep
reinforcement learning. However, applying these algorithms to high-dimensional action …

被引用次数：304 相关文章所有 14 个版本

[PDF] arxiv.org

Temporal difference models: Model-free deep rl for model-based control

V Pong, S Gu, M Dalal, S Levine - arXiv preprint arXiv:1802.09081, 2018 - arxiv.org

Model-free reinforcement learning (RL) is a powerful, general tool for learning complex
behaviors. However, its sample efficiency is often impractically large for solving challenging …

被引用次数：294 相关文章所有 7 个版本

高级搜索

QQ 群