deep reinforcement high level policies- 学术资源搜索

Reinforcement learning with deep energy-based policies

T Haarnoja, H Tang, P Abbeel… - … conference on machine …, 2017 - proceedings.mlr.press

… Deep reinforcement learning (deep RL) has emerged as a … entropy policies with approximate
inference for reinforcement … Our reinforcement learning problem can be defined as policy …

被引用次数：1411 相关文章所有 5 个版本

Approximate policy-based accelerated deep reinforcement learning

X Wang, Y Gu, Y Cheng, A Liu… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

… The proposed APA is proven to be convergent even with a more aggressive learning rate, …
algorithm with deep Q-network (DQN), Double DQN and deep deterministic policy gradient (…

被引用次数：52 相关文章所有 4 个版本

[PDF] arxiv.org

Illuminating generalization in deep reinforcement learning through procedural level generation

N Justesen, RR Torrado, P Bontrager, A Khalifa… - arXiv preprint arXiv …, 2018 - arxiv.org

… in deep reinforcement learning research (eg the Arcade Learning Environment [3]). Our
findings suggest that policies … on a higher level, to the distribution of generated levels presented …

被引用次数：223 相关文章所有 12 个版本

[PDF] arxiv.org

What matters in on-policy reinforcement learning? a large-scale empirical study

M Andrychowicz, A Raichuk, P Stańczyk… - arXiv preprint arXiv …, 2020 - arxiv.org

… Deep reinforcement learning (RL) has seen increased interest in recent years due to its …
the importance of hyperparameter tuning and the high level of stochasticity due to random seeds…

被引用次数：211 相关文章所有 5 个版本

[PDF] mlr.press

Benchmarking deep reinforcement learning for continuous control

Y Duan, X Chen, R Houthooft… - International …, 2016 - proceedings.mlr.press

… policies, we use the notation µθ : S→A to denote the policy … , where higher level decisions
can reuse lower level skills (… where both lowlevel motor controls and high-level decisions are …

被引用次数：2022 相关文章所有 14 个版本

[PDF] arxiv.org

How to discount deep reinforcement learning: Towards new dynamic strategies

V François-Lavet, R Fonteneau, D Ernst - arXiv preprint arXiv:1512.02011, 2015 - arxiv.org

… Following the ideas discussed above, Figure 7 represents the general update scheme that
we propose to further improve the performance of deep reinforcement learning algorithms. …

被引用次数：143 相关文章所有 4 个版本

[PDF] arxiv.org

Supervised policy update for deep reinforcement learning

Q Vuong, Y Zhang, KW Ross - arXiv preprint arXiv:1805.11706, 2018 - arxiv.org

… Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the
current policy… in the non-parameterized proximal policy space. Using supervised regression, it …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Automated lane change strategy using proximal policy optimization-based deep reinforcement learning

F Ye, X Cheng, P Wang, CY Chan… - 2020 IEEE Intelligent …, 2020 - ieeexplore.ieee.org

… Using SUMO and its associated traffic control interface (TraCI), we can access the vehicle
information in the road network, and execute highlevel decisions in the learner model and take …

被引用次数：113 相关文章所有 4 个版本

Hierarchical deep reinforcement learning for continuous action control

Z Yang, K Merrick, L Jin… - IEEE transactions on …, 2018 - ieeexplore.ieee.org

… -deep deterministic policy gradient (h-DDPG). The proposed algorithm comprises two levels
of … learning methods to further improve the performance of top level hierarchy (which they …

被引用次数：166 相关文章所有 5 个版本

[PDF] arxiv.org

Accelerated methods for deep reinforcement learning

A Stooke, P Abbeel - arXiv preprint arXiv:1803.02811, 2018 - arxiv.org

… deep RL algorithms for modern computers, specifically for a combination of CPUs and GPUs.
We confirm that both policy … DGX-1 to learn successful strategies in Atari games in mere …

被引用次数：138 相关文章所有 2 个版本

高级搜索

QQ 群

Reinforcement learning with deep energy-based policies

Approximate policy-based accelerated deep reinforcement learning

Illuminating generalization in deep reinforcement learning through procedural level generation

What matters in on-policy reinforcement learning? a large-scale empirical study

Benchmarking deep reinforcement learning for continuous control

How to discount deep reinforcement learning: Towards new dynamic strategies

Supervised policy update for deep reinforcement learning

Automated lane change strategy using proximal policy optimization-based deep reinforcement learning

Hierarchical deep reinforcement learning for continuous action control

Accelerated methods for deep reinforcement learning

引用