Deep reinforcement learning: An overview

Y Li - arXiv preprint arXiv:1701.07274, 2017 - arxiv.org
… , function approximation, policy optimization, deep RL, RL … To have a good understanding
of deep reinforcement learning, … a top level action value function and a lower level action …

Deep reinforcement learning: A brief survey

K Arulkumaran, MP Deisenroth… - IEEE Signal …, 2017 - ieeexplore.ieee.org
… ), policies could also run other policies (multitime-step “actions”) [79]. This approach allows
toplevel policies to focus on higher-level … by using one top-level policy that chooses between …

Using deep reinforcement learning to learn high-level policies on the atrias biped

T Li, H Geyer, CG Atkeson, A Rai - … International Conference on …, 2019 - ieeexplore.ieee.org
… In this work, we used deep reinforcement learning to learn two neural network policies to …
One of the policies uses a general neural network, while the second builds on the structure …

A brief survey of deep reinforcement learning

K Arulkumaran, MP Deisenroth, M Brundage… - arXiv preprint arXiv …, 2017 - arxiv.org
… with a higher level understanding of the visual world. Currently, … and policybased methods.
Our survey will cover central algorithms in deep reinforcement learning, including the deep Q-…

Language as an abstraction for hierarchical deep reinforcement learning

Y Jiang, SS Gu, KP Murphy… - Advances in Neural …, 2019 - proceedings.neurips.cc
… , particularly when combined with existing reinforcement learning algorithms. We explore
how we might incorporate a language model into the high level policy in Appendix A, which …

Edge: Explaining deep reinforcement learning policies

W Guo, X Wu, U Khan, X Xing - Advances in Neural …, 2021 - proceedings.neurips.cc
… At a high level, our method identifies the important time steps by approximating the target
agent’s decision-making process with a self-explainable model and extracting the explanations …

Multi-level policy and reward-based deep reinforcement learning framework for image captioning

N Xu, H Zhang, AA Liu, W Nie, Y Su… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
… multi-level policy and reward reinforcement learning framework … -level policy network aims
to jointly update the word- and sentence-level policies for word generation, and the multi-level

Human-level control through deep reinforcement learning

V Mnih, K Kavukcuoglu, D Silver, AA Rusu, J Veness… - nature, 2015 - nature.com
high-dimensional data ( colour video at 60 Hz) as input—to demonstrate that our approach
robustly learns successful policies … to break through to the top level of bricks and the value …

Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning

XB Peng, G Berseth, KK Yin… - Acm transactions on …, 2017 - dl.acm.org
deep reinforcement learning (RL) to learn control policies at both timescales. The use of deep
… of objective functions for low-level and highlevel policies. Taken together, the hierarchical …

Adversarial policies: Attacking deep reinforcement learning

A Gleave, M Dennis, C Wild, N Kant, S Levine… - arXiv preprint arXiv …, 2019 - arxiv.org
policies are more successful in high-dimensional environments, and induce substantially
different activations in the victim policy … Additionally, we find policies are easier to attack in high-…