How to train your robot with deep reinforcement learning: lessons we have learned

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com
Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

Meta-learning in neural networks: A survey

T Hospedales, A Antoniou, P Micaelli… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …

Replay-guided adversarial environment design

M Jiang, M Dennis, J Parker-Holder… - Advances in …, 2021 - proceedings.neurips.cc
Deep reinforcement learning (RL) agents may successfully generalize to new settings if
trained on an appropriately diverse set of environment and task configurations …

Automatic curriculum learning for deep rl: A short survey

R Portelas, C Colas, L Weng, K Hofmann… - arXiv preprint arXiv …, 2020 - arxiv.org
Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in
Deep Reinforcement Learning (DRL). These methods shape the learning trajectories of …

Offline meta-reinforcement learning with online self-supervision

VH Pong, AV Nair, LM Smith… - … on Machine Learning, 2022 - proceedings.mlr.press
Meta-reinforcement learning (RL) methods can meta-train policies that adapt to new tasks
with orders of magnitude less data than standard RL, but meta-training itself is costly and …

Noveld: A simple yet effective exploration criterion

T Zhang, H Xu, X Wang, Y Wu… - Advances in …, 2021 - proceedings.neurips.cc
Efficient exploration under sparse rewards remains a key challenge in deep reinforcement
learning. Previous exploration methods (eg, RND) have achieved strong results in multiple …

Explore, discover and learn: Unsupervised discovery of state-covering skills

V Campos, A Trott, C Xiong, R Socher… - International …, 2020 - proceedings.mlr.press
Acquiring abilities in the absence of a task-oriented reward function is at the frontier of
reinforcement learning research. This problem has been studied through the lens of …

One solution is not all you need: Few-shot extrapolation via structured maxent rl

S Kumar, A Kumar, S Levine… - Advances in Neural …, 2020 - proceedings.neurips.cc
While reinforcement learning algorithms can learn effective policies for complex tasks, these
policies are often brittle to even minor task variations, especially when variations are not …

Hierarchical reinforcement learning by discovering intrinsic options

J Zhang, H Yu, W Xu - arXiv preprint arXiv:2101.06521, 2021 - arxiv.org
We propose a hierarchical reinforcement learning method, HIDIO, that can learn task-
agnostic options in a self-supervised manner while jointly learning to utilize them to solve …