A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

Minigrid & miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks

M Chevalier-Boisvert, B Dai… - Advances in …, 2024 - proceedings.neurips.cc
We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D
and 3D environments. The libraries were explicitly created with a minimalistic design …

Evolving curricula with regret-based environment design

J Parker-Holder, M Jiang, M Dennis… - International …, 2022 - proceedings.mlr.press
Training generally-capable agents with reinforcement learning (RL) remains a significant
challenge. A promising avenue for improving the robustness of RL agents is through the use …

A comprehensive survey of data augmentation in visual reinforcement learning

G Ma, Z Wang, Z Yuan, X Wang, B Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org
Visual reinforcement learning (RL), which makes decisions directly from high-dimensional
visual inputs, has demonstrated significant potential in various domains. However …

Contrastive behavioral similarity embeddings for generalization in reinforcement learning

R Agarwal, MC Machado, PS Castro… - arXiv preprint arXiv …, 2021 - arxiv.org
Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …

Graph information bottleneck for subgraph recognition

J Yu, T Xu, Y Rong, Y Bian, J Huang, R He - arXiv preprint arXiv …, 2020 - arxiv.org
Given the input graph and its label/property, several key problems of graph learning, such as
finding interpretable subgraphs, graph denoising and graph compression, can be attributed …

Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability

D Ghosh, J Rahme, A Kumar, A Zhang… - Advances in neural …, 2021 - proceedings.neurips.cc
Generalization is a central challenge for the deployment of reinforcement learning (RL)
systems in the real world. In this paper, we show that the sequential structure of the RL …

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

Improving generalization in reinforcement learning with mixture regularization

K Wang, B Kang, J Shao… - Advances in Neural …, 2020 - proceedings.neurips.cc
Deep reinforcement learning (RL) agents trained in a limited set of environments tend to
suffer overfitting and fail to generalize to unseen testing environments. To improve their …

Decoupling value and policy for generalization in reinforcement learning

R Raileanu, R Fergus - International Conference on …, 2021 - proceedings.mlr.press
Standard deep reinforcement learning algorithms use a shared representation for the policy
and value function, especially when training directly from images. However, we argue that …