Generalization in reinforcement learning with selective noise injection and information bottleneck

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：360 相关文章所有 9 个版本

[PDF] neurips.cc

Minigrid & miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks

M Chevalier-Boisvert, B Dai… - Advances in …, 2024 - proceedings.neurips.cc

We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D
and 3D environments. The libraries were explicitly created with a minimalistic design …

被引用次数：132 相关文章所有 7 个版本

[PDF] mlr.press

Evolving curricula with regret-based environment design

J Parker-Holder, M Jiang, M Dennis… - International …, 2022 - proceedings.mlr.press

Training generally-capable agents with reinforcement learning (RL) remains a significant
challenge. A promising avenue for improving the robustness of RL agents is through the use …

被引用次数：110 相关文章所有 5 个版本

[PDF] arxiv.org

A comprehensive survey of data augmentation in visual reinforcement learning

G Ma, Z Wang, Z Yuan, X Wang, B Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org

Visual reinforcement learning (RL), which makes decisions directly from high-dimensional
visual inputs, has demonstrated significant potential in various domains. However …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Contrastive behavioral similarity embeddings for generalization in reinforcement learning

R Agarwal, MC Machado, PS Castro… - arXiv preprint arXiv …, 2021 - arxiv.org

Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …

被引用次数：200 相关文章所有 11 个版本

[PDF] arxiv.org

Graph information bottleneck for subgraph recognition

J Yu, T Xu, Y Rong, Y Bian, J Huang, R He - arXiv preprint arXiv …, 2020 - arxiv.org

Given the input graph and its label/property, several key problems of graph learning, such as
finding interpretable subgraphs, graph denoising and graph compression, can be attributed …

被引用次数：167 相关文章所有 3 个版本

[PDF] neurips.cc

Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability

D Ghosh, J Rahme, A Kumar, A Zhang… - Advances in neural …, 2021 - proceedings.neurips.cc

Generalization is a central challenge for the deployment of reinforcement learning (RL)
systems in the real world. In this paper, we show that the sequential structure of the RL …

被引用次数：112 相关文章所有 10 个版本

[PDF] arxiv.org

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org

Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

被引用次数：104 相关文章所有 4 个版本

[PDF] neurips.cc

Improving generalization in reinforcement learning with mixture regularization

K Wang, B Kang, J Shao… - Advances in Neural …, 2020 - proceedings.neurips.cc

Deep reinforcement learning (RL) agents trained in a limited set of environments tend to
suffer overfitting and fail to generalize to unseen testing environments. To improve their …

被引用次数：126 相关文章所有 5 个版本

[PDF] mlr.press

Decoupling value and policy for generalization in reinforcement learning

R Raileanu, R Fergus - International Conference on …, 2021 - proceedings.mlr.press

Standard deep reinforcement learning algorithms use a shared representation for the policy
and value function, especially when training directly from images. However, we argue that …

被引用次数：105 相关文章所有 6 个版本

高级搜索

QQ 群