Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org
In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

Scaling laws for reward model overoptimization

L Gao, J Schulman, J Hilton - International Conference on …, 2023 - proceedings.mlr.press
In reinforcement learning from human feedback, it is common to optimize against a reward
model trained to predict human preferences. Because the reward model is an imperfect …

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

Leveraging procedural generation to benchmark reinforcement learning

K Cobbe, C Hesse, J Hilton… - … conference on machine …, 2020 - proceedings.mlr.press
Abstract We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like
environments designed to benchmark both sample efficiency and generalization in …

Quantifying generalization in reinforcement learning

K Cobbe, O Klimov, C Hesse, T Kim… - … on machine learning, 2019 - proceedings.mlr.press
In this paper, we investigate the problem of overfitting in deep reinforcement learning.
Among the most common benchmarks in RL, it is customary to use the same environments …

A comprehensive survey of data augmentation in visual reinforcement learning

G Ma, Z Wang, Z Yuan, X Wang, B Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org
Visual reinforcement learning (RL), which makes decisions directly from high-dimensional
visual inputs, has demonstrated significant potential in various domains. However …

Contrastive behavioral similarity embeddings for generalization in reinforcement learning

R Agarwal, MC Machado, PS Castro… - arXiv preprint arXiv …, 2021 - arxiv.org
Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …

Stabilizing deep q-learning with convnets and vision transformers under data augmentation

N Hansen, H Su, X Wang - Advances in neural information …, 2021 - proceedings.neurips.cc
While agents trained by Reinforcement Learning (RL) can solve increasingly challenging
tasks directly from visual observations, generalizing learned skills to novel environments …

Deep reinforcement learning

SE Li - Reinforcement learning for sequential decision and …, 2023 - Springer
Similar to humans, RL agents use interactive learning to successfully obtain satisfactory
decision strategies. However, in many cases, it is desirable to learn directly from …