Visual reinforcement learning (RL), which makes decisions directly from high-dimensional visual inputs, has demonstrated significant potential in various domains. However …
Reinforcement learning methods trained on few environments rarely learn policies that generalize to unseen environments. To improve generalization, we incorporate the inherent …
N Hansen, H Su, X Wang - Advances in neural information …, 2021 - proceedings.neurips.cc
While agents trained by Reinforcement Learning (RL) can solve increasingly challenging tasks directly from visual observations, generalizing learned skills to novel environments …
D Ghosh, J Rahme, A Kumar, A Zhang… - Advances in neural …, 2021 - proceedings.neurips.cc
Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world. In this paper, we show that the sequential structure of the RL …
R Shah, V Kumar - arXiv preprint arXiv:2107.03380, 2021 - arxiv.org
The ability to autonomously learn behaviors via direct interactions in uninstrumented environments can lead to generalist robots capable of enhancing productivity or providing …
F Deng, I Jang, S Ahn - International conference on machine …, 2022 - proceedings.mlr.press
Abstract Reconstruction-based Model-Based Reinforcement Learning (MBRL) agents, such as Dreamer, often fail to discard task-irrelevant visual distractions that are prevalent in …
Z Yuan, S Yang, P Hua, C Chang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Visual Reinforcement Learning (Visual RL), coupled with high-dimensional observations, has consistently confronted the long-standing challenge of out-of-distribution …
X Fu, G Yang, P Agrawal… - … Conference on Machine …, 2021 - proceedings.mlr.press
Current model-based reinforcement learning methods struggle when operating from complex visual scenes due to their inability to prioritize task-relevant features. To mitigate …
Visual model-based RL methods typically encode image observations into low-dimensional representations in a manner that does not eliminate redundant information. This leaves them …