We propose a new policy representation based on score-based diffusion models (SDMs). We apply our new policy representation in the domain of Goal-Conditioned Imitation …
Unsupervised pre-training has recently become the bedrock for computer vision and natural language processing. In reinforcement learning (RL), goal-conditioned RL can potentially …
Offline reinforcement learning (RL) provides a promising direction to exploit massive amount of offline data for complex decision-making tasks. Due to the distribution shift issue, current …
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL- based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution …
W Li, X Wang, B Jin, H Zha - International Conference on …, 2023 - proceedings.mlr.press
Offline reinforcement learning typically introduces a hierarchical structure to solve the long- horizon problem so as to address its thorny issue of variance accumulation. Problems of …
While large-scale sequence modeling from offline data has led to impressive performance gains in natural language and image generation, directly translating such ideas to robotics …
J Liu, H Zhang, Z Zhuang, Y Kang… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we decouple the iterative bi-level offline RL (value estimation and policy extraction) from the offline training phase, forming a non-iterative bi-level paradigm and …
H Sun, L Han, R Yang, X Ma… - Advances in neural …, 2022 - proceedings.neurips.cc
In this work, we study the simple yet universally applicable case of reward shaping in value- based Deep Reinforcement Learning (DRL). We show that reward shifting in the form of a …
JY Ma, J Yan, D Jayaraman… - Advances in neural …, 2022 - proceedings.neurips.cc
Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill learning in the form of reaching diverse goals from purely offline datasets. We propose …