Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong assumptions on both the function classes (eg, Bellman-completeness) and the data …
L Shi, G Li, Y Wei, Y Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
This paper investigates model robustness in reinforcement learning (RL) via the framework of distributionally robust Markov decision processes (RMDPs). Despite recent efforts, the …
Offline reinforcement learning (RL) enables learning a decision-making policy without interaction with the environment. This makes it particularly beneficial in situations where …
J Guan, G Chen, J Ji, L Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Offline safe reinforcement learning (RL) algorithms promise to learn policies that satisfy safety constraints directly in offline datasets without interacting with the environment. This …
J Liu, H Zhang, Z Zhuang, Y Kang… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we decouple the iterative bi-level offline RL (value estimation and policy extraction) from the offline training phase, forming a non-iterative bi-level paradigm and …
A Huang, J Chen, N Jiang - International Conference on …, 2023 - proceedings.mlr.press
MDPs with low-rank transitions—that is, the transition matrix can be factored into the product of two matrices, left and right—is a highly representative structure that enables tractable …
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision- making using a pre-collected dataset, without further interaction with the environment …
Offline reinforcement learning (RL), which refers to decision-making from a previously- collected dataset of interactions, has received significant attention over the past years. Much …
JY Ma, J Yan, D Jayaraman… - Advances in neural …, 2022 - proceedings.neurips.cc
Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill learning in the form of reaching diverse goals from purely offline datasets. We propose …