A survey of imitation learning: Algorithms, recent developments, and challenges

M Zare, PM Kebria, A Khosravi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In recent years, the development of robotics and artificial intelligence (AI) systems has been
nothing short of remarkable. As these systems continue to evolve, they are being utilized in …

Offline reinforcement learning with realizability and single-policy concentrability

W Zhan, B Huang, A Huang… - … on Learning Theory, 2022 - proceedings.mlr.press
Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong
assumptions on both the function classes (eg, Bellman-completeness) and the data …

The curious price of distributional robustness in reinforcement learning with a generative model

L Shi, G Li, Y Wei, Y Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
This paper investigates model robustness in reinforcement learning (RL) via the framework
of distributionally robust Markov decision processes (RMDPs). Despite recent efforts, the …

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

ZW Hong, A Kumar, S Karnik… - Advances in …, 2023 - proceedings.neurips.cc
Offline reinforcement learning (RL) enables learning a decision-making policy without
interaction with the environment. This makes it particularly beneficial in situations where …

VOCE: Variational optimization with conservative estimation for offline safe reinforcement learning

J Guan, G Chen, J Ji, L Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Offline safe reinforcement learning (RL) algorithms promise to learn policies that satisfy
safety constraints directly in offline datasets without interacting with the environment. This …

Design from policies: Conservative test-time adaptation for offline policy optimization

J Liu, H Zhang, Z Zhuang, Y Kang… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we decouple the iterative bi-level offline RL (value estimation and policy
extraction) from the offline training phase, forming a non-iterative bi-level paradigm and …

Reinforcement learning in low-rank mdps with density features

A Huang, J Chen, N Jiang - International Conference on …, 2023 - proceedings.mlr.press
MDPs with low-rank transitions—that is, the transition matrix can be factored into the product
of two matrices, left and right—is a highly representative structure that enables tractable …

Revisiting the linear-programming framework for offline rl with general function approximation

AE Ozdaglar, S Pattathil, J Zhang… - … on Machine Learning, 2023 - proceedings.mlr.press
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-
making using a pre-collected dataset, without further interaction with the environment …

Optimal conservative offline rl with general function approximation via augmented lagrangian

P Rashidinejad, H Zhu, K Yang, S Russell… - arXiv preprint arXiv …, 2022 - arxiv.org
Offline reinforcement learning (RL), which refers to decision-making from a previously-
collected dataset of interactions, has received significant attention over the past years. Much …

Offline Goal-Conditioned Reinforcement Learning via -Advantage Regression

JY Ma, J Yan, D Jayaraman… - Advances in neural …, 2022 - proceedings.neurips.cc
Offline goal-conditioned reinforcement learning (GCRL) promises general-purpose skill
learning in the form of reaching diverse goals from purely offline datasets. We propose …