Regularized inverse reinforcement learning

Y Matsuo, Y LeCun, M Sahani, D Precup, D Silver… - Neural Networks, 2022 - Elsevier

Deep learning (DL) and reinforcement learning (RL) methods seem to be a part of
indispensable factors to achieve human-level or super-human AI systems. On the other …

被引用次数：269 相关文章所有 7 个版本

[PDF] mlr.press

Identifiability and generalizability in constrained inverse reinforcement learning

A Schlaginhaufen… - … Conference on Machine …, 2023 - proceedings.mlr.press

Two main challenges in Reinforcement Learning (RL) are designing appropriate reward
functions and ensuring the safety of the learned policy. To address these challenges, we …

被引用次数：10 相关文章所有 9 个版本

[PDF] neurips.cc

Distributed inverse constrained reinforcement learning for multi-agent systems

S Liu, M Zhu - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc

This paper considers the problem of recovering the policies of multiple interacting experts by
estimating their reward functions and constraints where the demonstration data of the …

被引用次数：17 相关文章所有 7 个版本

[PDF] arxiv.org

Offline inverse reinforcement learning

F Jarboui, V Perchet - arXiv preprint arXiv:2106.05068, 2021 - arxiv.org

The objective of offline RL is to learn optimal policies when a fixed exploratory
demonstrations data-set is available and sampling additional observations is impossible …

被引用次数：14 相关文章所有 3 个版本

[PDF] arxiv.org

Curricular Subgoals for Inverse Reinforcement Learning

S Liu, Y Qing, S Xu, H Wu, J Zhang, J Cong… - arXiv preprint arXiv …, 2023 - arxiv.org

Inverse Reinforcement Learning (IRL) aims to reconstruct the reward function from expert
demonstrations to facilitate policy learning, and has demonstrated its remarkable success in …

被引用次数：1 相关文章所有 3 个版本

[PDF] neurips.cc

Robust imitation via mirror descent inverse reinforcement learning

DS Han, H Kim, H Lee, JH Ryu… - Advances in Neural …, 2022 - proceedings.neurips.cc

Recently, adversarial imitation learning has shown a scalable reward acquisition method for
inverse reinforcement learning (IRL) problems. However, estimated reward signals often …

被引用次数：3 相关文章所有 6 个版本

[PDF] arxiv.org