A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

G Liu, S Xu, S Liu, A Gaurav, SG Subramanian… - arXiv preprint arXiv …, 2024 - arxiv.org
Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit
constraints followed by expert agents from their demonstration data. As an emerging …

Partial Identifiability and Misspecification in Inverse Reinforcement Learning

J Skalse, A Abate - arXiv preprint arXiv:2411.15951, 2024 - arxiv.org
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function $ R $ from a
policy $\pi $. This problem is difficult, for several reasons. First of all, there are typically …

Learning true objectives: Linear algebraic characterizations of identifiability in inverse reinforcement learning

ML Shehab, A Aspeel, N Arechiga… - 6th Annual Learning …, 2024 - proceedings.mlr.press
Inverse reinforcement Learning (IRL) has emerged as a powerful paradigm for extracting
expert skills from observed behavior, with applications ranging from autonomous systems to …

Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning

A Schlaginhaufen, M Kamgarpour - arXiv preprint arXiv:2406.01793, 2024 - arxiv.org
Inverse reinforcement learning (IRL) aims to infer a reward from expert demonstrations,
motivated by the idea that the reward, rather than the policy, is the most succinct and …

The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

L Fluri, L Lang, A Abate, P Forré, D Krueger… - arXiv preprint arXiv …, 2024 - arxiv.org
In reinforcement learning, specifying reward functions that capture the intended task can be
very challenging. Reward learning aims to address this issue by learning the reward …

A Novel Variational Lower Bound for Inverse Reinforcement Learning

Y Gui, P Doshi - arXiv preprint arXiv:2311.03698, 2023 - arxiv.org
Inverse reinforcement learning (IRL) seeks to learn the reward function from expert
trajectories, to understand the task for imitation or collaboration thereby removing the need …

Inference of Utilities and Time Preference in Sequential Decision-Making

H Cao, Z Wu, R Xu - arXiv preprint arXiv:2405.15975, 2024 - arxiv.org
This paper introduces a novel stochastic control framework to enhance the capabilities of
automated investment managers, or robo-advisors, by accurately inferring clients' …

Partial Identifiability in Inverse Reinforcement Learning For Agents With Non-Exponential Discounting

J Skalse, A Abate - arXiv preprint arXiv:2412.11155, 2024 - arxiv.org
The aim of inverse reinforcement learning (IRL) is to infer an agent's preferences from
observing their behaviour. Usually, preferences are modelled as a reward function, $ R …

Rethinking Adversarial Inverse Reinforcement Learning: From the Angles of Policy Imitation and Transferable Reward Recovery

Y Zhang, W Zhou, Y Zhou - arXiv preprint arXiv:2410.07643, 2024 - arxiv.org
In scenarios of inverse reinforcement learning (IRL) with a single expert, adversarial inverse
reinforcement learning (AIRL) serves as a foundational approach to providing …

Convergence of a model-free entropy-regularized inverse reinforcement learning algorithm

T Renard, A Schlaginhaufen, T Ni… - arXiv preprint arXiv …, 2024 - arxiv.org
Given a dataset of expert demonstrations, inverse reinforcement learning (IRL) aims to
recover a reward for which the expert is optimal. This work proposes a model-free algorithm …