A Schlaginhaufen… - … Conference on Machine …, 2023 - proceedings.mlr.press
Two main challenges in Reinforcement Learning (RL) are designing appropriate reward functions and ensuring the safety of the learned policy. To address these challenges, we …
S Levine, V Koltun - arXiv preprint arXiv:1206.4617, 2012 - arxiv.org
Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert …
While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent's policy or demonstrated …
Inverse reinforcement learning (IRL) denotes a powerful family of algorithms for recovering a reward function justifying the behavior demonstrated by an expert agent. A well-known …
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy that best fits observed sequences of states and actions implemented by an …
AM Metelli, G Ramponi, A Concetti… - … on Machine Learning, 2021 - proceedings.mlr.press
The reward function is widely accepted as a succinct, robust, and transferable representation of a task. Typical approaches, at the basis of Inverse Reinforcement Learning …
D Brown, S Niekum - Proceedings of the AAAI conference on artificial …, 2018 - ojs.aaai.org
In the field of reinforcement learning there has been recent progress towards safety and high- confidence bounds on policy performance. However, to our knowledge, no practical …
A Jacq, M Geist, A Paiva… - … Conference on Machine …, 2019 - proceedings.mlr.press
In this paper, we propose a novel setting for Inverse Reinforcement Learning (IRL), namely" Learning from a Learner"(LfL). As opposed to standard IRL, it does not consist in learning a …
G Kalweit, M Huegle, M Werling… - Advances in neural …, 2020 - proceedings.neurips.cc
Abstract Popular Maximum Entropy Inverse Reinforcement Learning approaches require the computation of expected state visitation frequencies for the optimal policy under an estimate …