PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward

W Zhou, W Li - arXiv preprint arXiv:2306.01731, 2023 - arxiv.org
Many imitation learning (IL) algorithms employ inverse reinforcement learning (IRL) to infer
the intrinsic reward function that an expert is implicitly optimizing for based on their …

Intrinsic reward driven imitation learning via generative model

X Yu, Y Lyu, I Tsang - International conference on machine …, 2020 - proceedings.mlr.press
Imitation learning in a high-dimensional environment is challenging. Most inverse
reinforcement learning (IRL) methods fail to outperform the demonstrator in such a high …

Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

M Beliaev, R Pedarsani - arXiv preprint arXiv:2402.01886, 2024 - arxiv.org
In Imitation Learning (IL), utilizing suboptimal and heterogeneous demonstrations presents a
substantial challenge due to the varied nature of real-world data. However, standard IL …

Probability Density Estimation Based Imitation Learning

Y Liu, Y Chang, S Jiang, X Wang, B Liang… - arXiv preprint arXiv …, 2021 - arxiv.org
Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between
agents and environments. It does not require explicit reward signals and instead tries to …

Support-weighted adversarial imitation learning

R Wang, C Ciliberto, P Amadori, Y Demiris - arXiv preprint arXiv …, 2020 - arxiv.org
Adversarial Imitation Learning (AIL) is a broad family of imitation learning methods designed
to mimic expert behaviors from demonstrations. While AIL has shown state-of-the-art …

Curricular Subgoals for Inverse Reinforcement Learning

S Liu, Y Qing, S Xu, H Wu, J Zhang, J Cong… - arXiv preprint arXiv …, 2023 - arxiv.org
Inverse Reinforcement Learning (IRL) aims to reconstruct the reward function from expert
demonstrations to facilitate policy learning, and has demonstrated its remarkable success in …

Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations

D Brown, W Goo, P Nagarajan… - … conference on machine …, 2019 - proceedings.mlr.press
A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to
significantly outperform the demonstrator. This is because IRL typically seeks a reward …

Sample Efficient Imitation Learning via Reward Function Trained in Advance

L Zhang - arXiv preprint arXiv:2111.11711, 2021 - arxiv.org
Imitation learning (IL) is a framework that learns to imitate expert behavior from
demonstrations. Recently, IL shows promising results on high dimensional and control tasks …

EvIL: Evolution Strategies for Generalisable Imitation Learning

S Sapora, G Swamy, C Lu, YW Teh… - arXiv preprint arXiv …, 2024 - arxiv.org
Often times in imitation learning (IL), the environment we collect expert demonstrations in
and the environment we want to deploy our learned policy in aren't exactly the same (eg …

Hybrid inverse reinforcement learning

J Ren, G Swamy, ZS Wu, JA Bagnell… - arXiv preprint arXiv …, 2024 - arxiv.org
The inverse reinforcement learning approach to imitation learning is a double-edged sword.
On the one hand, it can enable learning from a smaller number of expert demonstrations …