相关文章- 学术资源搜索

PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward

W Zhou, W Li - arXiv preprint arXiv:2306.01731, 2023 - arxiv.org

Many imitation learning (IL) algorithms employ inverse reinforcement learning (IRL) to infer
the intrinsic reward function that an expert is implicitly optimizing for based on their …

Intrinsic reward driven imitation learning via generative model

X Yu, Y Lyu, I Tsang - International conference on machine …, 2020 - proceedings.mlr.press

Imitation learning in a high-dimensional environment is challenging. Most inverse
reinforcement learning (IRL) methods fail to outperform the demonstrator in such a high …

被引用次数：44 相关文章所有 8 个版本

[PDF] arxiv.org

Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

M Beliaev, R Pedarsani - arXiv preprint arXiv:2402.01886, 2024 - arxiv.org

In Imitation Learning (IL), utilizing suboptimal and heterogeneous demonstrations presents a
substantial challenge due to the varied nature of real-world data. However, standard IL …

Probability Density Estimation Based Imitation Learning

Y Liu, Y Chang, S Jiang, X Wang, B Liang… - arXiv preprint arXiv …, 2021 - arxiv.org

Imitation Learning (IL) is an effective learning paradigm exploiting the interactions between
agents and environments. It does not require explicit reward signals and instead tries to …

Support-weighted adversarial imitation learning

R Wang, C Ciliberto, P Amadori, Y Demiris - arXiv preprint arXiv …, 2020 - arxiv.org

Adversarial Imitation Learning (AIL) is a broad family of imitation learning methods designed
to mimic expert behaviors from demonstrations. While AIL has shown state-of-the-art …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Curricular Subgoals for Inverse Reinforcement Learning

S Liu, Y Qing, S Xu, H Wu, J Zhang, J Cong… - arXiv preprint arXiv …, 2023 - arxiv.org

Inverse Reinforcement Learning (IRL) aims to reconstruct the reward function from expert
demonstrations to facilitate policy learning, and has demonstrated its remarkable success in …

被引用次数：1 相关文章所有 3 个版本

[PDF] mlr.press

Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations

D Brown, W Goo, P Nagarajan… - … conference on machine …, 2019 - proceedings.mlr.press

A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to
significantly outperform the demonstrator. This is because IRL typically seeks a reward …

被引用次数：366 相关文章所有 12 个版本

[PDF] arxiv.org

被引用次数：3 相关文章所有 3 个版本

高级搜索

QQ 群

PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward

Intrinsic reward driven imitation learning via generative model

Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

Probability Density Estimation Based Imitation Learning

Support-weighted adversarial imitation learning

Curricular Subgoals for Inverse Reinforcement Learning

Extrapolating beyond suboptimal demonstrations via inverse reinforcement learning from observations

Sample Efficient Imitation Learning via Reward Function Trained in Advance

EvIL: Evolution Strategies for Generalisable Imitation Learning

Hybrid inverse reinforcement learning

相关搜索

引用