Disagreement-regularized imitation learning

M Zare, PM Kebria, A Khosravi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

In recent years, the development of robotics and artificial intelligence (AI) systems has been
nothing short of remarkable. As these systems continue to evolve, they are being utilized in …

被引用次数：62 相关文章所有 2 个版本

[HTML] nih.gov

Transfer learning in deep reinforcement learning: A survey

Z Zhu, K Lin, AK Jain, J Zhou - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org

Reinforcement learning is a learning paradigm for solving sequential decision-making
problems. Recent years have witnessed remarkable progress in reinforcement learning …

被引用次数：715 相关文章所有 12 个版本

[PDF] thecvf.com

Trafficsim: Learning to simulate realistic multi-agent behaviors

S Suo, S Regalado, S Casas… - Proceedings of the …, 2021 - openaccess.thecvf.com

Simulation has the potential to massively scale evaluation of self-driving systems, enabling
rapid development as well as safe deployment. Bridging the gap between simulation and …

被引用次数：236 相关文章所有 5 个版本

[PDF] arxiv.org

Reward model ensembles help mitigate overoptimization

T Coste, U Anwar, R Kirk, D Krueger - arXiv preprint arXiv:2310.02743, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a standard approach for fine-tuning
large language models to follow instructions. As part of this process, learned reward models …

被引用次数：76 相关文章所有 4 个版本

[PDF] thecvf.com

Social nce: Contrastive learning of socially-aware motion representations

Y Liu, Q Yan, A Alahi - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Learning socially-aware motion representations is at the core of recent advances in multi-
agent problems, such as human motion forecasting and robot navigation in crowds. Despite …

被引用次数：127 相关文章所有 9 个版本

[PDF] neurips.cc

Mitigating covariate shift in imitation learning via offline data with partial coverage

J Chang, M Uehara, D Sreenivas… - Advances in Neural …, 2021 - proceedings.neurips.cc

This paper studies offline Imitation Learning (IL) where an agent learns to imitate an expert
demonstrator without additional online environment interactions. Instead, the learner is …

被引用次数：103 相关文章所有 7 个版本

[PDF] arxiv.org

Primal wasserstein imitation learning

R Dadashi, L Hussenot, M Geist, O Pietquin - arXiv preprint arXiv …, 2020 - arxiv.org

Imitation Learning (IL) methods seek to match the behavior of an agent with that of an expert.
In the present work, we propose a new IL method based on a conceptually simple algorithm …

被引用次数：148 相关文章所有 8 个版本

[PDF] mlr.press

f-irl: Inverse reinforcement learning via state marginal matching

T Ni, H Sikchi, Y Wang, T Gupta… - … on Robot Learning, 2021 - proceedings.mlr.press

Imitation learning is well-suited for robotic tasks where it is difficult to directly program the
behavior or specify a cost for optimal control. In this work, we propose a method for learning …

被引用次数：80 相关文章所有 5 个版本

[PDF] neurips.cc

Towards unifying behavioral and response diversity for open-ended learning in zero-sum games

X Liu, H Jia, Y Wen, Y Hu, Y Chen… - Advances in …, 2021 - proceedings.neurips.cc

Measuring and promoting policy diversity is critical for solving games with strong non-
transitive dynamics where strategic cycles exist, and there is no consistent winner (eg, Rock …

被引用次数：51 相关文章所有 5 个版本

[PDF] arxiv.org

Improved deep reinforcement learning with expert demonstrations for urban autonomous driving

H Liu, Z Huang, J Wu, C Lv - 2022 IEEE intelligent vehicles …, 2022 - ieeexplore.ieee.org

Learning-based approaches, such as reinforcement learning (RL) and imitation learning
(IL), have indicated superiority over rule-based approaches in complex urban autonomous …

被引用次数：86 相关文章所有 4 个版本

高级搜索

QQ 群