Imitation learning from observation with automatic discount scheduling

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Imitation learning from observation with automatic discount scheduling

在引用文章中搜索

[PDF] arxiv.org

Copa: General robotic manipulation through spatial constraints of parts with foundation models

H Huang, F Lin, Y Hu, S Wang, Y Gao - arXiv preprint arXiv:2403.08248, 2024 - arxiv.org

Foundation models pre-trained on web-scale data are shown to encapsulate extensive
world knowledge beneficial for robotic manipulation in the form of task planning. However …

被引用次数：32 相关文章所有 4 个版本

[PDF] arxiv.org

[PDF][PDF] Robot Policy Learning with Temporal Optimal Transport Reward

Y Fu, H Zhang, D Wu, W Xu, B Boulet - arXiv preprint arXiv:2410.21795, 2024 - arxiv.org

Reward specification is one of the most tricky problems in Reinforcement Learning, which
usually requires tedious hand engineering in practice. One promising approach to tackle this …

[PDF][PDF] SCaR: Refining skill chaining for long-horizon robotic manipulation via dual regularization

Z Chen, Z Ji, J Huo, Y Gao - 2024 - orca.cardiff.ac.uk

Long-horizon robotic manipulation tasks typically involve a series of interrelated 1 sub-tasks
spanning multiple execution stages. Skill chaining offers a feasible 2 solution for these tasks …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

Y Zhou, X Liu, X Zhang, Y Zhang - arXiv preprint arXiv:2501.12785, 2025 - arxiv.org

This paper tackles the efficiency and stability issues in learning from observations (LfO). We
commence by investigating how reward functions and policies generalize in LfO …

高级搜索

QQ 群

Imitation learning from observation with automatic discount scheduling

Copa: General robotic manipulation through spatial constraints of parts with foundation models

[PDF][PDF] Robot Policy Learning with Temporal Optimal Transport Reward

[PDF][PDF] SCaR: Refining skill chaining for long-horizon robotic manipulation via dual regularization

On Generalization and Distributional Update for Mimicking Observations with Adequate Exploration

引用