An offline time-aware apprenticeship learning framework for evolving reward functions- 学术资源搜索

文章

学术资源搜索

An offline time-aware apprenticeship learning framework for evolving reward functions

X Yang, G Gao, M Chi - arXiv preprint arXiv:2305.09070, 2023 - arxiv.org

arXiv preprint arXiv:2305.09070, 2023•arxiv.org

Apprenticeship learning (AL) is a process of inducing effective decision-making policies via
observing and imitating experts' demonstrations. Most existing AL approaches, however, are
not designed to cope with the evolving reward functions commonly found in human-centric
tasks such as healthcare, where offline learning is required. In this paper, we propose an
offline Time-aware Hierarchical EM Energy-based Sub-trajectory (THEMES) AL framework
to tackle the evolving reward functions in such tasks. The effectiveness of THEMES is …

Apprenticeship learning (AL) is a process of inducing effective decision-making policies via observing and imitating experts' demonstrations. Most existing AL approaches, however, are not designed to cope with the evolving reward functions commonly found in human-centric tasks such as healthcare, where offline learning is required. In this paper, we propose an offline Time-aware Hierarchical EM Energy-based Sub-trajectory (THEMES) AL framework to tackle the evolving reward functions in such tasks. The effectiveness of THEMES is evaluated via a challenging task -- sepsis treatment. The experimental results demonstrate that THEMES can significantly outperform competitive state-of-the-art baselines.

arxiv.org

展开收起

被引用次数：3 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

An offline time-aware apprenticeship learning framework for evolving reward functions

引用