Hierarchies of reward machines

D Furelos-Blanco, M Law, A Jonsson… - International …, 2023 - proceedings.mlr.press
Reward machines (RMs) are a recent formalism for representing the reward function of a
reinforcement learning task through a finite-state machine whose edges encode subgoals of …

逆强化学习算法, 理论与应用研究综述

宋莉, 李大字, 徐昕 - 自动化学报, 2023 - aas.net.cn
随着深度强化学习的研究与发展, 强化学习在博弈与优化决策, 智能驾驶等现实问题中的应用也
取得显著进展. 然而强化学习在智能体与环境的交互中存在人工设计奖励函数难的问题 …

Learning Reward Machines through Preference Queries over Sequences

E Hsiung, J Biswas, S Chaudhuri - arXiv preprint arXiv:2308.09301, 2023 - arxiv.org
Reward machines have shown great promise at capturing non-Markovian reward functions
for learning tasks that involve complex action sequencing. However, no algorithm currently …