Learning Reward Machines through Preference Queries over Sequences

E Hsiung, J Biswas, S Chaudhuri - arXiv preprint arXiv:2308.09301, 2023 - arxiv.org
Reward machines have shown great promise at capturing non-Markovian reward functions
for learning tasks that involve complex action sequencing. However, no algorithm currently …

Learning Reward Machines through Preference Queries over Sequences

E Hsiung, J Biswas, S Chaudhuri - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Reward machines have shown great promise at capturing non-Markovian reward functions
for learning tasks that involve complex action sequencing. However, no algorithm currently …