Reinforcement Learning (RL) algorithms suffer from the dependency on accurately engineered reward functions to properly guide the learning agents to do the required tasks …
Reinforcement learning (RL) techniques optimize the accumulated long-term reward of a suitably chosen reward function. However, designing such a reward function often requires …
J Hejna, D Sadigh - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Reward functions are difficult to design and often hard to align with human intent. Preference- based Reinforcement Learning (RL) algorithms address these problems by learning reward …
X Hu, J Li, X Zhan, QS Jia, YQ Zhang - arXiv preprint arXiv:2305.17400, 2023 - arxiv.org
Preference-based reinforcement learning (PbRL) provides a natural way to align RL agents' behavior with human desired outcomes, but is often restrained by costly human feedback …
R Liu, F Bai, Y Du, Y Yang - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract Setting up a well-designed reward function has been challenging for many reinforcement learning applications. Preference-based reinforcement learning (PbRL) …
X Liang, K Shu, K Lee, P Abbeel - arXiv preprint arXiv:2205.12401, 2022 - arxiv.org
Conveying complex objectives to reinforcement learning (RL) agents often requires meticulous reward engineering. Preference-based RL methods are able to learn a more …
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories, rather …
This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a …
Abstract Preference-based Reinforcement Learning (PbRL) replaces reward values in traditional reinforcement learning by preferences to better elicit human opinion on the target …