X Liang, K Shu, K Lee, P Abbeel - 10th International Conference …, 2022 - koasas.kaist.ac.kr
Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …