所有版本 - 学术资源搜索

Reward uncertainty for exploration in preference-based reinforcement learning

X Liang, K Shu, K Lee, P Abbeel - arXiv preprint arXiv:2205.12401, 2022 - arxiv.org

Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …

被引用次数：57 相关文章

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

X Liang, K Shu, K Lee, P Abbeel - arXiv e-prints, 2022 - ui.adsabs.harvard.edu

Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

X Liang, K Shu, K Lee, P Abbeel - 10th International Conference …, 2022 - koasas.kaist.ac.kr

Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

X Liang, K Shu, K Lee, P Abbeel - International Conference on Learning … - openreview.net

Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

X Liang, K Shu, K Lee, P Abbeel - Deep RL Workshop NeurIPS 2021 - openreview.net

Conveying complex objectives to reinforcement learning (RL) agents often requires
meticulous reward engineering. Preference-based RL methods are able to learn a more …

高级搜索

QQ 群

Reward uncertainty for exploration in preference-based reinforcement learning

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

引用