Hyperparameter selection for offline reinforcement learning

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：229 相关文章所有 9 个版本

[PDF] neurips.cc

A minimalist approach to offline reinforcement learning

S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …

被引用次数：661 相关文章所有 6 个版本

[PDF] openreview.net

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org

Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

被引用次数：291 相关文章所有 4 个版本

[PDF] mlr.press

Adversarially trained actor critic for offline reinforcement learning

CA Cheng, T Xie, N Jiang… - … Conference on Machine …, 2022 - proceedings.mlr.press

Abstract We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm
for offline reinforcement learning (RL) under insufficient data coverage, based on the …

被引用次数：112 相关文章所有 8 个版本

[PDF] neurips.cc

Rambo-rl: Robust adversarial model-based offline reinforcement learning

M Rigter, B Lacerda, N Hawes - Advances in neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims to find performant policies from logged data without
further environment interaction. Model-based algorithms, which learn a model of the …

被引用次数：97 相关文章所有 7 个版本

[PDF] arxiv.org

Rvs: What is essential for offline rl via supervised learning?

S Emmons, B Eysenbach, I Kostrikov… - arXiv preprint arXiv …, 2021 - arxiv.org

Recent work has shown that supervised learning alone, without temporal difference (TD)
learning, can be remarkably effective for offline RL. When does this hold true, and which …

被引用次数：178 相关文章所有 4 个版本

[PDF] mlr.press

Offline-to-online reinforcement learning via balanced replay and pessimistic q-ensemble

S Lee, Y Seo, K Lee, P Abbeel… - Conference on Robot …, 2022 - proceedings.mlr.press

Recent advance in deep offline reinforcement learning (RL) has made it possible to train
strong robotic agents from offline datasets. However, depending on the quality of the trained …

被引用次数：151 相关文章所有 5 个版本

[PDF] arxiv.org

Acme: A research framework for distributed reinforcement learning

MW Hoffman, B Shahriari, J Aslanides… - arXiv preprint arXiv …, 2020 - arxiv.org

Deep reinforcement learning (RL) has led to many recent and groundbreaking advances.
However, these advances have often come at the cost of both increased scale in the …

被引用次数：242 相关文章所有 2 个版本

A practical deep reinforcement learning framework for multivariate occupant-centric control in buildings

Y Lei, S Zhan, E Ono, Y Peng, Z Zhang, T Hasama… - Applied Energy, 2022 - Elsevier

Reinforcement learning (RL) has been shown to have the potential for optimal control of
heating, ventilation, and air conditioning (HVAC) systems. Although research on RL-based …

被引用次数：57 相关文章所有 5 个版本

[PDF] mlr.press

Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl

T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press

Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …

被引用次数：50 相关文章所有 9 个版本

高级搜索

QQ 群